gemma.cpp/gemma
Charles Zhao 50ee1a3e92 Write SBS progressively.
(1) Directly write to file in BlobWriter::Add and destruct the MatOwner to release the rams.

(2) Write a fake header to indicate this is V2, and write correct header and directory at the end of the file.

(3) Tested on loading sbs written the old way, and new way, both worked.

PiperOrigin-RevId: 789306837
2025-07-31 06:05:38 -07:00
..
bindings Rename GetModelConfig->Config 2025-07-29 10:18:12 -07:00
evals Add MMLU eval to github 2024-05-20 10:20:53 -07:00
activations.h 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
attention.cc 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
attention.h Back to f32 kv_cache, but via typedef 2025-07-21 07:05:35 -07:00
configs.cc Internal change. 2025-07-29 08:21:29 -07:00
configs.h Add blob_path to config deduction message 2025-07-11 18:58:56 -07:00
configs_test.cc Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
gemma-inl.h 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
gemma.cc Write SBS progressively. 2025-07-31 06:05:38 -07:00
gemma.h Fix file mapping: was letting the smart pointer go out of scope 2025-07-30 04:30:10 -07:00
gemma_args.h 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
griffin.cc Speed up builds by skipping rarely used targets 2025-06-17 05:44:20 -07:00
griffin.h Major refactor: clarify query_idx (global) vs qi. Refs #607 2025-06-16 02:42:02 -07:00
kv_cache.cc De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
kv_cache.h De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
model_store.cc Write SBS progressively. 2025-07-31 06:05:38 -07:00
model_store.h Write SBS progressively. 2025-07-31 06:05:38 -07:00
run.cc Fix file mapping: was letting the smart pointer go out of scope 2025-07-30 04:30:10 -07:00
tensor_info.cc Major refactor to de-templatize gemma-inl and weights 2025-06-02 23:01:35 -07:00
tensor_info.h Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
tensor_info_test.cc Minor: ModelWeightsPtrs -> WeightsPtrs 2025-07-11 06:11:51 -07:00
tokenizer.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
tokenizer.h 6x large-batch, short-prompt prefill speedup 2025-06-10 09:56:20 -07:00
vit.cc 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
vit.h Minor: ModelWeightsPtrs -> WeightsPtrs 2025-07-11 06:11:51 -07:00
weights.cc Fix file mapping: was letting the smart pointer go out of scope 2025-07-30 04:30:10 -07:00
weights.h Fix file mapping: was letting the smart pointer go out of scope 2025-07-30 04:30:10 -07:00