gemma.cpp

History

Charles Zhao 50ee1a3e92 Write SBS progressively. (1) Directly write to file in BlobWriter::Add and destruct the MatOwner to release the rams. (2) Write a fake header to indicate this is V2, and write correct header and directory at the end of the file. (3) Tested on loading sbs written the old way, and new way, both worked. PiperOrigin-RevId: 789306837		2025-07-31 06:05:38 -07:00
..
bindings	Rename GetModelConfig->Config	2025-07-29 10:18:12 -07:00
evals	Add MMLU eval to github	2024-05-20 10:20:53 -07:00
activations.h	1.14x batch decode speedup: parallelize RMSNorm ops	2025-07-30 00:55:45 -07:00
attention.cc	1.14x batch decode speedup: parallelize RMSNorm ops	2025-07-30 00:55:45 -07:00
attention.h	Back to f32 kv_cache, but via typedef	2025-07-21 07:05:35 -07:00
configs.cc	Internal change.	2025-07-29 08:21:29 -07:00
configs.h	Add blob_path to config deduction message	2025-07-11 18:58:56 -07:00
configs_test.cc	Minor: rename compression/shared -> types.h	2025-05-13 06:53:21 -07:00
gemma-inl.h	1.14x batch decode speedup: parallelize RMSNorm ops	2025-07-30 00:55:45 -07:00
gemma.cc	Write SBS progressively.	2025-07-31 06:05:38 -07:00
gemma.h	Fix file mapping: was letting the smart pointer go out of scope	2025-07-30 04:30:10 -07:00
gemma_args.h	1.14x batch decode speedup: parallelize RMSNorm ops	2025-07-30 00:55:45 -07:00
griffin.cc	Speed up builds by skipping rarely used targets	2025-06-17 05:44:20 -07:00
griffin.h	Major refactor: clarify query_idx (global) vs qi. Refs #607	2025-06-16 02:42:02 -07:00
kv_cache.cc	De-singleton ThreadingContext so callers can pass in their own	2025-07-22 02:08:46 -07:00
kv_cache.h	De-singleton ThreadingContext so callers can pass in their own	2025-07-22 02:08:46 -07:00
model_store.cc	Write SBS progressively.	2025-07-31 06:05:38 -07:00
model_store.h	Write SBS progressively.	2025-07-31 06:05:38 -07:00
run.cc	Fix file mapping: was letting the smart pointer go out of scope	2025-07-30 04:30:10 -07:00
tensor_info.cc	Major refactor to de-templatize gemma-inl and weights	2025-06-02 23:01:35 -07:00
tensor_info.h	Minor: rename compression/shared -> types.h	2025-05-13 06:53:21 -07:00
tensor_info_test.cc	Minor: ModelWeightsPtrs -> WeightsPtrs	2025-07-11 06:11:51 -07:00
tokenizer.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tokenizer.h	6x large-batch, short-prompt prefill speedup	2025-06-10 09:56:20 -07:00
vit.cc	1.14x batch decode speedup: parallelize RMSNorm ops	2025-07-30 00:55:45 -07:00
vit.h	Minor: ModelWeightsPtrs -> WeightsPtrs	2025-07-11 06:11:51 -07:00
weights.cc	Fix file mapping: was letting the smart pointer go out of scope	2025-07-30 04:30:10 -07:00
weights.h	Fix file mapping: was letting the smart pointer go out of scope	2025-07-30 04:30:10 -07:00