gemma.cpp

History

Jan Wassenberg 9a02d6be68 Add --prompt_file and testdata for it. Refs #608 Linux terminals truncate input after 4096 chars. testdata is Frankenstein from project Gutenberg, which are long out of copyright. Also fix loss of coherence after long context caused by incorrect IsGlobalLayer. Move that to config.h and use max_seq_len as the initializer to make this clear. Also avoid dynamic allocation for GriffinActivations. PiperOrigin-RevId: 772333225		2025-06-16 23:41:07 -07:00
..
bindings	Major refactor: clarify query_idx (global) vs qi. Refs #607	2025-06-16 02:42:02 -07:00
evals	Add MMLU eval to github	2024-05-20 10:20:53 -07:00
activations.h	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
attention.cc	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
attention.h	Split Activations into Griffin/Attention to reduce memory usage for attention-only tests.	2025-06-16 07:52:59 -07:00
configs.cc	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
configs.h	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
configs_test.cc	Minor: rename compression/shared -> types.h	2025-05-13 06:53:21 -07:00
gemma-inl.h	Further cleanup: separate MatMulEnv arg	2025-06-05 20:48:32 -07:00
gemma.cc	Split Activations into Griffin/Attention to reduce memory usage for attention-only tests.	2025-06-16 07:52:59 -07:00
gemma.h	Add `Append` method to `AllQueries`	2025-06-16 20:39:27 +08:00
gemma_args.h	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
griffin.cc	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
griffin.h	Major refactor: clarify query_idx (global) vs qi. Refs #607	2025-06-16 02:42:02 -07:00
kv_cache.cc	MatPtr-ify KV, shared div_seq_len, --seq_len flag	2025-06-11 09:49:38 -07:00
kv_cache.h	MatPtr-ify KV, shared div_seq_len, --seq_len flag	2025-06-11 09:49:38 -07:00
model_store.cc	Fix paligemma_test, refs #588	2025-06-03 04:45:22 -07:00
model_store.h	Remove backprop/	2025-05-28 07:01:17 -07:00
run.cc	Add --prompt_file and testdata for it. Refs #608	2025-06-16 23:41:07 -07:00
tensor_info.cc	Major refactor to de-templatize gemma-inl and weights	2025-06-02 23:01:35 -07:00
tensor_info.h	Minor: rename compression/shared -> types.h	2025-05-13 06:53:21 -07:00
tensor_info_test.cc	Major refactor to de-templatize gemma-inl and weights	2025-06-02 23:01:35 -07:00
tokenizer.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tokenizer.h	6x large-batch, short-prompt prefill speedup	2025-06-10 09:56:20 -07:00
vit.cc	Split Activations into Griffin/Attention to reduce memory usage for attention-only tests.	2025-06-16 07:52:59 -07:00
vit.h	Further cleanup: separate MatMulEnv arg	2025-06-05 20:48:32 -07:00
weights.cc	Avoid warning about inability to map, unless explicitly requested	2025-06-05 09:10:08 -07:00
weights.h	Split gemma-inl into separate source files	2025-06-05 05:36:44 -07:00