gemma.cpp/gemma
Daniel Keysers a8e08778d4 Add an additional QueryModel() overload to GemmaEnv.
Use args only in GemmaEnv constructor, store everything else in RuntimeConfig.
Add runtime option to turn off thread spinning.

PiperOrigin-RevId: 670467320
2024-09-03 02:25:19 -07:00
..
evals Add MMLU eval to github 2024-05-20 10:20:53 -07:00
instantiations Rename Gemma9B and Gemma27B to Gemma2_9B and Gemma2_27B. 2024-08-09 02:09:06 -07:00
activations.h Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul 2024-08-16 07:52:20 -07:00
common.cc Rename Gemma9B and Gemma27B to Gemma2_9B and Gemma2_27B. 2024-08-09 02:09:06 -07:00
common.h Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul 2024-08-16 07:52:20 -07:00
configs.h Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul 2024-08-16 07:52:20 -07:00
gemma-inl.h Fix asan failure in local attention computation. 2024-09-02 07:06:10 -07:00
gemma.cc Add an additional QueryModel() overload to GemmaEnv. 2024-09-03 02:25:19 -07:00
gemma.h Add an additional QueryModel() overload to GemmaEnv. 2024-09-03 02:25:19 -07:00
kv_cache.cc Major revamp #2 of Prefill: fix token order, parallel for multi-query 2024-07-25 03:28:55 -07:00
kv_cache.h Major revamp #2 of Prefill: fix token order, parallel for multi-query 2024-07-25 03:28:55 -07:00
run.cc Simplify pos handling, auto-increment output arg 2024-08-15 09:25:26 -07:00
tokenizer.cc 7x compile time speedup: shard gemma.cc 2024-07-03 06:35:04 -07:00
tokenizer.h 7x compile time speedup: shard gemma.cc 2024-07-03 06:35:04 -07:00
weights.cc 1.3x prefill, 0.95x decode: matmul replacing last matvec 2024-08-12 03:36:01 -07:00
weights.h Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul 2024-08-16 07:52:20 -07:00