gemma.cpp

History

Jan Wassenberg aaf51898b6 Major revamp #2 of Prefill: fix token order, parallel for multi-query - Allocate only the required KV caches and activation batch size - Add flags for batch sizes - Const-correct interface: Span of const int. - Also clean up the KVCache arg to a span. - Move kPrefillBatchSize into RuntimeConfig and remove related global constants. PiperOrigin-RevId: 655893197		2024-07-25 03:28:55 -07:00
..
benchmark.cc	Major revamp #2 of Prefill: fix token order, parallel for multi-query	2024-07-25 03:28:55 -07:00
benchmark_helper.cc	Major revamp #2 of Prefill: fix token order, parallel for multi-query	2024-07-25 03:28:55 -07:00
benchmark_helper.h	Major revamp #2 of Prefill: fix token order, parallel for multi-query	2024-07-25 03:28:55 -07:00
benchmarks.cc	Move benchmark_helper to evals/, weights_raw to compression/.	2024-07-08 01:13:23 -07:00
cross_entropy.cc	Cleanup: add ModelInfo struct, remove gcpp::	2024-07-02 07:11:15 -07:00
cross_entropy.h	Declutter gemma/ directory, move binaries to evals/ and util/.	2024-07-01 09:51:04 -07:00
debug_prompt.cc	Move benchmark_helper to evals/, weights_raw to compression/.	2024-07-08 01:13:23 -07:00
gemma_test.cc	Major revamp #2 of Prefill: fix token order, parallel for multi-query	2024-07-25 03:28:55 -07:00
run_mmlu.cc	Move benchmark_helper to evals/, weights_raw to compression/.	2024-07-08 01:13:23 -07:00