gemma.cpp/evals
Jan Wassenberg aaf51898b6 Major revamp #2 of Prefill: fix token order, parallel for multi-query
- Allocate only the required KV caches and activation batch size
- Add flags for batch sizes
- Const-correct interface: Span of const int.
- Also clean up the KVCache arg to a span.
- Move kPrefillBatchSize into RuntimeConfig and remove related global constants.

PiperOrigin-RevId: 655893197
2024-07-25 03:28:55 -07:00
..
benchmark.cc Major revamp #2 of Prefill: fix token order, parallel for multi-query 2024-07-25 03:28:55 -07:00
benchmark_helper.cc Major revamp #2 of Prefill: fix token order, parallel for multi-query 2024-07-25 03:28:55 -07:00
benchmark_helper.h Major revamp #2 of Prefill: fix token order, parallel for multi-query 2024-07-25 03:28:55 -07:00
benchmarks.cc Move benchmark_helper to evals/, weights_raw to compression/. 2024-07-08 01:13:23 -07:00
cross_entropy.cc Cleanup: add ModelInfo struct, remove gcpp:: 2024-07-02 07:11:15 -07:00
cross_entropy.h Declutter gemma/ directory, move binaries to evals/ and util/. 2024-07-01 09:51:04 -07:00
debug_prompt.cc Move benchmark_helper to evals/, weights_raw to compression/. 2024-07-08 01:13:23 -07:00
gemma_test.cc Major revamp #2 of Prefill: fix token order, parallel for multi-query 2024-07-25 03:28:55 -07:00
run_mmlu.cc Move benchmark_helper to evals/, weights_raw to compression/. 2024-07-08 01:13:23 -07:00