gemma.cpp/evals
Jan Wassenberg ba6131311a Fix gemma_batch_bench for flash attention
q_T rows do not change.
Also repeat prefill to reflect perf after autotuning.

PiperOrigin-RevId: 805319377
2025-09-10 05:32:34 -07:00
..
benchmark.cc Rename GetModelConfig->Config 2025-07-29 10:18:12 -07:00
benchmark_helper.cc Cleanup: split CacheInfo from Allocator, MatMul helper functions 2025-09-08 02:23:58 -07:00
benchmark_helper.h Replace mt19937 with new generator to enable parallel sampling 2025-09-04 23:49:10 -07:00
benchmarks.cc Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize. 2024-10-14 04:45:21 -07:00
cross_entropy.cc 1.15x speedup: parallel sampling, enabled by new RNG 2025-09-05 07:24:02 -07:00
cross_entropy.h Move MatMulEnv out of Gemma to enable concurrent calls 2025-06-23 01:20:09 -07:00
debug_prompt.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
gemma_batch_bench.cc Fix gemma_batch_bench for flash attention 2025-09-10 05:32:34 -07:00
gemma_test.cc Remove Griffin support 2025-09-05 02:35:40 -07:00
prompts.h Benchmark gemma.cpp with different length inputs. 2024-10-10 15:59:26 -07:00
run_mmlu.cc Replace mt19937 with new generator to enable parallel sampling 2025-09-04 23:49:10 -07:00