gemma.cpp/evals
Jan Wassenberg d1638587f0 1.14x batch decode speedup: parallelize RMSNorm ops
Activations was over-parallelized, use single pool instead.
Also improve profiler zone annotations,
pass through worker args (for tracking concurrency), now non-optional.

PiperOrigin-RevId: 788790976
2025-07-30 00:55:45 -07:00
..
benchmark.cc Rename GetModelConfig->Config 2025-07-29 10:18:12 -07:00
benchmark_helper.cc Rename GetModelConfig->Config 2025-07-29 10:18:12 -07:00
benchmark_helper.h Rename GetModelConfig->Config 2025-07-29 10:18:12 -07:00
benchmarks.cc Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize. 2024-10-14 04:45:21 -07:00
cross_entropy.cc 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
cross_entropy.h Move MatMulEnv out of Gemma to enable concurrent calls 2025-06-23 01:20:09 -07:00
debug_prompt.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
gemma_batch_bench.cc batch_bench tweak: more output 2025-06-20 06:09:18 -07:00
gemma_test.cc Rename GetModelConfig->Config 2025-07-29 10:18:12 -07:00
prompts.h Benchmark gemma.cpp with different length inputs. 2024-10-10 15:59:26 -07:00
run_mmlu.cc Move MatMulEnv out of Gemma to enable concurrent calls 2025-06-23 01:20:09 -07:00