gemma.cpp

History

Jan Wassenberg d1638587f0 1.14x batch decode speedup: parallelize RMSNorm ops Activations was over-parallelized, use single pool instead. Also improve profiler zone annotations, pass through worker args (for tracking concurrency), now non-optional. PiperOrigin-RevId: 788790976		2025-07-30 00:55:45 -07:00
..
benchmark.cc	Rename GetModelConfig->Config	2025-07-29 10:18:12 -07:00
benchmark_helper.cc	Rename GetModelConfig->Config	2025-07-29 10:18:12 -07:00
benchmark_helper.h	Rename GetModelConfig->Config	2025-07-29 10:18:12 -07:00
benchmarks.cc	Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize.	2024-10-14 04:45:21 -07:00
cross_entropy.cc	1.14x batch decode speedup: parallelize RMSNorm ops	2025-07-30 00:55:45 -07:00
cross_entropy.h	Move MatMulEnv out of Gemma to enable concurrent calls	2025-06-23 01:20:09 -07:00
debug_prompt.cc	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
gemma_batch_bench.cc	batch_bench tweak: more output	2025-06-20 06:09:18 -07:00
gemma_test.cc	Rename GetModelConfig->Config	2025-07-29 10:18:12 -07:00
prompts.h	Benchmark gemma.cpp with different length inputs.	2024-10-10 15:59:26 -07:00
run_mmlu.cc	Move MatMulEnv out of Gemma to enable concurrent calls	2025-06-23 01:20:09 -07:00