gemma.cpp

History

Jan Wassenberg f74d496879 Threading/infra improvements. * Add ParallelizeRange helpers and partitioning helpers Refactor Pinning class, store original affinity (required to construct another NestedPools after pinning happened) Compress: * prevent Compress printing stats in tests * zero-pad tensors Matmul: * add matmul_unit_test (TODO) and bench_matmul * matmul_test: change norm to row vectors (that is what is added) and include bf16 rounding error * Prepare for L2/L3 retrieval PiperOrigin-RevId: 700603811		2024-11-27 01:12:00 -08:00
..
benchmark.cc	Eliminated TConfig.	2024-10-17 05:04:22 -07:00
benchmark_helper.cc	Threading/infra improvements.	2024-11-27 01:12:00 -08:00
benchmark_helper.h	Use NestedPools, add NUMA infra	2024-10-18 08:11:18 -07:00
benchmarks.cc	Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize.	2024-10-14 04:45:21 -07:00
cross_entropy.cc	Make top_k a runtime argument (instead of a model argument).	2024-11-13 09:48:59 -08:00
cross_entropy.h	Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize.	2024-10-14 04:45:21 -07:00
debug_prompt.cc	Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize.	2024-10-14 04:45:21 -07:00
gemma_batch_bench.cc	Add a simple benchmark for batching.	2024-11-21 10:59:49 -08:00
gemma_test.cc	Make top_k a runtime argument (instead of a model argument).	2024-11-13 09:48:59 -08:00
prompts.h	Benchmark gemma.cpp with different length inputs.	2024-10-10 15:59:26 -07:00
run_mmlu.cc	Make top_k a runtime argument (instead of a model argument).	2024-11-13 09:48:59 -08:00