gemma.cpp/gemma
Jan Wassenberg 02ce1e344f Use NestedPools, add NUMA infra
Improved threading.h, fix thread counts for single package/cluster systems
Temporarily forces to a single socket. Prefill 29.28 tps, decode 6.92.

Also fix benchmarks.cc build, update tensor allocator to Allocator

PiperOrigin-RevId: 687307167
2024-10-18 08:11:18 -07:00
..
evals Add MMLU eval to github 2024-05-20 10:20:53 -07:00
instantiations Eliminated TConfig. 2024-10-17 05:04:22 -07:00
activations.h Use NestedPools, add NUMA infra 2024-10-18 08:11:18 -07:00
common.cc Eliminated TConfig. 2024-10-17 05:04:22 -07:00
common.h Fix PaliGemma's GenerateImageTokensT(). 2024-10-18 01:34:13 -07:00
configs.cc Fix PaliGemma's GenerateImageTokensT(). 2024-10-18 01:34:13 -07:00
configs.h Fix PaliGemma's GenerateImageTokensT(). 2024-10-18 01:34:13 -07:00
configs_test.cc Eliminated TConfig. 2024-10-17 05:04:22 -07:00
gemma-inl.h Use NestedPools, add NUMA infra 2024-10-18 08:11:18 -07:00
gemma.cc Use NestedPools, add NUMA infra 2024-10-18 08:11:18 -07:00
gemma.h Use NestedPools, add NUMA infra 2024-10-18 08:11:18 -07:00
kv_cache.cc Eliminated TConfig. 2024-10-17 05:04:22 -07:00
kv_cache.h Eliminated TConfig. 2024-10-17 05:04:22 -07:00
run.cc Use NestedPools, add NUMA infra 2024-10-18 08:11:18 -07:00
tokenizer.cc Add support for PaliGemma Vision-LM (224x224) to gemma.cpp 2024-09-23 10:09:38 -07:00
tokenizer.h 7x compile time speedup: shard gemma.cc 2024-07-03 06:35:04 -07:00
weights.cc Use NestedPools, add NUMA infra 2024-10-18 08:11:18 -07:00
weights.h Fix PaliGemma's GenerateImageTokensT(). 2024-10-18 01:34:13 -07:00