..
bindings
Replace RowVectorBatch with MatStorageT
2025-05-12 09:16:12 -07:00
evals
Add MMLU eval to github
2024-05-20 10:20:53 -07:00
instantiations
Eliminated TConfig.
2024-10-17 05:04:22 -07:00
activations.h
1.31x batch prefill, 1.24x batch decode speedup: NUMA binding
2025-05-16 07:42:13 -07:00
common.cc
Huge refactor of weight handling and model loading.
2025-05-06 04:44:21 -07:00
common.h
Huge refactor of weight handling and model loading.
2025-05-06 04:44:21 -07:00
configs.cc
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
configs.h
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
configs_test.cc
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
gemma-inl.h
1.31x batch prefill, 1.24x batch decode speedup: NUMA binding
2025-05-16 07:42:13 -07:00
gemma.cc
3.8x speedup of weights loading via preadv on Linux
2025-05-15 11:55:15 -07:00
gemma.h
Replace RowVectorBatch with MatStorageT
2025-05-12 09:16:12 -07:00
gemma_args.h
Cleanup: remove unused kCyclic, remove 2 suffix
2025-05-13 01:06:41 -07:00
kv_cache.cc
Replace RowVectorBatch with MatStorageT
2025-05-12 09:16:12 -07:00
kv_cache.h
Cleanup: remove unused kCyclic, remove 2 suffix
2025-05-13 01:06:41 -07:00
model_store.cc
Fix the wrapping field of the deduced model config
2025-05-13 23:02:03 +08:00
model_store.h
Move fields, io* and blob* from compression/ into io/
2025-05-06 11:17:19 -07:00
run.cc
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
tensor_info.cc
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
tensor_info.h
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
tensor_info_test.cc
Minor: rename compression/shared -> types.h
2025-05-13 06:53:21 -07:00
tokenizer.cc
Huge refactor of weight handling and model loading.
2025-05-06 04:44:21 -07:00
tokenizer.h
Huge refactor of weight handling and model loading.
2025-05-06 04:44:21 -07:00
weights.cc
1.31x batch prefill, 1.24x batch decode speedup: NUMA binding
2025-05-16 07:42:13 -07:00
weights.h
1.31x batch prefill, 1.24x batch decode speedup: NUMA binding
2025-05-16 07:42:13 -07:00