gemma.cpp

History

Biruk Mammo d834c07042 Exposes `GemmaAttention::DotSoftmaxWeightedSum` for experimentation. Also in this change: * The computation for a single `q` is factored out and exposed. * Strided `ConstMat` views into the KV caches are introduced to enable experimentation with various KV cache layouts. PiperOrigin-RevId: 756339313		2025-05-08 09:19:04 -07:00
..
bindings	cleanup, new conversation methods, bugfixes	2025-05-07 08:52:44 -07:00
evals	Add MMLU eval to github	2024-05-20 10:20:53 -07:00
instantiations	Eliminated TConfig.	2024-10-17 05:04:22 -07:00
activations.h	Rename-only: remove Allocator2 etc suffixes now that refactoring is complete	2025-05-06 09:12:43 -07:00
common.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
common.h	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
configs.cc	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
configs.h	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
configs_test.cc	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
gemma-inl.h	Exposes `GemmaAttention::DotSoftmaxWeightedSum` for experimentation.	2025-05-08 09:19:04 -07:00
gemma.cc	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
gemma.h	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
gemma_args.h	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
kv_cache.cc	Cleanup: include fixes/comments, fix leak, vector reserve	2025-04-22 12:01:46 -07:00
kv_cache.h	Cleanup: include fixes/comments, fix leak, vector reserve	2025-04-22 12:01:46 -07:00
model_store.cc	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
model_store.h	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
run.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tensor_info.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tensor_info.h	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tensor_info_test.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tokenizer.cc	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
tokenizer.h	Huge refactor of weight handling and model loading.	2025-05-06 04:44:21 -07:00
weights.cc	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
weights.h	Fix gcc build error and gemma3 crash, thanks @ufownl, fixes #551	2025-05-07 00:59:18 -07:00