gemma.cpp/gemma
Biruk Mammo d834c07042 Exposes `GemmaAttention::DotSoftmaxWeightedSum` for experimentation.
Also in this change:
* The computation for a single `q` is factored out and exposed.
* Strided `ConstMat` views into the KV caches are introduced to enable experimentation with various KV cache layouts.

PiperOrigin-RevId: 756339313
2025-05-08 09:19:04 -07:00
..
bindings cleanup, new conversation methods, bugfixes 2025-05-07 08:52:44 -07:00
evals Add MMLU eval to github 2024-05-20 10:20:53 -07:00
instantiations Eliminated TConfig. 2024-10-17 05:04:22 -07:00
activations.h Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
common.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
common.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
configs.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
configs.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
configs_test.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
gemma-inl.h Exposes `GemmaAttention::DotSoftmaxWeightedSum` for experimentation. 2025-05-08 09:19:04 -07:00
gemma.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
gemma.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
gemma_args.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
kv_cache.cc Cleanup: include fixes/comments, fix leak, vector reserve 2025-04-22 12:01:46 -07:00
kv_cache.h Cleanup: include fixes/comments, fix leak, vector reserve 2025-04-22 12:01:46 -07:00
model_store.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
model_store.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
run.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
tensor_info.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
tensor_info.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
tensor_info_test.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
tokenizer.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
tokenizer.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
weights.cc Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
weights.h Fix gcc build error and gemma3 crash, thanks @ufownl, fixes #551 2025-05-07 00:59:18 -07:00