gemma.cpp/util
Jan Wassenberg e890d46f30 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding
Only the weights; binding MatMul output worsens batch=1 prefill.
Update gemma_batch_bench to use --decode_qbatch.
Fix/remove prefill_activations in gemma-inl.h.

Refactor:
use BasePageBytes directly when binding
Move BindB/C to .cc by de-templatizing
Remove MatOwners::AllocateFor because it is weights-specific (binding or not)
Disband MatOwners, replace with vector
PiperOrigin-RevId: 759610477
2025-05-16 07:42:13 -07:00
..
allocator.cc Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
allocator.h 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding 2025-05-16 07:42:13 -07:00
args.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
basics.h Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes 2025-04-08 03:35:55 -07:00
mat.cc 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding 2025-05-16 07:42:13 -07:00
mat.h 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding 2025-05-16 07:42:13 -07:00
test_util.h Minor cleanup/fixes: 2024-09-09 06:58:09 -07:00
threading.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading.h Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
threading_context.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading_context.h Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading_test.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
topology.cc Add new singleton Allocator2 instead of monostate 2025-04-08 09:00:59 -07:00
topology.h Refactor Gemma ctor and improve pool NUMA support 2025-03-14 10:19:00 -07:00