gemma.cpp

History

Jan Wassenberg e890d46f30 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding Only the weights; binding MatMul output worsens batch=1 prefill. Update gemma_batch_bench to use --decode_qbatch. Fix/remove prefill_activations in gemma-inl.h. Refactor: use BasePageBytes directly when binding Move BindB/C to .cc by de-templatizing Remove MatOwners::AllocateFor because it is weights-specific (binding or not) Disband MatOwners, replace with vector PiperOrigin-RevId: 759610477		2025-05-16 07:42:13 -07:00
..
allocator.cc	Cleanup: remove unused kCyclic, remove 2 suffix	2025-05-13 01:06:41 -07:00
allocator.h	1.31x batch prefill, 1.24x batch decode speedup: NUMA binding	2025-05-16 07:42:13 -07:00
args.h	Move fields, io* and blob* from compression/ into io/	2025-05-06 11:17:19 -07:00
basics.h	Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes	2025-04-08 03:35:55 -07:00
mat.cc	1.31x batch prefill, 1.24x batch decode speedup: NUMA binding	2025-05-16 07:42:13 -07:00
mat.h	1.31x batch prefill, 1.24x batch decode speedup: NUMA binding	2025-05-16 07:42:13 -07:00
test_util.h	Minor cleanup/fixes:	2024-09-09 06:58:09 -07:00
threading.cc	Rename-only: remove Allocator2 etc suffixes now that refactoring is complete	2025-05-06 09:12:43 -07:00
threading.h	Cleanup: remove unused kCyclic, remove 2 suffix	2025-05-13 01:06:41 -07:00
threading_context.cc	Rename-only: remove Allocator2 etc suffixes now that refactoring is complete	2025-05-06 09:12:43 -07:00
threading_context.h	Rename-only: remove Allocator2 etc suffixes now that refactoring is complete	2025-05-06 09:12:43 -07:00
threading_test.cc	Rename-only: remove Allocator2 etc suffixes now that refactoring is complete	2025-05-06 09:12:43 -07:00
topology.cc	Add new singleton Allocator2 instead of monostate	2025-04-08 09:00:59 -07:00
topology.h	Refactor Gemma ctor and improve pool NUMA support	2025-03-14 10:19:00 -07:00