gemma.cpp/util
Jan Wassenberg cf4d7ceb82 1.16x decode speedup: remove last MatVec in Attention
Precompute row pointers.
Remove no longer used MHA support; QStride -> qkv_dim.
Remove RowPtr from MatMul interface, use only MatPtrT.
Require opt-in define for NUQ to speed up builds.
Also fix io.cc on Windows.

PiperOrigin-RevId: 766228108
2025-06-02 09:40:29 -07:00
..
allocator.cc Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
allocator.h 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding 2025-05-16 07:42:13 -07:00
args.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
basics.h Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes 2025-04-08 03:35:55 -07:00
mat.cc Remove backprop/ 2025-05-28 07:01:17 -07:00
mat.h 1.16x decode speedup: remove last MatVec in Attention 2025-06-02 09:40:29 -07:00
test_util.h Minor cleanup/fixes: 2024-09-09 06:58:09 -07:00
threading.cc Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00
threading.h Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
threading_context.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading_context.h Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading_test.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
topology.cc Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00
topology.h Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00