gemma.cpp/util
Jan Wassenberg 9efdcfd45c 1.07x batch decode speedup: more BF16 weights and activations
BF16 att_sums and ffw_out
Support BF16 B views without decompression
Support arbitrary types in MulByConstAndAdd, AddFrom

Also update profiler annotations in ops-inl.h

PiperOrigin-RevId: 766995010
2025-06-03 23:30:18 -07:00
..
allocator.cc Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
allocator.h 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding 2025-05-16 07:42:13 -07:00
args.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
basics.h Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes 2025-04-08 03:35:55 -07:00
mat.cc Remove backprop/ 2025-05-28 07:01:17 -07:00
mat.h 1.07x batch decode speedup: more BF16 weights and activations 2025-06-03 23:30:18 -07:00
test_util.h Minor cleanup/fixes: 2024-09-09 06:58:09 -07:00
threading.cc Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00
threading.h Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
threading_context.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading_context.h Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
threading_test.cc Rename-only: remove Allocator2 etc suffixes now that refactoring is complete 2025-05-06 09:12:43 -07:00
topology.cc Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00
topology.h Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00