gemma.cpp/util
Jan Wassenberg d1638587f0 1.14x batch decode speedup: parallelize RMSNorm ops
Activations was over-parallelized, use single pool instead.
Also improve profiler zone annotations,
pass through worker args (for tracking concurrency), now non-optional.

PiperOrigin-RevId: 788790976
2025-07-30 00:55:45 -07:00
..
allocator.cc Cleanup: remove unused kCyclic, remove 2 suffix 2025-05-13 01:06:41 -07:00
allocator.h 1.31x batch prefill, 1.24x batch decode speedup: NUMA binding 2025-05-16 07:42:13 -07:00
args.h Move fields, io* and blob* from compression/ into io/ 2025-05-06 11:17:19 -07:00
basics.h Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes 2025-04-08 03:35:55 -07:00
mat.cc De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
mat.h De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
test_util.h Minor cleanup/fixes: 2024-09-09 06:58:09 -07:00
threading.cc Avoid affinity related warnings on Apple. Refs #625 2025-07-03 08:22:31 -07:00
threading.h 1.14x batch decode speedup: parallelize RMSNorm ops 2025-07-30 00:55:45 -07:00
threading_context.cc De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
threading_context.h De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
threading_test.cc De-singleton ThreadingContext so callers can pass in their own 2025-07-22 02:08:46 -07:00
topology.cc Avoid affinity related warnings on Apple. Refs #625 2025-07-03 08:22:31 -07:00
topology.h Fix thread name when skipping packages/clusters 2025-06-01 23:50:11 -07:00