gemma.cpp/compression
Jan Wassenberg 343482c7ef 1.02x batch decode speedup: BF16 KV cache
ops-inl.h: Vectorize Rope(), template
Remove unused MulBy, and extra-arg overloads of MulByConst and Softmax
Fix for DecompressAndZeroPad: ensure second vector filled

PiperOrigin-RevId: 772779163
2025-06-17 23:21:59 -07:00
..
python Speed up builds by skipping rarely used targets 2025-06-17 05:44:20 -07:00
BUILD.bazel Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
analyze.h Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
compress-inl.h 1.02x batch decode speedup: BF16 KV cache 2025-06-17 23:21:59 -07:00
compress.cc Minor cleanup, on-demand NUQ buffer allocation 2025-04-16 10:49:43 -07:00
compress.h Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
compress_test.cc Speed up builds by skipping rarely used targets 2025-06-17 05:44:20 -07:00
distortion.h Refactor/cleanup, remove even_odd 2024-09-04 09:25:13 -07:00
distortion_test.cc Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
nuq-inl.h Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
nuq_test.cc Speed up builds by skipping rarely used targets 2025-06-17 05:44:20 -07:00
sfp-inl.h Minor: rename compression/shared -> types.h 2025-05-13 06:53:21 -07:00
sfp_test.cc Speed up builds by skipping rarely used targets 2025-06-17 05:44:20 -07:00
test_util-inl.h 1.16x decode speedup: remove last MatVec in Attention 2025-06-02 09:40:29 -07:00
types.h Speed up builds by skipping rarely used targets 2025-06-17 05:44:20 -07:00