Commit Graph

13 Commits

Author SHA1 Message Date
Jan Wassenberg c29e9752c7 Refactor/cleanup, remove even_odd
* New compression/shared.h, remove sfp.h
* Remove unused DistortionStats b_l1_
* Move exact arithmetic functions into fp_arith
* Remove even_odd optimization for MatVec (mostly unused)
* use BF16 typedef more widely
* Add kMaxSFP constant

PiperOrigin-RevId: 670996386
2024-09-04 09:25:13 -07:00
Jan Wassenberg 4033ed9e78 Avoid duplication of RMSNorm, support all activation/weight types
Add test for RMSNorm
Rename VectorizedRopeAndMulBy -> RopeAndMulBy

Move test_util to util/

PiperOrigin-RevId: 668332927
2024-08-28 01:26:55 -07:00
Jan Wassenberg 2308514e5a Experiment with compensated dot product.
ULP difference vs exact is 0..1, vs 200-5000 for previous.
Runtime overhead is 2.5-4x for f32 input.

PiperOrigin-RevId: 668084019
2024-08-27 12:05:35 -07:00
Jan Wassenberg 1617e1a33d SFP speedup: 1.14x f32, 1.19x bf16 dot = 1.02x prefill
12->9 ops by recognizing the upper/lower bytes are simply shifted.

PiperOrigin-RevId: 659609241
2024-08-05 10:59:13 -07:00
Jan Wassenberg 5c3e5f7038 Remove no longer required stats.h - use Highway version instead
PiperOrigin-RevId: 640440379
2024-06-05 01:37:48 -07:00
Jan Wassenberg c5c9fc300c Enable even/odd for SFP. Refs #166
Disable it for float32 because there is not enough benefit.

PiperOrigin-RevId: 631788326
2024-05-08 07:09:06 -07:00
Jan Wassenberg b5a9ade75f 2x speedup of SFP decode (1.4x overall) on AVX3_DL+.
Thanks @nzmichaelh for suggesting table lookups!

PiperOrigin-RevId: 631337524
2024-05-07 01:46:43 -07:00
Jan Wassenberg a939b5fc9f Update distortion.h to weighted average, add distortion_test.
More thorough checks in sfp_test and nuq_test.
nuq_test: use deterministic input generator.

PiperOrigin-RevId: 625602019
2024-04-17 01:44:19 -07:00
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Jan Wassenberg 61e031fe98 Towards building tests without GUnit Refs #29
PiperOrigin-RevId: 618032987
2024-03-21 19:28:02 -07:00
Jan Wassenberg 24add61dd9 Fix SFP/NUQ for bf16 rounding in Highway
SFP: Avoid rounding twice, and more robust TestDot.
NUQ: also more robust SNR, minor touchups to header.

PiperOrigin-RevId: 618030096
2024-03-21 19:06:19 -07:00
Jan Wassenberg bb9b023502 Support Bazel builds. Fixes #16
Also fix nuq/sfp-inl: warning, cast, and disable SCALAR

PiperOrigin-RevId: 612704056
2024-03-04 22:07:25 -08:00
Austin Huang e29cd566cf initial commit 2024-02-21 03:31:22 +00:00