gemma.cpp

History

Jan Wassenberg 8c0a8834c1 Major compression update, arbitrary-len unpack + new Dot Compression: * Implement {any packed} x {bf16, f32} 'Load2' and DecompressAndZeroPad * New compression test for all packed formats, add to GEMMA_TEST_FILES, remove from sfp/nuq_test * Decompress->DecompressAndZeroPad, use PackedSpan for args with bounds checking * NUQ: support arbitrary-length enc/dec * New compression/shared, remove sfp.h and nuq.h * Move Store2 into Traits and provide Compress2 wrapper * Remove unused Decompress()-with-pool overload * Simplify CompressedArrayLen, rename to CompressedArrayElements * Remove unused DistortionStats b_l1_ Misc: * Add compensated and Kahan dot, support any length * Use same Dot function everywhere * Move exact arithmetic functions into fp_arith * use FloatPtr and MatPtr typedefs in tests; less stack usage * Rename args to packed/raw * Remove Traits::Name, instead TypeName<T>() * Move kMaxSFP and kClusters/kGroupSize into Sfp/NuqStream PiperOrigin-RevId: 672868468		2024-09-10 02:22:19 -07:00
..
dot-inl.h	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
dot_test.cc	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
fp_arith-inl.h	Refactor/cleanup, remove even_odd	2024-09-04 09:25:13 -07:00
gemma_matvec_test.cc	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
matmul-inl.h	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
matmul.h	Major MatMul update, 1.9-2.3x speedup on Zen4 via bf16 mul	2024-08-16 07:52:20 -07:00
matmul_test.cc	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
matvec-inl.h	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
ops-inl.h	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00
ops_test.cc	Major compression update, arbitrary-len unpack + new Dot	2024-09-10 02:22:19 -07:00