Commit Graph

3 Commits

Author SHA1 Message Date
Jan Wassenberg cf4d7ceb82 1.16x decode speedup: remove last MatVec in Attention
Precompute row pointers.
Remove no longer used MHA support; QStride -> qkv_dim.
Remove RowPtr from MatMul interface, use only MatPtrT.
Require opt-in define for NUQ to speed up builds.
Also fix io.cc on Windows.

PiperOrigin-RevId: 766228108
2025-06-02 09:40:29 -07:00
Jan Wassenberg 3890eb5412 Remove backprop/
Also remove MatPtrT::Packed(); use PackedScale1 instead where const, or Row(0).

PiperOrigin-RevId: 764243198
2025-05-28 07:01:17 -07:00
Jan Wassenberg 2038dfd9cc Minor: rename compression/shared -> types.h
PiperOrigin-RevId: 758199851
2025-05-13 06:53:21 -07:00
Renamed from compression/shared.h (Browse further)