Commit Graph

2 Commits

Author SHA1 Message Date
Jan Wassenberg cf4d7ceb82 1.16x decode speedup: remove last MatVec in Attention
Precompute row pointers.
Remove no longer used MHA support; QStride -> qkv_dim.
Remove RowPtr from MatMul interface, use only MatPtrT.
Require opt-in define for NUQ to speed up builds.
Also fix io.cc on Windows.

PiperOrigin-RevId: 766228108
2025-06-02 09:40:29 -07:00
Jan Wassenberg 627cc04db9 Decouple MatMul from gemma-inl: precompile for all input types
Call MatMulStatic instead of MatMul.

Also fix build error due to Highway's Lanes not being constexpr.

PiperOrigin-RevId: 763777269
2025-05-27 07:08:58 -07:00