Jan Wassenberg
cf4d7ceb82
1.16x decode speedup: remove last MatVec in Attention
...
Precompute row pointers.
Remove no longer used MHA support; QStride -> qkv_dim.
Remove RowPtr from MatMul interface, use only MatPtrT.
Require opt-in define for NUQ to speed up builds.
Also fix io.cc on Windows.
PiperOrigin-RevId: 766228108
2025-06-02 09:40:29 -07:00
Jan Wassenberg
cb188d4a0e
Fix RowT issue and improve Griffin (currently still broken)
...
Use type-safe MatPtrT via dynamic_cast, avoid/remove unsafe RowT
activations: Griffin tensors are now padded
Griffin: add batching support, fix conv1d_cache allocation
weights: bundle to TensorToRead, add kNoPad flag, fix SplitW1
const-correct fix for ForEachTensor
blob_store: move BlobIO2 to .cc and rename BlobIO
PiperOrigin-RevId: 760610094
2025-05-19 07:02:10 -07:00
Jan Wassenberg
c443adee33
3.8x speedup of weights loading via preadv on Linux
...
Also move BlobReader reading functionality to weights.cc
PiperOrigin-RevId: 759240310
2025-05-15 11:55:15 -07:00
Jan Wassenberg
d538a6d6c6
Cleanup: remove unused kCyclic, remove 2 suffix
...
Also remove now unused allocator arg and fix warnings (cast, struct/class mismatch)
PiperOrigin-RevId: 758098495
2025-05-13 01:06:41 -07:00
Jan Wassenberg
a0ff98ea60
Entirely remove constexpr on PaddedDirEnd. Refs #551
...
Apparently GCC 9.4 does not handle HWY_CXX17_CONSTEXPR as we intend.
PiperOrigin-RevId: 755967709
2025-05-07 12:48:19 -07:00
Jan Wassenberg
e9ecb7794d
Fix gcc build error and gemma3 crash, thanks @ufownl, fixes #551
...
PiperOrigin-RevId: 755729478
2025-05-07 00:59:18 -07:00
Jan Wassenberg
c8d92948f4
Move fields, io* and blob* from compression/ into io/
...
PiperOrigin-RevId: 755445712
2025-05-06 11:17:19 -07:00