Commit Graph

105 Commits

Author SHA1 Message Date
Jan Wassenberg 992a2cbbc0 De-templatize Activations, add RowVectorBatch class
Also remove most kBatchSize args.

PiperOrigin-RevId: 653185525
2024-07-17 04:38:15 -07:00
Daniel Keysers ff34370aac Simplify FFW by using MatMul_4x4_Batch_Add.
Affects only the griffin model, where prefill TPS improves by about 70%.

PiperOrigin-RevId: 652878176
2024-07-16 09:41:23 -07:00
Kan Wu f519ab6693 Refactor configurables.
PiperOrigin-RevId: 651259154
2024-07-10 21:30:58 -07:00
Daniel Keysers 063bbaa683 Add more comments to attention computation (and some small restructuring).
PiperOrigin-RevId: 650929097
2024-07-10 02:39:07 -07:00
Jan Wassenberg c7c3daa624 7x compile time speedup: shard gemma.cc
Use overloaded functions defined in gemma/instantiations.
Also split out activations.h.

PiperOrigin-RevId: 649053122
2024-07-03 06:35:04 -07:00