Jan Wassenberg
|
992a2cbbc0
|
De-templatize Activations, add RowVectorBatch class
Also remove most kBatchSize args.
PiperOrigin-RevId: 653185525
|
2024-07-17 04:38:15 -07:00 |
Daniel Keysers
|
ff34370aac
|
Simplify FFW by using MatMul_4x4_Batch_Add.
Affects only the griffin model, where prefill TPS improves by about 70%.
PiperOrigin-RevId: 652878176
|
2024-07-16 09:41:23 -07:00 |
Kan Wu
|
f519ab6693
|
Refactor configurables.
PiperOrigin-RevId: 651259154
|
2024-07-10 21:30:58 -07:00 |
Daniel Keysers
|
063bbaa683
|
Add more comments to attention computation (and some small restructuring).
PiperOrigin-RevId: 650929097
|
2024-07-10 02:39:07 -07:00 |
Jan Wassenberg
|
c7c3daa624
|
7x compile time speedup: shard gemma.cc
Use overloaded functions defined in gemma/instantiations.
Also split out activations.h.
PiperOrigin-RevId: 649053122
|
2024-07-03 06:35:04 -07:00 |