Jan Wassenberg
|
6773e4517c
|
Split Activations into Griffin/Attention to reduce memory usage for attention-only tests.
PiperOrigin-RevId: 772025282
|
2025-06-16 07:52:59 -07:00 |
Jan Wassenberg
|
c027a45a2e
|
MatPtr-ify KV, shared div_seq_len, --seq_len flag
PiperOrigin-RevId: 770194455
|
2025-06-11 09:49:38 -07:00 |
Jan Wassenberg
|
6ee628ba38
|
Further cleanup: separate MatMulEnv arg
move row_ptrs into MatMulEnv
Consistent arg order: layer, activations, kv_cache, env
PiperOrigin-RevId: 767886386
|
2025-06-05 20:48:32 -07:00 |
Jan Wassenberg
|
3a266c662c
|
Split gemma-inl into separate source files
weights, mat: zero-initialize padding, required since the MatMul "avoid B decompress" optimization.
PiperOrigin-RevId: 767562313
|
2025-06-05 05:36:44 -07:00 |