Ray Smith
|
d2806fb1dd
|
Fixed msan error by fixing padding of k_cache and v_cache
PiperOrigin-RevId: 879644219
|
2026-03-06 08:11:17 -08:00 |
Ray Smith
|
49cb438b1e
|
Rollback of erroneous rollback.
PiperOrigin-RevId: 877376165
|
2026-03-02 06:50:26 -08:00 |
The gemma.cpp Authors
|
a3d994915f
|
No public description
PiperOrigin-RevId: 877333188
|
2026-03-02 04:32:29 -08:00 |
Ray Smith
|
16c1b29b89
|
Rewrote flash attention to use BF16, transpose k and v, rewrote the task distribution, increase parallelism on decode, and use double the registers for the core of flash attention.
PiperOrigin-RevId: 877308306
|
2026-03-02 03:11:01 -08:00 |
Martin Stolle
|
49d420aeaf
|
Add some comments.
PiperOrigin-RevId: 834173319
|
2025-11-19 01:09:15 -08:00 |
Ray Smith
|
8a100c1e8d
|
Added access to flash attention internals to TileFlashAttention4
PiperOrigin-RevId: 826011137
|
2025-10-30 06:50:05 -07:00 |