Jeff Bolz
a0374a67e2
vulkan: Handle updated FA dim2/3 definition ( #14518 )
...
* vulkan: Handle updated FA dim2/3 definition
Pack mask boolean and n_head_log2 into a single dword to keep the push
constant block under the 128B limit.
* handle null mask for gqa
* allow gqa with dim3>1
2025-07-05 09:26:04 +02:00
Jeff Bolz
2b72bedec1
vulkan: support mixed/deepseekR1 FA head sizes ( #14509 )
...
* vulkan: better parameterize FA by head sizes
* vulkan: support mixed/deepseekR1 FA head sizes
2025-07-03 20:21:14 +02:00
Jeff Bolz
8875523eb3
vulkan: support softmax/FA batch and broadcast ( #14449 )
2025-07-02 15:48:33 +03:00
Jeff Bolz
2f5a4e1e09
vulkan: move common FA code to flash_attn_base.comp ( #13556 )
...
* vulkan: move common FA code to flash_attn_base.comp
* vulkan: move common FA index/stride setup code to flash_attn_base.comp
* build fix
2025-05-17 09:14:55 +02:00