llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders
Jeff Bolz 3976dfbe00
vulkan: support im2col_3d (#15795)
2025-09-07 13:50:26 -05:00
..
CMakeLists.txt vulkan: Fix GGML_VULKAN_SHADER_DEBUG_INFO (#14427) 2025-06-27 22:35:30 -05:00
acc.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
add.comp vulkan: optimize rms_norm, and allow the work to spread across multiple SMs (#15281) 2025-08-23 13:16:17 -05:00
add_id.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
argmax.comp vulkan : fix out-of-bounds access in argmax kernel (#15342) 2025-08-15 16:16:36 +02:00
argsort.comp vulkan: Optimize argsort (#15354) 2025-08-17 10:41:45 +02:00
clamp.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
concat.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
contig_copy.comp vulkan: Add bfloat16 support (#12554) 2025-05-01 20:49:39 +02:00
conv2d_dw.comp sync : ggml (#13268) 2025-05-02 20:54:30 +03:00
conv2d_mm.comp vulkan: Use coopmat2 for conv2d (#14982) 2025-08-03 14:23:57 +02:00
conv_transpose_1d.comp ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (#13813) 2025-06-04 22:02:00 +02:00
copy.comp vulkan: Add bfloat16 support (#12554) 2025-05-01 20:49:39 +02:00
copy_from_quant.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
copy_to_quant.comp vulkan: add RTE variants for glu/add/sub/mul/div (#14653) 2025-07-15 21:32:11 +02:00
cos.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
count_equal.comp vulkan: implement several ops relevant for ggml_opt (#11769) 2025-02-17 07:55:57 +01:00
dequant_f32.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_funcs.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
dequant_funcs_cm2.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
dequant_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_iq1_m.comp vulkan: fix warnings (#13626) 2025-05-20 21:35:16 +00:00
dequant_iq1_s.comp vulkan: initial support for IQ1_S and IQ1_M quantizations (#11528) 2025-02-15 09:01:40 +01:00
dequant_iq2_s.comp vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360) 2025-01-29 18:29:39 +01:00
dequant_iq2_xs.comp vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360) 2025-01-29 18:29:39 +01:00
dequant_iq2_xxs.comp vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360) 2025-01-29 18:29:39 +01:00
dequant_iq3_s.comp vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360) 2025-01-29 18:29:39 +01:00
dequant_iq3_xxs.comp vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360) 2025-01-29 18:29:39 +01:00
dequant_iq4_nl.comp vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360) 2025-01-29 18:29:39 +01:00
dequant_iq4_xs.comp vulkan: initial support for IQ4_XS quantization (#11501) 2025-02-06 07:09:59 +01:00
dequant_mxfp4.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
dequant_q2_k.comp vulkan: fix noncontig check for mat_mul_id splitting (#14683) 2025-07-15 21:51:09 +02:00
dequant_q3_k.comp vulkan: fix noncontig check for mat_mul_id splitting (#14683) 2025-07-15 21:51:09 +02:00
dequant_q4_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_k.comp vulkan: fix noncontig check for mat_mul_id splitting (#14683) 2025-07-15 21:51:09 +02:00
dequant_q5_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_k.comp vulkan: fix noncontig check for mat_mul_id splitting (#14683) 2025-07-15 21:51:09 +02:00
dequant_q6_k.comp vulkan: fix noncontig check for mat_mul_id splitting (#14683) 2025-07-15 21:51:09 +02:00
dequant_q8_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
diag_mask_inf.comp vulkan: fix diag_mask_inf (#11323) 2025-01-23 08:01:17 +01:00
div.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
exp.comp vulkan: add exp operation (#15456) 2025-08-21 17:00:16 +02:00
flash_attn.comp vulkan: clamp matmul and FA results to the max finite value (#15652) 2025-08-31 08:27:57 +02:00
flash_attn_base.comp vulkan: Support FA with any multiple of 8 head sizes (#15537) 2025-08-24 11:24:25 +02:00
flash_attn_cm1.comp vulkan: clamp matmul and FA results to the max finite value (#15652) 2025-08-31 08:27:57 +02:00
flash_attn_cm2.comp vulkan: clamp matmul and FA results to the max finite value (#15652) 2025-08-31 08:27:57 +02:00
flash_attn_split_k_reduce.comp vulkan: clamp matmul and FA results to the max finite value (#15652) 2025-08-31 08:27:57 +02:00
geglu.comp ggml : implement REGLU/GEGLU/SWIGLU ops (#14158) 2025-06-29 11:04:10 +02:00
geglu_erf.comp ggml : implement GEGLU_ERF and GEGLU_QUICK ops (#14445) 2025-07-03 23:07:22 +02:00
geglu_quick.comp ggml : implement GEGLU_ERF and GEGLU_QUICK ops (#14445) 2025-07-03 23:07:22 +02:00
gelu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
gelu_erf.comp add GELU_ERF (#14455) 2025-07-01 10:14:21 +02:00
gelu_quick.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_binary_head.comp vulkan: fuse adds (#15252) 2025-08-16 11:48:22 -05:00
generic_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_unary_head.comp vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 2025-01-16 22:47:10 +01:00
get_rows.comp vulkan: handle large sizes for get_rows (#15686) 2025-08-31 10:13:27 +02:00
get_rows_quant.comp vulkan: handle large sizes for get_rows (#15686) 2025-08-31 10:13:27 +02:00
glu_head.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
glu_main.comp ggml : implement REGLU/GEGLU/SWIGLU ops (#14158) 2025-06-29 11:04:10 +02:00
group_norm.comp vulkan: fix group_norm (#10496) 2024-11-26 16:45:05 +01:00
hardsigmoid.comp ggml vulkan: add hardsigmoid and hardswish operations (#15762) 2025-09-03 20:22:55 +02:00
hardswish.comp ggml vulkan: add hardsigmoid and hardswish operations (#15762) 2025-09-03 20:22:55 +02:00
im2col.comp vulkan/cuda: Fix im2col when KW!=KH (#14789) 2025-07-21 13:35:40 +02:00
im2col_3d.comp vulkan: support im2col_3d (#15795) 2025-09-07 13:50:26 -05:00
l2_norm.comp llama: Add support for RWKV v7 architecture (#12412) 2025-03-18 07:27:50 +08:00
leaky_relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
mul_mat_split_k_reduce.comp vulkan: optimize and reenable split_k (#10637) 2024-12-03 20:29:54 +01:00
mul_mat_vec.comp vulkan: Add bfloat16 support (#12554) 2025-05-01 20:49:39 +02:00
mul_mat_vec_base.comp Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) 2025-09-01 16:19:07 +02:00
mul_mat_vec_iq1_m.comp vulkan: initial support for IQ1_S and IQ1_M quantizations (#11528) 2025-02-15 09:01:40 +01:00
mul_mat_vec_iq1_s.comp vulkan: initial support for IQ1_S and IQ1_M quantizations (#11528) 2025-02-15 09:01:40 +01:00
mul_mat_vec_iq2_s.comp vulkan: workaround for AMD Windows driver 16 bit unpack8 bug (#12472) 2025-03-21 20:27:47 +01:00
mul_mat_vec_iq2_xs.comp vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595) 2025-02-28 09:42:52 +01:00
mul_mat_vec_iq2_xxs.comp vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595) 2025-02-28 09:42:52 +01:00
mul_mat_vec_iq3_s.comp vulkan: workaround for AMD Windows driver 16 bit unpack8 bug (#12472) 2025-03-21 20:27:47 +01:00
mul_mat_vec_iq3_xxs.comp vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595) 2025-02-28 09:42:52 +01:00
mul_mat_vec_nc.comp vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (#15015) 2025-08-02 10:48:30 +02:00
mul_mat_vec_p021.comp vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505) 2025-03-22 09:40:11 +01:00
mul_mat_vec_q2_k.comp mat vec double buffer (#12188) 2025-03-10 19:28:11 +00:00
mul_mat_vec_q3_k.comp mat vec double buffer (#12188) 2025-03-10 19:28:11 +00:00
mul_mat_vec_q4_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q5_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q6_k.comp mat vec double buffer (#12188) 2025-03-10 19:28:11 +00:00
mul_mat_vecq.comp Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) 2025-09-01 16:19:07 +02:00
mul_mm.comp vulkan: Use larger loads in scalar/coopmat1 matmul (#15729) 2025-09-07 18:53:07 +02:00
mul_mm_cm2.comp vulkan: add missing clamps in new mul_mat_id paths (#15702) 2025-09-01 21:01:10 +02:00
mul_mmq.comp Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) 2025-09-01 16:19:07 +02:00
mul_mmq_funcs.comp Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) 2025-09-01 16:19:07 +02:00
multi_add.comp vulkan: workaround MoltenVK compile failure in multi_add (#15506) 2025-08-24 10:48:21 +02:00
norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
opt_step_adamw.comp vulkan: implement several ops relevant for ggml_opt (#11769) 2025-02-17 07:55:57 +01:00
opt_step_sgd.comp finetune: SGD optimizer, more CLI args (#13873) 2025-08-14 12:03:57 +02:00
pad.comp vulkan: Support pad_ext (#15794) 2025-09-07 19:00:49 +02:00
pool2d.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
quantize_q8_1.comp Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) 2025-09-01 16:19:07 +02:00
reglu.comp ggml : implement REGLU/GEGLU/SWIGLU ops (#14158) 2025-06-29 11:04:10 +02:00
relu.comp vulkan: Additional type support for unary, binary, and copy (#13266) 2025-05-04 07:17:16 +02:00
repeat.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
repeat_back.comp vulkan: implement several ops relevant for ggml_opt (#11769) 2025-02-17 07:55:57 +01:00
rms_norm.comp vulkan: optimize rms_norm, and allow the work to spread across multiple SMs (#15281) 2025-08-23 13:16:17 -05:00
rms_norm_back.comp vulkan: implement more backpropagation operators (#11914) 2025-02-25 12:04:45 +01:00
rms_norm_partials.comp vulkan: optimize rms_norm, and allow the work to spread across multiple SMs (#15281) 2025-08-23 13:16:17 -05:00
roll.comp vulkan : implement ggml_roll (ggml/1290) 2025-07-12 14:25:44 +03:00
rope_head.comp vulkan: add RTE variants for glu/add/sub/mul/div (#14653) 2025-07-15 21:32:11 +02:00
rope_multi.comp vulkan : fix rope with partial rotation and non-cont src (#14582) 2025-07-08 15:21:21 +02:00
rope_neox.comp vulkan : fix rope with partial rotation and non-cont src (#14582) 2025-07-08 15:21:21 +02:00
rope_norm.comp vulkan : fix rope with partial rotation and non-cont src (#14582) 2025-07-08 15:21:21 +02:00
rope_vision.comp vulkan: support multi/vision rope, and noncontiguous rope (#11902) 2025-02-16 08:52:23 +01:00
rte.comp vulkan: add RTE variants for glu/add/sub/mul/div (#14653) 2025-07-15 21:32:11 +02:00
scale.comp ggml : add ggml_scale_bias (#14417) 2025-07-09 18:16:12 +02:00
sigmoid.comp vulkan: Additional type support for unary, binary, and copy (#13266) 2025-05-04 07:17:16 +02:00
silu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
silu_back.comp vulkan: implement more backpropagation operators (#11914) 2025-02-25 12:04:45 +01:00
sin.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
soft_max.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
soft_max_back.comp vulkan: implement more backpropagation operators (#11914) 2025-02-25 12:04:45 +01:00
sqrt.comp vulkan: support sqrt (#15370) 2025-08-17 16:03:09 +02:00
square.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
sub.comp vulkan: implement several ops relevant for ggml_opt (#11769) 2025-02-17 07:55:57 +01:00
sum_rows.comp vulkan : support ggml_mean (#15393) 2025-08-23 08:35:21 +02:00
swiglu.comp ggml : implement REGLU/GEGLU/SWIGLU ops (#14158) 2025-06-29 11:04:10 +02:00
swiglu_oai.comp llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
tanh.comp vulkan: Additional type support for unary, binary, and copy (#13266) 2025-05-04 07:17:16 +02:00
test_bfloat16_support.comp vulkan: Add bfloat16 support (#12554) 2025-05-01 20:49:39 +02:00
test_coopmat2_support.comp vulkan: compile a test shader in cmake to check for coopmat2 support (#10713) 2024-12-08 09:05:55 +01:00
test_coopmat_support.comp Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (#11117) 2025-01-08 09:18:13 +01:00
test_integer_dot_support.comp Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135) 2025-03-31 14:37:01 +02:00
timestep_embedding.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
types.comp vulkan: Use larger loads in scalar/coopmat1 matmul (#15729) 2025-09-07 18:53:07 +02:00
upscale.comp vulkan : remove unused vars (#0) 2025-07-12 14:25:44 +03:00
utils.comp vulkan: fuse adds (#15252) 2025-08-16 11:48:22 -05:00
vulkan-shaders-gen.cpp vulkan: support im2col_3d (#15795) 2025-09-07 13:50:26 -05:00
wkv6.comp rwkv6: add wkv6 support for Vulkan backend (#10829) 2024-12-16 22:00:46 +01:00
wkv7.comp llama: Add support for RWKV v7 architecture (#12412) 2025-03-18 07:27:50 +08:00