llama.cpp/ggml/src/ggml-webgpu/wgsl-shaders
Reese Levine c1258830b2
ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687)
* Implement l2_norm, set, tri

* Add DIAG/SOLVE_TRI

* Add SSM_CONV

* Better get_rows and gated_delta_net to support qwen3.5

* Clean up, update ops.md

* Fix binding_index type for wasm

* Fix read write annotations

* cleanups
2026-03-19 08:45:28 -07:00
..
argmax.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
argsort.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
argsort_merge.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
binary.wgsl ggml-webgpu: Support non-contiguous `src0` and overlapping `src0/src1` in binary ops (#19850) 2026-03-02 07:59:53 -08:00
common_decls.tmpl ggml webgpu: shader library organization (#19530) 2026-02-18 07:51:02 -07:00
concat.wgsl Add concat op to webgpu. (#20068) 2026-03-04 11:19:00 -08:00
cpy.tmpl.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
cumsum.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
embed_wgsl.py ggml webgpu: shader library organization (#19530) 2026-02-18 07:51:02 -07:00
flash_attn.wgsl ggml-webgpu: improve flastAttention performance by software pipelining (#19151) 2026-01-29 14:05:30 -08:00
gated_delta_net.wgsl ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687) 2026-03-19 08:45:28 -07:00
get_rows.wgsl ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687) 2026-03-19 08:45:28 -07:00
glu.tmpl.wgsl ggml webgpu: support for rope,div,sub,glu,scale,cont operators (#16187) 2025-09-30 09:57:51 -07:00
memset.wgsl ggml WebGPU: add support for quantization types (#15440) 2025-08-22 11:28:03 -07:00
mul_mat.wgsl ggml webgpu: fix workgroup dispatch limit for large batch sizes (#19965) 2026-03-02 19:35:11 -08:00
mul_mat_decls.tmpl ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (#20173) 2026-03-10 09:14:27 -07:00
mul_mat_reg_tile.wgsl ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (#20173) 2026-03-10 09:14:27 -07:00
mul_mat_subgroup_matrix.wgsl ggml webgpu: fix workgroup dispatch limit for large batch sizes (#19965) 2026-03-02 19:35:11 -08:00
mul_mat_vec.wgsl ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (#20173) 2026-03-10 09:14:27 -07:00
pad.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
repeat.wgsl ggml-webgpu: Add supports for `GGML_OP_REPEAT` (#20230) 2026-03-11 14:40:36 -07:00
rope.tmpl.wgsl model: add support for qwen3vl series (#16780) 2025-10-30 16:19:14 +01:00
row_norm.wgsl ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687) 2026-03-19 08:45:28 -07:00
scale.wgsl ggml webgpu: shader library organization (#19530) 2026-02-18 07:51:02 -07:00
set.wgsl ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687) 2026-03-19 08:45:28 -07:00
set_rows.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
soft_max.tmpl.wgsl ggml webgpu: actually add softmax, fix rms_norm offset (#16400) 2025-10-04 20:59:31 -07:00
solve_tri.wgsl ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687) 2026-03-19 08:45:28 -07:00
ssm_conv.wgsl ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687) 2026-03-19 08:45:28 -07:00
sum_rows.wgsl ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
unary.wgsl ggml-webgpu: Add supports for `DIAG` and `TRI` (#20664) 2026-03-18 21:08:35 -07:00