llama.cpp

History

Reese Levine c1258830b2 ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 ) * Implement l2_norm, set, tri * Add DIAG/SOLVE_TRI * Add SSM_CONV * Better get_rows and gated_delta_net to support qwen3.5 * Clean up, update ops.md * Fix binding_index type for wasm * Fix read write annotations * cleanups		2026-03-19 08:45:28 -07:00
..
argmax.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
argsort.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
argsort_merge.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
binary.wgsl	ggml-webgpu: Support non-contiguous `src0` and overlapping `src0/src1` in binary ops (#19850 )	2026-03-02 07:59:53 -08:00
common_decls.tmpl	ggml webgpu: shader library organization (#19530 )	2026-02-18 07:51:02 -07:00
concat.wgsl	Add concat op to webgpu. (#20068 )	2026-03-04 11:19:00 -08:00
cpy.tmpl.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
cumsum.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
embed_wgsl.py	ggml webgpu: shader library organization (#19530 )	2026-02-18 07:51:02 -07:00
flash_attn.wgsl	ggml-webgpu: improve flastAttention performance by software pipelining (#19151 )	2026-01-29 14:05:30 -08:00
gated_delta_net.wgsl	ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )	2026-03-19 08:45:28 -07:00
get_rows.wgsl	ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )	2026-03-19 08:45:28 -07:00
glu.tmpl.wgsl	ggml webgpu: support for rope,div,sub,glu,scale,cont operators (#16187 )	2025-09-30 09:57:51 -07:00
memset.wgsl	ggml WebGPU: add support for quantization types (#15440 )	2025-08-22 11:28:03 -07:00
mul_mat.wgsl	ggml webgpu: fix workgroup dispatch limit for large batch sizes (#19965 )	2026-03-02 19:35:11 -08:00
mul_mat_decls.tmpl	ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (#20173 )	2026-03-10 09:14:27 -07:00
mul_mat_reg_tile.wgsl	ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (#20173 )	2026-03-10 09:14:27 -07:00
mul_mat_subgroup_matrix.wgsl	ggml webgpu: fix workgroup dispatch limit for large batch sizes (#19965 )	2026-03-02 19:35:11 -08:00
mul_mat_vec.wgsl	ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (#20173 )	2026-03-10 09:14:27 -07:00
pad.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
repeat.wgsl	ggml-webgpu: Add supports for `GGML_OP_REPEAT` (#20230 )	2026-03-11 14:40:36 -07:00
rope.tmpl.wgsl	model: add support for qwen3vl series (#16780 )	2025-10-30 16:19:14 +01:00
row_norm.wgsl	ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )	2026-03-19 08:45:28 -07:00
scale.wgsl	ggml webgpu: shader library organization (#19530 )	2026-02-18 07:51:02 -07:00
set.wgsl	ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )	2026-03-19 08:45:28 -07:00
set_rows.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
soft_max.tmpl.wgsl	ggml webgpu: actually add softmax, fix rms_norm offset (#16400 )	2025-10-04 20:59:31 -07:00
solve_tri.wgsl	ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )	2026-03-19 08:45:28 -07:00
ssm_conv.wgsl	ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization (#20687 )	2026-03-19 08:45:28 -07:00
sum_rows.wgsl	ggml webgpu: support for backend sampling (#18880 )	2026-01-16 16:12:43 -08:00
unary.wgsl	ggml-webgpu: Add supports for `DIAG` and `TRI` (#20664 )	2026-03-18 21:08:35 -07:00