llama.cpp/docs/ops
Reese Levine a89002f07b
ggml webgpu: support for backend sampling (#18880)
* ggml webgpu: add SOFTPLUS unary operator

Implements SOFTPLUS (log(1 + exp(x))) with f16/f32 support. Uses f32
precision for intermediate calculations to prevent f16 overflow.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* Follow Vulkan backend numerical stability pattern

* ggml webgpu: add EXPM1 unary operator

Implements EXPM1 (exp(x) - 1) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add FLOOR unary operator

Implements FLOOR (rounds down to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add CEIL unary operator

Implements CEIL (rounds up to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add ROUND unary operator

Implements ROUND (rounds to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add TRUNC unary operator

Implements TRUNC (truncates towards zero) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* docs : update WebGPU support for unary operators (FLOOR, CEIL, ROUND, TRUNC, EXPM1, SOFTPLUS)

* Updates to webgpu get_memory

* Add argmax

* Add argmax,cumsum,sum,sum_rows

* Add necessary CPY/GET_ROWS operators

* Support for argsort using multi-pass strategy

* Update set_rows for i32 indices, move to pre-wgsl

* Port unary operators to pre-wgsl and support FILL

* Implement PAD

* Add support for top-k

* clean up, scope pipeline init mutex

* fix newline

* Add support for log

* Update LOG for better precision, and ops doc

---------

Co-authored-by: Abhijit Ramesh <abhijitramesh2k@gmail.com>
2026-01-16 16:12:43 -08:00
..
BLAS.csv docs(ggml): update backend ops (#18734) 2026-01-10 18:48:17 +08:00
CANN.csv docs : update ops.md for CANN backend (#18654) 2026-01-16 13:32:17 +01:00
CPU.csv docs : update cpu and cuda ops (#17890) 2025-12-09 23:31:29 +01:00
CUDA.csv docs : update cpu and cuda ops (#17890) 2025-12-09 23:31:29 +01:00
Metal.csv metal : add count_equal op (#18314) 2025-12-31 10:39:48 +02:00
OpenCL.csv docs : update opencl ops (#17904) 2025-12-10 15:20:00 +01:00
SYCL.csv [SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (#17826) 2025-12-15 10:35:15 +08:00
Vulkan.csv ops.md: update vulkan support (#17661) 2025-12-01 15:26:21 -06:00
WebGPU.csv ggml webgpu: support for backend sampling (#18880) 2026-01-16 16:12:43 -08:00
ZenDNN.csv ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690) 2025-12-07 00:13:33 +08:00
zDNN.csv docs(ggml): update backend ops (#18734) 2026-01-10 18:48:17 +08:00