llama.cpp/ggml/src/ggml-webgpu
Reese Levine d006858316
ggml-webgpu: move from parameter buffer pool to single buffer with offsets (#21278)
* Work towards removing bitcast

* Move rest of existing types over

* Add timeout back to wait and remove synchronous set_tensor/memset_tensor

* move to unpackf16 for wider compatibility

* cleanup

* Remove deadlock condition in free_bufs

* Start work on removing parameter buffer pools

* Simplify and optimize further

* simplify profile futures

* Fix stride

* Try using a single command buffer per batch

* formatting
2026-04-03 11:40:14 -07:00
..
wgsl-shaders ggml-webgpu: add vectorized flash attention (#20709) 2026-04-02 10:40:42 -07:00
CMakeLists.txt ggml webgpu: add support for emscripten builds (#17184) 2025-12-03 10:25:34 +01:00
ggml-webgpu-shader-lib.hpp ggml-webgpu: move from parameter buffer pool to single buffer with offsets (#21278) 2026-04-03 11:40:14 -07:00
ggml-webgpu.cpp ggml-webgpu: move from parameter buffer pool to single buffer with offsets (#21278) 2026-04-03 11:40:14 -07:00
pre_wgsl.hpp ggml webgpu: initial flashattention implementation (#18610) 2026-01-08 08:23:39 -08:00