llama.cpp

History

Reese Levine 82764c341a ggml webgpu: quantized buffers to u32 + wider browser/device support (#21046 ) * Work towards removing bitcast * Move rest of existing types over * Add timeout back to wait and remove synchronous set_tensor/memset_tensor * move to unpackf16 for wider compatibility * cleanup * Remove deadlock condition in free_bufs		2026-04-01 08:38:24 +03:00
..
wgsl-shaders	ggml webgpu: quantized buffers to u32 + wider browser/device support (#21046 )	2026-04-01 08:38:24 +03:00
CMakeLists.txt	ggml webgpu: add support for emscripten builds (#17184 )	2025-12-03 10:25:34 +01:00
ggml-webgpu-shader-lib.hpp	ggml webgpu: quantized buffers to u32 + wider browser/device support (#21046 )	2026-04-01 08:38:24 +03:00
ggml-webgpu.cpp	ggml webgpu: quantized buffers to u32 + wider browser/device support (#21046 )	2026-04-01 08:38:24 +03:00
pre_wgsl.hpp	ggml webgpu: initial flashattention implementation (#18610 )	2026-01-08 08:23:39 -08:00