llama.cpp

History

Reese Levine 45363632cb ggml WebGPU: add support for quantization types (#15440 ) * Begin work on set_rows * Work on set rows * Add error buffers for reporting unsupported SET_ROWS indices * Remove extra comments * Work on templating for different types in shaders * Work on shader type generation * Working q4_0 mul_mat and some templating for different types * Add q4_0_f16 matmul and fix device init * Add matmul support for basic quantization types * Add q2_k and q3_k quantization * Add rest of k-quants * Get firt i-quant working * Closer to supporting all i-quants * Support rest of i-quants * Cleanup code * Fix python formatting * debug * Bugfix for memset * Add padding to end of buffers on creation * Simplify bit-shifting * Update usage of StringView		2025-08-22 11:28:03 -07:00
..
cpy.wgsl	ggml: Add initial WebGPU backend (#14521 )	2025-07-16 18:18:51 +03:00
embed_wgsl.py	ggml WebGPU: add support for quantization types (#15440 )	2025-08-22 11:28:03 -07:00
memset.wgsl	ggml WebGPU: add support for quantization types (#15440 )	2025-08-22 11:28:03 -07:00
mul_mat.tmpl.wgsl	ggml WebGPU: add support for quantization types (#15440 )	2025-08-22 11:28:03 -07:00
set_rows.wgsl	ggml: Add basic SET_ROWS support in WebGPU (#15137 )	2025-08-06 15:14:40 -07:00