llama.cpp

History

Akarshan Biswas cd1fce6d4f SYCL: Add set_rows support for quantized types (#14883 ) * SYCL: Add set_rows support for quantized types This commit adds support for GGML_OP_SET_ROWS operation for various quantized tensor types (Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, IQ4_NL) and BF16 type in the SYCL backend. The quantization/dequantization copy kernels were moved from cpy.cpp to cpy.hpp to make them available for set_rows.cpp. This addresses part of the TODOs mentioned in the code. * Use get_global_linear_id() instead ggml-ci * Fix formatting ggml-ci * Use const for ne11 and size_t variables in set_rows_sycl_q ggml-ci * Increase block size for q kernel to 256 ggml-ci * Cleanup imports * Add float.h to cpy.hpp		2025-07-28 20:32:15 +05:30
..
cmake	cmake : Indent ggml-config.cmake (ggml/1310)	2025-07-28 08:15:01 +03:00
include	ggml: Add initial WebGPU backend (#14521 )	2025-07-16 18:18:51 +03:00
src	SYCL: Add set_rows support for quantized types (#14883 )	2025-07-28 20:32:15 +05:30
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml-cpu : disable GGML_NNPA by default due to instability (#14880 )	2025-07-25 19:09:03 +02:00