llama.cpp/ggml
Akarshan Biswas cd1fce6d4f
SYCL: Add set_rows support for quantized types (#14883)
* SYCL: Add set_rows support for quantized types

This commit adds support for GGML_OP_SET_ROWS operation for various
quantized tensor types (Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, IQ4_NL) and BF16
type in the SYCL backend.

The quantization/dequantization copy kernels were moved from cpy.cpp
to cpy.hpp to make them available for set_rows.cpp.

This addresses part of the TODOs mentioned in the code.

* Use get_global_linear_id() instead

ggml-ci

* Fix formatting

ggml-ci

* Use const for ne11 and size_t variables in set_rows_sycl_q

ggml-ci

* Increase block size for q kernel to 256

ggml-ci

* Cleanup imports

* Add float.h to cpy.hpp
2025-07-28 20:32:15 +05:30
..
cmake cmake : Indent ggml-config.cmake (ggml/1310) 2025-07-28 08:15:01 +03:00
include ggml: Add initial WebGPU backend (#14521) 2025-07-16 18:18:51 +03:00
src SYCL: Add set_rows support for quantized types (#14883) 2025-07-28 20:32:15 +05:30
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml-cpu : disable GGML_NNPA by default due to instability (#14880) 2025-07-25 19:09:03 +02:00