llama.cpp

History

Alfred ce734a8a2f ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations (#17977 ) * feat: implement real Q8_0 * feat: adding cmake option for configuring FP32 quantize group size * typo: set() shall be used --------- Co-authored-by: ngdxzy <zhenyu_xu@uri.edu>		2025-12-19 09:42:28 -08:00
..
hexagon	ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations (#17977 )	2025-12-19 09:42:28 -08:00
BLIS.md	make : deprecate (#10514 )	2024-12-02 21:22:53 +02:00
CANN.md	CANN: GGML_CANN_ACL_GRAPH works only USE_ACL_GRAPH enabled (#16861 )	2025-11-12 14:37:52 +08:00
CUDA-FEDORA.md	docs: update: improve the Fedoa CUDA guide (#12536 )	2025-03-24 11:02:26 +00:00
OPENCL.md	opencl: update doc (#17011 )	2025-11-04 16:02:36 -08:00
SYCL.md	added note for old Intel hardware pre sycl (#18017 )	2025-12-16 17:45:09 +08:00
ZenDNN.md	ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690 )	2025-12-07 00:13:33 +08:00
zDNN.md	ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690 )	2025-12-07 00:13:33 +08:00