llama.cpp

History

Neo Zhang 213c4a0b81 [SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (#20190 ) * support flash-attention for fp32/fp16/Q4/Q5/Q8 * rm warining * update for JIT		2026-03-08 12:00:07 +08:00
..
VirtGPU	ggml-virtgpu: add backend documentation (#19354 )	2026-02-09 20:15:42 +08:00
snapdragon	chore : correct typos [no ci] (#20041 )	2026-03-05 08:50:21 +01:00
BLIS.md	make : deprecate (#10514 )	2024-12-02 21:22:53 +02:00
CANN.md	chore : correct typos [no ci] (#20041 )	2026-03-05 08:50:21 +01:00
CUDA-FEDORA.md	docs: update: improve the Fedoa CUDA guide (#12536 )	2025-03-24 11:02:26 +00:00
OPENCL.md	docs: add linux to index (#18907 )	2026-01-18 18:03:35 +08:00
SYCL.md	[SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (#20190 )	2026-03-08 12:00:07 +08:00
VirtGPU.md	ggml-virtgpu: improve the reliability of the code (#19846 )	2026-02-26 20:00:57 +08:00
ZenDNN.md	ggml-zendnn: update code for latest ZenDNN API (#19923 )	2026-02-27 08:43:41 +08:00
zDNN.md	ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690 )	2025-12-07 00:13:33 +08:00