llama.cpp

History

Yu, Zijun 65e1b1af6d Fix after rebasing - Layout of cache k and cache v are unified: [seq, n_head, head_size] - Add CPY and FLASH_ATTN_EXT, flash attn is not used yet - Skip test-backend-ops due to flash attn test crash - Add mutex around graph conversion to avoid test-thread-safety fali in the future - Update NPU config - Update GPU config to disable SDPA opt to make phi-3 run		2026-01-15 11:19:15 -08:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	fix build error	2026-01-15 10:10:00 -08:00
src	Fix after rebasing	2026-01-15 11:19:15 -08:00
.gitignore	…
CMakeLists.txt	Refactor: clean, fix warning	2026-01-15 10:20:18 -08:00