llama.cpp

History

Georgi Gerganov 1725e316c1 models : optimize qwen3next graph (#19375 ) * models : optimizing qwen3next graph * cont * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * cont : remove redundant q, g chunking * minor * minor * avoid passing masks around * avoid concats during chunking * naming + shapes * update names and use prefix to disable CUDA graphs		2026-02-14 12:57:36 +02:00
..
cmake	…
include	ggml-virtgpu: make the code thread safe (#19204 )	2026-02-04 10:46:18 +08:00
src	models : optimize qwen3next graph (#19375 )	2026-02-14 12:57:36 +02:00
.gitignore	…
CMakeLists.txt	Bump cmake max version (needed for Windows on Snapdragon builds) (#19188 )	2026-02-01 14:13:38 -08:00