llama.cpp

History

Georgi Gerganov 1725e316c1 models : optimize qwen3next graph (#19375 ) * models : optimizing qwen3next graph * cont * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * cont : remove redundant q, g chunking * minor * minor * avoid passing masks around * avoid concats during chunking * naming + shapes * update names and use prefix to disable CUDA graphs		2026-02-14 12:57:36 +02:00
..
CMakeLists.txt	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
ggml-metal-common.cpp	models : optimize qwen3next graph (#19375 )	2026-02-14 12:57:36 +02:00
ggml-metal-common.h	metal : refactor + optimize v2 (#15995 )	2025-09-17 20:38:12 +03:00
ggml-metal-context.h	metal : support virtual devices (#18919 )	2026-02-02 14:29:44 +02:00
ggml-metal-context.m	metal : fix event synchronization in cpy_tensor_async (#19402 )	2026-02-07 07:37:15 +02:00
ggml-metal-device.cpp	metal : update sum_rows kernel to support float4 (#19524 )	2026-02-12 11:35:28 +02:00
ggml-metal-device.h	metal : consolidate bin kernels (#19390 )	2026-02-07 10:35:56 +02:00
ggml-metal-device.m	metal : fix ACC op (#19427 )	2026-02-14 09:54:03 +02:00
ggml-metal-impl.h	metal : update sum_rows kernel to support float4 (#19524 )	2026-02-12 11:35:28 +02:00
ggml-metal-ops.cpp	metal : fix ACC op (#19427 )	2026-02-14 09:54:03 +02:00
ggml-metal-ops.h	metal : support GGML_OP_SET (#19548 )	2026-02-13 07:34:52 +02:00
ggml-metal.cpp	metal : add missing includes (#19348 )	2026-02-05 08:05:09 +02:00
ggml-metal.metal	metal : update sum_rows kernel to support float4 (#19524 )	2026-02-12 11:35:28 +02:00