llama.cpp/ggml/src/ggml-metal
Georgi Gerganov 1725e316c1
models : optimize qwen3next graph (#19375)
* models : optimizing qwen3next graph

* cont

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* cont : remove redundant q, g chunking

* minor

* minor

* avoid passing masks around

* avoid concats during chunking

* naming + shapes

* update names and use prefix to disable CUDA graphs
2026-02-14 12:57:36 +02:00
..
CMakeLists.txt docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
ggml-metal-common.cpp models : optimize qwen3next graph (#19375) 2026-02-14 12:57:36 +02:00
ggml-metal-common.h metal : refactor + optimize v2 (#15995) 2025-09-17 20:38:12 +03:00
ggml-metal-context.h metal : support virtual devices (#18919) 2026-02-02 14:29:44 +02:00
ggml-metal-context.m metal : fix event synchronization in cpy_tensor_async (#19402) 2026-02-07 07:37:15 +02:00
ggml-metal-device.cpp metal : update sum_rows kernel to support float4 (#19524) 2026-02-12 11:35:28 +02:00
ggml-metal-device.h metal : consolidate bin kernels (#19390) 2026-02-07 10:35:56 +02:00
ggml-metal-device.m metal : fix ACC op (#19427) 2026-02-14 09:54:03 +02:00
ggml-metal-impl.h metal : update sum_rows kernel to support float4 (#19524) 2026-02-12 11:35:28 +02:00
ggml-metal-ops.cpp metal : fix ACC op (#19427) 2026-02-14 09:54:03 +02:00
ggml-metal-ops.h metal : support GGML_OP_SET (#19548) 2026-02-13 07:34:52 +02:00
ggml-metal.cpp metal : add missing includes (#19348) 2026-02-05 08:05:09 +02:00
ggml-metal.metal metal : update sum_rows kernel to support float4 (#19524) 2026-02-12 11:35:28 +02:00