* models : optimizing qwen3next graph * cont * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * cont : remove redundant q, g chunking * minor * minor * avoid passing masks around * avoid concats during chunking * naming + shapes * update names and use prefix to disable CUDA graphs |
||
|---|---|---|
| .. | ||
| CMakeLists.txt | ||
| ggml-metal-common.cpp | ||
| ggml-metal-common.h | ||
| ggml-metal-context.h | ||
| ggml-metal-context.m | ||
| ggml-metal-device.cpp | ||
| ggml-metal-device.h | ||
| ggml-metal-device.m | ||
| ggml-metal-impl.h | ||
| ggml-metal-ops.cpp | ||
| ggml-metal-ops.h | ||
| ggml-metal.cpp | ||
| ggml-metal.metal | ||