llama.cpp/ggml/src/ggml-metal
Georgi Gerganov 2f966b8ed8
clip : use FA (#16837)
* clip : use FA

* cont : add warning about unsupported ops

* implement "auto" mode for clip flash attn

* clip : print more detailed op support info during warmup

* cont : remove obsolete comment [no ci]

* improve debugging message

* trailing space

* metal : remove stray return

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-02 21:21:48 +01:00
..
CMakeLists.txt metal : refactor + optimize v2 (#15995) 2025-09-17 20:38:12 +03:00
ggml-metal-common.cpp metal : fix loop bound in ggml_mem_ranges (#16412) 2025-10-03 19:18:56 +03:00
ggml-metal-common.h metal : refactor + optimize v2 (#15995) 2025-09-17 20:38:12 +03:00
ggml-metal-context.h metal : refactor + optimize v2 (#15995) 2025-09-17 20:38:12 +03:00
ggml-metal-context.m metal : fuse non-sequential nodes (#16102) 2025-09-28 09:34:05 +03:00
ggml-metal-device.cpp vulkan: fix shmem overrun in mmq id shader (#16873) 2025-10-31 08:14:49 +01:00
ggml-metal-device.h metal : add `CONV_TRANSPOSE_2D` (#16542) 2025-10-17 09:33:58 +03:00
ggml-metal-device.m clip : use FA (#16837) 2025-11-02 21:21:48 +01:00
ggml-metal-impl.h model: add support for qwen3vl series (#16780) 2025-10-30 16:19:14 +01:00
ggml-metal-ops.cpp metal : add `CONV_TRANSPOSE_2D` (#16542) 2025-10-17 09:33:58 +03:00
ggml-metal-ops.h metal : add `CONV_TRANSPOSE_2D` (#16542) 2025-10-17 09:33:58 +03:00
ggml-metal.cpp metal : mark FA blocks (#16372) 2025-10-08 10:57:53 +03:00
ggml-metal.metal clip : use FA (#16837) 2025-11-02 21:21:48 +01:00