llama.cpp/ggml
Francis Couture-Harpin 96b3d411e0 ggml-quants : allow using vdotq_s32 in TQ2_0 vec_dot
Not yet tested on harware which supports it,
might not work or might not even compile. But also it might.
It should make the performance better on recent ARM CPUs.

* ggml-quants : remove comment about possible format change of TQ2_0

Making it slightly more convenient for AVX512
but less convenient for everything else is not worth the trouble.
2024-08-07 15:08:41 -04:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
include ggml : remove q1_3 and q2_2 2024-08-02 20:16:26 -04:00
src ggml-quants : allow using vdotq_s32 in TQ2_0 vec_dot 2024-08-07 15:08:41 -04:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt feat: Support Moore Threads GPU (#8383) 2024-07-28 01:41:25 +02:00