llama.cpp

History

HyperFoldUK 39137bfe63 ggml : integrate sparse-ternary-fma for TQ2_0 quantization - Add adapter layer for TQ2_0 encoding conversion - Implement branchless bitwise encoding conversion - Add SIMD-accelerated Q8_K to int32 type conversion - Integrate with ggml_vec_dot_tq2_0_q8_K_generic via threshold dispatch - Add TQ2_0 test cases to test-backend-ops - Include sparse-ternary-fma library (dense SIMD kernel) - 2.3x throughput improvement on AVX-512		2026-01-14 05:20:27 -05:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten. (#18628 )	2026-01-08 08:36:42 -08:00
src	ggml : integrate sparse-ternary-fma for TQ2_0 quantization	2026-01-14 05:20:27 -05:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : integrate sparse-ternary-fma for TQ2_0 quantization	2026-01-14 05:20:27 -05:00