HyperFoldUK
|
39137bfe63
|
ggml : integrate sparse-ternary-fma for TQ2_0 quantization
- Add adapter layer for TQ2_0 encoding conversion
- Implement branchless bitwise encoding conversion
- Add SIMD-accelerated Q8_K to int32 type conversion
- Integrate with ggml_vec_dot_tq2_0_q8_K_generic via threshold dispatch
- Add TQ2_0 test cases to test-backend-ops
- Include sparse-ternary-fma library (dense SIMD kernel)
- 2.3x throughput improvement on AVX-512
|
2026-01-14 05:20:27 -05:00 |