HappyZ happyz
happyz synced commits to refs/pull/20797/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:06 -07:00
c2e224d829 issues: add openvino backends (#20932)
8c7957ca33 common : add standard Hugging Face cache support (#20775)
e852eb4901 llama-fit: fix regex pattern for gate_up tensors (#20910)
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
Compare 8 commits »
happyz synced commits to refs/pull/20801/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:06 -07:00
3fc6f1aed1 ggml-backend: re-enable graph reuse with pipeline parallelism (#20927)
29771a0a4c vendor : update cpp-httplib to 0.39.0 (#20933)
42ebce3beb common : fix get_gguf_split_info (#20946)
a94fdb090a WebUI: fix edit msg form textarea height (#20830)
Compare 14 commits »
happyz synced commits to refs/pull/20819/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:06 -07:00
2d2d9c2062 common : add a WARNING for HF cache migration (#20935)
92080b4396 metal : add FLOOR, CEIL, ROUND, TRUNC unary ops (#20930)
342d6125bc metal : add FA instantiations for HSK=512, HSV=512 (#20902)
c2e224d829 issues: add openvino backends (#20932)
Compare 12 commits »
happyz synced commits to refs/pull/20793/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:05 -07:00
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
1772701f99 opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
Compare 6 commits »
happyz synced commits to refs/pull/20742/head at happyz/llama.cpp from mirror 2026-03-24 07:02:05 -07:00
b9cb6b651b ci: use ninja multi-config for vulkan-x64 build
a987c02a56 Added explicit build types for Ninja
7842cf622c ci: fix windows ci errors from an errenous revert
231d441a4a ci: missed one self-hosted step
c06031138f ci: revert ninja from self-hosted runners
Compare 92 commits »
happyz synced commits to refs/pull/20742/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:05 -07:00
b9cb6b651b ci: use ninja multi-config for vulkan-x64 build
2d2d9c2062 common : add a WARNING for HF cache migration (#20935)
92080b4396 metal : add FLOOR, CEIL, ROUND, TRUNC unary ops (#20930)
342d6125bc metal : add FA instantiations for HSK=512, HSV=512 (#20902)
Compare 25 commits »
happyz synced commits to refs/pull/20759/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:05 -07:00
3fc6f1aed1 ggml-backend: re-enable graph reuse with pipeline parallelism (#20927)
29771a0a4c vendor : update cpp-httplib to 0.39.0 (#20933)
42ebce3beb common : fix get_gguf_split_info (#20946)
a94fdb090a WebUI: fix edit msg form textarea height (#20830)
Compare 33 commits »
happyz synced commits to refs/pull/20792/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:05 -07:00
e852eb4901 llama-fit: fix regex pattern for gate_up tensors (#20910)
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
Compare 12 commits »
happyz synced commits to refs/pull/20700/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:04 -07:00
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
1772701f99 opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
Compare 10 commits »
happyz synced commits to refs/pull/20716/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:04 -07:00
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
1772701f99 opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
Compare 10 commits »
happyz synced commits to refs/pull/20728/head at happyz/llama.cpp from mirror 2026-03-24 07:02:04 -07:00
6e0691e5bd ggml-webgpu: port all AOT opeartors to JIT
c9dc43333f readme : clarify MODEL_ENDPOINT usage (#20941)
2d2d9c2062 common : add a WARNING for HF cache migration (#20935)
92080b4396 metal : add FLOOR, CEIL, ROUND, TRUNC unary ops (#20930)
342d6125bc metal : add FA instantiations for HSK=512, HSV=512 (#20902)
Compare 22 commits »
happyz synced commits to refs/pull/20728/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:04 -07:00
6e0691e5bd ggml-webgpu: port all AOT opeartors to JIT
c9dc43333f readme : clarify MODEL_ENDPOINT usage (#20941)
2d2d9c2062 common : add a WARNING for HF cache migration (#20935)
92080b4396 metal : add FLOOR, CEIL, ROUND, TRUNC unary ops (#20930)
Compare 22 commits »
happyz synced commits to refs/pull/20729/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:04 -07:00
c9dc43333f readme : clarify MODEL_ENDPOINT usage (#20941)
2d2d9c2062 common : add a WARNING for HF cache migration (#20935)
92080b4396 metal : add FLOOR, CEIL, ROUND, TRUNC unary ops (#20930)
342d6125bc metal : add FA instantiations for HSK=512, HSV=512 (#20902)
Compare 14 commits »
happyz synced commits to refs/pull/20626/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:03 -07:00
342d6125bc metal : add FA instantiations for HSK=512, HSV=512 (#20902)
c2e224d829 issues: add openvino backends (#20932)
8c7957ca33 common : add standard Hugging Face cache support (#20775)
e852eb4901 llama-fit: fix regex pattern for gate_up tensors (#20910)
Compare 8 commits »
happyz synced commits to refs/pull/20628/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:03 -07:00
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
1772701f99 opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
Compare 10 commits »
happyz synced commits to refs/pull/20633/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:03 -07:00
3fc6f1aed1 ggml-backend: re-enable graph reuse with pipeline parallelism (#20927)
29771a0a4c vendor : update cpp-httplib to 0.39.0 (#20933)
42ebce3beb common : fix get_gguf_split_info (#20946)
a94fdb090a WebUI: fix edit msg form textarea height (#20830)
Compare 16 commits »
happyz synced commits to refs/pull/20644/head at happyz/llama.cpp from mirror 2026-03-24 07:02:03 -07:00
caa8fba0cc Added check for dst_t to cuda_cast template for float
7fd898beac Renamed k to ne
53450f12f1 Removed stale code
fa79ea639e Forced F32 path for NVFP4/Cublas and removed Fusion/TensorScale
342d6125bc metal : add FA instantiations for HSK=512, HSV=512 (#20902)
Compare 29 commits »
happyz synced commits to refs/pull/20644/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:03 -07:00
caa8fba0cc Added check for dst_t to cuda_cast template for float
7fd898beac Renamed k to ne
53450f12f1 Removed stale code
2d2d9c2062 common : add a WARNING for HF cache migration (#20935)
Compare 12 commits »
happyz synced commits to refs/pull/20590/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:02 -07:00
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
1772701f99 opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
Compare 10 commits »
happyz synced commits to refs/pull/20596/merge at happyz/llama.cpp from mirror 2026-03-24 07:02:02 -07:00
312d870a89 common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912)
7cadbfce10 hexagon: general DMA and Binary Op fixes for large strides (#20918)
1fb2290a51 Add codeowners for scripts/snapdragon and docs/snapdragon (#20915)
1772701f99 opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
Compare 10 commits »