HappyZ happyz
happyz synced new reference refs/tags/b8069 to happyz/llama.cpp from mirror 2026-02-16 06:02:19 -08:00
happyz synced commits to refs/pull/19645/head at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00
7d0be2c483 cuda : enable CUDA graphs for MMID BS <= 4
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
ff4affb4c1 sync : ggml
55d58599c8 ggml : bump version to 0.9.7 (ggml/1425)
Compare 12 commits »
happyz synced commits to refs/pull/19645/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
7d0be2c483 cuda : enable CUDA graphs for MMID BS <= 4
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
Compare 15 commits »
happyz synced commits to refs/pull/9206/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
ff4affb4c1 sync : ggml
Compare 90 commits »
happyz synced commits to refs/pull/19651/head at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00
e62f6849bd Merge branch 'joyai-llm-flash' of https://github.com/dranger003/llama.cpp into joyai-llm-flash
f5f2a087d8 add missing vocab type section
5255b32ccc Update convert_hf_to_gguf.py
629e49994c Update convert_hf_to_gguf_update.py
7af1cce091 llama-vocab: create a new pre-tokenizer name for joyai-llm.
Compare 9 commits »
happyz synced commits to refs/pull/19651/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
e62f6849bd Merge branch 'joyai-llm-flash' of https://github.com/dranger003/llama.cpp into joyai-llm-flash
f5f2a087d8 add missing vocab type section
5255b32ccc Update convert_hf_to_gguf.py
Compare 11 commits »
happyz synced commits to refs/pull/19622/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 5 commits »
happyz synced commits to refs/pull/19625/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 5 commits »
happyz synced commits to refs/pull/19635/head at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00
bdc1dda64f chat : remove dead thinking code from qwen3_coder_xml
ac0f256df0 chat : route Step-3.5-Flash to Nemotron v3 PEG parser, add tests
db38820013 common : fix Step-3.5-Flash format detection and thinking support
Compare 3 commits »
happyz synced commits to refs/pull/19635/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
bdc1dda64f chat : remove dead thinking code from qwen3_coder_xml
Compare 8 commits »
happyz synced commits to refs/pull/19616/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 4 commits »
happyz synced commits to refs/pull/19611/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
373da0e276 Apply suggestion from @ngxson
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
Compare 6 commits »
happyz synced commits to refs/pull/19612/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 5 commits »
happyz synced commits to refs/pull/19620/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 4 commits »
happyz synced commits to refs/pull/19608/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:15 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 5 commits »
happyz synced commits to refs/pull/19611/head at happyz/llama.cpp from mirror 2026-02-16 06:02:15 -08:00
373da0e276 Apply suggestion from @ngxson
happyz synced commits to refs/pull/19609/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:15 -08:00
2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 5 commits »
happyz synced commits to refs/pull/19597/head at happyz/llama.cpp from mirror 2026-02-16 06:02:14 -08:00
c70946a83e cont : add comments
db58c044b0 cont : keep qwen35 and qwen35moe graphs intact
3bff6927ab models : add llm_build_delta_net_base
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 19 commits »
happyz synced commits to refs/pull/19594/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:14 -08:00
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 4 commits »
happyz synced commits to refs/pull/19595/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:14 -08:00
cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)
d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)
267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)
Compare 4 commits »