HappyZ

happyz synced new reference refs/tags/b8069 to happyz/llama.cpp from mirror 2026-02-16 06:02:19 -08:00

happyz synced commits to refs/pull/19645/head at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00

7d0be2c483 cuda : enable CUDA graphs for MMID BS <= 4

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

ff4affb4c1 sync : ggml

55d58599c8 ggml : bump version to 0.9.7 (ggml/1425)

Compare 12 commits »

happyz synced commits to refs/pull/19645/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00

ad993ca286 Merge 7d0be2c483 into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

7d0be2c483 cuda : enable CUDA graphs for MMID BS <= 4

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

Compare 15 commits »

happyz synced commits to refs/pull/9206/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00

d73ce52091 Merge 74342d48c2 into cc45f2ada6

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

ff4affb4c1 sync : ggml

Compare 90 commits »

happyz synced commits to refs/pull/19651/head at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00

e62f6849bd Merge branch 'joyai-llm-flash' of https://github.com/dranger003/llama.cpp into joyai-llm-flash

f5f2a087d8 add missing vocab type section

5255b32ccc Update convert_hf_to_gguf.py

629e49994c Update convert_hf_to_gguf_update.py

7af1cce091 llama-vocab: create a new pre-tokenizer name for joyai-llm.

Compare 9 commits »

happyz synced commits to refs/pull/19651/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:18 -08:00

1e1ce8b1f3 Merge e62f6849bd into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

e62f6849bd Merge branch 'joyai-llm-flash' of https://github.com/dranger003/llama.cpp into joyai-llm-flash

f5f2a087d8 add missing vocab type section

5255b32ccc Update convert_hf_to_gguf.py

Compare 11 commits »

happyz synced commits to refs/pull/19622/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00

2571720a44 Merge 907e7be410 into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 5 commits »

happyz synced commits to refs/pull/19625/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00

856d20988b Merge 32d504cd94 into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 5 commits »

happyz synced commits to refs/pull/19635/head at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00

bdc1dda64f chat : remove dead thinking code from qwen3_coder_xml

ac0f256df0 chat : route Step-3.5-Flash to Nemotron v3 PEG parser, add tests

db38820013 common : fix Step-3.5-Flash format detection and thinking support

Compare 3 commits »

happyz synced commits to refs/pull/19635/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:17 -08:00

973463b53e Merge bdc1dda64f into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

bdc1dda64f chat : remove dead thinking code from qwen3_coder_xml

Compare 8 commits »

happyz synced commits to refs/pull/19616/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00

fc21fc9584 Merge f14fd0c7f2 into cc45f2ada6

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 4 commits »

happyz synced commits to refs/pull/19611/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00

a375f6ff4b Merge 373da0e276 into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

373da0e276 Apply suggestion from @ngxson

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

Compare 6 commits »

happyz synced commits to refs/pull/19612/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00

1f84fb9112 Merge 4343ae3d65 into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 5 commits »

happyz synced commits to refs/pull/19620/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:16 -08:00

24a46b30f8 Merge 4018b9ca80 into cc45f2ada6

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 4 commits »

happyz synced commits to refs/pull/19608/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:15 -08:00

0306d99a83 Merge 0a835c1ccd into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 5 commits »

happyz synced commits to refs/pull/19611/head at happyz/llama.cpp from mirror 2026-02-16 06:02:15 -08:00

373da0e276 Apply suggestion from @ngxson

happyz synced commits to refs/pull/19609/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:15 -08:00

df7533af58 Merge 9937626f47 into 2ba9adc093

2ba9adc093 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (#19591)

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 5 commits »

happyz synced commits to refs/pull/19597/head at happyz/llama.cpp from mirror 2026-02-16 06:02:14 -08:00

c70946a83e cont : add comments

db58c044b0 cont : keep qwen35 and qwen35moe graphs intact

3bff6927ab models : add llm_build_delta_net_base

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 19 commits »

happyz synced commits to refs/pull/19594/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:14 -08:00

8826b36180 Merge 661e03eb1b into cc45f2ada6

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 4 commits »

happyz synced commits to refs/pull/19595/merge at happyz/llama.cpp from mirror 2026-02-16 06:02:14 -08:00

8f47409ff6 Merge 2e17f6a931 into cc45f2ada6

cc45f2ada6 models : deduplicate delta-net graphs for Qwen family (#19597)

d5dfc33027 graph : fix KQ mask, lora, cvec reuse checks (#19644)

267ba5a1d9 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (#19132)

Compare 4 commits »