HappyZ

happyz synced commits to refs/pull/19504/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:28 -08:00

28b9f9ef22 Merge 3db6e5ef22 into 752584d5f5

752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)

3db6e5ef22 add permuted test-case

2f0ac21d4b cuda: add support for non-contig q,k,v

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

Compare 17 commits »

happyz synced commits to refs/pull/19478/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00

9e092a3480 Merge 49a5ff40e2 into 752584d5f5

752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

Compare 15 commits »

happyz synced commits to refs/pull/19488/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00

0c20d4d32d Merge cbe37e3b67 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 14 commits »

happyz synced commits to refs/pull/19489/head at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00

27104fc7f3 ggml: add GGML_OP_ISTFT with CPU and Vulkan backends

happyz synced commits to refs/pull/19489/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00

c707eee74b Merge 27104fc7f3 into 752584d5f5

752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)

27104fc7f3 ggml: add GGML_OP_ISTFT with CPU and Vulkan backends

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

Compare 16 commits »

happyz synced commits to refs/pull/19493/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00

879e5bc82b Merge c591189213 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 16 commits »

happyz synced commits to refs/pull/19440/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00

6d70f74ec8 Merge f5f2203ed4 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 16 commits »

happyz synced commits to refs/pull/19455/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00

a6e0a50c6a Merge 0b695d240d into 25224c8021

25224c8021 llama : remove deprecated codecvt (#19565)

2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)

bb96bfd361 memory : fix kv cache size for hybrid models (#19559)

0644baefde metal : improve concurrency (#19555)

Compare 9 commits »

happyz synced commits to refs/pull/19460/head at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00

1daef5f85f minor fix and cleanup

happyz synced commits to refs/pull/19472/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00

6cdedc79eb Merge 2162bec1fc into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 14 commits »

happyz synced commits to refs/pull/19422/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00

c6a07e517f Merge 1b44835c2b into 752584d5f5

752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)

1b44835c2b simd_gemm: convert everything to int

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

Compare 18 commits »

happyz synced commits to refs/pull/19427/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00

5c1a8aa850 Merge 15a484dee6 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 14 commits »

happyz synced commits to refs/pull/19433/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00

0af49e0ace Merge 170f2f9079 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 14 commits »

happyz synced commits to refs/pull/19434/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00

cd3e3cef70 Merge 05dfc18d55 into 25224c8021

25224c8021 llama : remove deprecated codecvt (#19565)

2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)

bb96bfd361 memory : fix kv cache size for hybrid models (#19559)

0644baefde metal : improve concurrency (#19555)

Compare 8 commits »

happyz synced commits to refs/pull/19418/head at happyz/llama.cpp from mirror 2026-02-13 06:02:24 -08:00

88d5e5cd36 Update ROCm docker container to 7.2 release

4d688f9ebb (webui) FEATURE: Enable adding or injecting System Message into chat (#19556)

ff599039a9 scripts : add support for forks in pr2wt.sh (#19540)

f486ce9f30 (webui) REFACTOR: UI primitives and polish (#19551)

38adc7d469 WebUI Architecture Cleanup (#19541)

Compare 39 commits »

happyz synced commits to refs/pull/19418/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:24 -08:00

b5797e1291 Merge 88d5e5cd36 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 15 commits »

happyz synced commits to refs/pull/19422/head at happyz/llama.cpp from mirror 2026-02-13 06:02:24 -08:00

1b44835c2b simd_gemm: convert everything to int

8d1be6c4cd use RM=4 for arm

9c660ddafe move another memset out of the loop

Compare 3 commits »

happyz synced commits to refs/pull/19399/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:23 -08:00

8a17adf418 Merge b4bc24e58b into 25224c8021

25224c8021 llama : remove deprecated codecvt (#19565)

2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)

bb96bfd361 memory : fix kv cache size for hybrid models (#19559)

0644baefde metal : improve concurrency (#19555)

Compare 9 commits »

happyz synced commits to refs/pull/19404/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:23 -08:00

6d464dbc4e Merge cfa27c64d1 into 25224c8021

25224c8021 llama : remove deprecated codecvt (#19565)

2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)

bb96bfd361 memory : fix kv cache size for hybrid models (#19559)

0644baefde metal : improve concurrency (#19555)

Compare 9 commits »

happyz synced commits to refs/pull/19409/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:23 -08:00

efed63bc5c Merge 1f42650078 into cc2aa81513

cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)

0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)

b2ecc0cdb4 support --verbose-prompt (#19576)

5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)

Compare 14 commits »