HappyZ happyz
happyz synced commits to refs/pull/19504/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:28 -08:00
752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)
3db6e5ef22 add permuted test-case
2f0ac21d4b cuda: add support for non-contig q,k,v
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
Compare 17 commits »
happyz synced commits to refs/pull/19478/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00
752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
Compare 15 commits »
happyz synced commits to refs/pull/19488/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 14 commits »
happyz synced commits to refs/pull/19489/head at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00
27104fc7f3 ggml: add GGML_OP_ISTFT with CPU and Vulkan backends
happyz synced commits to refs/pull/19489/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00
752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)
27104fc7f3 ggml: add GGML_OP_ISTFT with CPU and Vulkan backends
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
Compare 16 commits »
happyz synced commits to refs/pull/19493/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:27 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 16 commits »
happyz synced commits to refs/pull/19440/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 16 commits »
happyz synced commits to refs/pull/19455/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00
25224c8021 llama : remove deprecated codecvt (#19565)
2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)
bb96bfd361 memory : fix kv cache size for hybrid models (#19559)
0644baefde metal : improve concurrency (#19555)
Compare 9 commits »
happyz synced commits to refs/pull/19460/head at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00
1daef5f85f minor fix and cleanup
happyz synced commits to refs/pull/19472/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:26 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 14 commits »
happyz synced commits to refs/pull/19422/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00
752584d5f5 model: support GLM MoE DSA arch (NOTE: indexer is not yet supported) (#19460)
1b44835c2b simd_gemm: convert everything to int
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
Compare 18 commits »
happyz synced commits to refs/pull/19427/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 14 commits »
happyz synced commits to refs/pull/19433/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 14 commits »
happyz synced commits to refs/pull/19434/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:25 -08:00
25224c8021 llama : remove deprecated codecvt (#19565)
2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)
bb96bfd361 memory : fix kv cache size for hybrid models (#19559)
0644baefde metal : improve concurrency (#19555)
Compare 8 commits »
happyz synced commits to refs/pull/19418/head at happyz/llama.cpp from mirror 2026-02-13 06:02:24 -08:00
88d5e5cd36 Update ROCm docker container to 7.2 release
4d688f9ebb (webui) FEATURE: Enable adding or injecting System Message into chat (#19556)
ff599039a9 scripts : add support for forks in pr2wt.sh (#19540)
f486ce9f30 (webui) REFACTOR: UI primitives and polish (#19551)
38adc7d469 WebUI Architecture Cleanup (#19541)
Compare 39 commits »
happyz synced commits to refs/pull/19418/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:24 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 15 commits »
happyz synced commits to refs/pull/19422/head at happyz/llama.cpp from mirror 2026-02-13 06:02:24 -08:00
1b44835c2b simd_gemm: convert everything to int
8d1be6c4cd use RM=4 for arm
9c660ddafe move another memset out of the loop
Compare 3 commits »
happyz synced commits to refs/pull/19399/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:23 -08:00
25224c8021 llama : remove deprecated codecvt (#19565)
2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)
bb96bfd361 memory : fix kv cache size for hybrid models (#19559)
0644baefde metal : improve concurrency (#19555)
Compare 9 commits »
happyz synced commits to refs/pull/19404/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:23 -08:00
25224c8021 llama : remove deprecated codecvt (#19565)
2f5d8f8edc vendor : update BoringSSL to 0.20260211.0 (#19562)
bb96bfd361 memory : fix kv cache size for hybrid models (#19559)
0644baefde metal : improve concurrency (#19555)
Compare 9 commits »
happyz synced commits to refs/pull/19409/merge at happyz/llama.cpp from mirror 2026-02-13 06:02:23 -08:00
cc2aa81513 Fix wrong memcpy length for block_interleave == 4 (#19575)
0e21991472 fix vulkan ggml_acc only works in 3d but not 4d (#19426)
b2ecc0cdb4 support --verbose-prompt (#19576)
5065da554e CUDA: loop over ne2*ne3 in case it overflows (#19538)
Compare 14 commits »