HappyZ happyz
happyz synced commits to refs/pull/20454/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:56 -07:00
3840d3c356 Merge branch 'ggml-org:master' into feat/moe-expert-profiling
227ed28e12 webui: MCP Diagnostics improvements (#21803)
bafae27654 Remove extra conditional check on debug mode. (#21798)
Compare 4 commits »
happyz synced commits to refs/pull/20453/head at happyz/llama.cpp from mirror 2026-04-13 07:02:55 -07:00
71efc236e7 Merge branch 'ggml-org:master' into feat/qlora-training
227ed28e12 webui: MCP Diagnostics improvements (#21803)
bafae27654 Remove extra conditional check on debug mode. (#21798)
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
Compare 242 commits »
happyz synced commits to refs/pull/20242/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:55 -07:00
bafae27654 Remove extra conditional check on debug mode. (#21798)
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
Compare 53 commits »
happyz synced commits to refs/pull/20062/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:53 -07:00
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
9f5e1edb10 CUDA: Limit DeviceSegmentedSort to immediate mode (#21718)
920b3e78cb mtmd: use causal attn for gemma 4 audio (#21824)
974c8c94cc webui: add setting for first-line chat titles (#21797)
Compare 14 commits »
happyz synced commits to refs/pull/20075/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:53 -07:00
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
1e9d771e2c convert : force f16 or f32 on step3-vl conv weights (#21646)
Compare 7 commits »
happyz synced commits to refs/pull/20009/head at happyz/llama.cpp from mirror 2026-04-13 07:02:52 -07:00
f2e77e90a8 Merge branch 'ggml-org:master' into qwen3-rerank-instruct
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
1e9d771e2c convert : force f16 or f32 on step3-vl conv weights (#21646)
Compare 589 commits »
happyz synced commits to refs/pull/20009/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:52 -07:00
f2e77e90a8 Merge branch 'ggml-org:master' into qwen3-rerank-instruct
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
Compare 10 commits »
happyz synced commits to refs/pull/19941/head at happyz/llama.cpp from mirror 2026-04-13 07:02:51 -07:00
6c0bec52f4 The great quant laboratory squash
aa00911d12 common : add download cancellation and temp file cleanup (#21813)
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
9f5e1edb10 CUDA: Limit DeviceSegmentedSort to immediate mode (#21718)
920b3e78cb mtmd: use causal attn for gemma 4 audio (#21824)
Compare 610 commits »
happyz synced commits to refs/pull/19941/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:51 -07:00
6c0bec52f4 The great quant laboratory squash
75f3bc94e6 vulkan: Flash Attention DP4A shader for quantized KV cache (#20797)
aa00911d12 common : add download cancellation and temp file cleanup (#21813)
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
Compare 547 commits »
happyz synced commits to refs/pull/19833/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:50 -07:00
75f3bc94e6 vulkan: Flash Attention DP4A shader for quantized KV cache (#20797)
aa00911d12 common : add download cancellation and temp file cleanup (#21813)
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
9f5e1edb10 CUDA: Limit DeviceSegmentedSort to immediate mode (#21718)
Compare 62 commits »
happyz synced commits to refs/pull/18908/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:50 -07:00
bafae27654 Remove extra conditional check on debug mode. (#21798)
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
Compare 7 commits »
happyz synced commits to refs/pull/18923/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:50 -07:00
bafae27654 Remove extra conditional check on debug mode. (#21798)
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
Compare 112 commits »
happyz synced commits to refs/pull/19527/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:50 -07:00
aa00911d12 common : add download cancellation and temp file cleanup (#21813)
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
9f5e1edb10 CUDA: Limit DeviceSegmentedSort to immediate mode (#21718)
920b3e78cb mtmd: use causal attn for gemma 4 audio (#21824)
Compare 13 commits »
happyz synced commits to refs/pull/18890/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:49 -07:00
75f3bc94e6 vulkan: Flash Attention DP4A shader for quantized KV cache (#20797)
aa00911d12 common : add download cancellation and temp file cleanup (#21813)
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
9f5e1edb10 CUDA: Limit DeviceSegmentedSort to immediate mode (#21718)
Compare 18 commits »
happyz synced commits to refs/pull/18465/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:49 -07:00
920b3e78cb mtmd: use causal attn for gemma 4 audio (#21824)
974c8c94cc webui: add setting for first-line chat titles (#21797)
227ed28e12 webui: MCP Diagnostics improvements (#21803)
bafae27654 Remove extra conditional check on debug mode. (#21798)
Compare 9 commits »
happyz synced commits to refs/pull/18588/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:49 -07:00
aa00911d12 common : add download cancellation and temp file cleanup (#21813)
ce8fd4b1a6 server: Expose build_info in router mode (#21835)
9f5e1edb10 CUDA: Limit DeviceSegmentedSort to immediate mode (#21718)
920b3e78cb mtmd: use causal attn for gemma 4 audio (#21824)
Compare 17 commits »
happyz synced commits to refs/pull/17374/head at happyz/llama.cpp from mirror 2026-04-13 07:02:48 -07:00
04b7af9563 workaround unit test failure for TOP_K
b5249e9f43 Merge remote-tracking branch 'origin/master' into set-default-subgroup-for-intel
e34f042154 CUDA: fuse muls (#21665)
d132f22fc9 HIP: add CDNA4 (gfx950) architecture support for MI350X/MI355X (#21570)
d6f3030047 ggml: backend-agnostic tensor parallelism (experimental) (#19378)
Compare 20 commits »
happyz synced commits to refs/pull/17374/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:48 -07:00
04b7af9563 workaround unit test failure for TOP_K
920b3e78cb mtmd: use causal attn for gemma 4 audio (#21824)
974c8c94cc webui: add setting for first-line chat titles (#21797)
227ed28e12 webui: MCP Diagnostics improvements (#21803)
Compare 11 commits »
happyz synced commits to refs/pull/18373/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:48 -07:00
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
1e9d771e2c convert : force f16 or f32 on step3-vl conv weights (#21646)
Compare 36 commits »
happyz synced commits to refs/pull/16692/merge at happyz/llama.cpp from mirror 2026-04-13 07:02:47 -07:00
bafae27654 Remove extra conditional check on debug mode. (#21798)
873c825611 sycl: disable Q1_0 in backend and cleanup unused variables (#21807)
82764d8f40 mtmd: fix crash when sending image under 2x2 pixels (#21711)
21a4933042 mtmd: qwen3 audio support (qwen3-omni and qwen3-asr) (#19441)
Compare 112 commits »