HappyZ happyz
happyz synced commits to refs/pull/20844/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:10 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20828/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:09 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20830/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:09 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 8 commits »
happyz synced commits to refs/pull/20834/head at happyz/llama.cpp from mirror 2026-03-22 19:02:09 -07:00
2d1d26c1ca tests: clean up
77aded2bdc args: lessen the blow of the deprecation
e777916d2f chore: clean up refactor
Compare 3 commits »
happyz synced commits to refs/pull/20834/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:09 -07:00
2d1d26c1ca tests: clean up
77aded2bdc args: lessen the blow of the deprecation
e777916d2f chore: clean up refactor
81bc4d3ddc server: fix Host header (#20843)
Compare 6 commits »
happyz synced commits to refs/pull/20836/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:09 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20817/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:08 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20819/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:08 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20822/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:08 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20823/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:08 -07:00
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
23c9182ce8 jinja : refactor token advancement (#20864)
Compare 8 commits »
happyz synced commits to refs/pull/20799/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:07 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 9 commits »
happyz synced commits to refs/pull/20801/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:07 -07:00
81bc4d3ddc server: fix Host header (#20843)
f40a80b4f3 support bf16 and quantized type (#20803)
db9d8aa428 ggml-cuda: native bf16 flash attention for vec kernel (#20525)
ccb87fa3ee [CUDA] Increase number of output elements per-thread block if the K-dimension is small (#20635)
Compare 6 commits »
happyz synced commits to refs/pull/20802/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:07 -07:00
81bc4d3ddc server: fix Host header (#20843)
f40a80b4f3 support bf16 and quantized type (#20803)
db9d8aa428 ggml-cuda: native bf16 flash attention for vec kernel (#20525)
ccb87fa3ee [CUDA] Increase number of output elements per-thread block if the K-dimension is small (#20635)
Compare 6 commits »
happyz synced commits to refs/pull/20804/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:07 -07:00
ec2b787ebe mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
d3ac030a5d mtmd : fix LightOnOCR image preprocessing (#20877)
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
Compare 8 commits »
happyz synced commits to refs/pull/20811/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:07 -07:00
81bc4d3ddc server: fix Host header (#20843)
f40a80b4f3 support bf16 and quantized type (#20803)
db9d8aa428 ggml-cuda: native bf16 flash attention for vec kernel (#20525)
ccb87fa3ee [CUDA] Increase number of output elements per-thread block if the K-dimension is small (#20635)
Compare 5 commits »
happyz synced commits to refs/pull/20784/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:06 -07:00
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
23c9182ce8 jinja : refactor token advancement (#20864)
81bc4d3ddc server: fix Host header (#20843)
Compare 8 commits »
happyz synced commits to refs/pull/20792/merge at happyz/llama.cpp from mirror 2026-03-22 19:02:06 -07:00
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
23c9182ce8 jinja : refactor token advancement (#20864)
81bc4d3ddc server: fix Host header (#20843)
Compare 12 commits »
happyz synced commits to refs/pull/20775/head at happyz/llama.cpp from mirror 2026-03-22 19:01:46 -07:00
3645fee1ed Check all inputs
happyz synced commits to refs/pull/20773/merge at happyz/llama.cpp from mirror 2026-03-22 19:01:46 -07:00
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
23c9182ce8 jinja : refactor token advancement (#20864)
81bc4d3ddc server: fix Host header (#20843)
f40a80b4f3 support bf16 and quantized type (#20803)
Compare 8 commits »
happyz synced commits to refs/pull/20775/merge at happyz/llama.cpp from mirror 2026-03-22 19:01:46 -07:00
3645fee1ed Check all inputs
49bfddeca1 server: allow router to report child instances sleep status (#20849)
bd3f1d9d65 CUDA: fix BF16 FA compilation (#20865)
23c9182ce8 jinja : refactor token advancement (#20864)
Compare 8 commits »