HappyZ happyz
happyz synced commits to refs/pull/19361/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
7fcf1ef45d metal : skip loading all-zero mask (#19337)
Compare 9 commits »
happyz synced commits to refs/pull/19361/head at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00
b69a7156d2 server: to_json_oaicompat cached_tokens
ec919d7cbd tests : fix fetch_server_test_models.py
22cae83218 metal : adaptive CPU/GPU interleave based on number of nodes (#19369)
449ec2ab07 vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)
3795cc1e89 benches : update models + numbers (#19359)
Compare 5 commits »
happyz synced commits to refs/pull/19368/head at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00
ac6d09c63c move buffer_view to llama-impl.h
happyz synced commits to refs/pull/19362/head at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00
d176ae1c61 build: add GGML_DISABLE_MOE_SUM_CUDA compile flag for moe_sum comparison
happyz synced commits to refs/pull/19356/head at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00
7d0ad88bfa Add missing x86 arch-fallback
7d5ac45bda remaining comments from dev removed
21b8b4924a Reverted unintended reformat
8a5e84cb5b gemm finished
28fb08937a wip: GEMM implementation
Compare 121 commits »
happyz synced commits to refs/pull/19356/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00
7d0ad88bfa Add missing x86 arch-fallback
7d5ac45bda remaining comments from dev removed
21b8b4924a Reverted unintended reformat
8a5e84cb5b gemm finished
Compare 16 commits »
happyz synced commits to refs/pull/19357/head at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00
e514593221 fix ci build and test errors
happyz synced commits to refs/pull/19357/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00
e514593221 fix ci build and test errors
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
Compare 8 commits »
happyz synced commits to refs/pull/19341/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
7fcf1ef45d metal : skip loading all-zero mask (#19337)
Compare 7 commits »
happyz synced commits to refs/pull/19349/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00
06bf3796f4 unicode : MSVC regex fix (#19340)
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
Compare 8 commits »
happyz synced commits to refs/pull/19347/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
7fcf1ef45d metal : skip loading all-zero mask (#19337)
Compare 7 commits »
happyz synced commits to refs/pull/19354/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
f40f47e219 Revert "CODEOWNERS: add /docs/backend/GGML-VirtGPU/ -> kpouget"
Compare 8 commits »
happyz synced commits to refs/pull/19354/head at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00
f40f47e219 Revert "CODEOWNERS: add /docs/backend/GGML-VirtGPU/ -> kpouget"
happyz synced commits to refs/pull/19338/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00
06bf3796f4 unicode : MSVC regex fix (#19340)
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
dac6e60c62 Rename ne* to ne0* for consistent variable naming
e4611f3b33 Fix rope_norm
Compare 13 commits »
happyz synced commits to refs/pull/19339/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
7fcf1ef45d metal : skip loading all-zero mask (#19337)
Compare 7 commits »
happyz synced commits to refs/pull/19340/head at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00
ba4853ef7a Use const_iterator and remove specializations
happyz synced commits to refs/pull/19338/head at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00
dac6e60c62 Rename ne* to ne0* for consistent variable naming
e4611f3b33 Fix rope_norm
99b7b155a8 Fix rope_vision
5f08773a4d Fix rope_multi
f7c330aa7e Rename variables + fix rope_neox
Compare 24 commits »
happyz synced commits to refs/pull/19326/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:45 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
7fcf1ef45d metal : skip loading all-zero mask (#19337)
Compare 7 commits »
happyz synced commits to refs/pull/19317/head at happyz/llama.cpp from mirror 2026-02-06 06:02:45 -08:00
9c28f4c0af Merge branch 'ggml-org:master' into llama-quantize-help-cleanup
22cae83218 metal : adaptive CPU/GPU interleave based on number of nodes (#19369)
449ec2ab07 vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)
3795cc1e89 benches : update models + numbers (#19359)
b828e18c75 docker : fix vulkan build (#19352)
Compare 21 commits »
happyz synced commits to refs/pull/19317/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:45 -08:00
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
7fcf1ef45d metal : skip loading all-zero mask (#19337)
Compare 8 commits »