HappyZ

happyz synced commits to refs/pull/19361/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00

d0b9b1bdbf Merge b69a7156d2 into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

7fcf1ef45d metal : skip loading all-zero mask (#19337)

Compare 9 commits »

happyz synced commits to refs/pull/19361/head at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00

b69a7156d2 server: to_json_oaicompat cached_tokens

ec919d7cbd tests : fix fetch_server_test_models.py

22cae83218 metal : adaptive CPU/GPU interleave based on number of nodes (#19369)

449ec2ab07 vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)

3795cc1e89 benches : update models + numbers (#19359)

Compare 5 commits »

happyz synced commits to refs/pull/19368/head at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00

ac6d09c63c move buffer_view to llama-impl.h

happyz synced commits to refs/pull/19362/head at happyz/llama.cpp from mirror 2026-02-06 06:02:49 -08:00

d176ae1c61 build: add GGML_DISABLE_MOE_SUM_CUDA compile flag for moe_sum comparison

happyz synced commits to refs/pull/19356/head at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00

7d0ad88bfa Add missing x86 arch-fallback

7d5ac45bda remaining comments from dev removed

21b8b4924a Reverted unintended reformat

8a5e84cb5b gemm finished

28fb08937a wip: GEMM implementation

Compare 121 commits »

happyz synced commits to refs/pull/19356/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00

84591a8fa2 Merge 7d0ad88bfa into 3688c4f504

7d0ad88bfa Add missing x86 arch-fallback

7d5ac45bda remaining comments from dev removed

21b8b4924a Reverted unintended reformat

8a5e84cb5b gemm finished

Compare 16 commits »

happyz synced commits to refs/pull/19357/head at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00

e514593221 fix ci build and test errors

happyz synced commits to refs/pull/19357/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:48 -08:00

58fa2cb67f Merge e514593221 into 3688c4f504

e514593221 fix ci build and test errors

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

Compare 8 commits »

happyz synced commits to refs/pull/19341/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00

218697b548 Merge 1ea10cd774 into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

7fcf1ef45d metal : skip loading all-zero mask (#19337)

Compare 7 commits »

happyz synced commits to refs/pull/19349/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00

ea30a5b063 Merge 76363cd0af into 06bf3796f4

06bf3796f4 unicode : MSVC regex fix (#19340)

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

Compare 8 commits »

happyz synced commits to refs/pull/19347/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00

f69f1b6e44 Merge dc9d90020a into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

7fcf1ef45d metal : skip loading all-zero mask (#19337)

Compare 7 commits »

happyz synced commits to refs/pull/19354/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00

349a829756 Merge f40f47e219 into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

f40f47e219 Revert "CODEOWNERS: add /docs/backend/GGML-VirtGPU/ -> kpouget"

Compare 8 commits »

happyz synced commits to refs/pull/19354/head at happyz/llama.cpp from mirror 2026-02-06 06:02:47 -08:00

f40f47e219 Revert "CODEOWNERS: add /docs/backend/GGML-VirtGPU/ -> kpouget"

happyz synced commits to refs/pull/19338/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00

f8dccbd1f7 Merge dac6e60c62 into 06bf3796f4

06bf3796f4 unicode : MSVC regex fix (#19340)

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

dac6e60c62 Rename ne* to ne0* for consistent variable naming

e4611f3b33 Fix rope_norm

Compare 13 commits »

happyz synced commits to refs/pull/19339/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00

8d731cad10 Merge a0c5c26fb9 into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

7fcf1ef45d metal : skip loading all-zero mask (#19337)

Compare 7 commits »

happyz synced commits to refs/pull/19340/head at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00

ba4853ef7a Use const_iterator and remove specializations

happyz synced commits to refs/pull/19338/head at happyz/llama.cpp from mirror 2026-02-06 06:02:46 -08:00

dac6e60c62 Rename ne* to ne0* for consistent variable naming

e4611f3b33 Fix rope_norm

99b7b155a8 Fix rope_vision

5f08773a4d Fix rope_multi

f7c330aa7e Rename variables + fix rope_neox

Compare 24 commits »

happyz synced commits to refs/pull/19326/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:45 -08:00

a5b255e94e Merge f8b02b56a9 into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

7fcf1ef45d metal : skip loading all-zero mask (#19337)

Compare 7 commits »

happyz synced commits to refs/pull/19317/head at happyz/llama.cpp from mirror 2026-02-06 06:02:45 -08:00

9c28f4c0af Merge branch 'ggml-org:master' into llama-quantize-help-cleanup

22cae83218 metal : adaptive CPU/GPU interleave based on number of nodes (#19369)

449ec2ab07 vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)

3795cc1e89 benches : update models + numbers (#19359)

b828e18c75 docker : fix vulkan build (#19352)

Compare 21 commits »

happyz synced commits to refs/pull/19317/merge at happyz/llama.cpp from mirror 2026-02-06 06:02:45 -08:00

fa48e1c43d Merge 9c28f4c0af into 3688c4f504

3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)

1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)

f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)

7fcf1ef45d metal : skip loading all-zero mask (#19337)

Compare 8 commits »