HappyZ happyz
happyz synced commits to refs/pull/19113/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:11 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 14 commits »
happyz synced commits to refs/pull/19121/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:11 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 8 commits »
happyz synced commits to refs/pull/19101/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:11 -08:00
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
db6adb3c88 tests: reduce number of FA test permutations (#19381)
dfde5993ea common : add common_speculative_is_compat() (#19270)
06bf3796f4 unicode : MSVC regex fix (#19340)
Compare 5 commits »
happyz synced commits to refs/pull/18993/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:10 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 8 commits »
happyz synced commits to refs/pull/18981/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:10 -08:00
dd52e3fd0c opencl: refactor cumsum
c4b57de54c remove unused argument
afaf17d767 OpenCL: add CUMSUM op support
b83111815e model : support Step3.5-Flash (#19283)
Compare 52 commits »
happyz synced commits to refs/pull/19098/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:10 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 8 commits »
happyz synced commits to refs/pull/18981/head at happyz/llama.cpp from mirror 2026-02-06 18:02:10 -08:00
dd52e3fd0c opencl: refactor cumsum
c4b57de54c remove unused argument
afaf17d767 OpenCL: add CUMSUM op support
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
Compare 182 commits »
happyz synced commits to refs/pull/18890/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:09 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 8 commits »
happyz synced commits to refs/pull/18908/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:09 -08:00
db6adb3c88 tests: reduce number of FA test permutations (#19381)
dfde5993ea common : add common_speculative_is_compat() (#19270)
06bf3796f4 unicode : MSVC regex fix (#19340)
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
Compare 18 commits »
happyz synced commits to refs/pull/18963/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:09 -08:00
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
db6adb3c88 tests: reduce number of FA test permutations (#19381)
dfde5993ea common : add common_speculative_is_compat() (#19270)
06bf3796f4 unicode : MSVC regex fix (#19340)
Compare 34 commits »
happyz synced commits to refs/pull/18968/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:09 -08:00
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
db6adb3c88 tests: reduce number of FA test permutations (#19381)
dfde5993ea common : add common_speculative_is_compat() (#19270)
Compare 6 commits »
happyz synced commits to refs/pull/18862/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:08 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 8 commits »
happyz synced commits to refs/pull/18858/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:08 -08:00
06bf3796f4 unicode : MSVC regex fix (#19340)
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
1946e46f4c vulkan: For coopmat2 FA, use fp16 accumulators for the final result (#19376)
f9bd518a6b vulkan: make FA mask/softcap enables spec constants (#19309)
Compare 12 commits »
happyz synced commits to refs/pull/18886/head at happyz/llama.cpp from mirror 2026-02-06 18:02:08 -08:00
64f05859db add llama_context graph_type
a2860dc85e Merge branch 'master' into xsn/mtp_model
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
Compare 209 commits »
happyz synced commits to refs/pull/18886/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:08 -08:00
64f05859db add llama_context graph_type
a2860dc85e Merge branch 'master' into xsn/mtp_model
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
Compare 11 commits »
happyz synced commits to refs/pull/18816/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:07 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
Compare 8 commits »
happyz synced commits to refs/pull/18825/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:07 -08:00
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
537eadb1b9 sycl: add F16 support for GGML_OP_CEIL (#19306)
db6adb3c88 tests: reduce number of FA test permutations (#19381)
Compare 6 commits »
happyz synced commits to refs/pull/18756/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:07 -08:00
db6adb3c88 tests: reduce number of FA test permutations (#19381)
dfde5993ea common : add common_speculative_is_compat() (#19270)
06bf3796f4 unicode : MSVC regex fix (#19340)
3688c4f504 Kimi-Linear support (backend agnostic + MLA KV cache) (#18755)
Compare 20 commits »
happyz synced commits to refs/pull/18792/head at happyz/llama.cpp from mirror 2026-02-06 18:02:07 -08:00
ae33204660 Fix bad permute
6d0a9adace Change to decay mask approach
fb80f6e370 Adapt autoregressive version from @ymcki
7317889cd4 Refactor and optimize
7399880f12 Remove old methods.
Compare 10 commits »
happyz synced commits to refs/pull/18792/merge at happyz/llama.cpp from mirror 2026-02-06 18:02:07 -08:00
b83111815e model : support Step3.5-Flash (#19283)
3228e77287 gguf-py : bump sentencepiece version (#19319)
7fbd36c50c ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)
ae33204660 Fix bad permute
Compare 13 commits »