Commit Graph

26 Commits

Author SHA1 Message Date
Yee Man Chan 2c8cd844d0 added new names for n_experts, n_experts_used and score_func in TextModel and removed their code in KimiLinear in convert_hf_to_gguf.py. Removed unnecessary ggml_cont and GGML_ASSERT in kimi-linear.cpp 2026-02-01 08:42:01 +08:00
Yee Man Chan 6216273ede removed all ggml_cont b4 ggml_reshape_4d 2026-01-29 08:46:33 +08:00
Yee Man Chan a6b2c450c8 changed hparams.kda_head_dim to hparams.n_embd_head_kda. added TODO comment for class llama_graph_mem_hybrid_k 2026-01-29 08:35:35 +08:00
Yee Man Chan 0444a4faa0 remove ssm_o_norm_b 2026-01-27 13:19:55 +08:00
Yee Man Chan f1525b3695 new class llm_graph_input_mem_hybrid_k to get around the new MLA change. switch the concat order of ggml_concat calls in kimi-linear.cpp to accommodate MLA changes. Removed support for exp_probs_b.weight 2026-01-27 11:25:13 +08:00
Yee Man Chan 560190af97 fixed find_hparam calls. Fixed e_score_correction_bias to use bias instead of weight. Removed all ssm_conv bias terms. 2026-01-21 22:12:21 +08:00
Yee Man Chan 0aea18e718 merged dt_bias to SSM_DT. Do -exp(log_A) in convert_hf_to_gguf.py. 2026-01-16 12:02:27 +08:00
Yee Man Chan c163dff4c0 sync fork and comment fixing in kimi-linear.cpp 2026-01-14 18:01:44 +08:00
Yee Man Chan 2882915258 created static function causal_conv1d to abtract similar code for q/k/v 2026-01-14 17:26:00 +08:00
Yee Man Chan 18ae7f4684 removed unnecessary ggml_cont before ggml_reshape 2026-01-14 03:22:53 +08:00
Yee Man Chan 22bc582a82 return ggml_tensor * pair in kda_autoregressive and kda_chunking as in ngxson's Qwen3Next improvement 2026-01-12 20:32:19 +08:00
Yee Man Chan 59182f5e06 fix trailing whitespace 2026-01-11 22:06:48 +08:00
Yee Man Chan 93afbedc96 moved const llama_model & model; around to follow qwen3next format and see if it cna pass the -Wunused-private-field error 2026-01-11 21:44:54 +08:00
Yee Man Chan 6ae66fc40d fix trailing spaces 2026-01-11 21:31:35 +08:00
Yee Man Chan b9360c7fe1 MLA KV cache support 2026-01-11 15:58:46 +08:00
Yee Man Chan d26fe50178 Moved Aqk computation out of the loop 2026-01-10 08:45:57 +08:00
Yee Man Chan 6150bb7b17 no clamp version 2026-01-09 20:11:45 +08:00
Yee Man Chan f99913dd5f replaced Akk and Aqk with mul_mat and clamp 2026-01-08 13:40:17 +08:00
Yee Man Chan 1099cbf694 build_kda_autoregressive is implemented to replace build_kda_recurrent for faster inference. sync'd to b7682 2026-01-07 18:42:31 +08:00
Yee Man Chan e3542ff8a2 fixed some comments 2026-01-06 11:35:25 +08:00
Yee Man Chan cfed14e31b naive chunking form implemented 2026-01-06 11:23:53 +08:00
Yee Man Chan aba181ebad removed LOG_INFO 2026-01-05 19:21:06 +08:00
Yee Man Chan 66c0c5d8d4 Kimi Linear backend agnostic 2026-01-05 16:35:19 +08:00
Yee Man Chan a0269af292 removed all hard code 2025-12-06 11:51:16 +08:00
Yee Man Chan 9f1265fec1 removed some hard coded code 2025-12-05 19:51:02 +08:00
Yee Man Chan 27baad43d5 kimi linear model implementation 2025-12-02 08:35:14 +08:00