Xuan Son Nguyen
cbb37dd4cd
improve function args handling
2025-12-31 11:29:40 +01:00
Xuan Son Nguyen
1b213ae5e7
add placeholder for tojson
2025-12-30 21:52:47 +01:00
Xuan Son Nguyen
4479c382ce
demo: type inferrence
2025-12-30 17:26:23 +01:00
Xuan Son Nguyen
9c0fa6f810
rm workarounds
2025-12-30 16:07:23 +01:00
Xuan Son Nguyen
9e9a70f72f
more fixes
2025-12-29 15:07:18 +01:00
Xuan Son Nguyen
026730e8e3
more fix, more tests
2025-12-29 12:53:31 +01:00
Xuan Son Nguyen
1cf25734a9
more tests
2025-12-29 10:53:32 +01:00
Xuan Son Nguyen
2a31c9a30c
a lot of fixes
2025-12-29 00:38:29 +01:00
Xuan Son Nguyen
1784a57e7b
impl global_from_json
2025-12-28 23:15:48 +01:00
Xuan Son Nguyen
55fe96a9df
add jinja-value.cpp
2025-12-28 22:49:31 +01:00
Xuan Son Nguyen
c7f246e7a5
allow func to access ctx
2025-12-28 22:15:10 +01:00
Xuan Son Nguyen
adad34f64d
add filter_statement
2025-12-28 22:02:22 +01:00
Xuan Son Nguyen
9a8a45ff3b
mostly works
2025-12-28 21:32:55 +01:00
Xuan Son Nguyen
45df0c91e7
testing more templates
2025-12-28 19:50:09 +01:00
Xuan Son Nguyen
db09a7468d
fix negate test
2025-12-28 19:07:01 +01:00
Xuan Son Nguyen
acb0effa25
allow print source on exception
2025-12-28 18:45:41 +01:00
Xuan Son Nguyen
64e29a5848
add mk_stmt
2025-12-28 17:48:14 +01:00
Xuan Son Nguyen
7f17608ea4
use shared_ptr for values
2025-12-28 17:46:25 +01:00
Xuan Son Nguyen
4331e9c8e9
keyword arguments and slicing array
2025-12-28 17:23:29 +01:00
Xuan Son Nguyen
45c194622e
support binded functions
2025-12-28 15:33:14 +01:00
Xuan Son Nguyen
4ca114b095
track input string even after transformations
2025-12-28 12:48:35 +01:00
Xuan Son Nguyen
81310d29c1
render gemma tmpl ok
2025-12-28 12:04:23 +01:00
Xuan Son Nguyen
10835f2720
eval with is_user_input
2025-12-27 23:25:20 +01:00
Xuan Son Nguyen
c08f4ddf01
use mk_val
2025-12-27 22:28:54 +01:00
Xuan Son Nguyen
da7bbe5813
wip
2025-12-27 22:25:19 +01:00
Xuan Son Nguyen
7ed11f78f9
add more builtins
2025-12-27 22:10:45 +01:00
Xuan Son Nguyen
15b3dbab05
add string builtins
2025-12-27 21:52:50 +01:00
Xuan Son Nguyen
5a041e65b8
fix map object
2025-12-27 20:38:06 +01:00
Xuan Son Nguyen
d8ef00e610
bin ops works!
2025-12-27 20:16:46 +01:00
Xuan Son Nguyen
8d1e9a0d12
shadow naming
2025-12-27 16:06:23 +01:00
Xuan Son Nguyen
7ad6eb39ca
binary_expression::execute
2025-12-27 16:00:07 +01:00
Xuan Son Nguyen
8cea1ed6b0
parser ok
2025-12-27 12:55:01 +01:00
Xuan Son Nguyen
7ac8e98b28
clean up
2025-12-27 12:35:19 +01:00
Xuan Son Nguyen
a6e0ae7a85
demo
2025-12-27 12:22:34 +01:00
Xuan Son Nguyen
a35fcb00b5
add vm types
2025-12-27 12:12:07 +01:00
Xuan Son Nguyen
15b7c50e95
lexer
2025-12-25 21:08:51 +01:00
Xuan Son Nguyen
8d8030142e
jinja vm
2025-12-25 00:19:23 +01:00
Xuan-Son Nguyen
4cbafad4f0
model: support MiMo-V2-Flash ( #18328 )
...
* mimov2: convert ok
* rename mimov2 --> mimo2
* fix conversion
* runnable not incorrect
* use sink
* add_sliding_window_pattern
* add swa and per-layer n_head_kv
* correct params
* somewhat working
* correct gating func
* nits
* mimo2: wire RMS eps + MoE bias + converter guards
* add co-author
Co-authored-by: Aaryan-Kapoor <Aaryan-Kapoor@users.noreply.github.com>
* use add_rope_freq_base_swa
---------
Co-authored-by: Aaryan Kapoor <aaryankapoor2006@gmail.com>
Co-authored-by: Aaryan-Kapoor <Aaryan-Kapoor@users.noreply.github.com>
2025-12-24 23:07:08 +01:00
Aadeshveer Singh
c184284230
fit-params : fix race condition in fit-params output ( #18276 )
2025-12-24 15:57:38 +01:00
Aman Gupta
c8a2417d7b
CUDA: experimental native mxfp4 support for blackwell ( #17906 )
...
* CUDA: experimental native mxfp4 support for blackwell
* optimize load_tiles
* optimize quantize_mxfp4
* cleanup
* first pass review: formatting
* use interleaved layout for mma
* mmq: add assert for size
* use __nv_fp4x4_e2m1
* use iter_k as 512, cleanup
* Use 1200 as blackwell instead of 1000
* address review comments
* mmq: fix stride
* quantize.cu: use reference impl of e8m0 scale
* address review comments
* add 120f-virtual + minor fixes
---------
Co-authored-by: Aman Gupta <aman>
2025-12-24 22:28:26 +08:00
Saba Fallah
54132f1b1f
model : support for LlamaBidirectionalModel architecture ( #18220 )
...
* model: llama-embed-nemotron
* minor: python lint
* changed arch-name
* templated llm_build_llama to be used for both llama and llama-embed arch
2025-12-24 14:02:36 +01:00
Jeff Bolz
2a9ea2020c
vulkan: fix command buffer corruption in ggml_backend_vk_event_wait ( #18302 )
2025-12-24 12:36:34 +01:00
Wang Weixuan
ce7a6dc0fc
CANN : refactor ACL graph cache ( #17752 )
...
Move the graph property checking code into methods of LRU cache.
Signed-off-by: Wang Weixuan <wangweixvan@gmail.com>
2025-12-24 17:50:24 +08:00
Jesse Ikonen
1ce0126b18
docs: Fix typos in SYCL documentation ( #18269 )
2025-12-24 17:19:47 +08:00
Ruben Ortlam
7f459c98e7
vulkan: use fewer FA rows for small cache runs ( #18280 )
2025-12-24 08:59:14 +01:00
TianHao324
cf2ffc02bc
CANN: Uses yarn_ramp cache in ROPE ( #17725 )
2025-12-24 14:55:33 +08:00
ddh0
10355dc7d0
common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg ( #18267 )
2025-12-24 14:19:12 +08:00
Xuan-Son Nguyen
5ee4e43f26
server: return_progress to also report 0% processing state ( #18305 )
2025-12-23 21:49:05 +01:00
Pascal
5b6c9bc0f3
webui: apply webui_settings on first load ( #18223 )
...
* webui: apply webui_settings on first load
The webui_settings from /props were not applied on initial load
when default_generation_settings.params was null
Now syncs whenever serverProps is available, regardless of params,
works for both single-model and router modes
* chore: update webui build output
2025-12-23 15:48:03 +01:00
Xuan-Son Nguyen
849d021104
server: fix crash with model not having BOS/EOS ( #18321 )
2025-12-23 14:39:36 +01:00