Commit Graph

703 Commits

Author SHA1 Message Date
Xuan Son Nguyen a66e4a4f5d make output a bit cleaner 2026-01-01 23:07:45 +01:00
Xuan Son Nguyen 61c25c3fbf trailing spaces 2026-01-01 22:48:42 +01:00
Xuan Son Nguyen a10fbc77a3 no more std::regex 2026-01-01 22:48:17 +01:00
Xuan Son Nguyen d34efd9626 rm type inference 2025-12-31 11:43:53 +01:00
Xuan Son Nguyen cbb37dd4cd improve function args handling 2025-12-31 11:29:40 +01:00
Xuan Son Nguyen 1b213ae5e7 add placeholder for tojson 2025-12-30 21:52:47 +01:00
Xuan Son Nguyen 4479c382ce demo: type inferrence 2025-12-30 17:26:23 +01:00
Xuan Son Nguyen 9c0fa6f810 rm workarounds 2025-12-30 16:07:23 +01:00
Xuan Son Nguyen 9e9a70f72f more fixes 2025-12-29 15:07:18 +01:00
Xuan Son Nguyen 026730e8e3 more fix, more tests 2025-12-29 12:53:31 +01:00
Xuan Son Nguyen 1cf25734a9 more tests 2025-12-29 10:53:32 +01:00
Xuan Son Nguyen 2a31c9a30c a lot of fixes 2025-12-29 00:38:29 +01:00
Xuan Son Nguyen 1784a57e7b impl global_from_json 2025-12-28 23:15:48 +01:00
Xuan Son Nguyen 55fe96a9df add jinja-value.cpp 2025-12-28 22:49:31 +01:00
Xuan Son Nguyen c7f246e7a5 allow func to access ctx 2025-12-28 22:15:10 +01:00
Xuan Son Nguyen adad34f64d add filter_statement 2025-12-28 22:02:22 +01:00
Xuan Son Nguyen 9a8a45ff3b mostly works 2025-12-28 21:32:55 +01:00
Xuan Son Nguyen 45df0c91e7 testing more templates 2025-12-28 19:50:09 +01:00
Xuan Son Nguyen db09a7468d fix negate test 2025-12-28 19:07:01 +01:00
Xuan Son Nguyen acb0effa25 allow print source on exception 2025-12-28 18:45:41 +01:00
Xuan Son Nguyen 64e29a5848 add mk_stmt 2025-12-28 17:48:14 +01:00
Xuan Son Nguyen 7f17608ea4 use shared_ptr for values 2025-12-28 17:46:25 +01:00
Xuan Son Nguyen 4331e9c8e9 keyword arguments and slicing array 2025-12-28 17:23:29 +01:00
Xuan Son Nguyen 45c194622e support binded functions 2025-12-28 15:33:14 +01:00
Xuan Son Nguyen 4ca114b095 track input string even after transformations 2025-12-28 12:48:35 +01:00
Xuan Son Nguyen 81310d29c1 render gemma tmpl ok 2025-12-28 12:04:23 +01:00
Xuan Son Nguyen 10835f2720 eval with is_user_input 2025-12-27 23:25:20 +01:00
Xuan Son Nguyen c08f4ddf01 use mk_val 2025-12-27 22:28:54 +01:00
Xuan Son Nguyen da7bbe5813 wip 2025-12-27 22:25:19 +01:00
Xuan Son Nguyen 7ed11f78f9 add more builtins 2025-12-27 22:10:45 +01:00
Xuan Son Nguyen 15b3dbab05 add string builtins 2025-12-27 21:52:50 +01:00
Xuan Son Nguyen 5a041e65b8 fix map object 2025-12-27 20:38:06 +01:00
Xuan Son Nguyen d8ef00e610 bin ops works! 2025-12-27 20:16:46 +01:00
Xuan Son Nguyen 8d1e9a0d12 shadow naming 2025-12-27 16:06:23 +01:00
Xuan Son Nguyen 7ad6eb39ca binary_expression::execute 2025-12-27 16:00:07 +01:00
Xuan Son Nguyen 8cea1ed6b0 parser ok 2025-12-27 12:55:01 +01:00
Xuan Son Nguyen 7ac8e98b28 clean up 2025-12-27 12:35:19 +01:00
Xuan Son Nguyen a6e0ae7a85 demo 2025-12-27 12:22:34 +01:00
Xuan Son Nguyen a35fcb00b5 add vm types 2025-12-27 12:12:07 +01:00
Xuan Son Nguyen 15b7c50e95 lexer 2025-12-25 21:08:51 +01:00
Xuan Son Nguyen 8d8030142e jinja vm 2025-12-25 00:19:23 +01:00
ddh0 10355dc7d0
common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg (#18267) 2025-12-24 14:19:12 +08:00
Johannes Gäßler 147a521636
tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
Aldehir Rojas 9496bbb808
common : reorganize includes to prioritize vendored deps (#18222) 2025-12-20 21:43:21 -06:00
Xuan-Son Nguyen ddcb75dd8a
server: add auto-sleep after N seconds of idle (#18228)
* implement sleeping at queue level

* implement server-context suspend

* add test

* add docs

* optimization: add fast path

* make sure to free llama_init

* nits

* fix use-after-free

* allow /models to be accessed during sleeping, fix use-after-free

* don't allow accessing /models during sleep, it is not thread-safe

* fix data race on accessing props and model_meta

* small clean up

* trailing whitespace

* rm outdated comments
2025-12-21 02:24:42 +01:00
Xuan-Son Nguyen 9e39a1e6a9
server: support load model on startup, support preset-only options (#18206)
* server: support autoload model, support preset-only options

* add docs

* load-on-startup

* fix

* Update common/arg.cpp

Co-authored-by: Pascal <admin@serveurperso.com>

---------

Co-authored-by: Pascal <admin@serveurperso.com>
2025-12-20 09:25:27 +01:00
Pascal 14931a826e
arg: fix order to use short form before long form (#18196)
* arg: fix order to use short form before long form

* arg: update doc

* arg: update test-arg-parser

* arg: address review feedback from ngxson

simplified to check first.length() <= last.length() only
fixed: --sampler-seq, --rerank, --draft ordering
note: middle positions in 3+ arg sets are not verified

* arg: update doc
2025-12-19 18:01:56 +01:00
Xuan-Son Nguyen 98c1c7a7bf
presets: refactor, allow cascade presets from different sources, add global section (#18169)
* presets: refactor, allow cascade presets from different sources

* update docs

* fix neg arg handling

* fix empty mmproj

* also filter out server-controlled args before to_ini()

* skip loading custom_models if not specified

* fix unset_reserved_args

* fix crash on windows
2025-12-19 12:08:20 +01:00
Xuan-Son Nguyen 8ea958d4d9
model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106)
* ASR with LFM2-Audio-1.5B

* Set rope_theta

* Fix comment

* Remove rope_theta setting

* Address PR feedback

* rename functions to conformer

* remove some redundant ggml_cont

* fix missing tensor

* add prefix "a." for conv tensors

* remove redundant reshape

* clean up

* add test model

---------

Co-authored-by: Tarek Dakhran <tarek@liquid.ai>
2025-12-19 00:18:01 +01:00
Xuan-Son Nguyen 4d1316c440
arg: fix ASAN error on sampler_type_names empty (#18167) 2025-12-18 14:30:32 +01:00