Commit Graph

699 Commits

Author SHA1 Message Date
Xuan Son Nguyen cbb37dd4cd improve function args handling 2025-12-31 11:29:40 +01:00
Xuan Son Nguyen 1b213ae5e7 add placeholder for tojson 2025-12-30 21:52:47 +01:00
Xuan Son Nguyen 4479c382ce demo: type inferrence 2025-12-30 17:26:23 +01:00
Xuan Son Nguyen 9c0fa6f810 rm workarounds 2025-12-30 16:07:23 +01:00
Xuan Son Nguyen 9e9a70f72f more fixes 2025-12-29 15:07:18 +01:00
Xuan Son Nguyen 026730e8e3 more fix, more tests 2025-12-29 12:53:31 +01:00
Xuan Son Nguyen 1cf25734a9 more tests 2025-12-29 10:53:32 +01:00
Xuan Son Nguyen 2a31c9a30c a lot of fixes 2025-12-29 00:38:29 +01:00
Xuan Son Nguyen 1784a57e7b impl global_from_json 2025-12-28 23:15:48 +01:00
Xuan Son Nguyen 55fe96a9df add jinja-value.cpp 2025-12-28 22:49:31 +01:00
Xuan Son Nguyen c7f246e7a5 allow func to access ctx 2025-12-28 22:15:10 +01:00
Xuan Son Nguyen adad34f64d add filter_statement 2025-12-28 22:02:22 +01:00
Xuan Son Nguyen 9a8a45ff3b mostly works 2025-12-28 21:32:55 +01:00
Xuan Son Nguyen 45df0c91e7 testing more templates 2025-12-28 19:50:09 +01:00
Xuan Son Nguyen db09a7468d fix negate test 2025-12-28 19:07:01 +01:00
Xuan Son Nguyen acb0effa25 allow print source on exception 2025-12-28 18:45:41 +01:00
Xuan Son Nguyen 64e29a5848 add mk_stmt 2025-12-28 17:48:14 +01:00
Xuan Son Nguyen 7f17608ea4 use shared_ptr for values 2025-12-28 17:46:25 +01:00
Xuan Son Nguyen 4331e9c8e9 keyword arguments and slicing array 2025-12-28 17:23:29 +01:00
Xuan Son Nguyen 45c194622e support binded functions 2025-12-28 15:33:14 +01:00
Xuan Son Nguyen 4ca114b095 track input string even after transformations 2025-12-28 12:48:35 +01:00
Xuan Son Nguyen 81310d29c1 render gemma tmpl ok 2025-12-28 12:04:23 +01:00
Xuan Son Nguyen 10835f2720 eval with is_user_input 2025-12-27 23:25:20 +01:00
Xuan Son Nguyen c08f4ddf01 use mk_val 2025-12-27 22:28:54 +01:00
Xuan Son Nguyen da7bbe5813 wip 2025-12-27 22:25:19 +01:00
Xuan Son Nguyen 7ed11f78f9 add more builtins 2025-12-27 22:10:45 +01:00
Xuan Son Nguyen 15b3dbab05 add string builtins 2025-12-27 21:52:50 +01:00
Xuan Son Nguyen 5a041e65b8 fix map object 2025-12-27 20:38:06 +01:00
Xuan Son Nguyen d8ef00e610 bin ops works! 2025-12-27 20:16:46 +01:00
Xuan Son Nguyen 8d1e9a0d12 shadow naming 2025-12-27 16:06:23 +01:00
Xuan Son Nguyen 7ad6eb39ca binary_expression::execute 2025-12-27 16:00:07 +01:00
Xuan Son Nguyen 8cea1ed6b0 parser ok 2025-12-27 12:55:01 +01:00
Xuan Son Nguyen 7ac8e98b28 clean up 2025-12-27 12:35:19 +01:00
Xuan Son Nguyen a6e0ae7a85 demo 2025-12-27 12:22:34 +01:00
Xuan Son Nguyen a35fcb00b5 add vm types 2025-12-27 12:12:07 +01:00
Xuan Son Nguyen 15b7c50e95 lexer 2025-12-25 21:08:51 +01:00
Xuan Son Nguyen 8d8030142e jinja vm 2025-12-25 00:19:23 +01:00
ddh0 10355dc7d0
common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg (#18267) 2025-12-24 14:19:12 +08:00
Johannes Gäßler 147a521636
tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
Aldehir Rojas 9496bbb808
common : reorganize includes to prioritize vendored deps (#18222) 2025-12-20 21:43:21 -06:00
Xuan-Son Nguyen ddcb75dd8a
server: add auto-sleep after N seconds of idle (#18228)
* implement sleeping at queue level

* implement server-context suspend

* add test

* add docs

* optimization: add fast path

* make sure to free llama_init

* nits

* fix use-after-free

* allow /models to be accessed during sleeping, fix use-after-free

* don't allow accessing /models during sleep, it is not thread-safe

* fix data race on accessing props and model_meta

* small clean up

* trailing whitespace

* rm outdated comments
2025-12-21 02:24:42 +01:00
Xuan-Son Nguyen 9e39a1e6a9
server: support load model on startup, support preset-only options (#18206)
* server: support autoload model, support preset-only options

* add docs

* load-on-startup

* fix

* Update common/arg.cpp

Co-authored-by: Pascal <admin@serveurperso.com>

---------

Co-authored-by: Pascal <admin@serveurperso.com>
2025-12-20 09:25:27 +01:00
Pascal 14931a826e
arg: fix order to use short form before long form (#18196)
* arg: fix order to use short form before long form

* arg: update doc

* arg: update test-arg-parser

* arg: address review feedback from ngxson

simplified to check first.length() <= last.length() only
fixed: --sampler-seq, --rerank, --draft ordering
note: middle positions in 3+ arg sets are not verified

* arg: update doc
2025-12-19 18:01:56 +01:00
Xuan-Son Nguyen 98c1c7a7bf
presets: refactor, allow cascade presets from different sources, add global section (#18169)
* presets: refactor, allow cascade presets from different sources

* update docs

* fix neg arg handling

* fix empty mmproj

* also filter out server-controlled args before to_ini()

* skip loading custom_models if not specified

* fix unset_reserved_args

* fix crash on windows
2025-12-19 12:08:20 +01:00
Xuan-Son Nguyen 8ea958d4d9
model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106)
* ASR with LFM2-Audio-1.5B

* Set rope_theta

* Fix comment

* Remove rope_theta setting

* Address PR feedback

* rename functions to conformer

* remove some redundant ggml_cont

* fix missing tensor

* add prefix "a." for conv tensors

* remove redundant reshape

* clean up

* add test model

---------

Co-authored-by: Tarek Dakhran <tarek@liquid.ai>
2025-12-19 00:18:01 +01:00
Xuan-Son Nguyen 4d1316c440
arg: fix ASAN error on sampler_type_names empty (#18167) 2025-12-18 14:30:32 +01:00
Pascal 6ce3d85796
server: (webui) add --webui-config (#18028)
* server/webui: add server-side WebUI config support

Add CLI arguments --webui-config (inline JSON) and --webui-config-file
(file path) to configure WebUI default settings from server side.

Backend changes:
- Parse JSON once in server_context::load_model() for performance
- Cache parsed config in webui_settings member (zero overhead on /props)
- Add proper error handling in router mode with try/catch
- Expose webui_settings in /props endpoint for both router and child modes

Frontend changes:
- Add 14 configurable WebUI settings via parameter sync
- Add tests for webui settings extraction
- Fix subpath support with base path in API calls

Addresses feedback from @ngxson and @ggerganov

* server: address review feedback from ngxson

* server: regenerate README with llama-gen-docs
2025-12-17 21:45:45 +01:00
Georgi Gerganov 4301e27319
common : restore grammar-based rejection sampling (#18137)
* common : restart grammar-based rejection sampling

* sampling : allow null samplers
2025-12-17 19:46:00 +02:00
Johannes Gäßler a2c199e479
common: clarify instructions for bug reports (#18134) 2025-12-17 18:44:13 +01:00
Pascal 487674fbb3
common: fix --override-kv to support comma-separated values (#18056)
* common: fix --override-kv to support comma-separated values

* Update common/arg.cpp

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

* common: deprecate repeated arguments, suggest comma-separated values

* common: add comma escape support for --override-kv

* common: optimize duplicate detection with insert().second

Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com>

* common: migrate all repeated args to comma-separated syntax

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com>
2025-12-17 11:36:23 +02:00