Commit Graph

1 Commits

Author SHA1 Message Date
Xuan-Son Nguyen ddcb75dd8a
server: add auto-sleep after N seconds of idle (#18228)
* implement sleeping at queue level

* implement server-context suspend

* add test

* add docs

* optimization: add fast path

* make sure to free llama_init

* nits

* fix use-after-free

* allow /models to be accessed during sleeping, fix use-after-free

* don't allow accessing /models during sleep, it is not thread-safe

* fix data race on accessing props and model_meta

* small clean up

* trailing whitespace

* rm outdated comments
2025-12-21 02:24:42 +01:00