Commit Graph

13 Commits

Author SHA1 Message Date
Georgi Gerganov b8eb3b3501
wip fix tests 2025-12-06 16:13:27 +02:00
Oliver Simons 7668999518 Merge branch 'master' into gpu-sampling
Let's keep `master's` cumsum implementation for it's likely better AMD
perf and add back pure-CUB-implementation in follow-up commit
2025-12-05 14:41:08 +01:00
Georgi Gerganov 6958d41366
sampling : check backend support during init 2025-12-04 17:29:08 +02:00
Xuan-Son Nguyen c4c10bfb86
server: move msg diffs tracking to HTTP thread (#17740)
* server: move msg diffs tracking to HTTP thread

* wip

* tool call tests ok

* minor : style

* cont : fix

* move states to server_response_reader

* add safe-guard

* fix

* fix 2

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-12-04 15:46:08 +01:00
Daniel Bevenius c0b182f4d6
Merge remote-tracking branch 'upstream/master' into backend-sampling 2025-12-04 08:17:50 +01:00
Xuan-Son Nguyen 13628d8bdb
server: add --media-path for local media files (#17697)
* server: add --media-path for local media files

* remove unused fn
2025-12-02 22:49:20 +01:00
Daniel Bevenius 2595818a68
Merge remote-tracking branch 'upstream/master' into backend-sampling 2025-12-02 12:07:01 +01:00
Xuan-Son Nguyen 5d6bd842ea
server: remove default "gpt-3.5-turbo" model name (#17668)
* server: remove default "gpt-3.5-turbo" model name

* do not reflect back model name from request

* fix test
2025-12-02 11:38:57 +01:00
Daniel Bevenius 3e9a258c14 Merge remote-tracking branch 'upstream/master' into gpu-sampling 2025-12-02 09:26:04 +01:00
Xuan-Son Nguyen ecf74a8417
mtmd: add mtmd_context_params::warmup option (#17652)
* mtmd: add mtmd_context_params::warmup option

* reuse the common_params::warmup
2025-12-01 21:32:25 +01:00
Georgi Gerganov c187003d81
llama : naming 2025-11-30 00:05:47 +02:00
Georgi Gerganov 467746e3ad
Merge branch 'master' into HEAD 2025-11-29 23:17:25 +02:00
Xuan-Son Nguyen ab49f094d2
server: move server-context to its own cpp|h (#17595)
* git mv

* add server-context.h

* add server-context.h

* clean up headers

* cont : cleanup

* also expose server_response_reader (to be used by CLI)

* fix windows build

* decouple server_routes and server_http

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-29 22:04:44 +01:00