Georgi Gerganov
b8eb3b3501
wip fix tests
2025-12-06 16:13:27 +02:00
Oliver Simons
7668999518
Merge branch 'master' into gpu-sampling
...
Let's keep `master's` cumsum implementation for it's likely better AMD
perf and add back pure-CUB-implementation in follow-up commit
2025-12-05 14:41:08 +01:00
Georgi Gerganov
6958d41366
sampling : check backend support during init
2025-12-04 17:29:08 +02:00
Xuan-Son Nguyen
c4c10bfb86
server: move msg diffs tracking to HTTP thread ( #17740 )
...
* server: move msg diffs tracking to HTTP thread
* wip
* tool call tests ok
* minor : style
* cont : fix
* move states to server_response_reader
* add safe-guard
* fix
* fix 2
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-12-04 15:46:08 +01:00
Daniel Bevenius
c0b182f4d6
Merge remote-tracking branch 'upstream/master' into backend-sampling
2025-12-04 08:17:50 +01:00
Xuan-Son Nguyen
13628d8bdb
server: add --media-path for local media files ( #17697 )
...
* server: add --media-path for local media files
* remove unused fn
2025-12-02 22:49:20 +01:00
Daniel Bevenius
2595818a68
Merge remote-tracking branch 'upstream/master' into backend-sampling
2025-12-02 12:07:01 +01:00
Xuan-Son Nguyen
5d6bd842ea
server: remove default "gpt-3.5-turbo" model name ( #17668 )
...
* server: remove default "gpt-3.5-turbo" model name
* do not reflect back model name from request
* fix test
2025-12-02 11:38:57 +01:00
Daniel Bevenius
3e9a258c14
Merge remote-tracking branch 'upstream/master' into gpu-sampling
2025-12-02 09:26:04 +01:00
Xuan-Son Nguyen
ecf74a8417
mtmd: add mtmd_context_params::warmup option ( #17652 )
...
* mtmd: add mtmd_context_params::warmup option
* reuse the common_params::warmup
2025-12-01 21:32:25 +01:00
Georgi Gerganov
c187003d81
llama : naming
2025-11-30 00:05:47 +02:00
Georgi Gerganov
467746e3ad
Merge branch 'master' into HEAD
2025-11-29 23:17:25 +02:00
Xuan-Son Nguyen
ab49f094d2
server: move server-context to its own cpp|h ( #17595 )
...
* git mv
* add server-context.h
* add server-context.h
* clean up headers
* cont : cleanup
* also expose server_response_reader (to be used by CLI)
* fix windows build
* decouple server_routes and server_http
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-29 22:04:44 +01:00