Leszek Hanusz
fd3cb9bbdd
Merge branch 'master' into notebook
2026-02-17 01:57:31 +01:00
AesSedai
d612901116
perplexity: add proper batching ( #19661 )
2026-02-16 18:44:44 +02:00
Leszek Hanusz
2377b8c81e
Merge branch 'master' into notebook
2026-02-16 02:22:25 +01:00
Adrien Gallouët
9e118b97c4
build : remove LLAMA_HTTPLIB option ( #19623 )
...
This option was introduced as a workaround because cpp-httplib could not
build on visionOS. Since it has been fixed and now compiles on all platforms,
we can remove it and simplify many things.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-02-15 15:38:50 +01:00
Anav Prasad
01d8eaa28d
mtmd : Add Nemotron Nano 12B v2 VL support ( #19547 )
...
* nemotron nano v2 vlm support added
* simplified code; addressed reviews
* pre-downsample position embeddings during GGUF conversion for fixed input size
2026-02-14 14:07:00 +01:00
iMil
badba89320
NetBSD build support ( #19589 )
2026-02-14 09:47:01 +01:00
Aleksander Grygier
baa12f3831
webui: Architecture and UI improvements ( #19596 )
2026-02-14 09:06:41 +01:00
Sigbjørn Skjæret
b2ecc0cdb4
support --verbose-prompt ( #19576 )
2026-02-13 12:49:10 +01:00
Aleksander Grygier
5174d7206f
webui: UI and routing fixes ( #19586 )
...
* chore: update webui build output
* chore: update webui build output
* fix: Scroll issues in DropdownMenuSearchable
* webui: fix redirect to root ignoring base path
* fix: Word wrapping
* fix: remove obsolete modality UI tests causing CI failures
- Remove VisionModality/AudioModality test stories
- Remove mockServerProps usage and imports
- Simplify Default test (remove dropdown interaction checks)
- Simplify FileAttachments test (remove mocks)
* feat: Improve formatting performance time
---------
Co-authored-by: Pascal <admin@serveurperso.com>
2026-02-13 12:31:00 +01:00
Aleksander Grygier
4c61875bf8
webui: Add switcher to Chat Message UI to show raw LLM output ( #19571 )
2026-02-12 19:55:51 +01:00
Aleksander Grygier
4d688f9ebb
(webui) FEATURE: Enable adding or injecting System Message into chat ( #19556 )
...
* feat: Enable adding System Prompt per-chat
* fix: Save draft message in Chat Form when adding System Prompt from new chat view
* fix: Proper system message deletion logic
* chore: Formatting
* chore: update webui build output
2026-02-12 13:56:08 +01:00
Aleksander Grygier
f486ce9f30
(webui) REFACTOR: UI primitives and polish ( #19551 )
...
* webui: UI primitives and polish (non-MCP)
* chore: update webui build output
2026-02-12 12:21:00 +01:00
Aleksander Grygier
38adc7d469
WebUI Architecture Cleanup ( #19541 )
...
* webui: architecture foundation (non-MCP core refactors)
* chore: update webui build output
2026-02-12 11:22:27 +01:00
RichardScottOZ
fa16e517a3
server : fix typo in README.md for features list ( #19510 )
...
extra l for full
2026-02-12 08:56:25 +01:00
AesSedai
e463bbdf65
model: Add Kimi-K2.5 support ( #19170 )
...
* Move dequant_model to after the text_config merge
Add new kimi-k2.5 keys to mtmd convert
Update V_MMPROJ tensor mapping for new mm_projector.proj keys
Update V_M_IMP_NORM for new mm_projector.pre_norm key
* Fix a couple of oversights
* Add image support for Kimi-K2.5
* Revert changes to KimiVLForConditionalGeneration
* Fix an assert crash
* Fix permute swapping w / h on accident
* Kimi-K2.5: Use merged QKV for vision
* Kimi-K2.5: pre-convert vision QK to use build_rope_2d
* Kimi-K2.5: support non-interleaved rope for vision
* Kimi-K2.5: fix min / max pixel
* Kimi-K2.5: remove v/o permutes, unnecessary
* Kimi-K2.5: update permute name to match
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Kimi-K2.5: replace build_rope_2d ggml_cont with ggml_view_3d pointers
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-02-11 16:47:30 +01:00
Georgi Gerganov
6d95707827
model : fix wavtokenizer embedding notions ( #19479 )
2026-02-11 07:52:20 +02:00
JJJYmmm
fc0fe40049
models : support qwen3.5 series ( #19468 )
...
* support qwen3.5 series
* remove deepstack for now, and some code clean
* code clean
* add FULL_ATTENTION_INTERVAL metadata
* code clean
* reorder v heads for linear attention to avoid expensive interleaved repeat
2026-02-10 18:00:26 +02:00
Daniel Bevenius
66d403c480
tts : fix typos in README.md [no ci] ( #19463 )
2026-02-10 07:30:41 +01:00
Leszek Hanusz
8a6843aac1
Fix ApiChatCompletionRequest
2026-02-10 03:14:14 +01:00
Leszek Hanusz
8e125febc9
Don't use ChatService.notifyTimings
2026-02-10 01:54:05 +01:00
Leszek Hanusz
a35e4c4d81
Use a separate callbacks argument for sendCompletion
2026-02-10 01:20:14 +01:00
Leszek Hanusz
8f79f1fccb
Removing non-stream /completion implementation + fix api
2026-02-10 00:39:26 +01:00
Tarek Dakhran
262364e31d
mtmd: Implement tiling for LFM2-VL ( #19454 )
2026-02-09 17:30:32 +01:00
손희준
820ebfa6f4
Server: log when converting requests to chat completions format ( #19457 )
...
* Log converting requests
* Print as debug instead of info [no ci]
---------
Co-authored-by: openingnow <>
2026-02-09 16:22:57 +01:00
Sascha Rogmann
292f6908cd
spec : remove check rate ( #19377 )
...
* spec: remove parameter spec-ngram-check-rate
* spec : renamed statistics vars
* spec : add n_call_begin, n_call_accept
* spec : don't enable key-map-stats
2026-02-09 15:30:50 +02:00
Adrien Gallouët
5fa1c190d9
rpc : update from common.cpp ( #19400 )
...
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-02-08 09:06:45 +01:00
Georgi Gerganov
eb449cdfa4
server : improve context checkpoint logic ( #19408 )
2026-02-08 09:40:04 +02:00
ddh0
5999b50eb0
llama-quantize : cleanup `--help` output ( #19317 )
...
* cleanup `llama-quantize --help` output
some much needed TLC
* remove future argument
oops, spoiler
* cleanup of cleanup
2026-02-08 09:22:38 +02:00
Georgi Gerganov
dfde5993ea
common : add common_speculative_is_compat() ( #19270 )
...
* llama : add llama_memory_can_rm_suffix()
* Revert "llama : add llama_memory_can_rm_suffix()"
This reverts commit d30e59b62a .
* spec : check if the target context is compatible for spec decoding
2026-02-06 16:47:22 +02:00
Leszek Hanusz
a0c5c26fb9
Fix calculation of total tokens after undo/redo
2026-02-05 02:33:39 +01:00
Leszek Hanusz
4659a36ffd
Add 42px min height to the statistics to avoid flickering height problems + remove unused imports
2026-02-04 18:44:22 +01:00
Leszek Hanusz
77dc99cd9a
Remove [DONE] check
2026-02-04 18:11:27 +01:00
Leszek Hanusz
031e426005
Run npm run format
2026-02-04 16:31:44 +01:00
Leszek Hanusz
393faf0166
Put completion api service in separate file
2026-02-04 16:29:53 +01:00
Leszek Hanusz
251ba9d72a
Put tokenize in a separate file
2026-02-04 15:58:54 +01:00
Leszek Hanusz
efd274ab3d
chore: update webui build output
2026-02-04 14:25:20 +01:00
Daniel Bevenius
25f40ca65f
completion : simplify batch (embd) processing ( #19286 )
...
* completion : simplify batch (embd) processing
This commit simplifies the processing of embd by removing the for loop
that currently exists which uses params.n_batch as its increment. This
commit also removes the clamping of n_eval as the size of embd is always
at most the size of params.n_batch.
The motivation is to clarify the code as it is currently a little
confusing when looking at this for loop in isolation and thinking that
it can process multiple batches.
* add an assert to verify n_eval is not greater than n_batch
2026-02-04 05:43:28 +01:00
Leszek Hanusz
ad3b8df38f
Remove currentConfig.model
2026-02-04 02:03:59 +01:00
Leszek Hanusz
f20b17a087
Remove inputContent var and use tokenize only when needed
2026-02-04 01:23:24 +01:00
Leszek Hanusz
9cf4742adb
Fix tokenize with router on
2026-02-04 00:21:56 +01:00
Leszek Hanusz
03077cf297
Merge branch 'master' into notebook
2026-02-03 03:04:31 +01:00
Leszek Hanusz
210dc6a2c0
Running npm run format
2026-02-03 02:27:10 +01:00
Leszek Hanusz
9dc75f2664
Fix npm run check errors
2026-02-03 02:22:32 +01:00
Leszek Hanusz
f42d889a47
Fix vertical alignment of Generate tooltip shortcut info
2026-02-03 02:14:28 +01:00
Leszek Hanusz
fb2095e815
Show total number of tokens by using tokenizer
2026-02-03 01:50:52 +01:00
Leszek Hanusz
3657a8a7ad
Implement shortcuts for the notebook page
2026-02-02 23:59:36 +01:00
Leszek Hanusz
7892b259cb
Add last undo/redo for notebook page
2026-02-02 22:39:07 +01:00
Leszek Hanusz
f041a864ed
Use same dialog for server errors on notebook page
2026-02-02 21:29:48 +01:00
Leszek Hanusz
11e3cd81ce
Protect window from accidental closure if the notebook is not empty as it is not saved
2026-02-02 21:15:24 +01:00
Xuan-Son Nguyen
07a7412a3b
mtmd: add min/max pixels gguf metadata ( #19273 )
2026-02-02 20:59:06 +01:00