* webui: auto-refresh /props on inference start to resync model metadata
- Add no-cache headers to /props and /slots
- Throttle slot checks to 30s
- Prevent concurrent fetches with promise guard
- Trigger refresh from chat streaming for legacy and ModelSelector
- Show dynamic serverWarning when using cached data
* fix: restore proper legacy behavior in webui by using unified /props refresh
Updated assistant message bubbles to show each message's stored model when available,
falling back to the current server model only when the per-message value is missing
When the model selector is disabled, now fetches /props and prioritizes that model name
over chunk metadata, then persists it with the streamed message so legacy mode properly
reflects the backend configuration
* fix: detect first valid SSE chunk and refresh server props once
* fix: removed the slots availability throttle constant and state
* webui: purge ai-generated cruft
* chore: update webui static build
* webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe dialog
Extended MarkdownContent to flag previewable code languages,
add a preview button alongside copy controls, manage preview
dialog state, and share styling for the new button group
Introduced CodePreviewDialog.svelte, a sandboxed iframe modal
for rendering HTML/JS previews with consistent dialog controls
* webui: fullscreen HTML preview dialog using bits-ui
* Update tools/server/webui/src/lib/components/app/misc/CodePreviewDialog.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* Update tools/server/webui/src/lib/components/app/misc/MarkdownContent.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* webui: pedantic style tweak for CodePreviewDialog close button
* webui: remove overengineered preview language logic
* chore: update webui static build
---------
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* webui: recognize AsciiDoc files as valid text files
* webui: add an updated static webui build
* webui: add the updated dependency list
* webui: re-add an updated static webui build
This also reverts commit 742dbb8379.
* vulkan: fuse mul_mat+add and mul_mat_id+add_id
The fusion is only applied for the mat-vec mul paths.
* Apply suggestions from code review
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* fix 32b build
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* CUDA: Remove unneded bias/gate dims in fused mmvq
Pointed out
[here](https://github.com/ggml-org/llama.cpp/pull/16847#discussion_r2476798989)
that only a single value is needed per target col per thread
* Apply suggestions from code review
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Fix "Error 991-D: extra braces are nonstandard" during compilation
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* CUDA: Volta tensor core support for MMF
* more generic checks for hardware support
* Update ggml/src/ggml-cuda/mmf.cuh
Co-authored-by: Aman Gupta <amangupta052@gmail.com>
---------
Co-authored-by: Aman Gupta <amangupta052@gmail.com>
* Experimenting crash fix
* added assert for aborting and fixed comment
* changed to check if a pipeline is empty or not
* Moved function in class definition
* replaced with is_empty
* Modified is_empty to check only unaligned pipelines
* respect input size when getting/setting tensor data
allows partial repacking/copying when get tensor size is smaller than the actual tensor
* Removed duplicate repack_mxfp4_mxfp4x4x2 function