llama.cpp

Commit Graph

Author	SHA1	Message	Date
Aleksander Grygier	8a95ec3ea6	feat: Improve MCP Server selection UI + lazy load health checks	2026-01-19 19:01:32 +01:00
Aleksander Grygier	cafb9c09d3	feat: UI improvements	2026-01-19 16:56:02 +01:00
Aleksander Grygier	54192b05fb	feat: Simplify MCP server enabling logic per chat Refactors MCP server enabling logic to remove the dependency on global settings. This simplifies the logic by directly checking the per-chat override status, and removes the need to pass the global enabled state as a parameter. Additionally: - Only shows MCP servers that are enabled in settings in the selector. - Sorts the servers by whether they are enabled for the current chat.	2026-01-19 16:43:53 +01:00
Aleksander Grygier	62ed7f112d	chore: update webui build output	2026-01-19 16:26:16 +01:00
Aleksander Grygier	d37683942b	fix: Missing onModelChange callback running assistant message re-generation	2026-01-19 16:25:49 +01:00
Pascal	d6dfe8e064	chore: update webui build output	2026-01-19 12:12:52 +01:00
Pascal	058929d453	fix: acurate tool_response display	2026-01-19 12:11:06 +01:00
Pascal	d92b621346	fix: unify MCP server label logic with simplified fallback	2026-01-18 13:10:03 +01:00
Pascal	16a03eea36	chore: update webui build output	2026-01-18 10:43:45 +01:00
Pascal	d8af98f1ed	refactor: remove multimodal validation from model selector Remove all frontend validation logic that prevented users from selecting models based on multimodal capabilities. This refactoring removes restrictive UI code while maintaining full functionality - Vision models can describe images as text - That text remains useful for non-vision models - Chaining vision -> non-vision is a valid workflow - Users know their use case better than the UI - Users can return to vision models when needed	2026-01-18 10:42:01 +01:00
Pascal	5c28b7a2ee	chore: update webui build output	2026-01-17 18:38:50 +01:00
Pascal	fca7177eae	fix: ignore assistant attachments (MCP) for modality detection	2026-01-17 18:36:41 +01:00
Pascal	3572667788	chore: update webui build output	2026-01-17 16:35:54 +01:00
Pascal	506da17931	refactor: eliminate MCP circular dependency - Change architecture from mcpStore <-> mcpClient to mcpClient -> mcpStore - Remove bidirectional callback pattern (setCallback, notify methods) - Add updateState/updateHealthCheck public methods in mcpStore - Replace callback calls with direct mcpStore method calls - Remove unused imports (browser, HealthCheckState) and constructor - Fixes CI: ReferenceError Cannot access mcpClient before initialization	2026-01-17 16:30:42 +01:00
Pascal	9b3417703f	fix: remove obsolete modality UI tests causing CI failures - Remove VisionModality/AudioModality test stories - Remove mockServerProps usage and imports - Simplify Default test (remove dropdown interaction checks) - Simplify FileAttachments test (remove mocks)	2026-01-17 16:30:36 +01:00
Pascal	a723238245	chore: update webui build output	2026-01-16 19:52:23 +01:00
Pascal	229aba7c3e	fix: strip reasoning content and UI proprietary tags from prompts TODO: add toggle and ensure backend API compliance for reasoning format	2026-01-16 19:50:36 +01:00
Pascal	f09395821b	chore: update webui build output	2026-01-16 15:22:46 +01:00
Pascal	78c6380222	refactor: remove reasoning after first turn filter	2026-01-16 15:19:50 +01:00
Pascal	2973c64609	refactor: inline reasoning with tags, remove fixed thinking field	2026-01-16 15:19:42 +01:00
Xuan-Son Nguyen	c15395f73c	common : implement new jinja template engine (#18462 ) * jinja vm * lexer * add vm types * demo * clean up * parser ok * binary_expression::execute * shadow naming * bin ops works! * fix map object * add string builtins * add more builtins * wip * use mk_val * eval with is_user_input * render gemma tmpl ok * track input string even after transformations * support binded functions * keyword arguments and slicing array * use shared_ptr for values * add mk_stmt * allow print source on exception * fix negate test * testing more templates * mostly works * add filter_statement * allow func to access ctx * add jinja-value.cpp * impl global_from_json * a lot of fixes * more tests * more fix, more tests * more fixes * rm workarounds * demo: type inferrence * add placeholder for tojson * improve function args handling * rm type inference * no more std::regex * trailing spaces * make testing more flexible * make output a bit cleaner * (wip) redirect minja calls * test: add --output * fix crash on macro kwargs * add minimal caps system * add some workarounds * rm caps_apply_workarounds * get rid of preprocessing * more fixes * fix test-chat-template * move test-chat-jinja into test-chat-template * rm test-chat-jinja from cmake * test-chat-template: use common * fix build * fix build (2) * rename vm --> interpreter * improve error reporting * correct lstrip behavior * add tojson * more fixes * disable tests for COMMON_CHAT_FORMAT_GENERIC * make sure tojson output correct order * add object.length * fully functional selectattr / rejectattr * improve error reporting * more builtins added, more fixes * create jinja rendering tests * fix testing.h path * adjust whitespace rules * more fixes * temporary disable test for ibm-granite * r/lstrip behavior matched with hf.js * minimax, glm4.5 ok * add append and pop * kimi-k2 ok * test-chat passed * fix lstrip_block * add more jinja tests * cast to unsigned char * allow dict key to be numeric * nemotron: rm windows newline * tests ok * fix test * rename interpreter --> runtime * fix build * add more checks * bring back generic format support * fix Apertus * [json.exception.out_of_range.403] key 'content' not found * rm generic test * refactor input marking * add docs * fix windows build * clarify error message * improved tests * split/rsplit with maxsplit * non-inverse maxsplit forgot to change after simplifying * implement separators for tojson and fix indent * i like to move it move it * rename null -- > none * token::eof * some nits + comments * add exception classes for lexer and parser * null -> none * rename global -> env * rm minja * update docs * docs: add input marking caveats * imlement missing jinja-tests functions * oops * support trim filter with args, remove bogus to_json reference * numerous argument fixes * updated tests * implement optional strip chars parameter * use new chars parameter * float filter also has default * always leave at least one decimal in float string * jinja : static analysis + header cleanup + minor fixes * add fuzz test * add string.cpp * fix chat_template_kwargs * nits * fix build * revert * unrevert sorry :) * add fuzz func_args, refactor to be safer * fix array.map() * loosen ensure_vals max count condition, add not impl for map(int) * hopefully fix windows * check if empty first * normalize newlines --------- Co-authored-by: Alde Rojas <hello@alde.dev> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2026-01-16 11:22:06 +01:00
Pascal	a1550ab77d	chore: update webui build output	2026-01-16 11:02:17 +01:00
Pascal	db37b712b2	feat: resolve MCP attachment images via rehype plugin LLM can reference tool-generated images using markdown links like, plugin resolves attachment names to base64 from message.extra when present, regular HTTP/data URLs pass through unchanged (no regression) - rehypeResolveAttachmentImages plugin in markdown pipeline - Pass message prop to MarkdownContent and AgenticContent - Force processor reactivity on message.extra changes - Filter assistant images from API context (display-only)	2026-01-16 10:49:28 +01:00
Pascal	a3c2144c1d	feat: persist base64 attachments from tool results	2026-01-16 08:07:20 +01:00
Pascal	a377605f60	webui: fix custom headers persistence in UI (derived)	2026-01-15 20:36:14 +01:00
Pascal	3360f60b94	webui: fix custom headers persistence in UI	2026-01-15 20:13:01 +01:00
ddh0	13f1e4a9ca	llama : add adaptive-p sampler (#17927 ) * initial commit for branch * simplify constants * add params to `struct common_params_sampling`, add reference to PR * explicitly clamp `min_target` and `max_target` to `[0.0, 1.0]` * add args, rename `queue_size` -> `window_size` * improved comments * minor * remove old unused code from algorithm * minor * add power law case to `common_sampler_init`, add sampler name mappings * clarify behaviour when `window_size = 0` * add missing enums * remove `target_range` param, make `target == 1` no-op, cleanup code * oops, straggler * add missing parameters in `server-task.cpp` * copy from author ref: https://gist.github.com/MrJackSpade/9be99c7efbba7b95a41377e123b7b069 * remove old debug log, style nit * fix compiler warning, add commented-out logging per token * re-write + change parameters + simplify * oops forgot args.cpp * fix leftover `window_size` * add missing values to `common_params_sampling::print()` * with logging * does this fix it? * no, but does this? * update default decay * optimize * fix bad merge my git skills are lacking * silence `missing initializer for member` * update default decay to 0.9 * fix logging * format (double) * add power law to the new `samplers` vector * log sampler init values * improve logging messages in llama_sampler_power_law * remove extraneous logging * simplify target computation last commit with debug logging! * remove debug logging, explicitly clamp params at init * add `use_power_law` flag + logic, minor cleanup * update `power-law` -> `adaptive-p` * fix cold start EMA - `ctx->weighted_sum` is now initialized and reset to `target / (1.0f - clamped_decay)` - `ctx->total_weight` is now initialized and reset to `1.0f / (1.0f - clamped_decay)` this fixes a "cold start" problem with the moving average * update `SHARPNESS` constant to `10.0f` * minor style fixes no functional changes * minor style fixes cont. * update `llama_sampler_adaptive_p_i` for backend sampling (ref: #17004) * separate into `apply` + `accept` functions * `pending_token_idx`: switch from `llama_token` to `int32` functionally identical (`llama.h` has `typedef int32_t llama_token;`), but its more correct now * don't transform logits <= -1e9f * fix masking in backend top-p, min-p * address review comments * typo in comments `RND` -> `RNG` * add docs * add recommended values in completion docs * address PR feedback * remove trailing whitespace (for CI `editorconfig`) * add to adaptive-p to `common_sampler_types_from_chars`	2026-01-15 19:16:29 +02:00
Aleksander Grygier	cffc3b46ae	fix: Word wrapping	2026-01-15 17:59:57 +01:00
Xuan-Son Nguyen	a04c2b06a3	server: improve slots scheduling for n_cmpl (#18789 ) * server : make sure children tasks are scheduled to launch with parent * fix * add comment pointing to this PR * fix * clean up * more debug messages * add pop_deferred_task with specific ID version * improve the logic * simple approach * no double move * correct return type of launch_slots_with_parent_task	2026-01-15 17:10:28 +01:00
Georgi Gerganov	39173bcacb	context : reserve new scheduler when graph topology changes (#18547 ) * context : reserve new scheduler when graph topology changes * cont : fix * cont : fix reserve * cont : reserve only when changes occur + timing * context : add comments * llama : reserve on sampler changes * common : allow null common_sampler * server : task declares needs (embd, logits, sampling) * server : do not init sampler if not needed * llama : fix need_reserve when unsetting a sampler * server : consolidate slot reset/clear logic	2026-01-15 16:39:17 +02:00
Aleksander Grygier	5417a439ef	chore: update webui build output	2026-01-15 11:39:10 +01:00
Aleksander Grygier	30a585bb96	feat: UI improvements	2026-01-14 17:32:57 +01:00
Aleksander Grygier	886939c550	chore: update webui build output	2026-01-14 14:39:32 +01:00
Aleksander Grygier	39848ee12f	feat: UI improvement	2026-01-14 14:26:41 +01:00
Aleksander Grygier	c1ac8d7326	chore: update webui build output	2026-01-14 13:22:01 +01:00
Aleksander Grygier	afdae742e3	Merge remote-tracking branch 'ggml-org/master' into allozaur/mcp-mvp	2026-01-14 13:20:25 +01:00
Aleksander Grygier	b11b32ea28	chore: update webui build output	2026-01-14 12:47:13 +01:00
Aleksander Grygier	06efeb6eb9	chore: update webui build output	2026-01-14 11:49:26 +01:00
Aleksander Grygier	f89bcb90ca	feat: MCP Server Details	2026-01-14 11:45:47 +01:00
Aleksander Grygier	120f3c978c	chore: update webui build output	2026-01-12 18:27:54 +01:00
Aleksander Grygier	5407b2efab	feat: MCP connection details WIP	2026-01-12 18:26:48 +01:00
Radoslav Gerganov	bcf7546160	server : add arg for disabling prompt caching (#18776 ) * server : add arg for disabling prompt caching Disabling prompt caching is useful for clients who are restricted to sending only OpenAI-compat requests and want deterministic responses. * address review comments * address review comments	2026-01-12 19:21:34 +02:00
Aleksander Grygier	0009c0c300	refactor: MCP types and health check	2026-01-12 18:12:08 +01:00
Aleksander Grygier	0180becb8b	chore: update webui build output	2026-01-12 15:26:46 +01:00
Aleksander Grygier	08c1acd1db	refactor: KeyValuePairs component	2026-01-12 15:25:43 +01:00
Aleksander Grygier	392a6dce0d	chore: update webui build output	2026-01-12 15:15:19 +01:00
Aleksander Grygier	a44332b528	refactor: DRY	2026-01-12 15:10:18 +01:00
Aleksander Grygier	80e829a248	chore: update webui build output	2026-01-12 14:49:11 +01:00
Aleksander Grygier	60ef752d0f	refactor: Architecture improvements	2026-01-12 14:45:24 +01:00
Aleksander Grygier	a63a421952	chore: update webui build output	2026-01-12 14:18:15 +01:00
Aleksander Grygier	58ab834b18	refactor: MCP state management + stores/clients relationship	2026-01-12 14:17:06 +01:00
Xuan-Son Nguyen	ce3bf9b1a4	server: update docs for sleeping [no ci] (#18777 )	2026-01-12 13:01:24 +01:00
Aleksander Grygier	9c53bd4486	chore: update webui build output	2026-01-12 11:16:18 +01:00
Aleksander Grygier	528a560a25	fix: Distinguish streaming vs incomplete tool calls in UI	2026-01-12 11:15:58 +01:00
Aleksander Grygier	aa9054367a	chore: update webui build output	2026-01-12 11:10:24 +01:00
Aleksander Grygier	cead02ee58	fix: Restore live reactive UI progress for tool calls	2026-01-12 11:07:56 +01:00
Aleksander Grygier	c6843d0054	chore: update webui build output	2026-01-12 11:02:42 +01:00
Aleksander Grygier	b5226ebd86	Merge origin/allozaur/mcp-mvp: enable streaming of tool call arguments Resolves conflicts by: - Keeping clean store architecture (agentic.svelte.ts delegates to client) - Updating agentic.client.ts to use TOOL_ARGS_START/END format - Accepting remote AgenticContent.svelte with direct JSON parsing - Updating ChatMessageAssistant to match new AgenticContent props	2026-01-12 10:55:34 +01:00
Aleksander Grygier	01dfe0ee4c	chore: update webui build output	2026-01-12 10:37:12 +01:00
Aleksander Grygier	144148125b	refactor: Cleanup	2026-01-12 10:28:59 +01:00
Pascal	a02acca38d	fix: reset tool call state between turns	2026-01-10 19:14:13 +01:00
Pascal	b7288a4dd7	webui: enable streaming of tool call arguments	2026-01-10 18:59:57 +01:00
Georgi Gerganov	f307926482	server : adjust unified KV cache tests (#18716 )	2026-01-10 17:51:56 +02:00
Xuan-Son Nguyen	9ac2693a30	server: fix n_cmpl not skipping processing prompt (#18663 ) * server: fix n_cmpl not skipping processing * fix infinite loop on empty batch * cont : init child samplers + modify child logic * cont : cleanup * cont : improve n_cmpl logic - launch the parent task first so it finds the slot with best cache - parent task waits for child tasks to be launched - when a child task finishes - remove its cache * cont : remove redundant function * cont : reduce parent checks * fix : nullptr task dereference --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2026-01-10 00:00:41 +01:00
Pascal	ec8fd7876b	Webui/file upload (#18694 ) * webui: fix restrictive file type validation * webui: simplify file processing logic * chore: update webui build output * webui: remove file picker extension whitelist (1/2) * webui: remove file picker extension whitelist (2/2) * chore: update webui build output * refactor: Cleanup * chore: update webui build output * fix: update ChatForm storybook test after removing accept attribute * chore: update webui build output * refactor: more cleanup * chore: update webui build output	2026-01-09 16:45:32 +01:00
Georgi Gerganov	53eb9435da	server : fix timing of prompt/generation (#18713 )	2026-01-09 12:59:50 +02:00
Georgi Gerganov	f5f8812f7c	server : use different seeds for child completions (#18700 ) * server : use different seeds for child completions * cont : handle default seed * cont : note	2026-01-09 09:33:50 +02:00
Pascal	74b119e81e	webui: prevent mobile dropdown immediate close on synthetic click	2026-01-08 22:48:56 +01:00
Pascal	d000d84201	webui: fix redirect to root ignoring base path	2026-01-08 15:33:23 +01:00
Aleksander Grygier	2c0add6a90	Merge remote-tracking branch 'origin/allozaur/mcp-mvp' into allozaur/mcp-mvp	2026-01-08 15:02:05 +01:00
Aleksander Grygier	e3ca595651	chore: update webui build output	2026-01-08 14:54:45 +01:00
Aleksander Grygier	6f7750489e	refactor: Types	2026-01-08 14:45:47 +01:00
Aleksander Grygier	dfd3031b17	refactor: Componentize McpServerCard	2026-01-08 14:18:30 +01:00
Aleksander Grygier	835c06e0d1	refactor: Cleanup	2026-01-08 14:18:12 +01:00
Aleksander Grygier	ddbb7dc2e5	fix: Remove redundant CSS class	2026-01-08 14:11:52 +01:00
Adrien Gallouët	55abc39355	vendor : update cpp-httplib to 0.30.0 (#18660 ) * vendor : update cpp-httplib to 0.30.0 * common : allow custom headers when downloading	2026-01-08 13:53:54 +01:00
Aleksander Grygier	bf2a793f42	refactor: Cleanup	2026-01-08 13:49:55 +01:00
Aleksander Grygier	089f38230c	feat: Add TruncatedText component	2026-01-08 13:02:46 +01:00
Aleksander Grygier	06febe08b7	fix: Collapsible box trigger	2026-01-08 12:48:15 +01:00
Aleksander Grygier	223c6333e9	refactor: Cleanup	2026-01-08 12:46:10 +01:00
Aleksander Grygier	b0ba550928	refactor: Cleanup	2026-01-08 12:03:36 +01:00
Aleksander Grygier	56b34bf63b	refactor: Collapsible Content Block & small fixes	2026-01-08 09:17:24 +01:00
Aleksander Grygier	d89ada8cee	chore: update webui build output	2026-01-07 15:46:32 +01:00
Aleksander Grygier	98bce85b1f	refactor: Cleanup	2026-01-07 15:44:23 +01:00
Aleksander Grygier	b9adc00d3f	chore: update webui build output	2026-01-07 14:27:48 +01:00
Aleksander Grygier	10e5ad1396	feat: UI improvements	2026-01-07 14:01:27 +01:00
Aleksander Grygier	bc07e0723d	feat: Always show Mcp Selector	2026-01-07 14:01:27 +01:00
Pascal	4c095df509	fix: remove double scrollbar in model selector by using Bits UI content available height	2026-01-07 12:23:03 +01:00
R	3d26a09dc7	server : add thinking content blocks to Anthropic Messages API (#18551 ) * server : add thinking content blocks to Anthropic Messages API Add support for returning reasoning/thinking content in Anthropic API responses when using models with --reasoning-format deepseek and the thinking parameter enabled. - Non-streaming: adds thinking block before text in content array - Streaming: emits thinking_delta events with correct block indices - Partial streaming: tracks reasoning state across chunks via anthropic_has_reasoning member variable Tested with bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF model. * server : fix Anthropic API streaming for thinking content blocks Add signature field and fix duplicate content_block_start events in Anthropic Messages API streaming responses for reasoning models. * server: refactor Anthropic streaming state to avoid raw pointer Replace raw pointer to task_result_state with direct field copies: - Copy state fields in update() before processing chunk - Use local copies in to_json_anthropic() instead of dereferencing - Pre-compute state updates for next chunk in update() This makes the data flow clearer and avoids unsafe pointer patterns.	2026-01-06 16:17:13 +01:00
Tarek Dakhran	73d284a250	model : add LFM2-ColBert-350M (#18607 ) * model : add LFM2-ColBert-350M * llama_model_n_embd_out() - returns `hparams.n_embd_out` if set and fallbacks to `hparams.n_embd`	2026-01-05 19:52:56 +01:00
Aleksander Grygier	2d6020b574	feat: Enable adding System Prompt per-chat	2026-01-05 14:30:11 +01:00
Vladislav Sayapin	da143b9940	server : fix router child env in containerized environments (#18562 )	2026-01-05 14:12:05 +01:00
Aleksander Grygier	469263668f	fix: UI	2026-01-05 11:59:31 +01:00
Aleksander Grygier	cf37390434	chore: update webui build output	2026-01-05 11:57:23 +01:00
Aleksander Grygier	f3734b5b7c	feat: UI improvements	2026-01-05 11:53:53 +01:00
Pascal	653f85fedd	webui: raw tool result display, strip only leading/trailing newlines to preserve indentation	2026-01-05 09:01:31 +01:00
Pascal	fc7218ae11	webui: split raw output into backend parsing and frontend display options	2026-01-05 09:01:31 +01:00
Pascal	4f9d9d41b9	webui: remove legacy wrapper and restore WebSocket transport	2026-01-05 09:01:31 +01:00
Pascal	183d9eebff	webui: remove unused imports	2026-01-05 09:01:31 +01:00
Aleksander Grygier	f7ea69fa18	chore: update webui build output	2026-01-05 09:01:31 +01:00

1 2 3 4 5 ...

477 Commits