llama.cpp

Commit Graph

Author	SHA1	Message	Date
Pascal	3360f60b94	webui: fix custom headers persistence in UI	2026-01-15 20:13:01 +01:00
ddh0	13f1e4a9ca	llama : add adaptive-p sampler (#17927 ) * initial commit for branch * simplify constants * add params to `struct common_params_sampling`, add reference to PR * explicitly clamp `min_target` and `max_target` to `[0.0, 1.0]` * add args, rename `queue_size` -> `window_size` * improved comments * minor * remove old unused code from algorithm * minor * add power law case to `common_sampler_init`, add sampler name mappings * clarify behaviour when `window_size = 0` * add missing enums * remove `target_range` param, make `target == 1` no-op, cleanup code * oops, straggler * add missing parameters in `server-task.cpp` * copy from author ref: https://gist.github.com/MrJackSpade/9be99c7efbba7b95a41377e123b7b069 * remove old debug log, style nit * fix compiler warning, add commented-out logging per token * re-write + change parameters + simplify * oops forgot args.cpp * fix leftover `window_size` * add missing values to `common_params_sampling::print()` * with logging * does this fix it? * no, but does this? * update default decay * optimize * fix bad merge my git skills are lacking * silence `missing initializer for member` * update default decay to 0.9 * fix logging * format (double) * add power law to the new `samplers` vector * log sampler init values * improve logging messages in llama_sampler_power_law * remove extraneous logging * simplify target computation last commit with debug logging! * remove debug logging, explicitly clamp params at init * add `use_power_law` flag + logic, minor cleanup * update `power-law` -> `adaptive-p` * fix cold start EMA - `ctx->weighted_sum` is now initialized and reset to `target / (1.0f - clamped_decay)` - `ctx->total_weight` is now initialized and reset to `1.0f / (1.0f - clamped_decay)` this fixes a "cold start" problem with the moving average * update `SHARPNESS` constant to `10.0f` * minor style fixes no functional changes * minor style fixes cont. * update `llama_sampler_adaptive_p_i` for backend sampling (ref: #17004) * separate into `apply` + `accept` functions * `pending_token_idx`: switch from `llama_token` to `int32` functionally identical (`llama.h` has `typedef int32_t llama_token;`), but its more correct now * don't transform logits <= -1e9f * fix masking in backend top-p, min-p * address review comments * typo in comments `RND` -> `RNG` * add docs * add recommended values in completion docs * address PR feedback * remove trailing whitespace (for CI `editorconfig`) * add to adaptive-p to `common_sampler_types_from_chars`	2026-01-15 19:16:29 +02:00
Aleksander Grygier	cffc3b46ae	fix: Word wrapping	2026-01-15 17:59:57 +01:00
Xuan-Son Nguyen	a04c2b06a3	server: improve slots scheduling for n_cmpl (#18789 ) * server : make sure children tasks are scheduled to launch with parent * fix * add comment pointing to this PR * fix * clean up * more debug messages * add pop_deferred_task with specific ID version * improve the logic * simple approach * no double move * correct return type of launch_slots_with_parent_task	2026-01-15 17:10:28 +01:00
Georgi Gerganov	39173bcacb	context : reserve new scheduler when graph topology changes (#18547 ) * context : reserve new scheduler when graph topology changes * cont : fix * cont : fix reserve * cont : reserve only when changes occur + timing * context : add comments * llama : reserve on sampler changes * common : allow null common_sampler * server : task declares needs (embd, logits, sampling) * server : do not init sampler if not needed * llama : fix need_reserve when unsetting a sampler * server : consolidate slot reset/clear logic	2026-01-15 16:39:17 +02:00
Aleksander Grygier	5417a439ef	chore: update webui build output	2026-01-15 11:39:10 +01:00
Aleksander Grygier	30a585bb96	feat: UI improvements	2026-01-14 17:32:57 +01:00
Aleksander Grygier	886939c550	chore: update webui build output	2026-01-14 14:39:32 +01:00
Aleksander Grygier	39848ee12f	feat: UI improvement	2026-01-14 14:26:41 +01:00
Aleksander Grygier	c1ac8d7326	chore: update webui build output	2026-01-14 13:22:01 +01:00
Aleksander Grygier	afdae742e3	Merge remote-tracking branch 'ggml-org/master' into allozaur/mcp-mvp	2026-01-14 13:20:25 +01:00
Aleksander Grygier	b11b32ea28	chore: update webui build output	2026-01-14 12:47:13 +01:00
Aleksander Grygier	06efeb6eb9	chore: update webui build output	2026-01-14 11:49:26 +01:00
Aleksander Grygier	f89bcb90ca	feat: MCP Server Details	2026-01-14 11:45:47 +01:00
Aleksander Grygier	120f3c978c	chore: update webui build output	2026-01-12 18:27:54 +01:00
Aleksander Grygier	5407b2efab	feat: MCP connection details WIP	2026-01-12 18:26:48 +01:00
Radoslav Gerganov	bcf7546160	server : add arg for disabling prompt caching (#18776 ) * server : add arg for disabling prompt caching Disabling prompt caching is useful for clients who are restricted to sending only OpenAI-compat requests and want deterministic responses. * address review comments * address review comments	2026-01-12 19:21:34 +02:00
Aleksander Grygier	0009c0c300	refactor: MCP types and health check	2026-01-12 18:12:08 +01:00
Aleksander Grygier	0180becb8b	chore: update webui build output	2026-01-12 15:26:46 +01:00
Aleksander Grygier	08c1acd1db	refactor: KeyValuePairs component	2026-01-12 15:25:43 +01:00
Aleksander Grygier	392a6dce0d	chore: update webui build output	2026-01-12 15:15:19 +01:00
Aleksander Grygier	a44332b528	refactor: DRY	2026-01-12 15:10:18 +01:00
Aleksander Grygier	80e829a248	chore: update webui build output	2026-01-12 14:49:11 +01:00
Aleksander Grygier	60ef752d0f	refactor: Architecture improvements	2026-01-12 14:45:24 +01:00
Aleksander Grygier	a63a421952	chore: update webui build output	2026-01-12 14:18:15 +01:00
Aleksander Grygier	58ab834b18	refactor: MCP state management + stores/clients relationship	2026-01-12 14:17:06 +01:00
Xuan-Son Nguyen	ce3bf9b1a4	server: update docs for sleeping [no ci] (#18777 )	2026-01-12 13:01:24 +01:00
Aleksander Grygier	9c53bd4486	chore: update webui build output	2026-01-12 11:16:18 +01:00
Aleksander Grygier	528a560a25	fix: Distinguish streaming vs incomplete tool calls in UI	2026-01-12 11:15:58 +01:00
Aleksander Grygier	aa9054367a	chore: update webui build output	2026-01-12 11:10:24 +01:00
Aleksander Grygier	cead02ee58	fix: Restore live reactive UI progress for tool calls	2026-01-12 11:07:56 +01:00
Aleksander Grygier	c6843d0054	chore: update webui build output	2026-01-12 11:02:42 +01:00
Aleksander Grygier	b5226ebd86	Merge origin/allozaur/mcp-mvp: enable streaming of tool call arguments Resolves conflicts by: - Keeping clean store architecture (agentic.svelte.ts delegates to client) - Updating agentic.client.ts to use TOOL_ARGS_START/END format - Accepting remote AgenticContent.svelte with direct JSON parsing - Updating ChatMessageAssistant to match new AgenticContent props	2026-01-12 10:55:34 +01:00
Aleksander Grygier	01dfe0ee4c	chore: update webui build output	2026-01-12 10:37:12 +01:00
Aleksander Grygier	144148125b	refactor: Cleanup	2026-01-12 10:28:59 +01:00
Pascal	a02acca38d	fix: reset tool call state between turns	2026-01-10 19:14:13 +01:00
Pascal	b7288a4dd7	webui: enable streaming of tool call arguments	2026-01-10 18:59:57 +01:00
Georgi Gerganov	f307926482	server : adjust unified KV cache tests (#18716 )	2026-01-10 17:51:56 +02:00
Xuan-Son Nguyen	9ac2693a30	server: fix n_cmpl not skipping processing prompt (#18663 ) * server: fix n_cmpl not skipping processing * fix infinite loop on empty batch * cont : init child samplers + modify child logic * cont : cleanup * cont : improve n_cmpl logic - launch the parent task first so it finds the slot with best cache - parent task waits for child tasks to be launched - when a child task finishes - remove its cache * cont : remove redundant function * cont : reduce parent checks * fix : nullptr task dereference --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2026-01-10 00:00:41 +01:00
Pascal	ec8fd7876b	Webui/file upload (#18694 ) * webui: fix restrictive file type validation * webui: simplify file processing logic * chore: update webui build output * webui: remove file picker extension whitelist (1/2) * webui: remove file picker extension whitelist (2/2) * chore: update webui build output * refactor: Cleanup * chore: update webui build output * fix: update ChatForm storybook test after removing accept attribute * chore: update webui build output * refactor: more cleanup * chore: update webui build output	2026-01-09 16:45:32 +01:00
Georgi Gerganov	53eb9435da	server : fix timing of prompt/generation (#18713 )	2026-01-09 12:59:50 +02:00
Georgi Gerganov	f5f8812f7c	server : use different seeds for child completions (#18700 ) * server : use different seeds for child completions * cont : handle default seed * cont : note	2026-01-09 09:33:50 +02:00
Pascal	74b119e81e	webui: prevent mobile dropdown immediate close on synthetic click	2026-01-08 22:48:56 +01:00
Pascal	d000d84201	webui: fix redirect to root ignoring base path	2026-01-08 15:33:23 +01:00
Aleksander Grygier	2c0add6a90	Merge remote-tracking branch 'origin/allozaur/mcp-mvp' into allozaur/mcp-mvp	2026-01-08 15:02:05 +01:00
Aleksander Grygier	e3ca595651	chore: update webui build output	2026-01-08 14:54:45 +01:00
Aleksander Grygier	6f7750489e	refactor: Types	2026-01-08 14:45:47 +01:00
Aleksander Grygier	dfd3031b17	refactor: Componentize McpServerCard	2026-01-08 14:18:30 +01:00
Aleksander Grygier	835c06e0d1	refactor: Cleanup	2026-01-08 14:18:12 +01:00
Aleksander Grygier	ddbb7dc2e5	fix: Remove redundant CSS class	2026-01-08 14:11:52 +01:00
Adrien Gallouët	55abc39355	vendor : update cpp-httplib to 0.30.0 (#18660 ) * vendor : update cpp-httplib to 0.30.0 * common : allow custom headers when downloading	2026-01-08 13:53:54 +01:00
Aleksander Grygier	bf2a793f42	refactor: Cleanup	2026-01-08 13:49:55 +01:00
Aleksander Grygier	089f38230c	feat: Add TruncatedText component	2026-01-08 13:02:46 +01:00
Aleksander Grygier	06febe08b7	fix: Collapsible box trigger	2026-01-08 12:48:15 +01:00
Aleksander Grygier	223c6333e9	refactor: Cleanup	2026-01-08 12:46:10 +01:00
Aleksander Grygier	b0ba550928	refactor: Cleanup	2026-01-08 12:03:36 +01:00
Aleksander Grygier	56b34bf63b	refactor: Collapsible Content Block & small fixes	2026-01-08 09:17:24 +01:00
Aleksander Grygier	d89ada8cee	chore: update webui build output	2026-01-07 15:46:32 +01:00
Aleksander Grygier	98bce85b1f	refactor: Cleanup	2026-01-07 15:44:23 +01:00
Aleksander Grygier	b9adc00d3f	chore: update webui build output	2026-01-07 14:27:48 +01:00
Aleksander Grygier	10e5ad1396	feat: UI improvements	2026-01-07 14:01:27 +01:00
Aleksander Grygier	bc07e0723d	feat: Always show Mcp Selector	2026-01-07 14:01:27 +01:00
Pascal	4c095df509	fix: remove double scrollbar in model selector by using Bits UI content available height	2026-01-07 12:23:03 +01:00
R	3d26a09dc7	server : add thinking content blocks to Anthropic Messages API (#18551 ) * server : add thinking content blocks to Anthropic Messages API Add support for returning reasoning/thinking content in Anthropic API responses when using models with --reasoning-format deepseek and the thinking parameter enabled. - Non-streaming: adds thinking block before text in content array - Streaming: emits thinking_delta events with correct block indices - Partial streaming: tracks reasoning state across chunks via anthropic_has_reasoning member variable Tested with bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF model. * server : fix Anthropic API streaming for thinking content blocks Add signature field and fix duplicate content_block_start events in Anthropic Messages API streaming responses for reasoning models. * server: refactor Anthropic streaming state to avoid raw pointer Replace raw pointer to task_result_state with direct field copies: - Copy state fields in update() before processing chunk - Use local copies in to_json_anthropic() instead of dereferencing - Pre-compute state updates for next chunk in update() This makes the data flow clearer and avoids unsafe pointer patterns.	2026-01-06 16:17:13 +01:00
Tarek Dakhran	73d284a250	model : add LFM2-ColBert-350M (#18607 ) * model : add LFM2-ColBert-350M * llama_model_n_embd_out() - returns `hparams.n_embd_out` if set and fallbacks to `hparams.n_embd`	2026-01-05 19:52:56 +01:00
Aleksander Grygier	2d6020b574	feat: Enable adding System Prompt per-chat	2026-01-05 14:30:11 +01:00
Vladislav Sayapin	da143b9940	server : fix router child env in containerized environments (#18562 )	2026-01-05 14:12:05 +01:00
Aleksander Grygier	469263668f	fix: UI	2026-01-05 11:59:31 +01:00
Aleksander Grygier	cf37390434	chore: update webui build output	2026-01-05 11:57:23 +01:00
Aleksander Grygier	f3734b5b7c	feat: UI improvements	2026-01-05 11:53:53 +01:00
Pascal	653f85fedd	webui: raw tool result display, strip only leading/trailing newlines to preserve indentation	2026-01-05 09:01:31 +01:00
Pascal	fc7218ae11	webui: split raw output into backend parsing and frontend display options	2026-01-05 09:01:31 +01:00
Pascal	4f9d9d41b9	webui: remove legacy wrapper and restore WebSocket transport	2026-01-05 09:01:31 +01:00
Pascal	183d9eebff	webui: remove unused imports	2026-01-05 09:01:31 +01:00
Aleksander Grygier	f7ea69fa18	chore: update webui build output	2026-01-05 09:01:31 +01:00
Aleksander Grygier	c5d01fbb8f	feat: Improve agentic tool call streaming display with 'in progress' state	2026-01-05 09:01:31 +01:00
Aleksander Grygier	f755673c6f	feat: Enhance MCP server dropdown with search, popularity sorting, and per-chat overrides	2026-01-05 09:01:31 +01:00
Aleksander Grygier	81ad2d5569	feat: Add per-chat MCP server overrides	2026-01-05 09:01:31 +01:00
Aleksander Grygier	865c28a96d	chore: update webui build output	2026-01-05 09:01:31 +01:00
Aleksander Grygier	2592471d11	feat: Add image load error fallback in MarkdownContent	2026-01-05 09:01:31 +01:00
Aleksander Grygier	069be7b517	feat: Implement lazy MCP client shutdown	2026-01-05 09:01:31 +01:00
Aleksander Grygier	9571e07687	feat: Enhance tool call streaming UI and output format	2026-01-05 09:01:31 +01:00
Aleksander Grygier	260375819d	feat: Display and manage servers in ChatForm actions	2026-01-05 09:01:31 +01:00
Aleksander Grygier	74345d8785	feat: Integrate server management dialog into chat settings	2026-01-05 09:01:31 +01:00
Aleksander Grygier	dde5e1582c	feat: Implement dedicated server management UI components	2026-01-05 09:01:31 +01:00
Aleksander Grygier	c24d5e36f0	refactor: Centralize health check logic in store	2026-01-05 09:01:31 +01:00
Aleksander Grygier	f87b10ee66	feat: Enhance server config with headers and schema normalization	2026-01-05 09:01:31 +01:00
Aleksander Grygier	778ad550b1	feat: Add McpLogo Svelte component	2026-01-05 09:01:31 +01:00
Aleksander Grygier	c1c2234a62	refactor: Consolidate UI CSS classes into shared module	2026-01-05 09:01:31 +01:00
Aleksander Grygier	883d2a4f15	chore: update webui build output	2026-01-05 09:01:31 +01:00
Aleksander Grygier	7d5fd37324	feat: Raw LLM output switch per message	2026-01-05 09:01:31 +01:00
Aleksander Grygier	03464a0780	refactor: Tool call handling	2026-01-05 09:01:31 +01:00
Aleksander Grygier	3e7318f09d	docs: Update high-level architecture diagrams for MCP integration	2026-01-05 09:01:15 +01:00
Aleksander Grygier	219be7807e	feat: Add AgenticContent component for enhanced tool call rendering	2026-01-05 09:01:15 +01:00
Aleksander Grygier	52b1a1bffa	refactor: Update ChatStore to leverage mcpStore for agentic flow	2026-01-05 09:01:15 +01:00
Aleksander Grygier	60475dca3c	feat: Implement agentic orchestration within ChatService	2026-01-05 09:01:15 +01:00
Aleksander Grygier	5f5d5ab45f	feat: Introduce reactive mcpStore for client lifecycle management	2026-01-05 09:01:15 +01:00
Aleksander Grygier	9ab2326e79	feat: Refactor MCP client to use official SDK	2026-01-05 09:01:15 +01:00
Aleksander Grygier	4dbcb5cdfd	feat: Add @modelcontextprotocol/sdk and zod dependencies	2026-01-05 09:01:15 +01:00
Aleksander Grygier	8024ae540f	refactor: Update Agentic and MCP config parsing to use new utils and constants	2026-01-05 09:01:15 +01:00

1 2 3 4 5 ...

452 Commits