llama.cpp

History

Pascal d0fa2c9fbb Send reasoning content back to the model across turns via the reasoning_content API field (#21036 ) * webui: send reasoning_content back to model in context Preserve assistant reasoning across turns by extracting it from internal tags and sending it as a separate reasoning_content field in the API payload. The server and Jinja templates handle native formatting (e.g. <think> tags for Qwen, GLM, DeepSeek...). Adds "Exclude reasoning from context" toggle in Settings > Developer (off by default, so reasoning is preserved). Includes unit tests. * webui: add syncable parameter for excludeReasoningFromContext * chore: update webui build output		2026-03-27 08:17:35 +01:00
..
client	server: introduce API for serving / loading / unloading multiple models (#17470 )	2025-12-01 19:41:04 +01:00
e2e	server: introduce API for serving / loading / unloading multiple models (#17470 )	2025-12-01 19:41:04 +01:00
stories	Pre-MCP UI and architecture cleanup (#19685 )	2026-02-17 13:47:45 +01:00
unit	Send reasoning content back to the model across turns via the reasoning_content API field (#21036 )	2026-03-27 08:17:35 +01:00