llama.cpp

Commit Graph

Author	SHA1	Message	Date
Christopher Albert	adef64cb9f	server: fix reasoning item content format handling for multi-turn Accept all valid reasoning item content formats in multi-turn input: - Array of objects: [{"type":"reasoning_text","text":"..."}] (spec format) - Plain string: "thinking about it" (OpenCode format) - Null: content:null with encrypted_content (Codex, openai/codex#11834) - Omitted entirely: no content field present Previously threw "item['content'] is not an array" for non-array formats, breaking OpenCode multi-turn conversations. The encrypted_content field is accepted but ignored for local models (no server-side decryption). Add 4 tests covering each format variant. Refs: openai/codex#11834, anomalyco/opencode#19081	2026-03-31 06:37:49 +02:00
Christopher Albert	35f62f9eb3	server: fix streaming event bugs and tighten test assertions Code fixes: - build_oai_resp_metadata accepts status param; completed_at is null when status is in_progress (was always set to timestamp) - response.created/in_progress events use zeroed usage (was passing actual prompt tokens before response was logically started) - Function call item IDs are now generated once per tool call in update() and reused consistently across output_item.added, function_call_arguments.delta, and output_item.done events (was generating independent random IDs in each path) - Clean up commented-out status checks in server-common.cpp Test fixes: - Assert sequence_number on every event unconditionally (was using weak "if present" guard) - Check actual values not just key presence in streaming created event test (completed_at is None, usage tokens are 0, etc.) Refs: ggml-org/llama.cpp#21174 (patrick review)	2026-03-30 18:24:39 +02:00
Christopher Albert	5d51bbef1c	server: add streaming compliance tests for Responses API - test_responses_stream_created_event_has_full_response: verify response.created contains all 24+ fields with status in_progress - test_responses_stream_all_events_have_sequence_number: every event has sequence_number and they are strictly increasing across stream - test_responses_stream_delta_events_have_indices: output_index and content_index present on all delta/added events All 14 tests pass (2 original + 9 from previous commit + 3 new).	2026-03-30 18:13:29 +02:00
Christopher Albert	467266ba4c	server: add tests for Responses API compliance and Codex compatibility Add 8 new tests covering the changes in this PR: - test_responses_schema_fields: verify all 24+ Response object fields - test_responses_stream_schema_fields: verify sequence_number, output_index, content_index on streaming events - test_responses_non_function_tool_skipped: web_search/code_interpreter tool types return 200 instead of 400 - test_responses_mixed_tool_types: non-function tools filtered, function tools retained (not rejected at parsing layer) - test_responses_extra_keys_stripped: store, include, prompt_cache_key, web_search, text, truncation, metadata don't cause errors - test_responses_developer_role: developer messages merged into system - test_responses_input_text_type: input_text accepted for EasyInputMessage - test_responses_function_call_id_fields: output items have correct ids All 10 tests pass (2 existing + 8 new).	2026-03-30 15:04:01 +02:00
손희준	fbbf3ad190	server: /v1/responses (partial) (#18486 ) * from previous PR * Make instruction(system) as first message * Convert [input_message] (text/image/file) * Rename convert_responses_to_chatcmpl(body) -> response_body * Initial tool call support * Erase instructions field from chatcmpl body * Feed reasoning texts to chat template * Use std::vector instead of opaque json array * Make output_item.added events consistent * Move `server_task_result_cmpl_partial::update` from header to source * Match ID of output_item.added and .done events * Add function_call only if there is no "fc_" prefix * Add function call output at non-streaming API * Test if ID is persistent * Add doc * Fix style - use trailing comma * Rewrite state management * catch up with upstream/master * Fix style - "type" is the first item of SSE data * Explicitly check "instructions" from response_body * Make lambdas static * Check if reasoning content exists * Add `oai_resp_id` to task_result_state(also initialized at ctor), server_task_result_cmpl_partial, and server_task_result_cmpl_final * Reject `input_file` since it is not supported by chatcmpl * Add "fc_" prefix to non-straming function call id as coderabbit pointed out --------- Co-authored-by: openingnow <>	2026-01-21 17:47:23 +01:00

5 Commits