llama.cpp

History

손희준 fbbf3ad190 server: /v1/responses (partial) (#18486 ) * from previous PR * Make instruction(system) as first message * Convert [input_message] (text/image/file) * Rename convert_responses_to_chatcmpl(body) -> response_body * Initial tool call support * Erase instructions field from chatcmpl body * Feed reasoning texts to chat template * Use std::vector instead of opaque json array * Make output_item.added events consistent * Move `server_task_result_cmpl_partial::update` from header to source * Match ID of output_item.added and .done events * Add function_call only if there is no "fc_" prefix * Add function call output at non-streaming API * Test if ID is persistent * Add doc * Fix style - use trailing comma * Rewrite state management * catch up with upstream/master * Fix style - "type" is the first item of SSE data * Explicitly check "instructions" from response_body * Make lambdas static * Check if reasoning content exists * Add `oai_resp_id` to task_result_state(also initialized at ctor), server_task_result_cmpl_partial, and server_task_result_cmpl_final * Reject `input_file` since it is not supported by chatcmpl * Add "fc_" prefix to non-straming function call id as coderabbit pointed out --------- Co-authored-by: openingnow <>		2026-01-21 17:47:23 +01:00
..
batched-bench	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
cli	cli : fix reasoning responses in CLI (#18961 )	2026-01-20 18:23:25 +01:00
completion	llama : add adaptive-p sampler (#17927 )	2026-01-15 19:16:29 +02:00
cvector-generator	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
export-lora	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
fit-params	llama-fit-params: free memory target per device (#18679 )	2026-01-08 10:07:58 +01:00
gguf-split	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
imatrix	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
llama-bench	Setting mmap and direct_io to false as default in llama-bench.cpp (#18841 )	2026-01-16 09:46:51 +01:00
mtmd	mtmd : Fix ASR for LFM2.5-Audio-1.5B (#18876 )	2026-01-16 11:23:08 +01:00
perplexity	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
quantize	quantize: prevent input/output file collision (#18451 )	2025-12-31 23:29:03 +08:00
rpc	Install rpc-server when GGML_RPC is ON. (#17149 )	2025-11-11 10:53:59 +00:00
server	server: /v1/responses (partial) (#18486 )	2026-01-21 17:47:23 +01:00
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	refactor : remove libcurl, use OpenSSL when available (#18828 )	2026-01-14 18:02:47 +01:00
CMakeLists.txt	cmake: only build cli when server is enabled (#18670 )	2026-01-09 16:43:26 +01:00