llama.cpp

Commit Graph

Author	SHA1	Message	Date
hanishkvc	2a8bd1c9e7	SimpleChatTC: Actual tool call implementations simplified These no longer need to worry about * setting up the console.log related redirection to capture the generated outputs, nor about * setting up a dynamic function for executing the needed tool call related code The web worker setup to help run tool calls in a relatively isolated environment independent of the main browser env, takes care of these. One needs to only worry about getting the handle to the web worker to use and inturn pass the need code wrt the tool call to it.	2025-12-04 19:41:39 +05:30
hanishkvc	14d67f6c3c	SimpleChatTC: Pass around structured objects wrt tool worker The request for code to run as well as the resultant response data both need to follow a structured object convention, so that it is easy to map a request and the corresponding response to some extent.	2025-12-04 19:41:39 +05:30
hanishkvc	510c65c721	SimpleChatTC: Initial skeleton of a simple toolsworker	2025-12-04 19:41:39 +05:30
hanishkvc	a6bccf934e	SimpleChatTC:ToolsConsole:Cleanup a bit, add basic set of notes Try ensure as well as verify that original console.log is saved and not overwritten. Throw an exception if things seem off wrt same. Also ensure to add a newline at end of console.log messages	2025-12-04 19:41:39 +05:30
hanishkvc	2701cb3a1e	SimpleChatTC: Move console.log trapping into its own module So that it can be used from different modules, if required.	2025-12-04 19:41:39 +05:30
hanishkvc	45d8a00738	SimpleChatTC: Update readme wrt --jinja argument and bit more	2025-12-04 19:41:39 +05:30
hanishkvc	a8c8176d09	SimpleChatTC: Tool Calling UI elements use up horizontal space	2025-12-04 19:41:39 +05:30
hanishkvc	1e5b638beb	SimpleChatTC: Update readme with bit more details, Cleaner UI Also avoid showing Tool calling UI elements, when not needed to be shown.	2025-12-04 19:41:39 +05:30
hanishkvc	bfe789706e	SimpleChatTC: Let user trigger tool call, instead of automatic Instead of automatically calling any requested tool by the GenAi / llm, that is from the tail end of the handle user submit btn click, Now if the GenAi/LLM has requested any tool to be called, then enable the Tool Run related UI elements and fill them with the tool name and tool args. In turn the user can verify if they are ok with the tool being called and the arguments being passed to it. Rather they can even fix any errors in the tool usage like the arithmatic expr to calculate that is being passed to simple_calculator or the javascript code being passed to run_javascript_function_code If user is ok with the tool call being requested, then trigger the same. The results if any will be automatically placed into the user query text area. User can cross verify if they are ok with the result and or modify it suitabley if required and inturn submit the same to the GenAi/LLM.	2025-12-04 19:41:39 +05:30
hanishkvc	1fc44c971d	SimpleChatTC: Add ui elements for tool call verify and trigger Instead of automatically calling the requested tool with supplied arguments, rather allow user to verify things before triggering the tool. NOTE: User already provided control over tool_response before submitting it to the ai assistant.	2025-12-04 19:41:38 +05:30
hanishkvc	fd662b4b0b	SimpleChatTC: ToolCall hs info in normal assistant-user chat flow Also as part of same, wrap the request details in the assistant block using a similar tagging format as the tool_response in user block.	2025-12-04 19:41:38 +05:30
hanishkvc	30aa2f4c6b	SimpleChatTC: Update the readme.md wrt tool calling a bit	2025-12-04 19:41:38 +05:30
hanishkvc	63b5c6d76d	SimpleChatTC: Cleanup the function description a bit to better describe how it will be run, so that genai/llm while creating the code to run, will hopefully take care of any naunces required.	2025-12-04 19:41:38 +05:30
hanishkvc	a80da9a652	SimpleChatTC: Pass toolname to the tool handler So that when tool handler writes the result to the tc_switch, it can make use of the same, to write to the right location. NOTE: This also fixes the issue with I forgetting to rename the key in js_run wrt writing of result.	2025-12-04 19:41:38 +05:30
hanishkvc	f7284a8b89	SimpleChatTC: Move tool calling to tools, try trap async failures Move tool calling logic into tools module. Try trap async promise failures by awaiting results of tool calling and putting full thing in an outer try catch. Have forgotten the nitty gritties of JS flow, this might help, need to check.	2025-12-04 19:41:38 +05:30
hanishkvc	ef85ed41d4	SimpleChatTC: Clarify some type definitions to avoid warnings ie in vs code with ts-check	2025-12-04 19:41:38 +05:30
hanishkvc	a408e5e017	SimpleChatTC: More clearer description of toolcalls execution env Should hopeful ensure that the GenAi/LLM will generate appropriate code/expression as the argument to pass to these tool calls, to some extent.	2025-12-04 19:41:38 +05:30
hanishkvc	b4776da670	SimpleChatTC: Trap any exception raised during tool call and inform the GenAi/LLM about the same	2025-12-04 19:41:38 +05:30
hanishkvc	17c5daa52c	SimpleChatTC: Cleanup initial/1st go toolcall flow As output generated by any tool/function call is currently placed into the TextArea provided for End user (for their queries), bcas the GenAi (engine/LLM) may be expecting the tool response to be sent as a user role data with tool_response tag surrounding the results from the tool call. So also now at the end of submit btn click handling, the end user input text area is not cleared, if there was a tool call handled, for above reasons. Also given that running a simple arithmatic expression in itself doesnt generate any output, so wrap them in a console.log, to help capture the result using the console.log trapping flow that is already setup.	2025-12-04 19:41:38 +05:30
hanishkvc	301910c3a1	SimpleChatTC: Implement a simple toolcall handling flow Checks for toolname to be defined or not in the GenAi's response If toolname is set, then check if a corresponding tool/func exists, and if so call the same by passing it the GenAi provided toolargs as a object. Inturn the text generated by the tool/func is captured and put into the user input entry text box, with tool_response tag around it.	2025-12-04 19:41:38 +05:30
hanishkvc	fa63a86c71	SimpleChatTC:tooljs: Trap console.log and store in new result key The implementations of javascript and simple_calculator now use provided helpers to trap console.log messages when they execute the code / expression provided by GenAi and inturn store the captured log messages in the newly added result key in tc_switch This should help trap the output generated by the provided code or expression as the case maybe and inturn return the same to the GenAi, for its further processing.	2025-12-04 19:41:38 +05:30
hanishkvc	6d43011003	SimpleChatTC: Saner/Robust AssistantResponse content_equiv Previously if content was empty, it would have always sent the toolcall info related version even if there was no toolcall info in it. Fixed now to return empty string, if both content and toolname are empty.	2025-12-04 19:41:38 +05:30
hanishkvc	383c19c99b	SimpleChatTC: twins wrt streamed response handling As there could be failure wrt getting the response from the ai server some where in between a long response spread over multiple parts, the logic uses the latestResponse to cache the response as it is being received. However once the full response is got, one needs to transfer it to a new instance of AssistantResponse class, so that latestResponse can be cleared, while the new instance can be used in other locations in the flow as needed. Achieve the same now.	2025-12-04 19:41:38 +05:30
hanishkvc	53f85d09be	SimpleChatTC: AssistantResponse everywhere initial go Switch oneshot handler to use AssistantResponse, inturn currenlty only handle the normal content in the response. TODO: If any tool_calls in the oneshot response, it is currently not handled. Inturn switch the generic/toplevel handle response logic to use AssistantResponse class, given that both oneshot and the multipart/streaming flows use/return it. Inturn add trimmedContent member to AssistantResponse class and make the generic handle response logic to save the trimmed content into this. Update users of trimmed to work with this structure.	2025-12-04 19:41:38 +05:30
hanishkvc	3f3aa8d043	SimpleChatTC: AssistantResponse class initial go Make latestResponse into a new class based type instance wrt ai assistant response, which is what it represents. Move clearing, appending fields' values and getting assistant's response info (irrespective of a content or toolcall response) into this new class and inturn use the same.	2025-12-04 19:41:38 +05:30
hanishkvc	5a26831ad2	SimpleChatTC: Show toolcall being generated by ai - Temp	2025-12-04 19:41:38 +05:30
hanishkvc	e73bc4550b	SimpleChatTC: Avoid null content, Fix oversight wrt finish_reason I was wrongly checking for finish_reason to be non null, before trying to extract the genai content/toolcalls, have fixed this oversight with the new flow in progress. I had added few debug logs to identify the above issue, need to remove them later. Note: given that debug logs are disabled by replacing the debug function during this program's initialisation, which I had forgotten about, I didnt get the debug messages and had to scratch my head a bit, before realising this and the other issue ;) Also either when I had originally implemented simplechat 1+ years back, or later due to changes on the server end, the streaming flow sends a initial null wrt the content, where it only sets the role. This was not handled in my flow on the client side, so a null was getting prepended to the chat messages/responses from the server. This has been fixed now in the new generic flow.	2025-12-04 19:41:38 +05:30
hanishkvc	63430dc9f7	SimpleChatTC: Extract streamed field - assume only 1f at any time Update response_extract_stream to check for which field is being currently streamed ie is it normal content or tool call func name or tool call func args and then return the field name and extracted value. Previously it was always assumed that only normal content will be returned. Currently it is assumed that the server will only stream one of the 3 supported fields at any time and not more than one of them at the same time. TODO: Have to also add logic to extract the reasoning field later, ie wrt gen ai models which give out their thinking. Have updated append_response to expect both the key and the value wrt the latestResponse object, which it will be manipualted. Previously it was always assumed that content is what will be got and inturn appended.	2025-12-04 19:41:38 +05:30
hanishkvc	bfe7ef69fa	SimpleChatTC: Skeleton to handle diff fields when streaming Changed latestResponse type to an object instead of a string. Inturn it contains entries for content, toolname and toolargs. Added a custom clear logic due to the same and used it to replace the previously simple assigning of empty string to latestResponse. For now in all places where latestReponse is used, I have replaced with latestReponse.content. Next need to handle identifying the field being streamed and inturn append to it. Also need to add logic to call tool, when tool_call triggered by genai.	2025-12-04 19:41:38 +05:30
hanishkvc	32f5278e8c	SimpleChatTC: use tcpdump to dbg hs; check if ai aware of tools	2025-12-04 19:41:38 +05:30
hanishkvc	6167cdff9f	SimpleChatTC: Bring in the tools meta into the main flow	2025-12-04 19:41:38 +05:30
hanishkvc	46f0304105	SimpleChatTC: More generic tooljs, SimpCalc, some main skeleton Make tooljs structure and flow more generic Add a simple_calculator tool/function call logic Add initial skeleton wrt the main tools.mjs file.	2025-12-04 19:41:38 +05:30
hanishkvc	f1aa0ee778	SimpleChatTC: Add skeleton for a javascript interpretor tool call Define the meta that needs to be passed to the GenAi Engine. Define the logic that implements the tool call, if called. Implement the flow/structure such that a single tool calls implementation file can define multiple tool calls.	2025-12-04 19:41:38 +05:30
hanishkvc	48c9f07982	SimpleChatTC: Update test shell script a bit Enable streaming by default, to check the handshake before going on to change the code, given that havent looked into this for more than a year now and have been busy with totally different stuff. Also updated the user messages used for testing a bit	2025-12-04 19:41:38 +05:30
hanishkvc	9341c507f2	SimpleChatTools: Add boolean to allow user control of tools use	2025-12-04 19:41:38 +05:30
hanishkvc	4282a4277a	SimpleChatToolCalling: Test/Explore srvr initial hs using cmdline	2025-12-04 19:41:38 +05:30
Daniel Bevenius	817d743cc1	examples : add missing code block end marker [no ci] (#17756 ) This commit adds the missing code block end marker in simple-cmake-pkg to correct the formatting.	2025-12-04 14:17:30 +01:00
Daniel Bevenius	bd4ef13476	common : skip model validation when --help is requested (#17755 ) This commit skips the model validation check when the user specifies the --help option. The motivation for this is that currently and error is thrown before the --help could be processed. Now skips validation if params.usage is set, allowing help to display without requiring --model. Resolves: https://github.com/ggml-org/llama.cpp/issues/17754	2025-12-04 13:36:50 +01:00
Alberto Cabrera Pérez	87a2084c45	ggml-cpu : remove asserts always evaluating to false (#17728 )	2025-12-04 13:16:38 +01:00
SmartestWashingMachine	3659aa28e9	convert: use existing local chat_template if mistral-format model has one. (#17749 ) * conversion: use existing local chat_template.jinja file if mistral-format model has one. * fix --mistral-format mistakenly assuming some <=v7 chat template names are file paths and reading them. * Update convert_hf_to_gguf.py - change from exists() to is_file() Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-12-04 12:12:45 +01:00
Adrien Gallouët	2a73f81f8a	cmake : simplify build info detection using standard variables (#17423 ) The current approach has several drawbacks. Mostly, when cross-compiling, invoking the compiler binary directly to query the machine hardware can behave unexpectedly depending on the toolchain wrapper (using COMPILER_TARGET, CFLAGS, etc). As CMake is the official tool to build llama.cpp, I propose to only rely on it to get those variables (`CMAKE_SYSTEM_NAME` and `CMAKE_SYSTEM_PROCESSOR`). Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-12-04 12:42:13 +02:00
Sigbjørn Skjæret	7dba049b07	ci : disable ggml-ci-x64-amd-* (#17753 )	2025-12-04 11:25:08 +01:00
Adrien Gallouët	83c1171529	common: use native MultiByteToWideChar (#17738 ) `std::codecvt_utf8<wchar_t>` is deprecated and produces warnings: common/common.cpp:792:31: warning: 'codecvt_utf8<wchar_t>' is deprecated [-Wdeprecated-declarations] 792 \| std::wstring_convert<std::codecvt_utf8<wchar_t>> converter; \| Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-12-04 12:06:49 +02:00
Georgi Gerganov	0d1324856f	metal : use params per pipeline instance (#17739 )	2025-12-04 10:34:11 +02:00
Georgi Gerganov	a67ef0f47f	llama : fix sanity checks during quantization (#17721 )	2025-12-04 10:33:42 +02:00
Adrien Gallouët	ef75a89fdb	build : move _WIN32_WINNT definition to headers (#17736 ) Previously, cmake was forcing `_WIN32_WINNT=0x0A00` for MinGW builds, This caused "macro redefined" warnings with toolchains that define the version. This also removes the `GGML_WIN_VER` variable as it is no longer needed. Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-12-04 07:04:02 +01:00
Jeff Bolz	d8b5cdc4fe	build: enable parallel builds in msbuild using MTT (#17708 ) * build: enable parallel builds in msbuild using MTT * check LLAMA_STANDALONE	2025-12-03 22:42:29 -06:00
Herman Semenoff	dea9ba27cb	ggml-cpu: remove duplicate conditional check 'iid' (#17650 )	2025-12-04 05:03:19 +08:00
Piotr Wilkin (ilintar)	c6d1a00aa7	Add a couple of file types to the text section (#17670 ) * Add a couple of file types to the text section * Format + regenerate index * Rebuild after rebase	2025-12-03 21:45:06 +01:00
SmartestWashingMachine	424c579455	convert : support latest mistral-common (fix conversion with --mistral-format) (#17712 ) * fix convert_hf_to_gguf.py failing with --mistral-format using later mistral-common versions. * use get_one_valid_tokenizer_file from mistral-common if available and fallback to old logic otherwise. * use file name instead of file path for get_one_valid_tokenizer_file. * fix --mistral-format tokenizer file failing for tokenizers in subdirectories. * move get_one_valid_tokenizer_file import to avoid nested try-except.	2025-12-03 21:15:04 +01:00

... 5 6 7 8 9 ...

7608 Commits All Branches Search

7608 Commits

All Branches