Commit Graph

7608 Commits

Author SHA1 Message Date
hanishkvc 2a8bd1c9e7 SimpleChatTC: Actual tool call implementations simplified
These no longer need to worry about

* setting up the console.log related redirection to capture
  the generated outputs, nor about
* setting up a dynamic function for executing the needed
  tool call related code

The web worker setup to help run tool calls in a relatively
isolated environment independent of the main browser env,
takes care of these.

One needs to only worry about getting the handle to the
web worker to use and inturn pass the need code wrt the
tool call to it.
2025-12-04 19:41:39 +05:30
hanishkvc 14d67f6c3c SimpleChatTC: Pass around structured objects wrt tool worker
The request for code to run as well as the resultant response data
both need to follow a structured object convention, so that it is
easy to map a request and the corresponding response to some extent.
2025-12-04 19:41:39 +05:30
hanishkvc 510c65c721 SimpleChatTC: Initial skeleton of a simple toolsworker 2025-12-04 19:41:39 +05:30
hanishkvc a6bccf934e SimpleChatTC:ToolsConsole:Cleanup a bit, add basic set of notes
Try ensure as well as verify that original console.log is saved
and not overwritten. Throw an exception if things seem off wrt
same.

Also ensure to add a newline at end of console.log messages
2025-12-04 19:41:39 +05:30
hanishkvc 2701cb3a1e SimpleChatTC: Move console.log trapping into its own module
So that it can be used from different modules, if required.
2025-12-04 19:41:39 +05:30
hanishkvc 45d8a00738 SimpleChatTC: Update readme wrt --jinja argument and bit more 2025-12-04 19:41:39 +05:30
hanishkvc a8c8176d09 SimpleChatTC: Tool Calling UI elements use up horizontal space 2025-12-04 19:41:39 +05:30
hanishkvc 1e5b638beb SimpleChatTC: Update readme with bit more details, Cleaner UI
Also avoid showing Tool calling UI elements, when not needed to
be shown.
2025-12-04 19:41:39 +05:30
hanishkvc bfe789706e SimpleChatTC: Let user trigger tool call, instead of automatic
Instead of automatically calling any requested tool by the GenAi
/ llm, that is from the tail end of the handle user submit btn
click,

Now if the GenAi/LLM has requested any tool to be called, then
enable the Tool Run related UI elements and fill them with the
tool name and tool args.

In turn the user can verify if they are ok with the tool being
called and the arguments being passed to it. Rather they can
even fix any errors in the tool usage like the arithmatic expr
to calculate that is being passed to simple_calculator or the
javascript code being passed to run_javascript_function_code

If user is ok with the tool call being requested, then trigger
the same.

The results if any will be automatically placed into the user
query text area.

User can cross verify if they are ok with the result and or
modify it suitabley if required and inturn submit the same to
the GenAi/LLM.
2025-12-04 19:41:39 +05:30
hanishkvc 1fc44c971d SimpleChatTC: Add ui elements for tool call verify and trigger
Instead of automatically calling the requested tool with supplied
arguments, rather allow user to verify things before triggering the
tool.

NOTE: User already provided control over tool_response before
submitting it to the ai assistant.
2025-12-04 19:41:38 +05:30
hanishkvc fd662b4b0b SimpleChatTC: ToolCall hs info in normal assistant-user chat flow
Also as part of same, wrap the request details in the assistant
block using a similar tagging format as the tool_response in user
block.
2025-12-04 19:41:38 +05:30
hanishkvc 30aa2f4c6b SimpleChatTC: Update the readme.md wrt tool calling a bit 2025-12-04 19:41:38 +05:30
hanishkvc 63b5c6d76d SimpleChatTC: Cleanup the function description a bit
to better describe how it will be run, so that genai/llm while
creating the code to run, will hopefully take care of any naunces
required.
2025-12-04 19:41:38 +05:30
hanishkvc a80da9a652 SimpleChatTC: Pass toolname to the tool handler
So that when tool handler writes the result to the tc_switch, it
can make use of the same, to write to the right location.

NOTE: This also fixes the issue with I forgetting to rename the
key in js_run wrt writing of result.
2025-12-04 19:41:38 +05:30
hanishkvc f7284a8b89 SimpleChatTC: Move tool calling to tools, try trap async failures
Move tool calling logic into tools module.

Try trap async promise failures by awaiting results of tool calling
and putting full thing in an outer try catch. Have forgotten the
nitty gritties of JS flow, this might help, need to check.
2025-12-04 19:41:38 +05:30
hanishkvc ef85ed41d4 SimpleChatTC: Clarify some type definitions to avoid warnings
ie in vs code with ts-check
2025-12-04 19:41:38 +05:30
hanishkvc a408e5e017 SimpleChatTC: More clearer description of toolcalls execution env
Should hopeful ensure that the GenAi/LLM will generate appropriate
code/expression as the argument to pass to these tool calls, to
some extent.
2025-12-04 19:41:38 +05:30
hanishkvc b4776da670 SimpleChatTC: Trap any exception raised during tool call
and inform the GenAi/LLM about the same
2025-12-04 19:41:38 +05:30
hanishkvc 17c5daa52c SimpleChatTC: Cleanup initial/1st go toolcall flow
As output generated by any tool/function call is currently placed
into the TextArea provided for End user (for their queries), bcas
the GenAi (engine/LLM) may be expecting the tool response to be
sent as a user role data with tool_response tag surrounding the
results from the tool call. So also now at the end of submit btn
click handling, the end user input text area is not cleared, if
there was a tool call handled, for above reasons.

Also given that running a simple arithmatic expression in itself
doesnt generate any output, so wrap them in a console.log, to
help capture the result using the console.log trapping flow that
is already setup.
2025-12-04 19:41:38 +05:30
hanishkvc 301910c3a1 SimpleChatTC: Implement a simple toolcall handling flow
Checks for toolname to be defined or not in the GenAi's response

If toolname is set, then check if a corresponding tool/func exists,
and if so call the same by passing it the GenAi provided toolargs
as a object.

Inturn the text generated by the tool/func is captured and put
into the user input entry text box, with tool_response tag around
it.
2025-12-04 19:41:38 +05:30
hanishkvc fa63a86c71 SimpleChatTC:tooljs: Trap console.log and store in new result key
The implementations of javascript and simple_calculator now use
provided helpers to trap console.log messages when they execute
the code / expression provided by GenAi and inturn store the
captured log messages in the newly added result key in tc_switch

This should help trap the output generated by the provided code
or expression as the case maybe and inturn return the same to the
GenAi, for its further processing.
2025-12-04 19:41:38 +05:30
hanishkvc 6d43011003 SimpleChatTC: Saner/Robust AssistantResponse content_equiv
Previously if content was empty, it would have always sent the
toolcall info related version even if there was no toolcall info
in it. Fixed now to return empty string, if both content and
toolname are empty.
2025-12-04 19:41:38 +05:30
hanishkvc 383c19c99b SimpleChatTC: twins wrt streamed response handling
As there could be failure wrt getting the response from the ai
server some where in between a long response spread over multiple
 parts, the logic uses the latestResponse to cache the response
as it is being received. However once the full response is got,
one needs to transfer it to a new instance of AssistantResponse
class, so that latestResponse can be cleared, while the new
instance can be used in other locations in the flow as needed.

Achieve the same now.
2025-12-04 19:41:38 +05:30
hanishkvc 53f85d09be SimpleChatTC: AssistantResponse everywhere initial go
Switch oneshot handler to use AssistantResponse, inturn currenlty
only handle the normal content in the response.

TODO: If any tool_calls in the oneshot response, it is currently
not handled.

Inturn switch the generic/toplevel handle response logic to use
AssistantResponse class, given that both oneshot and the
multipart/streaming flows use/return it.

Inturn add trimmedContent member to AssistantResponse class and
make the generic handle response logic to save the trimmed content
into this. Update users of trimmed to work with this structure.
2025-12-04 19:41:38 +05:30
hanishkvc 3f3aa8d043 SimpleChatTC: AssistantResponse class initial go
Make latestResponse into a new class based type instance wrt
ai assistant response, which is what it represents.

Move clearing, appending fields' values and getting assistant's
response info (irrespective of a content or toolcall response)
into this new class and inturn use the same.
2025-12-04 19:41:38 +05:30
hanishkvc 5a26831ad2 SimpleChatTC: Show toolcall being generated by ai - Temp 2025-12-04 19:41:38 +05:30
hanishkvc e73bc4550b SimpleChatTC: Avoid null content, Fix oversight wrt finish_reason
I was wrongly checking for finish_reason to be non null, before
trying to extract the genai content/toolcalls, have fixed this
oversight with the new flow in progress.

I had added few debug logs to identify the above issue, need to
remove them later. Note: given that debug logs are disabled by
replacing the debug function during this program's initialisation,
which I had forgotten about, I didnt get the debug messages and
had to scratch my head a bit, before realising this and the other
issue ;)

Also either when I had originally implemented simplechat 1+ years
back, or later due to changes on the server end, the streaming
flow sends a initial null wrt the content, where it only sets the
role. This was not handled in my flow on the client side, so a
null was getting prepended to the chat messages/responses from the
server. This has been fixed now in the new generic flow.
2025-12-04 19:41:38 +05:30
hanishkvc 63430dc9f7 SimpleChatTC: Extract streamed field - assume only 1f at any time
Update response_extract_stream to check for which field is being
currently streamed ie is it normal content or tool call func name
or tool call func args and then return the field name and extracted
value.

Previously it was always assumed that only normal content will be
returned.

Currently it is assumed that the server will only stream one of the
3 supported fields at any time and not more than one of them at the
same time.

TODO: Have to also add logic to extract the reasoning field later,
ie wrt gen ai models which give out their thinking.

Have updated append_response to expect both the key and the value
wrt the latestResponse object, which it will be manipualted.

Previously it was always assumed that content is what will be got
and inturn appended.
2025-12-04 19:41:38 +05:30
hanishkvc bfe7ef69fa SimpleChatTC: Skeleton to handle diff fields when streaming
Changed latestResponse type to an object instead of a string.
Inturn it contains entries for content, toolname and toolargs.

Added a custom clear logic due to the same and used it to replace
the previously simple assigning of empty string to latestResponse.

For now in all places where latestReponse is used, I have replaced
with latestReponse.content.

Next need to handle identifying the field being streamed and inturn
append to it. Also need to add logic to call tool, when tool_call
triggered by genai.
2025-12-04 19:41:38 +05:30
hanishkvc 32f5278e8c SimpleChatTC: use tcpdump to dbg hs; check if ai aware of tools 2025-12-04 19:41:38 +05:30
hanishkvc 6167cdff9f SimpleChatTC: Bring in the tools meta into the main flow 2025-12-04 19:41:38 +05:30
hanishkvc 46f0304105 SimpleChatTC: More generic tooljs, SimpCalc, some main skeleton
Make tooljs structure and flow more generic

Add a simple_calculator tool/function call logic

Add initial skeleton wrt the main tools.mjs file.
2025-12-04 19:41:38 +05:30
hanishkvc f1aa0ee778 SimpleChatTC: Add skeleton for a javascript interpretor tool call
Define the meta that needs to be passed to the GenAi Engine.

Define the logic that implements the tool call, if called.

Implement the flow/structure such that a single tool calls
implementation file can define multiple tool calls.
2025-12-04 19:41:38 +05:30
hanishkvc 48c9f07982 SimpleChatTC: Update test shell script a bit
Enable streaming by default, to check the handshake before going
on to change the code, given that havent looked into this for more
than a year now and have been busy with totally different stuff.

Also updated the user messages used for testing a bit
2025-12-04 19:41:38 +05:30
hanishkvc 9341c507f2 SimpleChatTools: Add boolean to allow user control of tools use 2025-12-04 19:41:38 +05:30
hanishkvc 4282a4277a SimpleChatToolCalling: Test/Explore srvr initial hs using cmdline 2025-12-04 19:41:38 +05:30
Daniel Bevenius 817d743cc1
examples : add missing code block end marker [no ci] (#17756)
This commit adds the missing code block end marker in simple-cmake-pkg
to correct the formatting.
2025-12-04 14:17:30 +01:00
Daniel Bevenius bd4ef13476
common : skip model validation when --help is requested (#17755)
This commit skips the model validation check when the user specifies the
--help option.

The motivation for this is that currently and error is thrown before the
--help could be processed. Now skips validation if params.usage is set,
allowing help to display without requiring --model.

Resolves: https://github.com/ggml-org/llama.cpp/issues/17754
2025-12-04 13:36:50 +01:00
Alberto Cabrera Pérez 87a2084c45
ggml-cpu : remove asserts always evaluating to false (#17728) 2025-12-04 13:16:38 +01:00
SmartestWashingMachine 3659aa28e9
convert: use existing local chat_template if mistral-format model has one. (#17749)
* conversion: use existing local chat_template.jinja file if mistral-format model has one.

* fix --mistral-format mistakenly assuming some <=v7 chat template names are file paths and reading them.

* Update convert_hf_to_gguf.py - change from exists() to is_file()

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-04 12:12:45 +01:00
Adrien Gallouët 2a73f81f8a
cmake : simplify build info detection using standard variables (#17423)
The current approach has several drawbacks. Mostly, when
cross-compiling, invoking the compiler binary directly to query the
machine hardware can behave unexpectedly depending on the toolchain
wrapper (using COMPILER_TARGET, CFLAGS, etc).

As CMake is the official tool to build llama.cpp, I propose to only rely
on it to get those variables (`CMAKE_SYSTEM_NAME` and
`CMAKE_SYSTEM_PROCESSOR`).

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-12-04 12:42:13 +02:00
Sigbjørn Skjæret 7dba049b07
ci : disable ggml-ci-x64-amd-* (#17753) 2025-12-04 11:25:08 +01:00
Adrien Gallouët 83c1171529
common: use native MultiByteToWideChar (#17738)
`std::codecvt_utf8<wchar_t>` is deprecated and produces warnings:

    common/common.cpp:792:31: warning: 'codecvt_utf8<wchar_t>' is deprecated [-Wdeprecated-declarations]
      792 |     std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
          |

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-12-04 12:06:49 +02:00
Georgi Gerganov 0d1324856f
metal : use params per pipeline instance (#17739) 2025-12-04 10:34:11 +02:00
Georgi Gerganov a67ef0f47f
llama : fix sanity checks during quantization (#17721) 2025-12-04 10:33:42 +02:00
Adrien Gallouët ef75a89fdb
build : move _WIN32_WINNT definition to headers (#17736)
Previously, cmake was forcing `_WIN32_WINNT=0x0A00` for MinGW builds,
This caused "macro redefined" warnings with toolchains that define the version.

This also removes the `GGML_WIN_VER` variable as it is no longer needed.

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-12-04 07:04:02 +01:00
Jeff Bolz d8b5cdc4fe
build: enable parallel builds in msbuild using MTT (#17708)
* build: enable parallel builds in msbuild using MTT

* check LLAMA_STANDALONE
2025-12-03 22:42:29 -06:00
Herman Semenoff dea9ba27cb
ggml-cpu: remove duplicate conditional check 'iid' (#17650) 2025-12-04 05:03:19 +08:00
Piotr Wilkin (ilintar) c6d1a00aa7
Add a couple of file types to the text section (#17670)
* Add a couple of file types to the text section

* Format + regenerate index

* Rebuild after rebase
2025-12-03 21:45:06 +01:00
SmartestWashingMachine 424c579455
convert : support latest mistral-common (fix conversion with --mistral-format) (#17712)
* fix convert_hf_to_gguf.py failing with --mistral-format using later mistral-common versions.

* use get_one_valid_tokenizer_file from mistral-common if available and fallback to old logic otherwise.

* use file name instead of file path for get_one_valid_tokenizer_file.

* fix --mistral-format tokenizer file failing for tokenizers in subdirectories.

* move get_one_valid_tokenizer_file import to avoid nested try-except.
2025-12-03 21:15:04 +01:00