Commit Graph

774 Commits

Author SHA1 Message Date
hanishkvc f97efb86e4 SimpleChatTC:SimpleProxy:Pdf2Text: js side initial plumbing
Expose pdf2text tool call to ai server and handshake with simple
proxy for the same.
2025-12-04 19:41:39 +05:30
hanishkvc 6054ddfb65 SimpleChatTC:SimpleProxy:Pdf2Text: Initial go 2025-12-04 19:41:39 +05:30
hanishkvc 5ec29087ea SimpleChatTC:SimpleProxy:Pdf2Text: Move handling url to its own 2025-12-04 19:41:39 +05:30
hanishkvc ecfdb66c94 SimpleChatTC:SimpleProxy:Pdf2Text:Initial plumbing
Get the pdf2text request for processing.
2025-12-04 19:41:39 +05:30
hanishkvc da98a961ab SimpleChatTC:SimpleProxy: Enable allowing or not requested feature 2025-12-04 19:41:39 +05:30
hanishkvc 0b2329e5de SimpleChatTC: Update readme 2025-12-04 19:41:39 +05:30
hanishkvc 6a8ced244c SimpleChatTC:Raise Error on Ai Chat server handshake NotOk resp 2025-12-04 19:41:39 +05:30
hanishkvc 91f39b7197 SimpleChatTC:Move chat server handshake to SimpleChat 2025-12-04 19:41:39 +05:30
hanishkvc 482517543b SimpleChatTC:Seperate out actual nw handshake - initial go
move the actual chat handshake with ai server into a seperate code
to an extent.

also initial anchor to trap handshake http error responses

Rather come to think of it, its better to move this into SimpleChat
class.

Use finally to ensure any needed cleanup for handle_user_submit
occurs within itself.
2025-12-04 19:41:39 +05:30
hanishkvc 8d7eece81c SimpleChatTC:ToolsWorker: Update note to flow with chat session id 2025-12-04 19:41:39 +05:30
hanishkvc 8ab8727f70 SimpleChatTC:DataStore: update readme 2025-12-04 19:41:39 +05:30
hanishkvc 5935ecceca SimpleChatTC:DataStore:Cleanup:Msg, duplicate on routing side
Avoid the duplicate plumbing code and use a common ops plumbing
helper.

Remove args[key] oversight from DataStoreList msg on webworkr
2025-12-04 19:41:39 +05:30
hanishkvc 57dd228512 SimpleChatTC:DataStore:List keys - the plumbing 2025-12-04 19:41:39 +05:30
hanishkvc 2d497069d2 SimpleChatTC:DataStore:list - web worker side logic
The basic skeleton added on the web worker side for listing keys.

TODO: Avoid duplication of similar code to an extent across some
of these db ops.
2025-12-04 19:41:39 +05:30
hanishkvc 7f8eb04875 SimpleChatTC:DataStore:Delete a record - the plumbing side 2025-12-04 19:41:39 +05:30
hanishkvc bd7f7cb72a SimpleChatTC:DataStore: Delete a record - the db web worker side 2025-12-04 19:41:39 +05:30
hanishkvc d80e438cfa SimpleChatTC:DataStore:Put, stringify undefined, readme
Update the descriptions of set and get to indicate the possible
corner cases or rather semantic in such situations.

Update the readme also a bit. The auto save and restore mentioned
has nothing to do with the new data store mechanism.
2025-12-04 19:41:39 +05:30
hanishkvc 2dad246d53 SimpleChatTC:DataStore: Dont ignore the error paths
And indexedDB add isnt the one to be happy with updating existing
key.
2025-12-04 19:41:39 +05:30
hanishkvc 4ad88f0da8 SimpleChatTC:DataStore:Eagerness to Wrong JSON conversions
In the eagerness of initial skeleton, had forgotten that the
root/generic tool call router takes care of parsing the json string
into a object, before calling the tool call, so no need to try
parse again. Fixed the same.

Hadnt converted the object based response from data store related
calls in the db web worker, into json string before passing to the
generic tool response callback, fixed the same.

- Rather the though of making the ChatMsgEx.createAllInOne handle
string or object set aside for now, to keep things simple and
consistant to the greatest extent possible across different flows.

And good news - flow is working atleast for the overall happy path
Need to check what corner cases are lurking like calling set on
same key more than once, seemed to have some flow oddity, which I
need to check later.

Also maybe change the field name to value from data in the response
to get, to match the field name convention of set. GPT-OSS is fine
with it. But worst case micro / nano / pico models may trip up, in
worst case, so better to keep things consistent.
2025-12-04 19:41:39 +05:30
hanishkvc 797b702251 SimpleChatTC:DataStore:FuncCallArgs: Any type not supported
So mention that may be ai can send complex objects in stringified
form. Rather once type of value is set to string, ai should normally
do it, but no harm is hinting.
2025-12-04 19:41:39 +05:30
hanishkvc b080ddf5c3 SimpleChatTC:DataStore: Remaining plumbing to try this
Update tooldb logic to match that needed for the db logic and its
web worker.

Bring in the remaining aspects of db helpers into tools flow.
2025-12-04 19:41:39 +05:30
hanishkvc 2f58542713 SimpleChatTC:DataStore: Duplicate tooljs to tooldb initial skel 2025-12-04 19:41:39 +05:30
hanishkvc aedffe1df0 SimpleChatTC:DataStore: Initial skeleton of a Db WebWorker
Create the DB store

Try Get and Set operations

The post back to main thread done from asynchronous paths.

NOTE: given that it has been ages since indexed db was used,
so this is a logical implementation by refering to mdn as needed.
2025-12-04 19:41:39 +05:30
hanishkvc 4f857575f5 SimpleChatTC:UICleanup:ShowMessage: Update readme 2025-12-04 19:41:39 +05:30
hanishkvc 9e9016f7fe SimpleChatTC:UICleanup: WordBreaks, Print avoid side vertical
Define rules to ensure that chat message contents wrap so as to
avoid overflowing beyond the size of the screen being viewed.

The style used for chat message role to be placed with vertical
oriented text adjacent to the actual message content on the side
seems to be creating issue with blank pages in some browsers,
so avoid that styling when one is printing.
2025-12-04 19:41:39 +05:30
hanishkvc f3593a9611 SimpleChatTC:ShowMessage:Show any number of toolcalls
Also make reasoning easily identifiable in the chat
2025-12-04 19:41:39 +05:30
hanishkvc 41ef449db1 SimpleChatTC:ShowMessage: Seperate out the content parts 2025-12-04 19:41:39 +05:30
hanishkvc ae9f7971a5 SimpleChatTC:CSS: Instead of hardcoded btn minwidth use padding 2025-12-04 19:41:39 +05:30
hanishkvc 2f07288e40 SimpleChatTC:ShowMessage: containers, role, contents
Seperate out the message ui block into a container containing
a role block and contents container block.

This will allow themeing of these seperately, if required.
As part of same, currently the role has been put to the side
of the message with vertical text flow.
2025-12-04 19:41:39 +05:30
hanishkvc a4f247d730 SimpleChatTC:Cleanup:Move showing message into ShowMessage 2025-12-04 19:41:39 +05:30
hanishkvc 59effa6ea8 SimpleChatTC:Cleanup: tool resp xml, some allowed domains
Add a newline between name and content in the xml representation
of the tool response, so that it is more easy to distinguish things

Add github, linkedin and apnews domains to allowed.domains for
simpleproxy.py
2025-12-04 19:41:39 +05:30
hanishkvc cf06c8682b SimpleChatTC:Reasoning+: Update readme wrt reasoning, flow cleanup
Also cleanup the minimal based showing of chat messages a bit

And add github.com to allowed list
2025-12-04 19:41:39 +05:30
hanishkvc 937aa57528 SimpleChatTC:MultiChatUI:ChatShow cleanup of Initial skeleton
Fix up the initial skeleton / logic as needed.

Remember that we are working with potentially a subset of chat
messages from the session, given the sliding window logic of
context managing on client ui side, so fix up the logic to use
the right subset of messages array and not the global xchat
when deciding whether a message is the last or last but one,
which need special handling wrt Assistant (with toolcall) and
Tool (ie response) messages.

Moving tool call ui setup as well as tool call response got ui
setup into ChatShow of MultiChatUI ensures that switching between
chat sessions handle the ui wrt tool call triggering ui and tool
call response submission related ui as needed properly.

Rather even loading a previously auto saved chat session if it had
tool call or tool call response to be handled, the chat ui will be
setup as needed to continue that session properly.
2025-12-04 19:41:39 +05:30
hanishkvc 2cc10f6705 SimpleChatTC:MultiChatUI.ChatShow: Mov SimpleChat.Show in -initial
Also take care of updating the toolcall ui if needed from within
this.
2025-12-04 19:41:39 +05:30
hanishkvc 62bce9ebfb SimpleChatTC:Show: Cleanup
Update existing flow so that next Tool Role message is handled
directly from within
2025-12-04 19:41:39 +05:30
hanishkvc aa17edfa78 SimpleChatTC:SimpleProxy: Include some news sites in allowed domains 2025-12-04 19:41:39 +05:30
hanishkvc 47d9550131 SimpleChatTC:Reasoning: Cleanup the initial go
Rather simplify and make the content_equiv provide a relatively
simple and neat representation of the reasoning with content and
toolcall as the cases may be.

Also remove the partial new para that I had introduced in the
initial go for reasoning.
2025-12-04 19:41:39 +05:30
hanishkvc dbb5512b20 SimpleChatTC:Reasoning: Initial Go 2025-12-04 19:41:39 +05:30
hanishkvc 25df32b553 SimpleChatTC:ChatSessionID: Get all handlers to account for chatid
This should ensure that tool call responses can be mapped back to
the chat session for which it was triggered.
2025-12-04 19:41:39 +05:30
hanishkvc 734beb08f5 SimpleChatTC:ChatSessionID through the tool call cycle
Pass chatId to tool call, and use chatId in got tool call resp,
to decide as to to which chat session the async tool call resp
belongs and inturn if auto submit timer should be started if auto
is enabled.
2025-12-04 19:41:39 +05:30
hanishkvc 13d312fe0d SimpleChatTC:ToolTemp: Ensure add removes non promoted ToolTemp 2025-12-04 19:41:39 +05:30
hanishkvc 4eb3322017 SimpleChatTC:ToolCallErrPath:ToolTemp and MultiChatUIChatShow
Update the immidiate tool call triggering failure and tool call
response timeout paths to use the new ToolTemp and MultiChatUI
based chat show logics.

Actual tool call itself generating errors, is already handled
in the previous commit changes.
2025-12-04 19:41:39 +05:30
hanishkvc e79faebde1 SimpleChatTC:ToolTemp and ChatShow
Add a new role ToolTemp, which is used to maintain any tool call
response on the client ui side, without submitting it to the server
ie till user or auto submit triggers the submitting of that tool
call response.

When ever a tool call response is got, create a ToolTemp role based
message in the corresponding chat session. And dont directly update
the user query input area, rather leave it to the updated simplechat
show and the new multichatui chat_show helper and inturn whether the
current chat session active in ui is same as the one for which the
tool call response has been recieved.

TODO: Currently the response message is added to the current
active chat session, but this needs to be changed by tracking
chatId/session through the full tool call cycle and then adding
the tool call response in the related chat session, and inturn
updating or not the ui based on whether that chat session is
still the active chat session in ui or not, given that tool call
gets handled in a asynchronous way.

Now when that tool call response is submitted, promote the equiv
tool temp role based message that should be in the session's chat
history as the last message into becoming a normal tool response
message.

SimpleChat.show has been updated to take care of showing any
ToolTemp role message in the user query input area.

A newer chat_show helper added to MultiChatUI, that takes care of
calling SimpleChat.show, provided the chat_show is being requested
for the currently active in ui, chat session. As well as to take
care of passing both the ChatDiv and elInUser. Converts users of
SimpleChat.show to use MultiChatUI.chat_show
2025-12-04 19:41:39 +05:30
hanishkvc 84403973cd SimpleChatTC:SimpleProxy: once in a bluemoon transformed bearer
instead of using the shared bearer token as is, hash it with
current year and use the hash.

keep /aum path out of auth check.

in future bearer token could be transformed more often, as well as
with additional nounce/dynamic token from server got during initial
/aum handshake as also running counter and so ...

NOTE: All these circus not good enough, given that currently the
simpleproxy.py handshakes work over http. However these skeletons
put in place, for future, if needed.

TODO: There is a once in a bluemoon race when the year transitions
between client generating the request and server handling the req.
But other wise year transitions dont matter bcas client always
creates fresh token, and server checks for year change to genrate
fresh token if required.
2025-12-04 19:41:39 +05:30
hanishkvc 0552ff9098 SimpleChatTC:SimpleProxy:ClientUI: Send Authorization bearer
User can configure the bearer token to send
2025-12-04 19:41:39 +05:30
hanishkvc 044d1cf535 SimpleChatTC:tools.proxyUrl: rename to just proxyUrl
Next will be adding a proxyAuth field also to tools.
2025-12-04 19:41:39 +05:30
hanishkvc 6d08cda9c8 SimpleChatTC:SimpleProxy: Check for bearer authorization
As noted in the comments in code, this is a very insecure flow
for now.
2025-12-04 19:41:39 +05:30
hanishkvc 3f1fd289eb SimpleChatTC:SimpleProxy:BearerInsecure a needed config
Add a config entry called bearer.insecure which will contain a
token used for bearer auth of http requests

Make bearer.insecure and allowed.domains as needed configs, and
exit program if they arent got through cmdline or config file.
2025-12-04 19:41:39 +05:30
hanishkvc 0caa2e8101 SimpleChatTC:SimpleProxy: Prg Parameters handling cleanup - next
Ensure load_config gets called on encountering --config in cmdline,
so that the user has control over whether cmdline or config file
will decide the final value of any given parameter.

Ensure that str type values in cmdline are picked up directly, without
running them through ast.literal_eval, bcas otherwise one will have to
ensure throught the cmdline arg mechanism that string quote is retained
for literal_eval

Have the """ function note/description below def line immidiately
so that it is interpreted as a function description.
2025-12-04 19:41:39 +05:30
hanishkvc f221a2c356 SimpleChatTC:SimpleProxy:LoadConfig ProcessArgs cleanup - initial
Now both follow a similar mechanism and do the following

* exit on finding any issue, so that things are in a known
  state from usage perspective, without any confusion/overlook

* check if the cmdlineArgCmd/configCmd being processed is a known
  one or not.

* check value of the cmd is of the expected type

* have a generic flow which can accomodate more cmds in future
  in a simple way
2025-12-04 19:41:39 +05:30
hanishkvc a1b33ecd1c SimpleChatTC:ToolCallResponseTimeout: Allow end user to control
Moved it into Me->tools, so that end user can modify the same as
required from the settings ui.

TODO: Currently, if tc response is got after a tool call timed out
and user submitted default timed out error response, the delayed
actual response when it is got may overwrite any new content in
user query box, this needs to be tackled.
2025-12-04 19:41:39 +05:30
hanishkvc 252fb91e95 SimpleChatTC:WebSearchPlus: Update readme, Wikipedia in allowed
If using wikipedia or so, remember to have sufficient context window
in general wrt the ai engine as well as wrt the handshake / chat
end point.
2025-12-04 19:41:39 +05:30
hanishkvc 221b5a9228 SimpleChatTC:ToolCallWeby: Cleanup the toolweb module flow
Avoid code duplication, by creating helpers for setup and toolcall.

Also send indication of the path that will be used, when checking
for simpleproxy.py server to be running at runtime setup.
2025-12-04 19:41:39 +05:30
hanishkvc de6f370d3b SimpleChatTC:ToolCall:SearchWebText using UrlText
Initial go at implementing a web search tool call, which uses the
existing UrlText support of the bundled simpleproxy.py.

It allows user to control the search engine to use, by allowing
them to set the search engine url template.

The logic comes with search engine url template strings for
duckduckgo, brave, bing and google. With duckduckgo set by default.
2025-12-04 19:41:39 +05:30
hanishkvc 978ee3db1e SimpleChatTC:ToolCalling:Seprat out JSWebWorker and ProxyBasedWeb
Remove the unneed (belonging to the other file) stuff from tooljs
and toolweb files.

Update tools manager to make use of the new toolweb module
2025-12-04 19:41:39 +05:30
hanishkvc d00e5b341a SimpleChatTC:Duplicate tooljs.mjs to toolweb.mjs
So as to split browser js webworker based tool calls from web
related tool calls.
2025-12-04 19:41:39 +05:30
hanishkvc 8c8ddb1e59 SimpleChatTC:Update and cleanup the readme a bit
include info about the auto option within tools.

use nonwrapped text wrt certain sections, so that the markdown
readme can be viewed properly wrt the structure of the content
in it.
2025-12-04 19:41:39 +05:30
hanishkvc 2192ae6dd3 SimpleChatTC:Cleanup whitespace - github editorconfig checker
Add missing newline to ending bracket line of json config file
2025-12-04 19:41:39 +05:30
hanishkvc f74ce327e5 SimpleChatTC: Cleanup whitespaces
identified by llama.cpp editorconfig check

* convert tab to spaces in json config file
* remove extra space at end of line
2025-12-04 19:41:39 +05:30
hanishkvc fb968347b0 SimpleChatTC:AutoToolCalls: Track and clear related timers
also cleanup the existing toolResponseTimeout timer to be in the
same structure and have similar flow convention.
2025-12-04 19:41:39 +05:30
hanishkvc 45f9db9963 SimpleChatTC:Auto tool calling control to end user
Instead of enforcing always explicit user triggered tool calling,
now user is given the option whether to use explicit user triggered
tool calling or to use auto triggering after showing tool details
for a user specified amount of seconds.

NOTE: The current logic doesnt account for user clicking the buttons
before the autoclick triggers; need to cancel the auto clicks, if
user triggers before autoclick, ie in future.
2025-12-04 19:41:39 +05:30
hanishkvc 9e97880dde SimpleChatTC:SimpleProxy:Cleanup
avoid logically duplicate debug log
2025-12-04 19:41:39 +05:30
hanishkvc 4c1c363504 SimpleChatTC:SimpleProxy: debug dumps to identify funny bing
bing raised a challenge for chrome triggered search requests after
few requests, which were spread few minutes apart, while still
seemingly allowing wget based search to continue (again spread
few minutes apart).

Added a simple helper to trace this, use --debug True to enable
same.
2025-12-04 19:41:39 +05:30
hanishkvc dbb24fec77 SimpleChatTC:ToolResponse: Use browser dom for xml/html safe
Instead of simple concatenating of tool call id, name and result
now use browser's dom logic to create the xml structure used for
now to store these within content field.

This should take care of transforming / escaping any xml special
chars in the result, so that extracting them later for putting
into different fields in the server handshake doesnt have any
problem.
2025-12-04 19:41:39 +05:30
hanishkvc 90d232dc4a SimpleChatTC:SimpleProxy: Update readme wrt mimicing client req
ie during proxying
2025-12-04 19:41:39 +05:30
hanishkvc 74226a0992 SimpleChatTC:ToolCall response relaxed handling
Use DOMParser parseFromString in text/html mode rather than text/xml
as it makes it more relaxed without worrying about special chars
of xml like & etal
2025-12-04 19:41:39 +05:30
hanishkvc c109da870f SimpleChatTC:SimpleProxy: mimicing got req helps wrt duckduckgo
mimicing got req in generated req helps with duckduckgo also and
not just yahoo.

also update allowed.domains to allow a url generated by ai when
trying to access the bing's news aggregation url
2025-12-04 19:41:39 +05:30
hanishkvc bebf846157 SimpleChatTC:SimpleProxy:Cleanup a bit
The tagging of messages wrt ValidateUrl and UrlReq

Also dump req

Move check for --allowed.domains to ValidateUrl

NOTE: Also with mimicing of user agent etal from got request to
the generated request, yahoo search/news is returning results now,
instead of the bland error before.
2025-12-04 19:41:39 +05:30
hanishkvc d0b9103176 SimpleChatTC:SimpleProxy:Try mimic real client using got req info
ie include User-Agent, Accept-Language and Accept in the generated
request using equivalent values got in the request being proxied.
2025-12-04 19:41:39 +05:30
hanishkvc e6e0adbe90 SimpleChatTC:SimpleProxy: Some debug prints which give info 2025-12-04 19:41:39 +05:30
hanishkvc 17365ed4b9 SimpleChatTC: Update readme a bit 2025-12-04 19:41:39 +05:30
hanishkvc 840cab0b1c SimpleChatTC:SimpleProxy: Include a sample config file
with allowed domains set to few sites in general to show its use

this includes some sites which allow search to be carried out
through them as well as provide news aggregation
2025-12-04 19:41:39 +05:30
hanishkvc 370326b1ec SimpleChatTC:SimpleProxy: Cleanup domain filtering and general
Had confused between js and python wrt accessing dictionary
contents and its consequence on non existent key. Fixed it.

Use different error ids to distinguish between failure in common
urlreq and the specific urltext and urlraw helpers.
2025-12-04 19:41:39 +05:30
hanishkvc 71ad609db6 SimpleChatTC:SimpleProxy: AllowedDomains based filtering
Allow fetching from only specified allowed.domains
2025-12-04 19:41:39 +05:30
hanishkvc 58954c8814 SimpleChatTC:SimpleProxy: Update doc following python convention 2025-12-04 19:41:39 +05:30
hanishkvc 62dcd506e3 SimpleChatTC:SimpleProxy:Allow for loading json based config file
The config entries should be named same as their equivalent cmdline
argument entries but without the -- prefix
2025-12-04 19:41:39 +05:30
hanishkvc aac5213104 SimpleChatTC:Tools: Show available tool names
Dont allow tool names to be changed in settings page
2025-12-04 19:41:39 +05:30
hanishkvc aa8c8040cf SimpleChatTC:Cleanup:ChatProps: apiEP 2025-12-04 19:41:39 +05:30
hanishkvc ad65659a63 SimpleChatTC:Cleanup:ChatProps: bTrimGarbage
Also remove more inner/detailed stuff from show info in not bAll
mode, given that many of the previous differentiated stuff have
been moved into chatProps and inturn shown for now
2025-12-04 19:41:39 +05:30
hanishkvc 82be13aa33 SimpleChatTC:Cleanup:ChatProps: bCompletionInsertStandardRolePrefix 2025-12-04 19:41:39 +05:30
hanishkvc 734f74c908 SimpleChatTC:Cleanup:ChatProps: bCompletionFreshChatAlways
Moved into Me.chatProps
2025-12-04 19:41:39 +05:30
hanishkvc 78ccca056f SimpleChatTC:Cleanup:ChatProps: iRecentUserMsgCnt
Update Me class

Update show settings

Update show props info

Update readme
2025-12-04 19:41:39 +05:30
hanishkvc 7409b29862 SimpleChatTC:Cleanup:ChatProps: Move bStream into it 2025-12-04 19:41:39 +05:30
hanishkvc a54fa472dd SimpleChatTC:ShowObjPropsEdit:Any depth trapping of ui setup - t2
Fix up the oversights wrt any depth trapping flow

Remember to start the propWithTree being checked/trapped with :
to indicate the root of the prop hierarchy and also use : as sep
between the elements of the props hierarchy tree

Also had forgotten about the goof up possible with using in in a
condition statement to check for array to contain a entry of interest
in JS, fixed it now.
2025-12-04 19:41:39 +05:30
hanishkvc 8d7eb68712 SimpleChatTC:ShowObjPropsEdit:Any depth trapping of ui setup
Maintain the current property hierarchy to its root over recursive
calls.

Allow callers to specify the props to be trapped using the prop
hierarchy.

Pass the prop hierarchy to the fTrapper.

This should allow one to trap any prop wrt its editing ui setup,
irrespective of whether it is a prop of the main object passed,
or a member of a child prop of the main object passed or so ...

Update the setting up of ChatHistoryInCtxt and ApiEndPoint to follow
the new semantic/flow.
2025-12-04 19:41:39 +05:30
hanishkvc b19e754322 SimpleChatTC:Cleanup:Rename func arg to match semantic better 2025-12-04 19:41:39 +05:30
hanishkvc 03426f0276 SimpleChatTC:Cleanup:EditObjProps: rename vars followingConvention
Part 1 - add el prefix wrt the element handle related vars
2025-12-04 19:41:39 +05:30
hanishkvc 3e490cefc5 SimpleChatTC:Cleanup: Move bTools and toolFetchProxyUrl into tools
Also update the readme wrt same and related
2025-12-04 19:41:39 +05:30
hanishkvc 303af1800e SimpleChatTC:ShowInfo:Clean up layout of showing of props data
Also ensure when switching between sessions, the full set of props
info is shown.
2025-12-04 19:41:39 +05:30
hanishkvc 0e21d67e8a SimpleChatTC:ShowInfo: Allow showing minimal info set, if needed 2025-12-04 19:41:39 +05:30
hanishkvc fc26e47222 SimpleChatTC:ShowObjPropsInfo: Use sections to indicate relations
Also create a top level div wrt whole. And allow class to be
specified for the same as well as the top level legend, optionally
2025-12-04 19:41:39 +05:30
hanishkvc 24ba85026e SimpleChatTC:ShowInfo: Make logic recursive, avoid JSON.stringify 2025-12-04 19:41:39 +05:30
hanishkvc 34b2beea1a SimpleChatTC:ShowInfo: Create and use common automated info show
Also fetch info from ai-server, and place path and ctx size into
current Me instance and include in show info.
2025-12-04 19:41:39 +05:30
hanishkvc 2a94cb3786 SimpleChatTC:Fetch:Proxy URL rename and in settings 2025-12-04 19:41:39 +05:30
hanishkvc 98d43fac7f SimpleChatTC:WebFetch: Try confirm simpleproxy before enabling 2025-12-04 19:41:39 +05:30
hanishkvc a6aa563a18 SimpleChatTC:WebFetch: Check for the specific proxy paths 2025-12-04 19:41:39 +05:30
hanishkvc 80dbbb89a5 SimpleChatTC:WebFetch: Enable only if something at proxyUrl
NOTE: not a robust check, just tries to establish a http connection
for now and doesnt really check if it is the specific proxy srvr
of interest or not.
2025-12-04 19:41:39 +05:30
hanishkvc fa0a6919cb SimpleChatTC: Update/Cleanup readme 2025-12-04 19:41:39 +05:30
hanishkvc 8ca77e455a SimpleChatTC:NonStreaming: Update oneshot mode wrt tool calls
Take care of the possibility of content not being there as well as
take care of retrieving the tool calls for further processing.

With this tool calls should work in non streaming mode also
2025-12-04 19:41:39 +05:30
hanishkvc 3e0cf2a2df SimpleChatTC:ObjPropsEdit: Obj within Obj aware fRefiner
Use same to set a placeholder for Authorization entry in headers
2025-12-04 19:41:39 +05:30
hanishkvc f874c69983 SimpleChatTC:UiShowObjPropsEdit allow refining 2025-12-04 19:41:39 +05:30
hanishkvc 6253c717b3 SimpleChatTC:Trappable UiShowObjPropsEdit for custom handling
Use it to handle apiEP and iRecentUserMsgCnt in more user friendly
way, where they get a selection to choose from.
2025-12-04 19:41:39 +05:30
hanishkvc 3718a39c06 SimpleChatTC:Use generic obj props edit for settings in general
Bring more user controllable properties into this new settings ui
2025-12-04 19:41:39 +05:30
hanishkvc 756b128539 SimpleChatTC:UI:ObjPropEdits handle objects, use for gMe 2025-12-04 19:41:39 +05:30
hanishkvc b771e42dc1 SimpleChatTC:UI:Common helper to edit obj members of few types
Make the previously relatively generic flow wrt apiRequestOptions
settings into a fully generic reusable by others flow.

Rather had stopped short of it, when previously moved onto other
things at that time.
2025-12-04 19:41:39 +05:30
hanishkvc 6e5b532313 SimpleChatTC:UI: el_get/el_set to avoid warnings 2025-12-04 19:41:39 +05:30
hanishkvc 04644761e6 SimpleChatTC:Tools: Pick proxy server address from document[gMe] 2025-12-04 19:41:39 +05:30
hanishkvc 9b55775e8a SimpleChatTC:WebFetch: Update readme to reflect the new names 2025-12-04 19:41:39 +05:30
hanishkvc 42f91df261 SimpleChatTC:WebFetch:Trap Non Ok status and raise error
So that the same error path is used for logical error wrt http req
also, without needing a different path for it.

Dont forget to return the resp text/json/..., so that the contents
are passed along the promise then chain
2025-12-04 19:41:39 +05:30
hanishkvc d04c8cd38d SimpleChatTC:SimpleProxy: Ensure CORS related headers sent always
Add a new send headers common helper and use the same wrt the
overridden send_error as well as do_OPTIONS

This ensures that if there is any error during proxy opertions,
the send_error propogates to the fetch from any browser properly
without browser intercepting it with a CORS error
2025-12-04 19:41:39 +05:30
hanishkvc c2fb0cd241 SimpleChatTC:WebFetch: Cleanup the names and descriptions a bit 2025-12-04 19:41:39 +05:30
hanishkvc 73a144c44d SimpleChatTC:SimpleProxy:HtmlParser more generic and flexible
also now track header, footer and nav so that they arent captured
2025-12-04 19:41:39 +05:30
hanishkvc cd226e8dae SimpleChatTC: Update readme wrt web fetch and related simple proxy 2025-12-04 19:41:39 +05:30
hanishkvc 8b950fd348 SimpleChatTC:WebFetch:UrlEnc url2fetch b4Passing toProxy asQuery
Ensures that if the url being requested as any query strings in
them then things dont get messed up, when the url to get inc its
query is extracted from the proxy request's query string
2025-12-04 19:41:39 +05:30
hanishkvc 9ff2c596ee SimpleChatTC:SimpleProxy:Options just in case 2025-12-04 19:41:39 +05:30
hanishkvc 9c7d6cc0e4 SimpleChatTC:WebUrlText:Update name and desc to see if prefered 2025-12-04 19:41:39 +05:30
hanishkvc bf63b8f45a SimpleChatTC:SimpleProxy:UrlText: Slightly better trimming
First identify lines which have only whitespace and replace them
with lines with only newline char in them.

Next strip out adjacent lines, if they have only newlines
2025-12-04 19:41:39 +05:30
hanishkvc 266e825c68 SimpleChatTC:SimpleProxy:UrlText: Try strip empty lines some what 2025-12-04 19:41:39 +05:30
hanishkvc 82ab08ec1a SimpleChatTC:WebUrl FetchStrip through simple proxy 2025-12-04 19:41:39 +05:30
hanishkvc b46bbc542a SimpleChatTC:SimpleProxy:UrlText: Avoid style blocks also 2025-12-04 19:41:39 +05:30
hanishkvc f493e1af59 SimpleChatTC:SimpleProxy:UrlText: Capture body except for scripts 2025-12-04 19:41:39 +05:30
hanishkvc 45b05df21b SimpleChatTC:SimpleProxy: Switch to html.parser
As html can be malformed, xml ElementTree XMLParser cant handle
the same properly, so switch to the HtmlParser helper class that is
provided by python and try extend it.

Currently a minimal skeleton to just start it out, which captures
only the body contents.
2025-12-04 19:41:39 +05:30
hanishkvc d5f4183f7c SimpleChatTC:SimpleProxy: ElementTree, No _UrlopenRet
As _UrlopenRet not exposed for use outside urllib, so decode and
encode the data.

Add skeleton to try get the html/xml tree top elements
2025-12-04 19:41:39 +05:30
hanishkvc 6537559360 SimpleChatTC:SimpleProxy:Common UrlReq helper for UrlRaw & UrlText
Declare the result of UrlReq as a DataClass, so that one doesnt
goof up wrt updating and accessing members.

Duplicate UrlRaw into UrlText, need to add Text extracting from
html next for UrlText
2025-12-04 19:41:39 +05:30
hanishkvc e600e62e86 SimpleChatTC:SimpleProxy: Cleanup few messages 2025-12-04 19:41:39 +05:30
hanishkvc c25b1968cd SimpleChatTC:WebFetch: Update to use internal SimpleProxy.py 2025-12-04 19:41:39 +05:30
hanishkvc 3bab4de0e8 SimpleChatTC:SimpleProxy:UrlRaw: Fixup basic oversight wrt 1st go 2025-12-04 19:41:39 +05:30
hanishkvc 73ef9f7d46 SimpleChatTC:SimpleProxy:implement handle_urlraw
A basic go at it
2025-12-04 19:41:39 +05:30
hanishkvc 73054a5832 SimpleChatTC:SimpleProxy: Extract and check path, route to handlers 2025-12-04 19:41:39 +05:30
hanishkvc c99788e290 SimpleChatTC:SimpleProxy: Cleanup for basic run 2025-12-04 19:41:39 +05:30
hanishkvc 80fd065993 SimpleChatTC:SimpleProxy: Start server, Show requested path 2025-12-04 19:41:39 +05:30
hanishkvc 05c0ade8be SimpleChatTC:SimpleProxy:Process args --port 2025-12-04 19:41:39 +05:30
hanishkvc 8fc74ef923 SimpleChatTC:WebFetchThroughProxy:Initial go creating request 2025-12-04 19:41:39 +05:30
hanishkvc 09ce19a95a SimpleChatTC: update readme wrt promise related trapping 2025-12-04 19:41:39 +05:30
hanishkvc f0a3886d1e SimpleChatTC:Ensure fetch's promise chain is also trapped
Dont forget to map members of got entity from fetch to things
from saved original promise, bcas remember what is got is a promise.

also

add some comments around certain decisions and needed exploration
2025-12-04 19:41:39 +05:30
hanishkvc 77d3e43cb4 SimpleChatTC: Allow await in generated code that will be evald 2025-12-04 19:41:39 +05:30
hanishkvc 92e5b2133e SimpleChatTC:Promises: trap normal fetch (dont care await or not) 2025-12-04 19:41:39 +05:30
hanishkvc 0241b7b469 SimpleChatTC:TrapPromise: log the trapping
also possible refinement wrt trapping, if needed, added as comment

all or allSettled to use or not is the question.

whether to wait for a round trip through the related event loop or
not is also a question.
2025-12-04 19:41:39 +05:30
hanishkvc 3d661793ef SimpleChatTC:ChatMessageEx: 1st go at trying to track promises 2025-12-04 19:41:39 +05:30
hanishkvc 7dbbc46390 SimpleChatTC:ChatMessageEx: Better tool result extractor 2025-12-04 19:41:39 +05:30
hanishkvc 61b70bfa5d SimpleChatTC:Readme: Updated wrt new relativelyProper toolCallsHS
Also update the sliding window context size to last 9 chat messages
so that there is a sufficiently large context for multi turn tool
calls based adjusting by ai and user, without needing to go full
hog, which has the issue of overflowing the currently set context
window wrt the loaded ai model.
2025-12-04 19:41:39 +05:30
hanishkvc 152deb5d5a SimpleChatTC:ChatMessageEx:While at it also ns_delete
these common helpers avoid needing ignore tagging to ts-check, in
places where valid constructs have been used which go beyond strict
structured js handling that is tried to be achieved using it, but
are still valid and legal.
2025-12-04 19:41:39 +05:30
hanishkvc cc65a2f7a3 SimpleChatTC:ChatMessageEx: Build tool role result fully
Expand the xml format id, name and content in content field of
tool result into apropriate fields in the tool result message sent
to the genai/llm engine on the server.
2025-12-04 19:41:39 +05:30
hanishkvc ebc7f88b53 SimpleChatTC:Propogate toolcall id through tool call chain
Use HTMLElement's dataset to maintain tool call id along with
the element which maintains the toolname.

Pass it along to the tools manager and inturn the actual tool
calls and through them to the web worker handling the tool call
related code and inturn returning it back as part of the obj
which is used to return the tool call result.

Embed the tool call id, function name and function result into
the content field of chat message in terms of a xml structure

Also make use of tool role to send back the tool call result.
Do note that currently the id, name and content are all embedded
into the content field of the tool role message sent to the
ai engine on the server.

NOTE: Use the user query entry area for showing tool call result
in the above mentioned xml form, as well as for user to enter
their own queries. Based on presence of the xml format data at
beginning the logic will treat it has a tool result and if not
then as a normal user query.

The css has been updated to help show tool results/msgs in a
lightyellow background
2025-12-04 19:41:39 +05:30
hanishkvc 2bb3d747e6 SimpleChatTC:ChatMessageEx: send tool_calls, only if needed 2025-12-04 19:41:39 +05:30
hanishkvc 2ef201ff8d SimpleChatTC:Load allows old and new ChatMessage(Ex) formats 2025-12-04 19:41:39 +05:30
hanishkvc 475858a4b3 SimpleChatTC:ChatMessageEx: Cleanup remaining stuff
wrt ChatMessageEx related required flow as well as avoid warnings
2025-12-04 19:41:39 +05:30
hanishkvc 963b9f4661 SimpleChatTC:ChatMessageEx: Recent chat users upd
Users of recent_chat updated to work with ChatMessageEx

As part of same recent_chat_ns also added, for the case where the
array of chat messages can be passed as is ie in the chat mode,
provided it has only the network handshake representation of the
messages.
2025-12-04 19:41:39 +05:30
hanishkvc 4d9e3d1566 SimpleChatTC:ChatMessageEx: Upd Add, rm sysPromptAtBeginOnly hlpr
Simplify Add semantic by expecting any validation of stuff before
adding to be done by the callers of Add and not by add itself.

Also update it to expect ChatMessageEx object

Update all users of add to follow the new syntax and semantic.

Remove the old and ununsed AddSysPromptOnlyAtBegin helper
2025-12-04 19:41:39 +05:30
hanishkvc c65c1d5f0f SimpleChatTC:ChatMessageEx: RecentChat, GetSystemLatest
GetSystemLatest and its users updated wrt ChatMessageEx.

RecentChat updated wrt ChatMessageEx. Also now irrespective of
whether full history is being retrieved or only a subset, both
cases refer to the ChatMessageEx instances in SimpleChat.xchat
without creating new instances of anything.
2025-12-04 19:41:39 +05:30
hanishkvc 343d414dd3 SimpleChatTC:ChatMessageEx: ods load, system prompt related
these have been updated to work with ChatMessageEx to an extent
2025-12-04 19:41:39 +05:30
hanishkvc abbf927557 SimpleChatTC:ChatMessageEx: add update_oneshot
response_extract logic moved directly into ChatMessageEx as update
oneshot, with suitable adjustments. Inturn use the same directly.
2025-12-04 19:41:39 +05:30
hanishkvc 361f6968d1 SimpleChatTC:ChatMessage: remove ResponseExtractStream
Use the equivalent update_stream directly added to ChatMessageEx.

update_stream is also more generic to some extent and also directly
implemented by the ChatMessageEx class.
2025-12-04 19:41:39 +05:30
hanishkvc 32dd63ee1d SimpleChatTC:ChatMessageEx:cleanup, HasToolCalls, ContentEquiv
Update HasToolCalls and ContentEquiv to work with new structure
2025-12-04 19:41:39 +05:30
hanishkvc aa229a1f99 SimpleChatTC:ChatMessageEx: UpdateStream logic
Rename ChatMessage to ChatMessageEx.

Add typedefs for NSToolCall and NSChatMessage, they represent the
way the corresponding data is structured in network hs.

Add logic to build the ChatMessageEx from data got over network in
streaming mode.
2025-12-04 19:41:39 +05:30
hanishkvc 2c29c2d589 SimpleChatTC:ChatMessage: AssistantResponse into chat message class
Modify the constructor, newFrom and clear towards this goal.
2025-12-04 19:41:39 +05:30
hanishkvc 37faf8611a SimpleChatTC: update descs to indicate use of web workers
ie wrt the tool calls provided.
2025-12-04 19:41:39 +05:30
hanishkvc c2112618c0 SimpleChatTC: Update readme.md wrt latest updates. 2k maxtokens 2025-12-04 19:41:39 +05:30
hanishkvc 1789f5f1e2 SimpleChatTC: Increase the sliding window context to Last4 QA
As the tool calling, if enabled, will need access to last few
user query and ai assistant responses (which will also include
in them the tool call requests and the corresponding results),
so that the model can build answers based on its tool call reqs
and got responses, and also given that most of the models these
days have sufficiently large context windows, so the sliding
window context implemented by SimpleChat logic has been increased
by default to include last 4 query and their responses roughlty.
2025-12-04 19:41:39 +05:30
hanishkvc a0f6762fda SimpleChatTC: Web worker flow initial go cleanup
Had forgotten to specify type as module wrt web worker, in order
to allow it to import the toolsconsole module.

Had forgotten to maintain the id of the timeout handler, which is
needed to clear/stop the timeout handler from triggering, if tool
call response is got well in time.

As I am currently reverting the console redirection at end of
handling a tool call code in the web worker message handler, I
need to setup the redirection each time. Also I had forgotten
to clear the console.log capture data space, before a new tool
call code is executed, this is also fixed by this change.

TODO: Need to abort the tool call code execution in the web worker
if possible in future, if the client / browser side times out
waiting for tool call response, ie if the tool call code is taking
up too much time.
2025-12-04 19:41:39 +05:30
hanishkvc 148ec1c41a SimpleChatTC: Get ready for decoupled tool call response
tools manager/module

* setup the web worker that will help execute the tool call related
  codes in a js environment that is isolated from the browsers main
  js environment

* pass the web worker to the tool call providers, for them to use

* dont wait for the result from the tool call, as it will be got
  later asynchronously through a message

* allow users of the tools manager to register a call back, which
  will be called when ever a message is got from the web worker
  containing response wrt previously requested tool call execution.

simplechat

* decouple toolcall response handling and toolcall requesting logic

* setup a timeout to take back control if tool call takes up too
  much time. Inturn help alert the ai model, that the tool call
  took up too much time and so was aborted, by placing a approriate
  tagged tool response into user query area.

* register a call back that will be called when response is got
  asynchronously wrt anye requested tool calls.
  In turn take care of updating the user query area with response
  got wrt the tool call, along with tool response tag around it.
2025-12-04 19:41:39 +05:30
hanishkvc 2a8bd1c9e7 SimpleChatTC: Actual tool call implementations simplified
These no longer need to worry about

* setting up the console.log related redirection to capture
  the generated outputs, nor about
* setting up a dynamic function for executing the needed
  tool call related code

The web worker setup to help run tool calls in a relatively
isolated environment independent of the main browser env,
takes care of these.

One needs to only worry about getting the handle to the
web worker to use and inturn pass the need code wrt the
tool call to it.
2025-12-04 19:41:39 +05:30
hanishkvc 14d67f6c3c SimpleChatTC: Pass around structured objects wrt tool worker
The request for code to run as well as the resultant response data
both need to follow a structured object convention, so that it is
easy to map a request and the corresponding response to some extent.
2025-12-04 19:41:39 +05:30
hanishkvc 510c65c721 SimpleChatTC: Initial skeleton of a simple toolsworker 2025-12-04 19:41:39 +05:30
hanishkvc a6bccf934e SimpleChatTC:ToolsConsole:Cleanup a bit, add basic set of notes
Try ensure as well as verify that original console.log is saved
and not overwritten. Throw an exception if things seem off wrt
same.

Also ensure to add a newline at end of console.log messages
2025-12-04 19:41:39 +05:30
hanishkvc 2701cb3a1e SimpleChatTC: Move console.log trapping into its own module
So that it can be used from different modules, if required.
2025-12-04 19:41:39 +05:30
hanishkvc 45d8a00738 SimpleChatTC: Update readme wrt --jinja argument and bit more 2025-12-04 19:41:39 +05:30
hanishkvc a8c8176d09 SimpleChatTC: Tool Calling UI elements use up horizontal space 2025-12-04 19:41:39 +05:30
hanishkvc 1e5b638beb SimpleChatTC: Update readme with bit more details, Cleaner UI
Also avoid showing Tool calling UI elements, when not needed to
be shown.
2025-12-04 19:41:39 +05:30
hanishkvc bfe789706e SimpleChatTC: Let user trigger tool call, instead of automatic
Instead of automatically calling any requested tool by the GenAi
/ llm, that is from the tail end of the handle user submit btn
click,

Now if the GenAi/LLM has requested any tool to be called, then
enable the Tool Run related UI elements and fill them with the
tool name and tool args.

In turn the user can verify if they are ok with the tool being
called and the arguments being passed to it. Rather they can
even fix any errors in the tool usage like the arithmatic expr
to calculate that is being passed to simple_calculator or the
javascript code being passed to run_javascript_function_code

If user is ok with the tool call being requested, then trigger
the same.

The results if any will be automatically placed into the user
query text area.

User can cross verify if they are ok with the result and or
modify it suitabley if required and inturn submit the same to
the GenAi/LLM.
2025-12-04 19:41:39 +05:30
hanishkvc 1fc44c971d SimpleChatTC: Add ui elements for tool call verify and trigger
Instead of automatically calling the requested tool with supplied
arguments, rather allow user to verify things before triggering the
tool.

NOTE: User already provided control over tool_response before
submitting it to the ai assistant.
2025-12-04 19:41:38 +05:30
hanishkvc fd662b4b0b SimpleChatTC: ToolCall hs info in normal assistant-user chat flow
Also as part of same, wrap the request details in the assistant
block using a similar tagging format as the tool_response in user
block.
2025-12-04 19:41:38 +05:30
hanishkvc 30aa2f4c6b SimpleChatTC: Update the readme.md wrt tool calling a bit 2025-12-04 19:41:38 +05:30
hanishkvc 63b5c6d76d SimpleChatTC: Cleanup the function description a bit
to better describe how it will be run, so that genai/llm while
creating the code to run, will hopefully take care of any naunces
required.
2025-12-04 19:41:38 +05:30
hanishkvc a80da9a652 SimpleChatTC: Pass toolname to the tool handler
So that when tool handler writes the result to the tc_switch, it
can make use of the same, to write to the right location.

NOTE: This also fixes the issue with I forgetting to rename the
key in js_run wrt writing of result.
2025-12-04 19:41:38 +05:30
hanishkvc f7284a8b89 SimpleChatTC: Move tool calling to tools, try trap async failures
Move tool calling logic into tools module.

Try trap async promise failures by awaiting results of tool calling
and putting full thing in an outer try catch. Have forgotten the
nitty gritties of JS flow, this might help, need to check.
2025-12-04 19:41:38 +05:30
hanishkvc ef85ed41d4 SimpleChatTC: Clarify some type definitions to avoid warnings
ie in vs code with ts-check
2025-12-04 19:41:38 +05:30
hanishkvc a408e5e017 SimpleChatTC: More clearer description of toolcalls execution env
Should hopeful ensure that the GenAi/LLM will generate appropriate
code/expression as the argument to pass to these tool calls, to
some extent.
2025-12-04 19:41:38 +05:30
hanishkvc b4776da670 SimpleChatTC: Trap any exception raised during tool call
and inform the GenAi/LLM about the same
2025-12-04 19:41:38 +05:30
hanishkvc 17c5daa52c SimpleChatTC: Cleanup initial/1st go toolcall flow
As output generated by any tool/function call is currently placed
into the TextArea provided for End user (for their queries), bcas
the GenAi (engine/LLM) may be expecting the tool response to be
sent as a user role data with tool_response tag surrounding the
results from the tool call. So also now at the end of submit btn
click handling, the end user input text area is not cleared, if
there was a tool call handled, for above reasons.

Also given that running a simple arithmatic expression in itself
doesnt generate any output, so wrap them in a console.log, to
help capture the result using the console.log trapping flow that
is already setup.
2025-12-04 19:41:38 +05:30
hanishkvc 301910c3a1 SimpleChatTC: Implement a simple toolcall handling flow
Checks for toolname to be defined or not in the GenAi's response

If toolname is set, then check if a corresponding tool/func exists,
and if so call the same by passing it the GenAi provided toolargs
as a object.

Inturn the text generated by the tool/func is captured and put
into the user input entry text box, with tool_response tag around
it.
2025-12-04 19:41:38 +05:30
hanishkvc fa63a86c71 SimpleChatTC:tooljs: Trap console.log and store in new result key
The implementations of javascript and simple_calculator now use
provided helpers to trap console.log messages when they execute
the code / expression provided by GenAi and inturn store the
captured log messages in the newly added result key in tc_switch

This should help trap the output generated by the provided code
or expression as the case maybe and inturn return the same to the
GenAi, for its further processing.
2025-12-04 19:41:38 +05:30
hanishkvc 6d43011003 SimpleChatTC: Saner/Robust AssistantResponse content_equiv
Previously if content was empty, it would have always sent the
toolcall info related version even if there was no toolcall info
in it. Fixed now to return empty string, if both content and
toolname are empty.
2025-12-04 19:41:38 +05:30
hanishkvc 383c19c99b SimpleChatTC: twins wrt streamed response handling
As there could be failure wrt getting the response from the ai
server some where in between a long response spread over multiple
 parts, the logic uses the latestResponse to cache the response
as it is being received. However once the full response is got,
one needs to transfer it to a new instance of AssistantResponse
class, so that latestResponse can be cleared, while the new
instance can be used in other locations in the flow as needed.

Achieve the same now.
2025-12-04 19:41:38 +05:30
hanishkvc 53f85d09be SimpleChatTC: AssistantResponse everywhere initial go
Switch oneshot handler to use AssistantResponse, inturn currenlty
only handle the normal content in the response.

TODO: If any tool_calls in the oneshot response, it is currently
not handled.

Inturn switch the generic/toplevel handle response logic to use
AssistantResponse class, given that both oneshot and the
multipart/streaming flows use/return it.

Inturn add trimmedContent member to AssistantResponse class and
make the generic handle response logic to save the trimmed content
into this. Update users of trimmed to work with this structure.
2025-12-04 19:41:38 +05:30
hanishkvc 3f3aa8d043 SimpleChatTC: AssistantResponse class initial go
Make latestResponse into a new class based type instance wrt
ai assistant response, which is what it represents.

Move clearing, appending fields' values and getting assistant's
response info (irrespective of a content or toolcall response)
into this new class and inturn use the same.
2025-12-04 19:41:38 +05:30
hanishkvc 5a26831ad2 SimpleChatTC: Show toolcall being generated by ai - Temp 2025-12-04 19:41:38 +05:30
hanishkvc e73bc4550b SimpleChatTC: Avoid null content, Fix oversight wrt finish_reason
I was wrongly checking for finish_reason to be non null, before
trying to extract the genai content/toolcalls, have fixed this
oversight with the new flow in progress.

I had added few debug logs to identify the above issue, need to
remove them later. Note: given that debug logs are disabled by
replacing the debug function during this program's initialisation,
which I had forgotten about, I didnt get the debug messages and
had to scratch my head a bit, before realising this and the other
issue ;)

Also either when I had originally implemented simplechat 1+ years
back, or later due to changes on the server end, the streaming
flow sends a initial null wrt the content, where it only sets the
role. This was not handled in my flow on the client side, so a
null was getting prepended to the chat messages/responses from the
server. This has been fixed now in the new generic flow.
2025-12-04 19:41:38 +05:30
hanishkvc 63430dc9f7 SimpleChatTC: Extract streamed field - assume only 1f at any time
Update response_extract_stream to check for which field is being
currently streamed ie is it normal content or tool call func name
or tool call func args and then return the field name and extracted
value.

Previously it was always assumed that only normal content will be
returned.

Currently it is assumed that the server will only stream one of the
3 supported fields at any time and not more than one of them at the
same time.

TODO: Have to also add logic to extract the reasoning field later,
ie wrt gen ai models which give out their thinking.

Have updated append_response to expect both the key and the value
wrt the latestResponse object, which it will be manipualted.

Previously it was always assumed that content is what will be got
and inturn appended.
2025-12-04 19:41:38 +05:30
hanishkvc bfe7ef69fa SimpleChatTC: Skeleton to handle diff fields when streaming
Changed latestResponse type to an object instead of a string.
Inturn it contains entries for content, toolname and toolargs.

Added a custom clear logic due to the same and used it to replace
the previously simple assigning of empty string to latestResponse.

For now in all places where latestReponse is used, I have replaced
with latestReponse.content.

Next need to handle identifying the field being streamed and inturn
append to it. Also need to add logic to call tool, when tool_call
triggered by genai.
2025-12-04 19:41:38 +05:30
hanishkvc 32f5278e8c SimpleChatTC: use tcpdump to dbg hs; check if ai aware of tools 2025-12-04 19:41:38 +05:30
hanishkvc 6167cdff9f SimpleChatTC: Bring in the tools meta into the main flow 2025-12-04 19:41:38 +05:30
hanishkvc 46f0304105 SimpleChatTC: More generic tooljs, SimpCalc, some main skeleton
Make tooljs structure and flow more generic

Add a simple_calculator tool/function call logic

Add initial skeleton wrt the main tools.mjs file.
2025-12-04 19:41:38 +05:30
hanishkvc f1aa0ee778 SimpleChatTC: Add skeleton for a javascript interpretor tool call
Define the meta that needs to be passed to the GenAi Engine.

Define the logic that implements the tool call, if called.

Implement the flow/structure such that a single tool calls
implementation file can define multiple tool calls.
2025-12-04 19:41:38 +05:30
hanishkvc 48c9f07982 SimpleChatTC: Update test shell script a bit
Enable streaming by default, to check the handshake before going
on to change the code, given that havent looked into this for more
than a year now and have been busy with totally different stuff.

Also updated the user messages used for testing a bit
2025-12-04 19:41:38 +05:30
hanishkvc 9341c507f2 SimpleChatTools: Add boolean to allow user control of tools use 2025-12-04 19:41:38 +05:30
hanishkvc 4282a4277a SimpleChatToolCalling: Test/Explore srvr initial hs using cmdline 2025-12-04 19:41:38 +05:30
Adrien Gallouët ef75a89fdb
build : move _WIN32_WINNT definition to headers (#17736)
Previously, cmake was forcing `_WIN32_WINNT=0x0A00` for MinGW builds,
This caused "macro redefined" warnings with toolchains that define the version.

This also removes the `GGML_WIN_VER` variable as it is no longer needed.

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-12-04 07:04:02 +01:00
Piotr Wilkin (ilintar) c6d1a00aa7
Add a couple of file types to the text section (#17670)
* Add a couple of file types to the text section

* Format + regenerate index

* Rebuild after rebase
2025-12-03 21:45:06 +01:00
Aleksander Grygier e9f9483464
Use OpenAI-compatible `/v1/models` endpoint by default (#17689)
* refactor: Data fetching via stores

* chore: update webui build output

* refactor: Use OpenAI compat `/v1/models` endpoint by default to list models

* chore: update webui build output

* chore: update webui build output
2025-12-03 20:49:09 +01:00
Andika Wasisto 41c5e02f42
webui: Fix zero pasteLongTextToFileLen to disable conversion being overridden (#17445)
* webui: Fix zero pasteLongTextToFileLen to disable conversion being overridden

Zero pasteLongTextToFileLen should disable the conversion, but it was
overwritten with 2500.

* Apply suggestions from code review

* Update webui build
2025-12-03 20:45:17 +01:00
Pascal e7c2cf1356
server: add router multi-model tests (#17704) (#17722)
* llama-server: add router multi-model tests (#17704)

Add 4 test cases for model router:
- test_router_unload_model: explicit model unloading
- test_router_models_max_evicts_lru: LRU eviction with --models-max
- test_router_no_models_autoload: --no-models-autoload flag behavior
- test_router_api_key_required: API key authentication

Tests use async model loading with polling and graceful skip when
insufficient models available for eviction testing.

utils.py changes:
- Add models_max, models_dir, no_models_autoload attributes to ServerProcess
- Handle JSONDecodeError for non-JSON error responses (fallback to text)

* llama-server: update test models to new HF repos

* add offline

* llama-server: fix router LRU eviction test and add preloading

Fix eviction test: load 2 models first, verify state, then load
3rd to trigger eviction. Previous logic loaded all 3 at once,
causing first model to be evicted before verification could occur.

Add module fixture to preload models via ServerPreset.load_all()
and mark test presets as offline to use cached models

* llama-server: fix split model download on Windows

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-12-03 15:10:37 +01:00
Adrien Gallouët 1257491047
server : fix bad fmt, size() is a size_type (#17735)
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-12-03 15:47:22 +02:00
Aldehir Rojas 0a8026e768
common : introduce composable PEG parser combinators for chat parsing (#17136)
* common : implement parser combinators to simplify chat parsing

* add virtual destructor to parser_base

* fix memory leak from circular references of rules

* implement gbnf grammar building

* remove unused private variable

* create a base visitor and implement id assignment as a visitor

* fix const ref for grammar builder

* clean up types, friend classes, and class declarations

* remove builder usage from until_parser

* Use a counter class to help assign rule ids

* cache everything

* add short description for each parser

* create a type for the root parser

* implement repetition parser

* Make optional, one_or_more, and zero_or_more subclasses of repetition

* improve context constructor

* improve until parsing and add benchmarks

* remove cached() pattern, cache in parser_base with specialized parsing functions for each parser

* improve json parsing performance to better match legacy parsing

* fix const auto * it for windows

* move id assignment to classes instead of using a visitor

* create named rules in the command r7b example

* use '.' for any in GBNF

* fix parens around choices in gbnf grammar

* add convenience operators to turn strings to literals

* add free-form operators for const char * to simplify defining literals

* simplify test case parser

* implement semantic actions

* remove groups in favor of actions and a scratchpad

* add built in actions for common operations

* add actions to command r7b example

* use std::default_searcher for platforms that don't have bm

* improve parser_type handling and add cast helper

* add partial result type to better control when to run actions

* fix bug in until()

* run actions on partial results by default

* use common_chat_msg for result

* add qwen3 example wip

* trash partial idea and simplify

* move action arguments to a struct

* implement aho-corasick matcher for until_parser and to build exclusion grammars

* use std::string for input, since std::string_view is incompatible with std::regex

* Refactor tests

* improve qwen3 example

* implement sax-style parsing and refactor

* fix json string in test

* rename classes to use common_chat_ prefix

* remove is_ suffix from functions

* rename from id_counter to just counter

* Final refactored tests

* Fix executable name and editorconfig-checker

* Third time's the charm...

* add trigger parser to begin lazy grammar rule generation

* working lazy grammar

* refactor json rules now that we check for reachability

* reduce pointer usage

* print out grammars in example

* rename to chat-peg-parser* and common_chat_peg_parser*

* Revert unrelated changes

* New macros for CMakeLists to enable multi-file compilations

* starting unicode support

* add unicode support to char_parser

* use unparsed args as additional sources

* Refactor tests to new harness

* Fix CMakeLists

* fix rate calculation

* add unicode tests

* fix trailing whitespace and line endings

skip-checks: true

* Helpers + rewrite qwen3 with helpers

* Fix whitespace

* extract unicode functions to separate file

* refactor parse unicode function

* fix compiler error

* improve construction of sequence/choice parsers

* be less clever

* add make_parser helper function

* expand usage of make_parser, alias common_chat_msg_peg_parser_builder to builder in source

* lower bench iterations

* add unicode support to until_parser

* add unicode support to json_string_parser

* clean up unicode tests

* reduce unicode details to match src/unicode.cpp

* simplify even further

* remove unused functions

* fix type

* reformat char class parsing

* clean up json string parser

* clean up + fix diagnostics

* reorder includes

* compact builder functions

* replace action_parser with capture_parser, rename env to semantics

* rename env to semantics

* clean up common_chat_parse_context

* move type() to below constant

* use default constructor for common_chat_peg_parser

* make all operators functions for consistency

* fix compilation errors in test-optional.cpp

* simplify result values

* rename json_string_unquoted to json_string_content

* Move helper to separate class, add separate explicit and helper classes

* Whitespace

* Change + to append()

* Reformat

* Add extra helpers, tests and Minimax example

* Add some extra optional debugging prints + real example of how to use them

* fix bug in repetitions when min_count = 0 reports failures

* dump rule in debug

* fix token accumulation and assert parsing never fails

* indent debug by depth

* use LOG_* in tests so logs sync up with test logs

* - Add selective testing
- Refactor all messaging to use LOG_ERR
- Fix lack of argument / tool name capturing
- Temporary fix for double event capture

* refactor rule() and introduce ref()

* clean up visitor

* clean up indirection in root parser w.r.t rules

* store shared ptr directly in parser classes

* replace aho-corasick automation with a simple trie

* Reset prev for qwen3 helper example variant

* refactor to use value semantics with std::variant/std::visit

* simplify trie_matcher result

* fix linting issues

* add annotations to rules

* revert test workaround

* implement serializing the parser

* remove redundant parsers

* remove tests

* gbnf generation fixes

* remove LOG_* use in tests

* update gbnf tests to test entire grammar

* clean up gbnf generation and fix a few bugs

* fix typo in test output

* remove implicit conversion rules

* improve test output

* rename trie_matcher to trie

* simplify trie to just know if a node is the end of a word

* remove common_chat_ prefix and ensure a common_peg_ prefix to all types

* rename chat-peg-parser -> peg-parser

* promote chat-peg-parser-helper to chat-peg-parser

* checkpoint

* use a static_assert to ensure we handle every branch

* inline trivial peg parser builders

* use json strings for now

* implement basic and native chat peg parser builders/extractors

* resolve refs to their rules

* remove packrat caching (for now)

* update tests

* compare parsers with incremental input

* benchmark both complete and incremental parsing

* add raw string generation from json schema

* add support for string schemas in gbnf generation

* fix qwen example to include \n

* tidy up example

* rename extractor to mapper

* rename ast_arena to ast

* place basic tests into one

* use gbnf_format_literal from json-schema-to-grammar

* integrate parser with common/chat and server

* clean up schema and serialization

* add json-schema raw string tests

* clean up json creation and remove capture parser

* trim spaces from reasoning and content

* clean up redundant rules and comments

* rename input_is_complete to is_partial to match rest of project

* simplify json rules

* remove extraneous file

* remove comment

* implement += and |= operators

* add comments to qwen3 implementation

* reorder arguments to common_chat_peg_parse

* remove commented outdated tests

* add explicit copy constructor

* fix operators and constness

* wip: update test-chat for qwen3-coder

* bring json parser closer to json-schema-to-grammar rules

* trim trailing space for most things

* fix qwen3 coder rules w.r.t. trailing spaces

* group rules

* do not trim trailing space from string args

* tweak spacing of qwen3 grammar

* update qwen3-coder tests

* qwen3-coder small fixes

* place parser in common_chat_syntax to simplify invocation

* use std::set to collect rules to keep order predictable for tests

* initialize parser to make certain platforms happy

* revert back to std::unordered_set, sort rule names at the end instead

* uncomment rest of chat tests

* define explicit default constructor

* improve arena init and server integration

* fix chat test

* add json_member()

* add a comprehensive native example

* clean up example qwen test and add response_format example to native test

* make build_peg_parser accept std::function instead of template

* change peg parser parameters into const ref

* push tool call on tool open for constructed parser

* add parsing documentation

* clean up some comments

* add json schema support to qwen3-coder

* add id initializer in tests

* remove grammar debug line from qwen3-coder

* refactor qwen3-coder to use sequence over operators

* only call common_chat_peg_parse if appropriate format

* simplify qwen3-coder space handling

* revert qwen3-coder implementation

* revert json-schema-to-grammar changes

* remove unnecessary forward declaration

* small adjustment to until_parser

* rename C/C++ files to use dashes

* codeowners : add aldehir to peg-parser and related files

---------

Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>
2025-12-03 12:45:32 +02:00
Pascal 5ceed62421
server: fix duplicate HTTP headers in multiple models mode (#17698)
* llama-server: fix duplicate HTTP headers in multiple models mode (#17693)

* llama-server: address review feedback from ngxson

- restrict scope of header after std::move
- simplify header check (remove unordered_set)
2025-12-03 10:28:43 +01:00
Xuan-Son Nguyen 13628d8bdb
server: add --media-path for local media files (#17697)
* server: add --media-path for local media files

* remove unused fn
2025-12-02 22:49:20 +01:00
Xuan-Son Nguyen a96283adc4
mtmd: fix --no-warmup (#17695) 2025-12-02 22:48:08 +01:00
Chad Voegele c4357dcc35
Server: Change Invalid Schema from Server Error (500) to User Error (400) (#17572)
* Make invalid schema a user error (400)

* Move invalid_argument exception handler to ex_wrapper

* Fix test

* Simplify test back to original pattern
2025-12-02 17:33:50 +01:00
Xuan-Son Nguyen 5d6bd842ea
server: remove default "gpt-3.5-turbo" model name (#17668)
* server: remove default "gpt-3.5-turbo" model name

* do not reflect back model name from request

* fix test
2025-12-02 11:38:57 +01:00
senhtry fd3abe849e
server: fixing naming conflict res_error in server-models.cpp (#17679) 2025-12-02 11:18:39 +01:00
Xuan-Son Nguyen 682e6658bb
server: explicitly set exec path when create new instance (#17669)
* Revert "rm unused fn"

This reverts commit f2dbe9c087.

* server: explicitly set exec path when create new instance

* put back TODO

* only call get_server_exec_path() once

* add fallback logic
2025-12-02 10:25:11 +01:00
Aleksander Grygier cee92af553
Add context info to server error (#17663)
* fix: Add context info to server error

* chore: update webui build output
2025-12-02 09:20:57 +01:00
Xuan-Son Nguyen ecf74a8417
mtmd: add mtmd_context_params::warmup option (#17652)
* mtmd: add mtmd_context_params::warmup option

* reuse the common_params::warmup
2025-12-01 21:32:25 +01:00
Xuan-Son Nguyen ec18edfcba
server: introduce API for serving / loading / unloading multiple models (#17470)
* server: add model management and proxy

* fix compile error

* does this fix windows?

* fix windows build

* use subprocess.h, better logging

* add test

* fix windows

* feat: Model/Router server architecture WIP

* more stable

* fix unsafe pointer

* also allow terminate loading model

* add is_active()

* refactor: Architecture improvements

* tmp apply upstream fix

* address most problems

* address thread safety issue

* address review comment

* add docs (first version)

* address review comment

* feat: Improved UX for model information, modality interactions etc

* chore: update webui build output

* refactor: Use only the message data `model` property for displaying model used info

* chore: update webui build output

* add --models-dir param

* feat: New Model Selection UX WIP

* chore: update webui build output

* feat: Add auto-mic setting

* feat: Attachments UX improvements

* implement LRU

* remove default model path

* better --models-dir

* add env for args

* address review comments

* fix compile

* refactor: Chat Form Submit component

* ad endpoint docs

* Merge remote-tracking branch 'webui/allozaur/server_model_management_v1_2' into xsn/server_model_maagement_v1_2

Co-authored-by: Aleksander <aleksander.grygier@gmail.com>

* feat: Add copy to clipboard to model name in model info dialog

* feat: Model unavailable UI state for model selector

* feat: Chat Form Actions UI logic improvements

* feat: Auto-select model from last assistant response

* chore: update webui build output

* expose args and exit_code in API

* add note

* support extra_args on loading model

* allow reusing args if auto_load

* typo docs

* oai-compat /models endpoint

* cleaner

* address review comments

* feat: Use `model` property for displaying the `repo/model-name` naming format

* refactor: Attachments data

* chore: update webui build output

* refactor: Enum imports

* feat: Improve Model Selector responsiveness

* chore: update webui build output

* refactor: Cleanup

* refactor: Cleanup

* refactor: Formatters

* chore: update webui build output

* refactor: Copy To Clipboard Icon component

* chore: update webui build output

* refactor: Cleanup

* chore: update webui build output

* refactor: UI badges

* chore: update webui build output

* refactor: Cleanup

* refactor: Cleanup

* chore: update webui build output

* add --models-allow-extra-args for security

* nits

* add stdin_file

* fix merge

* fix: Retrieve lost setting after resolving merge conflict

* refactor: DatabaseStore -> DatabaseService

* refactor: Database, Conversations & Chat services + stores architecture improvements (WIP)

* refactor: Remove redundant settings

* refactor: Multi-model business logic WIP

* chore: update webui build output

* feat: Switching models logic for ChatForm or when regenerating messges + modality detection logic

* chore: update webui build output

* fix: Add `untrack` inside chat processing info data logic to prevent infinite effect

* fix: Regenerate

* feat: Remove redundant settigns + rearrange

* fix: Audio attachments

* refactor: Icons

* chore: update webui build output

* feat: Model management and selection features WIP

* chore: update webui build output

* refactor: Improve server properties management

* refactor: Icons

* chore: update webui build output

* feat: Improve model loading/unloading status updates

* chore: update webui build output

* refactor: Improve API header management via utility functions

* remove support for extra args

* set hf_repo/docker_repo as model alias when posible

* refactor: Remove ConversationsService

* refactor: Chat requests abort handling

* refactor: Server store

* tmp webui build

* refactor: Model modality handling

* chore: update webui build output

* refactor: Processing state reactivity

* fix: UI

* refactor: Services/Stores syntax + logic improvements

Refactors components to access stores directly instead of using exported getter functions.

This change centralizes store access and logic, simplifying component code and improving maintainability by reducing the number of exported functions and promoting direct store interaction.

Removes exported getter functions from `chat.svelte.ts`, `conversations.svelte.ts`, `models.svelte.ts` and `settings.svelte.ts`.

* refactor: Architecture cleanup

* feat: Improve statistic badges

* feat: Condition available models based on modality + better model loading strategy & UX

* docs: Architecture documentation

* feat: Update logic for PDF as Image

* add TODO for http client

* refactor: Enhance model info and attachment handling

* chore: update webui build output

* refactor: Components naming

* chore: update webui build output

* refactor: Cleanup

* refactor: DRY `getAttachmentDisplayItems` function + fix UI

* chore: update webui build output

* fix: Modality detection improvement for text-based PDF attachments

* refactor: Cleanup

* docs: Add info comment

* refactor: Cleanup

* re

* refactor: Cleanup

* refactor: Cleanup

* feat: Attachment logic & UI improvements

* refactor: Constants

* feat: Improve UI sidebar background color

* chore: update webui build output

* refactor: Utils imports + move types to `app.d.ts`

* test: Fix Storybook mocks

* chore: update webui build output

* test: Update Chat Form UI tests

* refactor: Tooltip Provider from core layout

* refactor: Tests to separate location

* decouple server_models from server_routes

* test: Move demo test  to tests/server

* refactor: Remove redundant method

* chore: update webui build output

* also route anthropic endpoints

* fix duplicated arg

* fix invalid ptr to shutdown_handler

* server : minor

* rm unused fn

* add ?autoload=true|false query param

* refactor: Remove redundant code

* docs: Update README documentations + architecture & data flow diagrams

* fix: Disable autoload on calling server props for the model

* chore: update webui build output

* fix ubuntu build

* fix: Model status reactivity

* fix: Modality detection for MODEL mode

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-12-01 19:41:04 +01:00
Xuan-Son Nguyen 7733409734
common: improve verbosity level definitions (#17630)
* common: improve verbosity level definitions

* string_format

* update autogen docs
2025-12-01 14:38:13 +01:00
Tarek Dakhran 2ba719519d
model: LFM2-VL fixes (#17577)
* Adjust to pytorch

* Add antialiasing upscale

* Increase number of patches to 1024

* Handle default marker insertion for LFM2

* Switch to flag

* Reformat

* Cuda implementation of antialias kernel

* Change placement in ops.cpp

* consistent float literals

* Pad only for LFM2

* Address PR feedback

* Rollback default marker placement changes

* Fallback to CPU implementation for antialias implementation of upscale
2025-11-30 21:57:31 +01:00
Xuan-Son Nguyen 7f8ef50cce
clip: fix nb calculation for qwen3-vl (#17594) 2025-11-30 15:33:55 +01:00
Xuan-Son Nguyen 3c136b21a3
cli: add migration warning (#17620) 2025-11-30 15:32:43 +01:00
Xuan-Son Nguyen ab49f094d2
server: move server-context to its own cpp|h (#17595)
* git mv

* add server-context.h

* add server-context.h

* clean up headers

* cont : cleanup

* also expose server_response_reader (to be used by CLI)

* fix windows build

* decouple server_routes and server_http

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-29 22:04:44 +01:00
Haiyue Wang 8c32d9d96d
server: explicitly set the function name in lambda (#17538)
As [1] explained, the real debug message will be like:
	"res    operator(): operator() : queue result stop"

Set the name explicitly, the message is easy for debugging:
	"res    operator(): recv : queue result stop"

The left "operator()" is generated by 'RES_DBG() ... __func__'

[1]: https://clang.llvm.org/extra/clang-tidy/checks/bugprone/lambda-function-name.html

Signed-off-by: Haiyue Wang <haiyuewa@163.com>
2025-11-29 18:43:29 +01:00
Igor Smirnov 0874693b44
common : fix json schema with '\' in literals (#17307)
* Fix json schema with '\' in literals

* Add "literal string with escapes" test
2025-11-29 17:06:32 +01:00
o7si 3ce7a65c2f
server: fix: /metrics endpoint returning JSON-escaped Prometheus format (#17386)
* fix: /metrics endpoint returning JSON-escaped Prometheus format

* mod: remove string overload from ok() method
2025-11-28 19:14:00 +01:00
Fredrik Hultin ddf9f94389
server : add Anthropic Messages API support (#17570)
* server : add Anthropic Messages API support

* remove -@pytest.mark.slow from tool calling/jinja tests

* server : remove unused code and slow/skip on test_anthropic_vision_base64_with_multimodal_model in test_anthropic_api.py

* server : removed redundant n field logic in anthropic_params_from_json

* server : use single error object instead of error_array in streaming response handler for /v1/chat/completions and use unordered_set instead of set in to_json_anthropic_stream()

* server : refactor Anthropic API to use OAI conversion

* make sure basic test always go first

* clean up

* clean up api key check, add test

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-28 12:57:04 +01:00
Xuan-Son Nguyen e509411cf1
server: enable jinja by default, update docs (#17524)
* server: enable jinja by default, update docs

* fix tests
2025-11-27 01:02:50 +01:00
Han Qingzhe 1d594c295c
clip: (minicpmv) fix resampler kq_scale (#17516)
* debug:"solve minicpmv precision problem"

* “debug minicpmv”

* Apply suggestion from @ngxson

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-11-26 21:44:07 +01:00
Pascal b1846f1c8e
webui: add rehype plugin to restore HTML in Markdown table cells (#17477)
* webui: add rehype plugin to restore HTML in Markdown table cells

The remark/rehype pipeline neutralizes inline HTML as literal text
(remarkLiteralHtml) so that XML/HTML snippets in LLM responses display
as-is instead of being rendered. This causes <br> and <ul> markup in
table cells to show as plain text.

This plugin traverses the HAST post-conversion, parses whitelisted HTML
patterns (<br>, <ul><li>) from text nodes, and replaces them with actual
HAST element nodes. For lists, adjacent siblings must be combined first
as the AST fragmentation breaks pattern matching.

Strict validation rejects malformed markup, keeping it as raw text.

* chore: update webui build output
2025-11-25 08:01:02 +01:00
Xuan-Son Nguyen b8372eecd9
server: split server.cpp code into server/common/task/queue (#17362)
* add server-task, server-common

* add server-queue

* rm redundant includes

* move enum stop_type to server-task

* server : headers cleanup

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-24 14:41:53 +01:00
Pascal 0c7220db56
webui: minor settings reorganization and add disable autoscroll option (#17452)
* webui: added a dedicated 'Display' settings section that groups visualization options

* webui: added a Display setting to toggle automatic chat scrolling

* chore: update webui build output
2025-11-23 18:42:00 +01:00
Aleksander Grygier 4c91f2633f
Improved file naming & structure for UI components (#17405)
* refactor: Component iles naming & structure

* chore: update webui build output

* refactor: Dialog titles + components namig

* chore: update webui build output

* refactor: Imports

* chore: update webui build output
2025-11-20 14:07:31 +01:00
Georgi Gerganov 196f5083ef
common : more accurate sampling timing (#17382)
* common : more accurate sampling timing

* eval-callback : minor fixes

* cont : add time_meas impl

* cont : fix log msg [no ci]

* cont : fix multiple definitions of time_meas

* llama-cli : exclude chat template init from time measurement

* cont : print percentage of unaccounted time

* cont : do not reset timings
2025-11-20 13:40:10 +02:00
Aleksander Grygier 99c53d6558
webui: Add a "Continue" Action for Assistant Message (#16971)
* feat: Add "Continue" action for assistant messages

* feat: Continuation logic & prompt improvements

* chore: update webui build output

* feat: Improve logic for continuing the assistant message

* chore: update webui build output

* chore: Linting

* chore: update webui build output

* fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message

* chore: update webui build output

* feat: Enable "Continue" button based on config & non-reasoning model type

* chore: update webui build output

* chore: Update packages with `npm audit fix`

* fix: Remove redundant error

* chore: update webui build output

* chore: Update `.gitignore`

* fix: Add missing change

* feat: Add auto-resizing for Edit Assistant/User Message textareas

* chore: update webui build output
2025-11-19 14:39:50 +01:00
o7si 97cb3fd5ae
fix: resolve undefined variable 'svr' compilation error (#17348) 2025-11-18 10:10:47 +02:00
Xuan-Son Nguyen 0de8878c96
server: split HTTP into its own interface (#17216)
* server: split HTTP into its own interface

* move server-http and httplib to its own file

* add the remaining endpoints

* fix exception/error handling

* renaming

* missing header

* fix missing windows header

* fix error responses from http layer

* fix slot save/restore handler

* fix case where only one stream chunk is returned

* add NOMINMAX

* do not call sink.write on empty data

* use safe_json_to_str for SSE

* clean up

* add some comments

* improve usage of next()

* bring back the "server is listening on" message

* more generic handler

* add req.headers

* move the chat template print to init()

* add req.path

* cont : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-17 22:05:44 +01:00
Georgi Gerganov 5b2093becc
server : handle context overflow during decode (#17267)
* server : handle context overflow during decode

* server : minor refactor
2025-11-16 09:23:37 +02:00
Aleksander Grygier 22e1ce2f81
webui: Fix clickability around chat processing statistics UI (#17278)
* fix: Better pointer events handling in chat processing info elements

* chore: update webui build output
2025-11-15 22:41:41 +01:00
Pascal 1411d9275a
webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI (#16618)
* webui: add OAI-Compat Harmony tool-call live streaming visualization and persistence in chat UI

- Purely visual and diagnostic change, no effect on model context, prompt
  construction, or inference behavior

- Captured assistant tool call payloads during streaming and non-streaming
  completions, and persisted them in chat state and storage for downstream use

- Exposed parsed tool call labels beneath the assistant's model info line
  with graceful fallback when parsing fails

- Added tool call badges beneath assistant responses that expose JSON tooltips
  and copy their payloads when clicked, matching the existing model badge styling

- Added a user-facing setting to toggle tool call visibility to the Developer
  settings section directly under the model selector option

* webui: remove scroll listener causing unnecessary layout updates (model selector)

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* chore: npm run format & update webui build output

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2025-11-15 21:09:32 +01:00
Ankur Verma c7b7db0445
mtmd-cli: Avoid logging to stdout for model loading messages in mtmd-cli (#17277) 2025-11-15 12:41:16 +01:00
Xuan-Son Nguyen 9b17d74ab7
mtmd: add mtmd_log_set (#17268) 2025-11-14 15:56:19 +01:00
Georgi Gerganov d396b43748
server : fix "can batch with" bug (#17263) 2025-11-14 14:03:45 +02:00
Aleksander Grygier f1bad23f88
Better UX for handling multiple attachments in WebUI (#17246) 2025-11-14 01:19:08 +01:00
Xuan-Son Nguyen c4abcb2457
server: fixing naming conflict res_error (#17243) 2025-11-13 20:53:47 +01:00
Aleksander Grygier 8e878f0cb4
Update packages + upgrade Storybook to v10 (#17201)
* chore: Update packages + upgrade Storybook to v10

* fix: Increase timeout for UI tests
2025-11-12 19:01:48 +01:00
Xuan-Son Nguyen 00c94083b3
server: (refactor) implement generator-based API for task results (#17174)
* server: (refactor) implement generator-based API for task results

* improve

* moving some code

* fix "Response ended prematurely"

* add sink.done before return false

* rm redundant check

* rm unused var

* rename generator --> reader
2025-11-12 18:50:52 +01:00
Xuan-Son Nguyen ee8dd5c658
server: move res_error/res_ok to static function (#17167) 2025-11-12 14:17:24 +01:00
Adrien Gallouët 78010a0d52
cmake : move OpenSSL linking to vendor/cpp-httplib (#17177)
* cmake : move OpenSSL linking to vendor/cpp-httplib

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* bring back httplib 0.27.0

* add -DLLAMA_HTTPLIB

* update cmake config for visionos

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-12 12:32:50 +01:00
Xuan-Son Nguyen 1d45b4228f
vendor: split httplib to cpp/h files (#17150)
* vendor: split httplib to cpp/h files

* move defines

* include httplib if curl is not used

* add TODO

* fix build ios

* fix build visionos instead
2025-11-11 13:32:58 +01:00
Mike Abbott 4a5b8aff40
cmake : add version to all shared object files (#17091)
When compiling llama.cpp in Yocto, it fails QA checks because the generated so files aren't versioned.  This applies a version to all generated so files, allowing the package to build without errors.
2025-11-11 13:19:50 +02:00
Nicolas B. Pierron d2d626938a
Install rpc-server when GGML_RPC is ON. (#17149) 2025-11-11 10:53:59 +00:00
Gabe Goodhart 0c74f32632
memory: Hybrid context shift (#17009)
* feat(memory): Only fail partial erasure of recurrent tail

The recurrent state is always assumed to be the state as of the last update
from the final token in the sequence. When doing a partial erasure, if the
range does not include the final token, the erasure can be considered a
success since any memory used for the sequence prior to the final token
(which is no memory) has been successfully removed.

There is one potential case that this doesn't address which is the pruning
of cache to remove sensitive data from the context. This wouldn't work for
attention cache partial removal (in the middle) either since the KV state
is linearly-dependent and states in later sequence positions would still be
based on the state from the sensitive data, even if that data is no longer
cached, so I don't think this is relevant, but it is worth noting that the
semantics of this change for a partial erasure in the middle of the cache
are essentially "my context is already compressed" and not "all trace of
the removed tokens has been removed."

https://github.com/ggml-org/llama.cpp/issues/16768
Branch: HybridContextShift-16768

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* fix(main): Check the output of seq_rm for prefix matching

This prefix matching is explicitly attempting to remove the tokens at the
end of the sequence that don't match. This is the operation that can't be
performed on a recurrent cache due to the state being updated in place, so
if this removal fails, we need to clear the whole cache.

https://github.com/ggml-org/llama.cpp/issues/16768
Branch: HybridContextShift-16768

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* fix(memory): Fix condition for partial erasure failure if p0 > pos

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

Co-authored-by: compilade <git@compilade.net>

* style: Fix extra parens

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* fix(main.cpp): Set n_matching_session_tokens to 0 on cache clear

https://github.com/ggml-org/llama.cpp/issues/16768
Branch: HybridContextShift-16768

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: compilade <git@compilade.net>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-10 17:14:23 +02:00
Georgi Gerganov f914544b16
batched-bench : add "separate text gen" mode (#17103) 2025-11-10 12:59:29 +02:00