llama.cpp

Commit Graph

Author	SHA1	Message	Date
hanishkvc	f97efb86e4	SimpleChatTC:SimpleProxy:Pdf2Text: js side initial plumbing Expose pdf2text tool call to ai server and handshake with simple proxy for the same.	2025-12-04 19:41:39 +05:30
hanishkvc	6054ddfb65	SimpleChatTC:SimpleProxy:Pdf2Text: Initial go	2025-12-04 19:41:39 +05:30
hanishkvc	5ec29087ea	SimpleChatTC:SimpleProxy:Pdf2Text: Move handling url to its own	2025-12-04 19:41:39 +05:30
hanishkvc	ecfdb66c94	SimpleChatTC:SimpleProxy:Pdf2Text:Initial plumbing Get the pdf2text request for processing.	2025-12-04 19:41:39 +05:30
hanishkvc	da98a961ab	SimpleChatTC:SimpleProxy: Enable allowing or not requested feature	2025-12-04 19:41:39 +05:30
hanishkvc	0b2329e5de	SimpleChatTC: Update readme	2025-12-04 19:41:39 +05:30
hanishkvc	6a8ced244c	SimpleChatTC:Raise Error on Ai Chat server handshake NotOk resp	2025-12-04 19:41:39 +05:30
hanishkvc	91f39b7197	SimpleChatTC:Move chat server handshake to SimpleChat	2025-12-04 19:41:39 +05:30
hanishkvc	482517543b	SimpleChatTC:Seperate out actual nw handshake - initial go move the actual chat handshake with ai server into a seperate code to an extent. also initial anchor to trap handshake http error responses Rather come to think of it, its better to move this into SimpleChat class. Use finally to ensure any needed cleanup for handle_user_submit occurs within itself.	2025-12-04 19:41:39 +05:30
hanishkvc	8d7eece81c	SimpleChatTC:ToolsWorker: Update note to flow with chat session id	2025-12-04 19:41:39 +05:30
hanishkvc	8ab8727f70	SimpleChatTC:DataStore: update readme	2025-12-04 19:41:39 +05:30
hanishkvc	5935ecceca	SimpleChatTC:DataStore:Cleanup:Msg, duplicate on routing side Avoid the duplicate plumbing code and use a common ops plumbing helper. Remove args[key] oversight from DataStoreList msg on webworkr	2025-12-04 19:41:39 +05:30
hanishkvc	57dd228512	SimpleChatTC:DataStore:List keys - the plumbing	2025-12-04 19:41:39 +05:30
hanishkvc	2d497069d2	SimpleChatTC:DataStore:list - web worker side logic The basic skeleton added on the web worker side for listing keys. TODO: Avoid duplication of similar code to an extent across some of these db ops.	2025-12-04 19:41:39 +05:30
hanishkvc	7f8eb04875	SimpleChatTC:DataStore:Delete a record - the plumbing side	2025-12-04 19:41:39 +05:30
hanishkvc	bd7f7cb72a	SimpleChatTC:DataStore: Delete a record - the db web worker side	2025-12-04 19:41:39 +05:30
hanishkvc	d80e438cfa	SimpleChatTC:DataStore:Put, stringify undefined, readme Update the descriptions of set and get to indicate the possible corner cases or rather semantic in such situations. Update the readme also a bit. The auto save and restore mentioned has nothing to do with the new data store mechanism.	2025-12-04 19:41:39 +05:30
hanishkvc	2dad246d53	SimpleChatTC:DataStore: Dont ignore the error paths And indexedDB add isnt the one to be happy with updating existing key.	2025-12-04 19:41:39 +05:30
hanishkvc	4ad88f0da8	SimpleChatTC:DataStore:Eagerness to Wrong JSON conversions In the eagerness of initial skeleton, had forgotten that the root/generic tool call router takes care of parsing the json string into a object, before calling the tool call, so no need to try parse again. Fixed the same. Hadnt converted the object based response from data store related calls in the db web worker, into json string before passing to the generic tool response callback, fixed the same. - Rather the though of making the ChatMsgEx.createAllInOne handle string or object set aside for now, to keep things simple and consistant to the greatest extent possible across different flows. And good news - flow is working atleast for the overall happy path Need to check what corner cases are lurking like calling set on same key more than once, seemed to have some flow oddity, which I need to check later. Also maybe change the field name to value from data in the response to get, to match the field name convention of set. GPT-OSS is fine with it. But worst case micro / nano / pico models may trip up, in worst case, so better to keep things consistent.	2025-12-04 19:41:39 +05:30
hanishkvc	797b702251	SimpleChatTC:DataStore:FuncCallArgs: Any type not supported So mention that may be ai can send complex objects in stringified form. Rather once type of value is set to string, ai should normally do it, but no harm is hinting.	2025-12-04 19:41:39 +05:30
hanishkvc	b080ddf5c3	SimpleChatTC:DataStore: Remaining plumbing to try this Update tooldb logic to match that needed for the db logic and its web worker. Bring in the remaining aspects of db helpers into tools flow.	2025-12-04 19:41:39 +05:30
hanishkvc	2f58542713	SimpleChatTC:DataStore: Duplicate tooljs to tooldb initial skel	2025-12-04 19:41:39 +05:30
hanishkvc	aedffe1df0	SimpleChatTC:DataStore: Initial skeleton of a Db WebWorker Create the DB store Try Get and Set operations The post back to main thread done from asynchronous paths. NOTE: given that it has been ages since indexed db was used, so this is a logical implementation by refering to mdn as needed.	2025-12-04 19:41:39 +05:30
hanishkvc	4f857575f5	SimpleChatTC:UICleanup:ShowMessage: Update readme	2025-12-04 19:41:39 +05:30
hanishkvc	9e9016f7fe	SimpleChatTC:UICleanup: WordBreaks, Print avoid side vertical Define rules to ensure that chat message contents wrap so as to avoid overflowing beyond the size of the screen being viewed. The style used for chat message role to be placed with vertical oriented text adjacent to the actual message content on the side seems to be creating issue with blank pages in some browsers, so avoid that styling when one is printing.	2025-12-04 19:41:39 +05:30
hanishkvc	f3593a9611	SimpleChatTC:ShowMessage:Show any number of toolcalls Also make reasoning easily identifiable in the chat	2025-12-04 19:41:39 +05:30
hanishkvc	41ef449db1	SimpleChatTC:ShowMessage: Seperate out the content parts	2025-12-04 19:41:39 +05:30
hanishkvc	ae9f7971a5	SimpleChatTC:CSS: Instead of hardcoded btn minwidth use padding	2025-12-04 19:41:39 +05:30
hanishkvc	2f07288e40	SimpleChatTC:ShowMessage: containers, role, contents Seperate out the message ui block into a container containing a role block and contents container block. This will allow themeing of these seperately, if required. As part of same, currently the role has been put to the side of the message with vertical text flow.	2025-12-04 19:41:39 +05:30
hanishkvc	a4f247d730	SimpleChatTC:Cleanup:Move showing message into ShowMessage	2025-12-04 19:41:39 +05:30
hanishkvc	59effa6ea8	SimpleChatTC:Cleanup: tool resp xml, some allowed domains Add a newline between name and content in the xml representation of the tool response, so that it is more easy to distinguish things Add github, linkedin and apnews domains to allowed.domains for simpleproxy.py	2025-12-04 19:41:39 +05:30
hanishkvc	cf06c8682b	SimpleChatTC:Reasoning+: Update readme wrt reasoning, flow cleanup Also cleanup the minimal based showing of chat messages a bit And add github.com to allowed list	2025-12-04 19:41:39 +05:30
hanishkvc	937aa57528	SimpleChatTC:MultiChatUI:ChatShow cleanup of Initial skeleton Fix up the initial skeleton / logic as needed. Remember that we are working with potentially a subset of chat messages from the session, given the sliding window logic of context managing on client ui side, so fix up the logic to use the right subset of messages array and not the global xchat when deciding whether a message is the last or last but one, which need special handling wrt Assistant (with toolcall) and Tool (ie response) messages. Moving tool call ui setup as well as tool call response got ui setup into ChatShow of MultiChatUI ensures that switching between chat sessions handle the ui wrt tool call triggering ui and tool call response submission related ui as needed properly. Rather even loading a previously auto saved chat session if it had tool call or tool call response to be handled, the chat ui will be setup as needed to continue that session properly.	2025-12-04 19:41:39 +05:30
hanishkvc	2cc10f6705	SimpleChatTC:MultiChatUI.ChatShow: Mov SimpleChat.Show in -initial Also take care of updating the toolcall ui if needed from within this.	2025-12-04 19:41:39 +05:30
hanishkvc	62bce9ebfb	SimpleChatTC:Show: Cleanup Update existing flow so that next Tool Role message is handled directly from within	2025-12-04 19:41:39 +05:30
hanishkvc	aa17edfa78	SimpleChatTC:SimpleProxy: Include some news sites in allowed domains	2025-12-04 19:41:39 +05:30
hanishkvc	47d9550131	SimpleChatTC:Reasoning: Cleanup the initial go Rather simplify and make the content_equiv provide a relatively simple and neat representation of the reasoning with content and toolcall as the cases may be. Also remove the partial new para that I had introduced in the initial go for reasoning.	2025-12-04 19:41:39 +05:30
hanishkvc	dbb5512b20	SimpleChatTC:Reasoning: Initial Go	2025-12-04 19:41:39 +05:30
hanishkvc	25df32b553	SimpleChatTC:ChatSessionID: Get all handlers to account for chatid This should ensure that tool call responses can be mapped back to the chat session for which it was triggered.	2025-12-04 19:41:39 +05:30
hanishkvc	734beb08f5	SimpleChatTC:ChatSessionID through the tool call cycle Pass chatId to tool call, and use chatId in got tool call resp, to decide as to to which chat session the async tool call resp belongs and inturn if auto submit timer should be started if auto is enabled.	2025-12-04 19:41:39 +05:30
hanishkvc	13d312fe0d	SimpleChatTC:ToolTemp: Ensure add removes non promoted ToolTemp	2025-12-04 19:41:39 +05:30
hanishkvc	4eb3322017	SimpleChatTC:ToolCallErrPath:ToolTemp and MultiChatUIChatShow Update the immidiate tool call triggering failure and tool call response timeout paths to use the new ToolTemp and MultiChatUI based chat show logics. Actual tool call itself generating errors, is already handled in the previous commit changes.	2025-12-04 19:41:39 +05:30
hanishkvc	e79faebde1	SimpleChatTC:ToolTemp and ChatShow Add a new role ToolTemp, which is used to maintain any tool call response on the client ui side, without submitting it to the server ie till user or auto submit triggers the submitting of that tool call response. When ever a tool call response is got, create a ToolTemp role based message in the corresponding chat session. And dont directly update the user query input area, rather leave it to the updated simplechat show and the new multichatui chat_show helper and inturn whether the current chat session active in ui is same as the one for which the tool call response has been recieved. TODO: Currently the response message is added to the current active chat session, but this needs to be changed by tracking chatId/session through the full tool call cycle and then adding the tool call response in the related chat session, and inturn updating or not the ui based on whether that chat session is still the active chat session in ui or not, given that tool call gets handled in a asynchronous way. Now when that tool call response is submitted, promote the equiv tool temp role based message that should be in the session's chat history as the last message into becoming a normal tool response message. SimpleChat.show has been updated to take care of showing any ToolTemp role message in the user query input area. A newer chat_show helper added to MultiChatUI, that takes care of calling SimpleChat.show, provided the chat_show is being requested for the currently active in ui, chat session. As well as to take care of passing both the ChatDiv and elInUser. Converts users of SimpleChat.show to use MultiChatUI.chat_show	2025-12-04 19:41:39 +05:30
hanishkvc	84403973cd	SimpleChatTC:SimpleProxy: once in a bluemoon transformed bearer instead of using the shared bearer token as is, hash it with current year and use the hash. keep /aum path out of auth check. in future bearer token could be transformed more often, as well as with additional nounce/dynamic token from server got during initial /aum handshake as also running counter and so ... NOTE: All these circus not good enough, given that currently the simpleproxy.py handshakes work over http. However these skeletons put in place, for future, if needed. TODO: There is a once in a bluemoon race when the year transitions between client generating the request and server handling the req. But other wise year transitions dont matter bcas client always creates fresh token, and server checks for year change to genrate fresh token if required.	2025-12-04 19:41:39 +05:30
hanishkvc	0552ff9098	SimpleChatTC:SimpleProxy:ClientUI: Send Authorization bearer User can configure the bearer token to send	2025-12-04 19:41:39 +05:30
hanishkvc	044d1cf535	SimpleChatTC:tools.proxyUrl: rename to just proxyUrl Next will be adding a proxyAuth field also to tools.	2025-12-04 19:41:39 +05:30
hanishkvc	6d08cda9c8	SimpleChatTC:SimpleProxy: Check for bearer authorization As noted in the comments in code, this is a very insecure flow for now.	2025-12-04 19:41:39 +05:30
hanishkvc	3f1fd289eb	SimpleChatTC:SimpleProxy:BearerInsecure a needed config Add a config entry called bearer.insecure which will contain a token used for bearer auth of http requests Make bearer.insecure and allowed.domains as needed configs, and exit program if they arent got through cmdline or config file.	2025-12-04 19:41:39 +05:30
hanishkvc	0caa2e8101	SimpleChatTC:SimpleProxy: Prg Parameters handling cleanup - next Ensure load_config gets called on encountering --config in cmdline, so that the user has control over whether cmdline or config file will decide the final value of any given parameter. Ensure that str type values in cmdline are picked up directly, without running them through ast.literal_eval, bcas otherwise one will have to ensure throught the cmdline arg mechanism that string quote is retained for literal_eval Have the """ function note/description below def line immidiately so that it is interpreted as a function description.	2025-12-04 19:41:39 +05:30
hanishkvc	f221a2c356	SimpleChatTC:SimpleProxy:LoadConfig ProcessArgs cleanup - initial Now both follow a similar mechanism and do the following * exit on finding any issue, so that things are in a known state from usage perspective, without any confusion/overlook * check if the cmdlineArgCmd/configCmd being processed is a known one or not. * check value of the cmd is of the expected type * have a generic flow which can accomodate more cmds in future in a simple way	2025-12-04 19:41:39 +05:30
hanishkvc	a1b33ecd1c	SimpleChatTC:ToolCallResponseTimeout: Allow end user to control Moved it into Me->tools, so that end user can modify the same as required from the settings ui. TODO: Currently, if tc response is got after a tool call timed out and user submitted default timed out error response, the delayed actual response when it is got may overwrite any new content in user query box, this needs to be tackled.	2025-12-04 19:41:39 +05:30
hanishkvc	252fb91e95	SimpleChatTC:WebSearchPlus: Update readme, Wikipedia in allowed If using wikipedia or so, remember to have sufficient context window in general wrt the ai engine as well as wrt the handshake / chat end point.	2025-12-04 19:41:39 +05:30
hanishkvc	221b5a9228	SimpleChatTC:ToolCallWeby: Cleanup the toolweb module flow Avoid code duplication, by creating helpers for setup and toolcall. Also send indication of the path that will be used, when checking for simpleproxy.py server to be running at runtime setup.	2025-12-04 19:41:39 +05:30
hanishkvc	de6f370d3b	SimpleChatTC:ToolCall:SearchWebText using UrlText Initial go at implementing a web search tool call, which uses the existing UrlText support of the bundled simpleproxy.py. It allows user to control the search engine to use, by allowing them to set the search engine url template. The logic comes with search engine url template strings for duckduckgo, brave, bing and google. With duckduckgo set by default.	2025-12-04 19:41:39 +05:30
hanishkvc	978ee3db1e	SimpleChatTC:ToolCalling:Seprat out JSWebWorker and ProxyBasedWeb Remove the unneed (belonging to the other file) stuff from tooljs and toolweb files. Update tools manager to make use of the new toolweb module	2025-12-04 19:41:39 +05:30
hanishkvc	d00e5b341a	SimpleChatTC:Duplicate tooljs.mjs to toolweb.mjs So as to split browser js webworker based tool calls from web related tool calls.	2025-12-04 19:41:39 +05:30
hanishkvc	8c8ddb1e59	SimpleChatTC:Update and cleanup the readme a bit include info about the auto option within tools. use nonwrapped text wrt certain sections, so that the markdown readme can be viewed properly wrt the structure of the content in it.	2025-12-04 19:41:39 +05:30
hanishkvc	2192ae6dd3	SimpleChatTC:Cleanup whitespace - github editorconfig checker Add missing newline to ending bracket line of json config file	2025-12-04 19:41:39 +05:30
hanishkvc	f74ce327e5	SimpleChatTC: Cleanup whitespaces identified by llama.cpp editorconfig check * convert tab to spaces in json config file * remove extra space at end of line	2025-12-04 19:41:39 +05:30
hanishkvc	fb968347b0	SimpleChatTC:AutoToolCalls: Track and clear related timers also cleanup the existing toolResponseTimeout timer to be in the same structure and have similar flow convention.	2025-12-04 19:41:39 +05:30
hanishkvc	45f9db9963	SimpleChatTC:Auto tool calling control to end user Instead of enforcing always explicit user triggered tool calling, now user is given the option whether to use explicit user triggered tool calling or to use auto triggering after showing tool details for a user specified amount of seconds. NOTE: The current logic doesnt account for user clicking the buttons before the autoclick triggers; need to cancel the auto clicks, if user triggers before autoclick, ie in future.	2025-12-04 19:41:39 +05:30
hanishkvc	9e97880dde	SimpleChatTC:SimpleProxy:Cleanup avoid logically duplicate debug log	2025-12-04 19:41:39 +05:30
hanishkvc	4c1c363504	SimpleChatTC:SimpleProxy: debug dumps to identify funny bing bing raised a challenge for chrome triggered search requests after few requests, which were spread few minutes apart, while still seemingly allowing wget based search to continue (again spread few minutes apart). Added a simple helper to trace this, use --debug True to enable same.	2025-12-04 19:41:39 +05:30
hanishkvc	dbb24fec77	SimpleChatTC:ToolResponse: Use browser dom for xml/html safe Instead of simple concatenating of tool call id, name and result now use browser's dom logic to create the xml structure used for now to store these within content field. This should take care of transforming / escaping any xml special chars in the result, so that extracting them later for putting into different fields in the server handshake doesnt have any problem.	2025-12-04 19:41:39 +05:30
hanishkvc	90d232dc4a	SimpleChatTC:SimpleProxy: Update readme wrt mimicing client req ie during proxying	2025-12-04 19:41:39 +05:30
hanishkvc	74226a0992	SimpleChatTC:ToolCall response relaxed handling Use DOMParser parseFromString in text/html mode rather than text/xml as it makes it more relaxed without worrying about special chars of xml like & etal	2025-12-04 19:41:39 +05:30
hanishkvc	c109da870f	SimpleChatTC:SimpleProxy: mimicing got req helps wrt duckduckgo mimicing got req in generated req helps with duckduckgo also and not just yahoo. also update allowed.domains to allow a url generated by ai when trying to access the bing's news aggregation url	2025-12-04 19:41:39 +05:30
hanishkvc	bebf846157	SimpleChatTC:SimpleProxy:Cleanup a bit The tagging of messages wrt ValidateUrl and UrlReq Also dump req Move check for --allowed.domains to ValidateUrl NOTE: Also with mimicing of user agent etal from got request to the generated request, yahoo search/news is returning results now, instead of the bland error before.	2025-12-04 19:41:39 +05:30
hanishkvc	d0b9103176	SimpleChatTC:SimpleProxy:Try mimic real client using got req info ie include User-Agent, Accept-Language and Accept in the generated request using equivalent values got in the request being proxied.	2025-12-04 19:41:39 +05:30
hanishkvc	e6e0adbe90	SimpleChatTC:SimpleProxy: Some debug prints which give info	2025-12-04 19:41:39 +05:30
hanishkvc	17365ed4b9	SimpleChatTC: Update readme a bit	2025-12-04 19:41:39 +05:30
hanishkvc	840cab0b1c	SimpleChatTC:SimpleProxy: Include a sample config file with allowed domains set to few sites in general to show its use this includes some sites which allow search to be carried out through them as well as provide news aggregation	2025-12-04 19:41:39 +05:30
hanishkvc	370326b1ec	SimpleChatTC:SimpleProxy: Cleanup domain filtering and general Had confused between js and python wrt accessing dictionary contents and its consequence on non existent key. Fixed it. Use different error ids to distinguish between failure in common urlreq and the specific urltext and urlraw helpers.	2025-12-04 19:41:39 +05:30
hanishkvc	71ad609db6	SimpleChatTC:SimpleProxy: AllowedDomains based filtering Allow fetching from only specified allowed.domains	2025-12-04 19:41:39 +05:30
hanishkvc	58954c8814	SimpleChatTC:SimpleProxy: Update doc following python convention	2025-12-04 19:41:39 +05:30
hanishkvc	62dcd506e3	SimpleChatTC:SimpleProxy:Allow for loading json based config file The config entries should be named same as their equivalent cmdline argument entries but without the -- prefix	2025-12-04 19:41:39 +05:30
hanishkvc	aac5213104	SimpleChatTC:Tools: Show available tool names Dont allow tool names to be changed in settings page	2025-12-04 19:41:39 +05:30
hanishkvc	aa8c8040cf	SimpleChatTC:Cleanup:ChatProps: apiEP	2025-12-04 19:41:39 +05:30
hanishkvc	ad65659a63	SimpleChatTC:Cleanup:ChatProps: bTrimGarbage Also remove more inner/detailed stuff from show info in not bAll mode, given that many of the previous differentiated stuff have been moved into chatProps and inturn shown for now	2025-12-04 19:41:39 +05:30
hanishkvc	82be13aa33	SimpleChatTC:Cleanup:ChatProps: bCompletionInsertStandardRolePrefix	2025-12-04 19:41:39 +05:30
hanishkvc	734f74c908	SimpleChatTC:Cleanup:ChatProps: bCompletionFreshChatAlways Moved into Me.chatProps	2025-12-04 19:41:39 +05:30
hanishkvc	78ccca056f	SimpleChatTC:Cleanup:ChatProps: iRecentUserMsgCnt Update Me class Update show settings Update show props info Update readme	2025-12-04 19:41:39 +05:30
hanishkvc	7409b29862	SimpleChatTC:Cleanup:ChatProps: Move bStream into it	2025-12-04 19:41:39 +05:30
hanishkvc	a54fa472dd	SimpleChatTC:ShowObjPropsEdit:Any depth trapping of ui setup - t2 Fix up the oversights wrt any depth trapping flow Remember to start the propWithTree being checked/trapped with : to indicate the root of the prop hierarchy and also use : as sep between the elements of the props hierarchy tree Also had forgotten about the goof up possible with using in in a condition statement to check for array to contain a entry of interest in JS, fixed it now.	2025-12-04 19:41:39 +05:30
hanishkvc	8d7eb68712	SimpleChatTC:ShowObjPropsEdit:Any depth trapping of ui setup Maintain the current property hierarchy to its root over recursive calls. Allow callers to specify the props to be trapped using the prop hierarchy. Pass the prop hierarchy to the fTrapper. This should allow one to trap any prop wrt its editing ui setup, irrespective of whether it is a prop of the main object passed, or a member of a child prop of the main object passed or so ... Update the setting up of ChatHistoryInCtxt and ApiEndPoint to follow the new semantic/flow.	2025-12-04 19:41:39 +05:30
hanishkvc	b19e754322	SimpleChatTC:Cleanup:Rename func arg to match semantic better	2025-12-04 19:41:39 +05:30
hanishkvc	03426f0276	SimpleChatTC:Cleanup:EditObjProps: rename vars followingConvention Part 1 - add el prefix wrt the element handle related vars	2025-12-04 19:41:39 +05:30
hanishkvc	3e490cefc5	SimpleChatTC:Cleanup: Move bTools and toolFetchProxyUrl into tools Also update the readme wrt same and related	2025-12-04 19:41:39 +05:30
hanishkvc	303af1800e	SimpleChatTC:ShowInfo:Clean up layout of showing of props data Also ensure when switching between sessions, the full set of props info is shown.	2025-12-04 19:41:39 +05:30
hanishkvc	0e21d67e8a	SimpleChatTC:ShowInfo: Allow showing minimal info set, if needed	2025-12-04 19:41:39 +05:30
hanishkvc	fc26e47222	SimpleChatTC:ShowObjPropsInfo: Use sections to indicate relations Also create a top level div wrt whole. And allow class to be specified for the same as well as the top level legend, optionally	2025-12-04 19:41:39 +05:30
hanishkvc	24ba85026e	SimpleChatTC:ShowInfo: Make logic recursive, avoid JSON.stringify	2025-12-04 19:41:39 +05:30
hanishkvc	34b2beea1a	SimpleChatTC:ShowInfo: Create and use common automated info show Also fetch info from ai-server, and place path and ctx size into current Me instance and include in show info.	2025-12-04 19:41:39 +05:30
hanishkvc	2a94cb3786	SimpleChatTC:Fetch:Proxy URL rename and in settings	2025-12-04 19:41:39 +05:30
hanishkvc	98d43fac7f	SimpleChatTC:WebFetch: Try confirm simpleproxy before enabling	2025-12-04 19:41:39 +05:30
hanishkvc	a6aa563a18	SimpleChatTC:WebFetch: Check for the specific proxy paths	2025-12-04 19:41:39 +05:30
hanishkvc	80dbbb89a5	SimpleChatTC:WebFetch: Enable only if something at proxyUrl NOTE: not a robust check, just tries to establish a http connection for now and doesnt really check if it is the specific proxy srvr of interest or not.	2025-12-04 19:41:39 +05:30
hanishkvc	fa0a6919cb	SimpleChatTC: Update/Cleanup readme	2025-12-04 19:41:39 +05:30
hanishkvc	8ca77e455a	SimpleChatTC:NonStreaming: Update oneshot mode wrt tool calls Take care of the possibility of content not being there as well as take care of retrieving the tool calls for further processing. With this tool calls should work in non streaming mode also	2025-12-04 19:41:39 +05:30
hanishkvc	3e0cf2a2df	SimpleChatTC:ObjPropsEdit: Obj within Obj aware fRefiner Use same to set a placeholder for Authorization entry in headers	2025-12-04 19:41:39 +05:30
hanishkvc	f874c69983	SimpleChatTC:UiShowObjPropsEdit allow refining	2025-12-04 19:41:39 +05:30
hanishkvc	6253c717b3	SimpleChatTC:Trappable UiShowObjPropsEdit for custom handling Use it to handle apiEP and iRecentUserMsgCnt in more user friendly way, where they get a selection to choose from.	2025-12-04 19:41:39 +05:30
hanishkvc	3718a39c06	SimpleChatTC:Use generic obj props edit for settings in general Bring more user controllable properties into this new settings ui	2025-12-04 19:41:39 +05:30
hanishkvc	756b128539	SimpleChatTC:UI:ObjPropEdits handle objects, use for gMe	2025-12-04 19:41:39 +05:30
hanishkvc	b771e42dc1	SimpleChatTC:UI:Common helper to edit obj members of few types Make the previously relatively generic flow wrt apiRequestOptions settings into a fully generic reusable by others flow. Rather had stopped short of it, when previously moved onto other things at that time.	2025-12-04 19:41:39 +05:30
hanishkvc	6e5b532313	SimpleChatTC:UI: el_get/el_set to avoid warnings	2025-12-04 19:41:39 +05:30
hanishkvc	04644761e6	SimpleChatTC:Tools: Pick proxy server address from document[gMe]	2025-12-04 19:41:39 +05:30
hanishkvc	9b55775e8a	SimpleChatTC:WebFetch: Update readme to reflect the new names	2025-12-04 19:41:39 +05:30
hanishkvc	42f91df261	SimpleChatTC:WebFetch:Trap Non Ok status and raise error So that the same error path is used for logical error wrt http req also, without needing a different path for it. Dont forget to return the resp text/json/..., so that the contents are passed along the promise then chain	2025-12-04 19:41:39 +05:30
hanishkvc	d04c8cd38d	SimpleChatTC:SimpleProxy: Ensure CORS related headers sent always Add a new send headers common helper and use the same wrt the overridden send_error as well as do_OPTIONS This ensures that if there is any error during proxy opertions, the send_error propogates to the fetch from any browser properly without browser intercepting it with a CORS error	2025-12-04 19:41:39 +05:30
hanishkvc	c2fb0cd241	SimpleChatTC:WebFetch: Cleanup the names and descriptions a bit	2025-12-04 19:41:39 +05:30
hanishkvc	73a144c44d	SimpleChatTC:SimpleProxy:HtmlParser more generic and flexible also now track header, footer and nav so that they arent captured	2025-12-04 19:41:39 +05:30
hanishkvc	cd226e8dae	SimpleChatTC: Update readme wrt web fetch and related simple proxy	2025-12-04 19:41:39 +05:30
hanishkvc	8b950fd348	SimpleChatTC:WebFetch:UrlEnc url2fetch b4Passing toProxy asQuery Ensures that if the url being requested as any query strings in them then things dont get messed up, when the url to get inc its query is extracted from the proxy request's query string	2025-12-04 19:41:39 +05:30
hanishkvc	9ff2c596ee	SimpleChatTC:SimpleProxy:Options just in case	2025-12-04 19:41:39 +05:30
hanishkvc	9c7d6cc0e4	SimpleChatTC:WebUrlText:Update name and desc to see if prefered	2025-12-04 19:41:39 +05:30
hanishkvc	bf63b8f45a	SimpleChatTC:SimpleProxy:UrlText: Slightly better trimming First identify lines which have only whitespace and replace them with lines with only newline char in them. Next strip out adjacent lines, if they have only newlines	2025-12-04 19:41:39 +05:30
hanishkvc	266e825c68	SimpleChatTC:SimpleProxy:UrlText: Try strip empty lines some what	2025-12-04 19:41:39 +05:30
hanishkvc	82ab08ec1a	SimpleChatTC:WebUrl FetchStrip through simple proxy	2025-12-04 19:41:39 +05:30
hanishkvc	b46bbc542a	SimpleChatTC:SimpleProxy:UrlText: Avoid style blocks also	2025-12-04 19:41:39 +05:30
hanishkvc	f493e1af59	SimpleChatTC:SimpleProxy:UrlText: Capture body except for scripts	2025-12-04 19:41:39 +05:30
hanishkvc	45b05df21b	SimpleChatTC:SimpleProxy: Switch to html.parser As html can be malformed, xml ElementTree XMLParser cant handle the same properly, so switch to the HtmlParser helper class that is provided by python and try extend it. Currently a minimal skeleton to just start it out, which captures only the body contents.	2025-12-04 19:41:39 +05:30
hanishkvc	d5f4183f7c	SimpleChatTC:SimpleProxy: ElementTree, No _UrlopenRet As _UrlopenRet not exposed for use outside urllib, so decode and encode the data. Add skeleton to try get the html/xml tree top elements	2025-12-04 19:41:39 +05:30
hanishkvc	6537559360	SimpleChatTC:SimpleProxy:Common UrlReq helper for UrlRaw & UrlText Declare the result of UrlReq as a DataClass, so that one doesnt goof up wrt updating and accessing members. Duplicate UrlRaw into UrlText, need to add Text extracting from html next for UrlText	2025-12-04 19:41:39 +05:30
hanishkvc	e600e62e86	SimpleChatTC:SimpleProxy: Cleanup few messages	2025-12-04 19:41:39 +05:30
hanishkvc	c25b1968cd	SimpleChatTC:WebFetch: Update to use internal SimpleProxy.py	2025-12-04 19:41:39 +05:30
hanishkvc	3bab4de0e8	SimpleChatTC:SimpleProxy:UrlRaw: Fixup basic oversight wrt 1st go	2025-12-04 19:41:39 +05:30
hanishkvc	73ef9f7d46	SimpleChatTC:SimpleProxy:implement handle_urlraw A basic go at it	2025-12-04 19:41:39 +05:30
hanishkvc	73054a5832	SimpleChatTC:SimpleProxy: Extract and check path, route to handlers	2025-12-04 19:41:39 +05:30
hanishkvc	c99788e290	SimpleChatTC:SimpleProxy: Cleanup for basic run	2025-12-04 19:41:39 +05:30
hanishkvc	80fd065993	SimpleChatTC:SimpleProxy: Start server, Show requested path	2025-12-04 19:41:39 +05:30
hanishkvc	05c0ade8be	SimpleChatTC:SimpleProxy:Process args --port	2025-12-04 19:41:39 +05:30
hanishkvc	8fc74ef923	SimpleChatTC:WebFetchThroughProxy:Initial go creating request	2025-12-04 19:41:39 +05:30
hanishkvc	09ce19a95a	SimpleChatTC: update readme wrt promise related trapping	2025-12-04 19:41:39 +05:30
hanishkvc	f0a3886d1e	SimpleChatTC:Ensure fetch's promise chain is also trapped Dont forget to map members of got entity from fetch to things from saved original promise, bcas remember what is got is a promise. also add some comments around certain decisions and needed exploration	2025-12-04 19:41:39 +05:30
hanishkvc	77d3e43cb4	SimpleChatTC: Allow await in generated code that will be evald	2025-12-04 19:41:39 +05:30
hanishkvc	92e5b2133e	SimpleChatTC:Promises: trap normal fetch (dont care await or not)	2025-12-04 19:41:39 +05:30
hanishkvc	0241b7b469	SimpleChatTC:TrapPromise: log the trapping also possible refinement wrt trapping, if needed, added as comment all or allSettled to use or not is the question. whether to wait for a round trip through the related event loop or not is also a question.	2025-12-04 19:41:39 +05:30
hanishkvc	3d661793ef	SimpleChatTC:ChatMessageEx: 1st go at trying to track promises	2025-12-04 19:41:39 +05:30
hanishkvc	7dbbc46390	SimpleChatTC:ChatMessageEx: Better tool result extractor	2025-12-04 19:41:39 +05:30
hanishkvc	61b70bfa5d	SimpleChatTC:Readme: Updated wrt new relativelyProper toolCallsHS Also update the sliding window context size to last 9 chat messages so that there is a sufficiently large context for multi turn tool calls based adjusting by ai and user, without needing to go full hog, which has the issue of overflowing the currently set context window wrt the loaded ai model.	2025-12-04 19:41:39 +05:30
hanishkvc	152deb5d5a	SimpleChatTC:ChatMessageEx:While at it also ns_delete these common helpers avoid needing ignore tagging to ts-check, in places where valid constructs have been used which go beyond strict structured js handling that is tried to be achieved using it, but are still valid and legal.	2025-12-04 19:41:39 +05:30
hanishkvc	cc65a2f7a3	SimpleChatTC:ChatMessageEx: Build tool role result fully Expand the xml format id, name and content in content field of tool result into apropriate fields in the tool result message sent to the genai/llm engine on the server.	2025-12-04 19:41:39 +05:30
hanishkvc	ebc7f88b53	SimpleChatTC:Propogate toolcall id through tool call chain Use HTMLElement's dataset to maintain tool call id along with the element which maintains the toolname. Pass it along to the tools manager and inturn the actual tool calls and through them to the web worker handling the tool call related code and inturn returning it back as part of the obj which is used to return the tool call result. Embed the tool call id, function name and function result into the content field of chat message in terms of a xml structure Also make use of tool role to send back the tool call result. Do note that currently the id, name and content are all embedded into the content field of the tool role message sent to the ai engine on the server. NOTE: Use the user query entry area for showing tool call result in the above mentioned xml form, as well as for user to enter their own queries. Based on presence of the xml format data at beginning the logic will treat it has a tool result and if not then as a normal user query. The css has been updated to help show tool results/msgs in a lightyellow background	2025-12-04 19:41:39 +05:30
hanishkvc	2bb3d747e6	SimpleChatTC:ChatMessageEx: send tool_calls, only if needed	2025-12-04 19:41:39 +05:30
hanishkvc	2ef201ff8d	SimpleChatTC:Load allows old and new ChatMessage(Ex) formats	2025-12-04 19:41:39 +05:30
hanishkvc	475858a4b3	SimpleChatTC:ChatMessageEx: Cleanup remaining stuff wrt ChatMessageEx related required flow as well as avoid warnings	2025-12-04 19:41:39 +05:30
hanishkvc	963b9f4661	SimpleChatTC:ChatMessageEx: Recent chat users upd Users of recent_chat updated to work with ChatMessageEx As part of same recent_chat_ns also added, for the case where the array of chat messages can be passed as is ie in the chat mode, provided it has only the network handshake representation of the messages.	2025-12-04 19:41:39 +05:30
hanishkvc	4d9e3d1566	SimpleChatTC:ChatMessageEx: Upd Add, rm sysPromptAtBeginOnly hlpr Simplify Add semantic by expecting any validation of stuff before adding to be done by the callers of Add and not by add itself. Also update it to expect ChatMessageEx object Update all users of add to follow the new syntax and semantic. Remove the old and ununsed AddSysPromptOnlyAtBegin helper	2025-12-04 19:41:39 +05:30
hanishkvc	c65c1d5f0f	SimpleChatTC:ChatMessageEx: RecentChat, GetSystemLatest GetSystemLatest and its users updated wrt ChatMessageEx. RecentChat updated wrt ChatMessageEx. Also now irrespective of whether full history is being retrieved or only a subset, both cases refer to the ChatMessageEx instances in SimpleChat.xchat without creating new instances of anything.	2025-12-04 19:41:39 +05:30
hanishkvc	343d414dd3	SimpleChatTC:ChatMessageEx: ods load, system prompt related these have been updated to work with ChatMessageEx to an extent	2025-12-04 19:41:39 +05:30
hanishkvc	abbf927557	SimpleChatTC:ChatMessageEx: add update_oneshot response_extract logic moved directly into ChatMessageEx as update oneshot, with suitable adjustments. Inturn use the same directly.	2025-12-04 19:41:39 +05:30
hanishkvc	361f6968d1	SimpleChatTC:ChatMessage: remove ResponseExtractStream Use the equivalent update_stream directly added to ChatMessageEx. update_stream is also more generic to some extent and also directly implemented by the ChatMessageEx class.	2025-12-04 19:41:39 +05:30
hanishkvc	32dd63ee1d	SimpleChatTC:ChatMessageEx:cleanup, HasToolCalls, ContentEquiv Update HasToolCalls and ContentEquiv to work with new structure	2025-12-04 19:41:39 +05:30
hanishkvc	aa229a1f99	SimpleChatTC:ChatMessageEx: UpdateStream logic Rename ChatMessage to ChatMessageEx. Add typedefs for NSToolCall and NSChatMessage, they represent the way the corresponding data is structured in network hs. Add logic to build the ChatMessageEx from data got over network in streaming mode.	2025-12-04 19:41:39 +05:30
hanishkvc	2c29c2d589	SimpleChatTC:ChatMessage: AssistantResponse into chat message class Modify the constructor, newFrom and clear towards this goal.	2025-12-04 19:41:39 +05:30
hanishkvc	37faf8611a	SimpleChatTC: update descs to indicate use of web workers ie wrt the tool calls provided.	2025-12-04 19:41:39 +05:30
hanishkvc	c2112618c0	SimpleChatTC: Update readme.md wrt latest updates. 2k maxtokens	2025-12-04 19:41:39 +05:30
hanishkvc	1789f5f1e2	SimpleChatTC: Increase the sliding window context to Last4 QA As the tool calling, if enabled, will need access to last few user query and ai assistant responses (which will also include in them the tool call requests and the corresponding results), so that the model can build answers based on its tool call reqs and got responses, and also given that most of the models these days have sufficiently large context windows, so the sliding window context implemented by SimpleChat logic has been increased by default to include last 4 query and their responses roughlty.	2025-12-04 19:41:39 +05:30
hanishkvc	a0f6762fda	SimpleChatTC: Web worker flow initial go cleanup Had forgotten to specify type as module wrt web worker, in order to allow it to import the toolsconsole module. Had forgotten to maintain the id of the timeout handler, which is needed to clear/stop the timeout handler from triggering, if tool call response is got well in time. As I am currently reverting the console redirection at end of handling a tool call code in the web worker message handler, I need to setup the redirection each time. Also I had forgotten to clear the console.log capture data space, before a new tool call code is executed, this is also fixed by this change. TODO: Need to abort the tool call code execution in the web worker if possible in future, if the client / browser side times out waiting for tool call response, ie if the tool call code is taking up too much time.	2025-12-04 19:41:39 +05:30
hanishkvc	148ec1c41a	SimpleChatTC: Get ready for decoupled tool call response tools manager/module * setup the web worker that will help execute the tool call related codes in a js environment that is isolated from the browsers main js environment * pass the web worker to the tool call providers, for them to use * dont wait for the result from the tool call, as it will be got later asynchronously through a message * allow users of the tools manager to register a call back, which will be called when ever a message is got from the web worker containing response wrt previously requested tool call execution. simplechat * decouple toolcall response handling and toolcall requesting logic * setup a timeout to take back control if tool call takes up too much time. Inturn help alert the ai model, that the tool call took up too much time and so was aborted, by placing a approriate tagged tool response into user query area. * register a call back that will be called when response is got asynchronously wrt anye requested tool calls. In turn take care of updating the user query area with response got wrt the tool call, along with tool response tag around it.	2025-12-04 19:41:39 +05:30
hanishkvc	2a8bd1c9e7	SimpleChatTC: Actual tool call implementations simplified These no longer need to worry about * setting up the console.log related redirection to capture the generated outputs, nor about * setting up a dynamic function for executing the needed tool call related code The web worker setup to help run tool calls in a relatively isolated environment independent of the main browser env, takes care of these. One needs to only worry about getting the handle to the web worker to use and inturn pass the need code wrt the tool call to it.	2025-12-04 19:41:39 +05:30
hanishkvc	14d67f6c3c	SimpleChatTC: Pass around structured objects wrt tool worker The request for code to run as well as the resultant response data both need to follow a structured object convention, so that it is easy to map a request and the corresponding response to some extent.	2025-12-04 19:41:39 +05:30
hanishkvc	510c65c721	SimpleChatTC: Initial skeleton of a simple toolsworker	2025-12-04 19:41:39 +05:30
hanishkvc	a6bccf934e	SimpleChatTC:ToolsConsole:Cleanup a bit, add basic set of notes Try ensure as well as verify that original console.log is saved and not overwritten. Throw an exception if things seem off wrt same. Also ensure to add a newline at end of console.log messages	2025-12-04 19:41:39 +05:30
hanishkvc	2701cb3a1e	SimpleChatTC: Move console.log trapping into its own module So that it can be used from different modules, if required.	2025-12-04 19:41:39 +05:30
hanishkvc	45d8a00738	SimpleChatTC: Update readme wrt --jinja argument and bit more	2025-12-04 19:41:39 +05:30
hanishkvc	a8c8176d09	SimpleChatTC: Tool Calling UI elements use up horizontal space	2025-12-04 19:41:39 +05:30
hanishkvc	1e5b638beb	SimpleChatTC: Update readme with bit more details, Cleaner UI Also avoid showing Tool calling UI elements, when not needed to be shown.	2025-12-04 19:41:39 +05:30
hanishkvc	bfe789706e	SimpleChatTC: Let user trigger tool call, instead of automatic Instead of automatically calling any requested tool by the GenAi / llm, that is from the tail end of the handle user submit btn click, Now if the GenAi/LLM has requested any tool to be called, then enable the Tool Run related UI elements and fill them with the tool name and tool args. In turn the user can verify if they are ok with the tool being called and the arguments being passed to it. Rather they can even fix any errors in the tool usage like the arithmatic expr to calculate that is being passed to simple_calculator or the javascript code being passed to run_javascript_function_code If user is ok with the tool call being requested, then trigger the same. The results if any will be automatically placed into the user query text area. User can cross verify if they are ok with the result and or modify it suitabley if required and inturn submit the same to the GenAi/LLM.	2025-12-04 19:41:39 +05:30
hanishkvc	1fc44c971d	SimpleChatTC: Add ui elements for tool call verify and trigger Instead of automatically calling the requested tool with supplied arguments, rather allow user to verify things before triggering the tool. NOTE: User already provided control over tool_response before submitting it to the ai assistant.	2025-12-04 19:41:38 +05:30
hanishkvc	fd662b4b0b	SimpleChatTC: ToolCall hs info in normal assistant-user chat flow Also as part of same, wrap the request details in the assistant block using a similar tagging format as the tool_response in user block.	2025-12-04 19:41:38 +05:30
hanishkvc	30aa2f4c6b	SimpleChatTC: Update the readme.md wrt tool calling a bit	2025-12-04 19:41:38 +05:30
hanishkvc	63b5c6d76d	SimpleChatTC: Cleanup the function description a bit to better describe how it will be run, so that genai/llm while creating the code to run, will hopefully take care of any naunces required.	2025-12-04 19:41:38 +05:30
hanishkvc	a80da9a652	SimpleChatTC: Pass toolname to the tool handler So that when tool handler writes the result to the tc_switch, it can make use of the same, to write to the right location. NOTE: This also fixes the issue with I forgetting to rename the key in js_run wrt writing of result.	2025-12-04 19:41:38 +05:30
hanishkvc	f7284a8b89	SimpleChatTC: Move tool calling to tools, try trap async failures Move tool calling logic into tools module. Try trap async promise failures by awaiting results of tool calling and putting full thing in an outer try catch. Have forgotten the nitty gritties of JS flow, this might help, need to check.	2025-12-04 19:41:38 +05:30
hanishkvc	ef85ed41d4	SimpleChatTC: Clarify some type definitions to avoid warnings ie in vs code with ts-check	2025-12-04 19:41:38 +05:30
hanishkvc	a408e5e017	SimpleChatTC: More clearer description of toolcalls execution env Should hopeful ensure that the GenAi/LLM will generate appropriate code/expression as the argument to pass to these tool calls, to some extent.	2025-12-04 19:41:38 +05:30
hanishkvc	b4776da670	SimpleChatTC: Trap any exception raised during tool call and inform the GenAi/LLM about the same	2025-12-04 19:41:38 +05:30
hanishkvc	17c5daa52c	SimpleChatTC: Cleanup initial/1st go toolcall flow As output generated by any tool/function call is currently placed into the TextArea provided for End user (for their queries), bcas the GenAi (engine/LLM) may be expecting the tool response to be sent as a user role data with tool_response tag surrounding the results from the tool call. So also now at the end of submit btn click handling, the end user input text area is not cleared, if there was a tool call handled, for above reasons. Also given that running a simple arithmatic expression in itself doesnt generate any output, so wrap them in a console.log, to help capture the result using the console.log trapping flow that is already setup.	2025-12-04 19:41:38 +05:30
hanishkvc	301910c3a1	SimpleChatTC: Implement a simple toolcall handling flow Checks for toolname to be defined or not in the GenAi's response If toolname is set, then check if a corresponding tool/func exists, and if so call the same by passing it the GenAi provided toolargs as a object. Inturn the text generated by the tool/func is captured and put into the user input entry text box, with tool_response tag around it.	2025-12-04 19:41:38 +05:30
hanishkvc	fa63a86c71	SimpleChatTC:tooljs: Trap console.log and store in new result key The implementations of javascript and simple_calculator now use provided helpers to trap console.log messages when they execute the code / expression provided by GenAi and inturn store the captured log messages in the newly added result key in tc_switch This should help trap the output generated by the provided code or expression as the case maybe and inturn return the same to the GenAi, for its further processing.	2025-12-04 19:41:38 +05:30
hanishkvc	6d43011003	SimpleChatTC: Saner/Robust AssistantResponse content_equiv Previously if content was empty, it would have always sent the toolcall info related version even if there was no toolcall info in it. Fixed now to return empty string, if both content and toolname are empty.	2025-12-04 19:41:38 +05:30
hanishkvc	383c19c99b	SimpleChatTC: twins wrt streamed response handling As there could be failure wrt getting the response from the ai server some where in between a long response spread over multiple parts, the logic uses the latestResponse to cache the response as it is being received. However once the full response is got, one needs to transfer it to a new instance of AssistantResponse class, so that latestResponse can be cleared, while the new instance can be used in other locations in the flow as needed. Achieve the same now.	2025-12-04 19:41:38 +05:30
hanishkvc	53f85d09be	SimpleChatTC: AssistantResponse everywhere initial go Switch oneshot handler to use AssistantResponse, inturn currenlty only handle the normal content in the response. TODO: If any tool_calls in the oneshot response, it is currently not handled. Inturn switch the generic/toplevel handle response logic to use AssistantResponse class, given that both oneshot and the multipart/streaming flows use/return it. Inturn add trimmedContent member to AssistantResponse class and make the generic handle response logic to save the trimmed content into this. Update users of trimmed to work with this structure.	2025-12-04 19:41:38 +05:30
hanishkvc	3f3aa8d043	SimpleChatTC: AssistantResponse class initial go Make latestResponse into a new class based type instance wrt ai assistant response, which is what it represents. Move clearing, appending fields' values and getting assistant's response info (irrespective of a content or toolcall response) into this new class and inturn use the same.	2025-12-04 19:41:38 +05:30
hanishkvc	5a26831ad2	SimpleChatTC: Show toolcall being generated by ai - Temp	2025-12-04 19:41:38 +05:30
hanishkvc	e73bc4550b	SimpleChatTC: Avoid null content, Fix oversight wrt finish_reason I was wrongly checking for finish_reason to be non null, before trying to extract the genai content/toolcalls, have fixed this oversight with the new flow in progress. I had added few debug logs to identify the above issue, need to remove them later. Note: given that debug logs are disabled by replacing the debug function during this program's initialisation, which I had forgotten about, I didnt get the debug messages and had to scratch my head a bit, before realising this and the other issue ;) Also either when I had originally implemented simplechat 1+ years back, or later due to changes on the server end, the streaming flow sends a initial null wrt the content, where it only sets the role. This was not handled in my flow on the client side, so a null was getting prepended to the chat messages/responses from the server. This has been fixed now in the new generic flow.	2025-12-04 19:41:38 +05:30
hanishkvc	63430dc9f7	SimpleChatTC: Extract streamed field - assume only 1f at any time Update response_extract_stream to check for which field is being currently streamed ie is it normal content or tool call func name or tool call func args and then return the field name and extracted value. Previously it was always assumed that only normal content will be returned. Currently it is assumed that the server will only stream one of the 3 supported fields at any time and not more than one of them at the same time. TODO: Have to also add logic to extract the reasoning field later, ie wrt gen ai models which give out their thinking. Have updated append_response to expect both the key and the value wrt the latestResponse object, which it will be manipualted. Previously it was always assumed that content is what will be got and inturn appended.	2025-12-04 19:41:38 +05:30
hanishkvc	bfe7ef69fa	SimpleChatTC: Skeleton to handle diff fields when streaming Changed latestResponse type to an object instead of a string. Inturn it contains entries for content, toolname and toolargs. Added a custom clear logic due to the same and used it to replace the previously simple assigning of empty string to latestResponse. For now in all places where latestReponse is used, I have replaced with latestReponse.content. Next need to handle identifying the field being streamed and inturn append to it. Also need to add logic to call tool, when tool_call triggered by genai.	2025-12-04 19:41:38 +05:30
hanishkvc	32f5278e8c	SimpleChatTC: use tcpdump to dbg hs; check if ai aware of tools	2025-12-04 19:41:38 +05:30
hanishkvc	6167cdff9f	SimpleChatTC: Bring in the tools meta into the main flow	2025-12-04 19:41:38 +05:30
hanishkvc	46f0304105	SimpleChatTC: More generic tooljs, SimpCalc, some main skeleton Make tooljs structure and flow more generic Add a simple_calculator tool/function call logic Add initial skeleton wrt the main tools.mjs file.	2025-12-04 19:41:38 +05:30
hanishkvc	f1aa0ee778	SimpleChatTC: Add skeleton for a javascript interpretor tool call Define the meta that needs to be passed to the GenAi Engine. Define the logic that implements the tool call, if called. Implement the flow/structure such that a single tool calls implementation file can define multiple tool calls.	2025-12-04 19:41:38 +05:30
hanishkvc	48c9f07982	SimpleChatTC: Update test shell script a bit Enable streaming by default, to check the handshake before going on to change the code, given that havent looked into this for more than a year now and have been busy with totally different stuff. Also updated the user messages used for testing a bit	2025-12-04 19:41:38 +05:30
hanishkvc	9341c507f2	SimpleChatTools: Add boolean to allow user control of tools use	2025-12-04 19:41:38 +05:30
hanishkvc	4282a4277a	SimpleChatToolCalling: Test/Explore srvr initial hs using cmdline	2025-12-04 19:41:38 +05:30
Adrien Gallouët	ef75a89fdb	build : move _WIN32_WINNT definition to headers (#17736 ) Previously, cmake was forcing `_WIN32_WINNT=0x0A00` for MinGW builds, This caused "macro redefined" warnings with toolchains that define the version. This also removes the `GGML_WIN_VER` variable as it is no longer needed. Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-12-04 07:04:02 +01:00
Piotr Wilkin (ilintar)	c6d1a00aa7	Add a couple of file types to the text section (#17670 ) * Add a couple of file types to the text section * Format + regenerate index * Rebuild after rebase	2025-12-03 21:45:06 +01:00
Aleksander Grygier	e9f9483464	Use OpenAI-compatible `/v1/models` endpoint by default (#17689 ) * refactor: Data fetching via stores * chore: update webui build output * refactor: Use OpenAI compat `/v1/models` endpoint by default to list models * chore: update webui build output * chore: update webui build output	2025-12-03 20:49:09 +01:00
Andika Wasisto	41c5e02f42	webui: Fix zero pasteLongTextToFileLen to disable conversion being overridden (#17445 ) * webui: Fix zero pasteLongTextToFileLen to disable conversion being overridden Zero pasteLongTextToFileLen should disable the conversion, but it was overwritten with 2500. * Apply suggestions from code review * Update webui build	2025-12-03 20:45:17 +01:00
Pascal	e7c2cf1356	server: add router multi-model tests (#17704 ) (#17722 ) * llama-server: add router multi-model tests (#17704) Add 4 test cases for model router: - test_router_unload_model: explicit model unloading - test_router_models_max_evicts_lru: LRU eviction with --models-max - test_router_no_models_autoload: --no-models-autoload flag behavior - test_router_api_key_required: API key authentication Tests use async model loading with polling and graceful skip when insufficient models available for eviction testing. utils.py changes: - Add models_max, models_dir, no_models_autoload attributes to ServerProcess - Handle JSONDecodeError for non-JSON error responses (fallback to text) * llama-server: update test models to new HF repos * add offline * llama-server: fix router LRU eviction test and add preloading Fix eviction test: load 2 models first, verify state, then load 3rd to trigger eviction. Previous logic loaded all 3 at once, causing first model to be evicted before verification could occur. Add module fixture to preload models via ServerPreset.load_all() and mark test presets as offline to use cached models * llama-server: fix split model download on Windows --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2025-12-03 15:10:37 +01:00
Adrien Gallouët	1257491047	server : fix bad fmt, size() is a size_type (#17735 ) Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2025-12-03 15:47:22 +02:00
Aldehir Rojas	0a8026e768	common : introduce composable PEG parser combinators for chat parsing (#17136 ) * common : implement parser combinators to simplify chat parsing * add virtual destructor to parser_base * fix memory leak from circular references of rules * implement gbnf grammar building * remove unused private variable * create a base visitor and implement id assignment as a visitor * fix const ref for grammar builder * clean up types, friend classes, and class declarations * remove builder usage from until_parser * Use a counter class to help assign rule ids * cache everything * add short description for each parser * create a type for the root parser * implement repetition parser * Make optional, one_or_more, and zero_or_more subclasses of repetition * improve context constructor * improve until parsing and add benchmarks * remove cached() pattern, cache in parser_base with specialized parsing functions for each parser * improve json parsing performance to better match legacy parsing * fix const auto * it for windows * move id assignment to classes instead of using a visitor * create named rules in the command r7b example * use '.' for any in GBNF * fix parens around choices in gbnf grammar * add convenience operators to turn strings to literals * add free-form operators for const char * to simplify defining literals * simplify test case parser * implement semantic actions * remove groups in favor of actions and a scratchpad * add built in actions for common operations * add actions to command r7b example * use std::default_searcher for platforms that don't have bm * improve parser_type handling and add cast helper * add partial result type to better control when to run actions * fix bug in until() * run actions on partial results by default * use common_chat_msg for result * add qwen3 example wip * trash partial idea and simplify * move action arguments to a struct * implement aho-corasick matcher for until_parser and to build exclusion grammars * use std::string for input, since std::string_view is incompatible with std::regex * Refactor tests * improve qwen3 example * implement sax-style parsing and refactor * fix json string in test * rename classes to use common_chat_ prefix * remove is_ suffix from functions * rename from id_counter to just counter * Final refactored tests * Fix executable name and editorconfig-checker * Third time's the charm... * add trigger parser to begin lazy grammar rule generation * working lazy grammar * refactor json rules now that we check for reachability * reduce pointer usage * print out grammars in example * rename to chat-peg-parser* and common_chat_peg_parser* * Revert unrelated changes * New macros for CMakeLists to enable multi-file compilations * starting unicode support * add unicode support to char_parser * use unparsed args as additional sources * Refactor tests to new harness * Fix CMakeLists * fix rate calculation * add unicode tests * fix trailing whitespace and line endings skip-checks: true * Helpers + rewrite qwen3 with helpers * Fix whitespace * extract unicode functions to separate file * refactor parse unicode function * fix compiler error * improve construction of sequence/choice parsers * be less clever * add make_parser helper function * expand usage of make_parser, alias common_chat_msg_peg_parser_builder to builder in source * lower bench iterations * add unicode support to until_parser * add unicode support to json_string_parser * clean up unicode tests * reduce unicode details to match src/unicode.cpp * simplify even further * remove unused functions * fix type * reformat char class parsing * clean up json string parser * clean up + fix diagnostics * reorder includes * compact builder functions * replace action_parser with capture_parser, rename env to semantics * rename env to semantics * clean up common_chat_parse_context * move type() to below constant * use default constructor for common_chat_peg_parser * make all operators functions for consistency * fix compilation errors in test-optional.cpp * simplify result values * rename json_string_unquoted to json_string_content * Move helper to separate class, add separate explicit and helper classes * Whitespace * Change + to append() * Reformat * Add extra helpers, tests and Minimax example * Add some extra optional debugging prints + real example of how to use them * fix bug in repetitions when min_count = 0 reports failures * dump rule in debug * fix token accumulation and assert parsing never fails * indent debug by depth * use LOG_* in tests so logs sync up with test logs * - Add selective testing - Refactor all messaging to use LOG_ERR - Fix lack of argument / tool name capturing - Temporary fix for double event capture * refactor rule() and introduce ref() * clean up visitor * clean up indirection in root parser w.r.t rules * store shared ptr directly in parser classes * replace aho-corasick automation with a simple trie * Reset prev for qwen3 helper example variant * refactor to use value semantics with std::variant/std::visit * simplify trie_matcher result * fix linting issues * add annotations to rules * revert test workaround * implement serializing the parser * remove redundant parsers * remove tests * gbnf generation fixes * remove LOG_* use in tests * update gbnf tests to test entire grammar * clean up gbnf generation and fix a few bugs * fix typo in test output * remove implicit conversion rules * improve test output * rename trie_matcher to trie * simplify trie to just know if a node is the end of a word * remove common_chat_ prefix and ensure a common_peg_ prefix to all types * rename chat-peg-parser -> peg-parser * promote chat-peg-parser-helper to chat-peg-parser * checkpoint * use a static_assert to ensure we handle every branch * inline trivial peg parser builders * use json strings for now * implement basic and native chat peg parser builders/extractors * resolve refs to their rules * remove packrat caching (for now) * update tests * compare parsers with incremental input * benchmark both complete and incremental parsing * add raw string generation from json schema * add support for string schemas in gbnf generation * fix qwen example to include \n * tidy up example * rename extractor to mapper * rename ast_arena to ast * place basic tests into one * use gbnf_format_literal from json-schema-to-grammar * integrate parser with common/chat and server * clean up schema and serialization * add json-schema raw string tests * clean up json creation and remove capture parser * trim spaces from reasoning and content * clean up redundant rules and comments * rename input_is_complete to is_partial to match rest of project * simplify json rules * remove extraneous file * remove comment * implement += and \|= operators * add comments to qwen3 implementation * reorder arguments to common_chat_peg_parse * remove commented outdated tests * add explicit copy constructor * fix operators and constness * wip: update test-chat for qwen3-coder * bring json parser closer to json-schema-to-grammar rules * trim trailing space for most things * fix qwen3 coder rules w.r.t. trailing spaces * group rules * do not trim trailing space from string args * tweak spacing of qwen3 grammar * update qwen3-coder tests * qwen3-coder small fixes * place parser in common_chat_syntax to simplify invocation * use std::set to collect rules to keep order predictable for tests * initialize parser to make certain platforms happy * revert back to std::unordered_set, sort rule names at the end instead * uncomment rest of chat tests * define explicit default constructor * improve arena init and server integration * fix chat test * add json_member() * add a comprehensive native example * clean up example qwen test and add response_format example to native test * make build_peg_parser accept std::function instead of template * change peg parser parameters into const ref * push tool call on tool open for constructed parser * add parsing documentation * clean up some comments * add json schema support to qwen3-coder * add id initializer in tests * remove grammar debug line from qwen3-coder * refactor qwen3-coder to use sequence over operators * only call common_chat_peg_parse if appropriate format * simplify qwen3-coder space handling * revert qwen3-coder implementation * revert json-schema-to-grammar changes * remove unnecessary forward declaration * small adjustment to until_parser * rename C/C++ files to use dashes * codeowners : add aldehir to peg-parser and related files --------- Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>	2025-12-03 12:45:32 +02:00
Pascal	5ceed62421	server: fix duplicate HTTP headers in multiple models mode (#17698 ) * llama-server: fix duplicate HTTP headers in multiple models mode (#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)	2025-12-03 10:28:43 +01:00
Xuan-Son Nguyen	13628d8bdb	server: add --media-path for local media files (#17697 ) * server: add --media-path for local media files * remove unused fn	2025-12-02 22:49:20 +01:00
Xuan-Son Nguyen	a96283adc4	mtmd: fix --no-warmup (#17695 )	2025-12-02 22:48:08 +01:00
Chad Voegele	c4357dcc35	Server: Change Invalid Schema from Server Error (500) to User Error (400) (#17572 ) * Make invalid schema a user error (400) * Move invalid_argument exception handler to ex_wrapper * Fix test * Simplify test back to original pattern	2025-12-02 17:33:50 +01:00
Xuan-Son Nguyen	5d6bd842ea	server: remove default "gpt-3.5-turbo" model name (#17668 ) * server: remove default "gpt-3.5-turbo" model name * do not reflect back model name from request * fix test	2025-12-02 11:38:57 +01:00
senhtry	fd3abe849e	server: fixing naming conflict res_error in server-models.cpp (#17679 )	2025-12-02 11:18:39 +01:00
Xuan-Son Nguyen	682e6658bb	server: explicitly set exec path when create new instance (#17669 ) * Revert "rm unused fn" This reverts commit `f2dbe9c087`. * server: explicitly set exec path when create new instance * put back TODO * only call get_server_exec_path() once * add fallback logic	2025-12-02 10:25:11 +01:00
Aleksander Grygier	cee92af553	Add context info to server error (#17663 ) * fix: Add context info to server error * chore: update webui build output	2025-12-02 09:20:57 +01:00
Xuan-Son Nguyen	ecf74a8417	mtmd: add mtmd_context_params::warmup option (#17652 ) * mtmd: add mtmd_context_params::warmup option * reuse the common_params::warmup	2025-12-01 21:32:25 +01:00
Xuan-Son Nguyen	ec18edfcba	server: introduce API for serving / loading / unloading multiple models (#17470 ) * server: add model management and proxy * fix compile error * does this fix windows? * fix windows build * use subprocess.h, better logging * add test * fix windows * feat: Model/Router server architecture WIP * more stable * fix unsafe pointer * also allow terminate loading model * add is_active() * refactor: Architecture improvements * tmp apply upstream fix * address most problems * address thread safety issue * address review comment * add docs (first version) * address review comment * feat: Improved UX for model information, modality interactions etc * chore: update webui build output * refactor: Use only the message data `model` property for displaying model used info * chore: update webui build output * add --models-dir param * feat: New Model Selection UX WIP * chore: update webui build output * feat: Add auto-mic setting * feat: Attachments UX improvements * implement LRU * remove default model path * better --models-dir * add env for args * address review comments * fix compile * refactor: Chat Form Submit component * ad endpoint docs * Merge remote-tracking branch 'webui/allozaur/server_model_management_v1_2' into xsn/server_model_maagement_v1_2 Co-authored-by: Aleksander <aleksander.grygier@gmail.com> * feat: Add copy to clipboard to model name in model info dialog * feat: Model unavailable UI state for model selector * feat: Chat Form Actions UI logic improvements * feat: Auto-select model from last assistant response * chore: update webui build output * expose args and exit_code in API * add note * support extra_args on loading model * allow reusing args if auto_load * typo docs * oai-compat /models endpoint * cleaner * address review comments * feat: Use `model` property for displaying the `repo/model-name` naming format * refactor: Attachments data * chore: update webui build output * refactor: Enum imports * feat: Improve Model Selector responsiveness * chore: update webui build output * refactor: Cleanup * refactor: Cleanup * refactor: Formatters * chore: update webui build output * refactor: Copy To Clipboard Icon component * chore: update webui build output * refactor: Cleanup * chore: update webui build output * refactor: UI badges * chore: update webui build output * refactor: Cleanup * refactor: Cleanup * chore: update webui build output * add --models-allow-extra-args for security * nits * add stdin_file * fix merge * fix: Retrieve lost setting after resolving merge conflict * refactor: DatabaseStore -> DatabaseService * refactor: Database, Conversations & Chat services + stores architecture improvements (WIP) * refactor: Remove redundant settings * refactor: Multi-model business logic WIP * chore: update webui build output * feat: Switching models logic for ChatForm or when regenerating messges + modality detection logic * chore: update webui build output * fix: Add `untrack` inside chat processing info data logic to prevent infinite effect * fix: Regenerate * feat: Remove redundant settigns + rearrange * fix: Audio attachments * refactor: Icons * chore: update webui build output * feat: Model management and selection features WIP * chore: update webui build output * refactor: Improve server properties management * refactor: Icons * chore: update webui build output * feat: Improve model loading/unloading status updates * chore: update webui build output * refactor: Improve API header management via utility functions * remove support for extra args * set hf_repo/docker_repo as model alias when posible * refactor: Remove ConversationsService * refactor: Chat requests abort handling * refactor: Server store * tmp webui build * refactor: Model modality handling * chore: update webui build output * refactor: Processing state reactivity * fix: UI * refactor: Services/Stores syntax + logic improvements Refactors components to access stores directly instead of using exported getter functions. This change centralizes store access and logic, simplifying component code and improving maintainability by reducing the number of exported functions and promoting direct store interaction. Removes exported getter functions from `chat.svelte.ts`, `conversations.svelte.ts`, `models.svelte.ts` and `settings.svelte.ts`. * refactor: Architecture cleanup * feat: Improve statistic badges * feat: Condition available models based on modality + better model loading strategy & UX * docs: Architecture documentation * feat: Update logic for PDF as Image * add TODO for http client * refactor: Enhance model info and attachment handling * chore: update webui build output * refactor: Components naming * chore: update webui build output * refactor: Cleanup * refactor: DRY `getAttachmentDisplayItems` function + fix UI * chore: update webui build output * fix: Modality detection improvement for text-based PDF attachments * refactor: Cleanup * docs: Add info comment * refactor: Cleanup * re * refactor: Cleanup * refactor: Cleanup * feat: Attachment logic & UI improvements * refactor: Constants * feat: Improve UI sidebar background color * chore: update webui build output * refactor: Utils imports + move types to `app.d.ts` * test: Fix Storybook mocks * chore: update webui build output * test: Update Chat Form UI tests * refactor: Tooltip Provider from core layout * refactor: Tests to separate location * decouple server_models from server_routes * test: Move demo test to tests/server * refactor: Remove redundant method * chore: update webui build output * also route anthropic endpoints * fix duplicated arg * fix invalid ptr to shutdown_handler * server : minor * rm unused fn * add ?autoload=true\|false query param * refactor: Remove redundant code * docs: Update README documentations + architecture & data flow diagrams * fix: Disable autoload on calling server props for the model * chore: update webui build output * fix ubuntu build * fix: Model status reactivity * fix: Modality detection for MODEL mode * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-12-01 19:41:04 +01:00
Xuan-Son Nguyen	7733409734	common: improve verbosity level definitions (#17630 ) * common: improve verbosity level definitions * string_format * update autogen docs	2025-12-01 14:38:13 +01:00
Tarek Dakhran	2ba719519d	model: LFM2-VL fixes (#17577 ) * Adjust to pytorch * Add antialiasing upscale * Increase number of patches to 1024 * Handle default marker insertion for LFM2 * Switch to flag * Reformat * Cuda implementation of antialias kernel * Change placement in ops.cpp * consistent float literals * Pad only for LFM2 * Address PR feedback * Rollback default marker placement changes * Fallback to CPU implementation for antialias implementation of upscale	2025-11-30 21:57:31 +01:00
Xuan-Son Nguyen	7f8ef50cce	clip: fix nb calculation for qwen3-vl (#17594 )	2025-11-30 15:33:55 +01:00
Xuan-Son Nguyen	3c136b21a3	cli: add migration warning (#17620 )	2025-11-30 15:32:43 +01:00
Xuan-Son Nguyen	ab49f094d2	server: move server-context to its own cpp\|h (#17595 ) * git mv * add server-context.h * add server-context.h * clean up headers * cont : cleanup * also expose server_response_reader (to be used by CLI) * fix windows build * decouple server_routes and server_http --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-11-29 22:04:44 +01:00
Haiyue Wang	8c32d9d96d	server: explicitly set the function name in lambda (#17538 ) As [1] explained, the real debug message will be like: "res operator(): operator() : queue result stop" Set the name explicitly, the message is easy for debugging: "res operator(): recv : queue result stop" The left "operator()" is generated by 'RES_DBG() ... __func__' [1]: https://clang.llvm.org/extra/clang-tidy/checks/bugprone/lambda-function-name.html Signed-off-by: Haiyue Wang <haiyuewa@163.com>	2025-11-29 18:43:29 +01:00
Igor Smirnov	0874693b44	common : fix json schema with '\' in literals (#17307 ) * Fix json schema with '\' in literals * Add "literal string with escapes" test	2025-11-29 17:06:32 +01:00
o7si	3ce7a65c2f	server: fix: /metrics endpoint returning JSON-escaped Prometheus format (#17386 ) * fix: /metrics endpoint returning JSON-escaped Prometheus format * mod: remove string overload from ok() method	2025-11-28 19:14:00 +01:00
Fredrik Hultin	ddf9f94389	server : add Anthropic Messages API support (#17570 ) * server : add Anthropic Messages API support * remove -@pytest.mark.slow from tool calling/jinja tests * server : remove unused code and slow/skip on test_anthropic_vision_base64_with_multimodal_model in test_anthropic_api.py * server : removed redundant n field logic in anthropic_params_from_json * server : use single error object instead of error_array in streaming response handler for /v1/chat/completions and use unordered_set instead of set in to_json_anthropic_stream() * server : refactor Anthropic API to use OAI conversion * make sure basic test always go first * clean up * clean up api key check, add test --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-11-28 12:57:04 +01:00
Xuan-Son Nguyen	e509411cf1	server: enable jinja by default, update docs (#17524 ) * server: enable jinja by default, update docs * fix tests	2025-11-27 01:02:50 +01:00
Han Qingzhe	1d594c295c	clip: (minicpmv) fix resampler kq_scale (#17516 ) * debug:"solve minicpmv precision problem" * “debug minicpmv” * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2025-11-26 21:44:07 +01:00
Pascal	b1846f1c8e	webui: add rehype plugin to restore HTML in Markdown table cells (#17477 ) * webui: add rehype plugin to restore HTML in Markdown table cells The remark/rehype pipeline neutralizes inline HTML as literal text (remarkLiteralHtml) so that XML/HTML snippets in LLM responses display as-is instead of being rendered. This causes <br> and <ul> markup in table cells to show as plain text. This plugin traverses the HAST post-conversion, parses whitelisted HTML patterns (<br>, <ul><li>) from text nodes, and replaces them with actual HAST element nodes. For lists, adjacent siblings must be combined first as the AST fragmentation breaks pattern matching. Strict validation rejects malformed markup, keeping it as raw text. * chore: update webui build output	2025-11-25 08:01:02 +01:00
Xuan-Son Nguyen	b8372eecd9	server: split server.cpp code into server/common/task/queue (#17362 ) * add server-task, server-common * add server-queue * rm redundant includes * move enum stop_type to server-task * server : headers cleanup --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-11-24 14:41:53 +01:00
Pascal	0c7220db56	webui: minor settings reorganization and add disable autoscroll option (#17452 ) * webui: added a dedicated 'Display' settings section that groups visualization options * webui: added a Display setting to toggle automatic chat scrolling * chore: update webui build output	2025-11-23 18:42:00 +01:00
Aleksander Grygier	4c91f2633f	Improved file naming & structure for UI components (#17405 ) * refactor: Component iles naming & structure * chore: update webui build output * refactor: Dialog titles + components namig * chore: update webui build output * refactor: Imports * chore: update webui build output	2025-11-20 14:07:31 +01:00
Georgi Gerganov	196f5083ef	common : more accurate sampling timing (#17382 ) * common : more accurate sampling timing * eval-callback : minor fixes * cont : add time_meas impl * cont : fix log msg [no ci] * cont : fix multiple definitions of time_meas * llama-cli : exclude chat template init from time measurement * cont : print percentage of unaccounted time * cont : do not reset timings	2025-11-20 13:40:10 +02:00
Aleksander Grygier	99c53d6558	webui: Add a "Continue" Action for Assistant Message (#16971 ) * feat: Add "Continue" action for assistant messages * feat: Continuation logic & prompt improvements * chore: update webui build output * feat: Improve logic for continuing the assistant message * chore: update webui build output * chore: Linting * chore: update webui build output * fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message * chore: update webui build output * feat: Enable "Continue" button based on config & non-reasoning model type * chore: update webui build output * chore: Update packages with `npm audit fix` * fix: Remove redundant error * chore: update webui build output * chore: Update `.gitignore` * fix: Add missing change * feat: Add auto-resizing for Edit Assistant/User Message textareas * chore: update webui build output	2025-11-19 14:39:50 +01:00
o7si	97cb3fd5ae	fix: resolve undefined variable 'svr' compilation error (#17348 )	2025-11-18 10:10:47 +02:00
Xuan-Son Nguyen	0de8878c96	server: split HTTP into its own interface (#17216 ) * server: split HTTP into its own interface * move server-http and httplib to its own file * add the remaining endpoints * fix exception/error handling * renaming * missing header * fix missing windows header * fix error responses from http layer * fix slot save/restore handler * fix case where only one stream chunk is returned * add NOMINMAX * do not call sink.write on empty data * use safe_json_to_str for SSE * clean up * add some comments * improve usage of next() * bring back the "server is listening on" message * more generic handler * add req.headers * move the chat template print to init() * add req.path * cont : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-11-17 22:05:44 +01:00
Georgi Gerganov	5b2093becc	server : handle context overflow during decode (#17267 ) * server : handle context overflow during decode * server : minor refactor	2025-11-16 09:23:37 +02:00
Aleksander Grygier	22e1ce2f81	webui: Fix clickability around chat processing statistics UI (#17278 ) * fix: Better pointer events handling in chat processing info elements * chore: update webui build output	2025-11-15 22:41:41 +01:00
Pascal	1411d9275a	webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI (#16618 ) * webui: add OAI-Compat Harmony tool-call live streaming visualization and persistence in chat UI - Purely visual and diagnostic change, no effect on model context, prompt construction, or inference behavior - Captured assistant tool call payloads during streaming and non-streaming completions, and persisted them in chat state and storage for downstream use - Exposed parsed tool call labels beneath the assistant's model info line with graceful fallback when parsing fails - Added tool call badges beneath assistant responses that expose JSON tooltips and copy their payloads when clicked, matching the existing model badge styling - Added a user-facing setting to toggle tool call visibility to the Developer settings section directly under the model selector option * webui: remove scroll listener causing unnecessary layout updates (model selector) * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: npm run format & update webui build output * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>	2025-11-15 21:09:32 +01:00
Ankur Verma	c7b7db0445	mtmd-cli: Avoid logging to stdout for model loading messages in mtmd-cli (#17277 )	2025-11-15 12:41:16 +01:00
Xuan-Son Nguyen	9b17d74ab7	mtmd: add mtmd_log_set (#17268 )	2025-11-14 15:56:19 +01:00
Georgi Gerganov	d396b43748	server : fix "can batch with" bug (#17263 )	2025-11-14 14:03:45 +02:00
Aleksander Grygier	f1bad23f88	Better UX for handling multiple attachments in WebUI (#17246 )	2025-11-14 01:19:08 +01:00
Xuan-Son Nguyen	c4abcb2457	server: fixing naming conflict res_error (#17243 )	2025-11-13 20:53:47 +01:00
Aleksander Grygier	8e878f0cb4	Update packages + upgrade Storybook to v10 (#17201 ) * chore: Update packages + upgrade Storybook to v10 * fix: Increase timeout for UI tests	2025-11-12 19:01:48 +01:00
Xuan-Son Nguyen	00c94083b3	server: (refactor) implement generator-based API for task results (#17174 ) * server: (refactor) implement generator-based API for task results * improve * moving some code * fix "Response ended prematurely" * add sink.done before return false * rm redundant check * rm unused var * rename generator --> reader	2025-11-12 18:50:52 +01:00
Xuan-Son Nguyen	ee8dd5c658	server: move res_error/res_ok to static function (#17167 )	2025-11-12 14:17:24 +01:00
Adrien Gallouët	78010a0d52	cmake : move OpenSSL linking to vendor/cpp-httplib (#17177 ) * cmake : move OpenSSL linking to vendor/cpp-httplib Signed-off-by: Adrien Gallouët <angt@huggingface.co> * bring back httplib 0.27.0 * add -DLLAMA_HTTPLIB * update cmake config for visionos --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co> Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-11-12 12:32:50 +01:00
Xuan-Son Nguyen	1d45b4228f	vendor: split httplib to cpp/h files (#17150 ) * vendor: split httplib to cpp/h files * move defines * include httplib if curl is not used * add TODO * fix build ios * fix build visionos instead	2025-11-11 13:32:58 +01:00
Mike Abbott	4a5b8aff40	cmake : add version to all shared object files (#17091 ) When compiling llama.cpp in Yocto, it fails QA checks because the generated so files aren't versioned. This applies a version to all generated so files, allowing the package to build without errors.	2025-11-11 13:19:50 +02:00
Nicolas B. Pierron	d2d626938a	Install rpc-server when GGML_RPC is ON. (#17149 )	2025-11-11 10:53:59 +00:00
Gabe Goodhart	0c74f32632	memory: Hybrid context shift (#17009 ) * feat(memory): Only fail partial erasure of recurrent tail The recurrent state is always assumed to be the state as of the last update from the final token in the sequence. When doing a partial erasure, if the range does not include the final token, the erasure can be considered a success since any memory used for the sequence prior to the final token (which is no memory) has been successfully removed. There is one potential case that this doesn't address which is the pruning of cache to remove sensitive data from the context. This wouldn't work for attention cache partial removal (in the middle) either since the KV state is linearly-dependent and states in later sequence positions would still be based on the state from the sensitive data, even if that data is no longer cached, so I don't think this is relevant, but it is worth noting that the semantics of this change for a partial erasure in the middle of the cache are essentially "my context is already compressed" and not "all trace of the removed tokens has been removed." https://github.com/ggml-org/llama.cpp/issues/16768 Branch: HybridContextShift-16768 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(main): Check the output of seq_rm for prefix matching This prefix matching is explicitly attempting to remove the tokens at the end of the sequence that don't match. This is the operation that can't be performed on a recurrent cache due to the state being updated in place, so if this removal fails, we need to clear the whole cache. https://github.com/ggml-org/llama.cpp/issues/16768 Branch: HybridContextShift-16768 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix(memory): Fix condition for partial erasure failure if p0 > pos Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: compilade <git@compilade.net> * style: Fix extra parens Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * fix(main.cpp): Set n_matching_session_tokens to 0 on cache clear https://github.com/ggml-org/llama.cpp/issues/16768 Branch: HybridContextShift-16768 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: compilade <git@compilade.net> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-11-10 17:14:23 +02:00
Georgi Gerganov	f914544b16	batched-bench : add "separate text gen" mode (#17103 )	2025-11-10 12:59:29 +02:00

... 3 4 5 6 7 ...

774 Commits