Commit Graph

466 Commits

Author SHA1 Message Date
hanishkvc 06fd41a88e SimpleChatTC:WebTools: urltext-tag-drops python side - skel
Rename search-drops to urltext-tag-drops, to indicate its more
generic semantic. Rather search drops specified in UI by user
will be mapped to urltext-tag-drops header entry of a urltext
web fetch request.

Implement a crude urltext-tag-drops logic in TextHtmlParser.
If there is any mismatch with opening and closing tags in the
html being parsed and inturn wrt the type of tag being targetted
for dropping, things can mess up.
2025-12-04 19:41:39 +05:30
hanishkvc f75bdb0e00 SimpleChatTC:WebTools And Search - headers and search drops - js
Allow the web tools handshake helper to pass additional header
entries provided by its caller.

Make use of this to send a list of tag and id pairs wrt web search
tool. Which will be used to drop div's matching the specified id.
2025-12-04 19:41:39 +05:30
hanishkvc 7fce3eeb2a SimpleChatTC:SettingsDefault:Enable cache prompt api option 2025-12-04 19:41:39 +05:30
hanishkvc 2cdf3f574c SimpleChatTC:SimpleProxy: Validate deps wrt enabled service paths
helps ensure only service paths that can be serviced are enabled

Use same to check for pypdf wrt pdftext
2025-12-04 19:41:39 +05:30
hanishkvc e6fd0ed05a SimpleChatTC: ToolCalling enabled, Sliding window adjust
Chances are for ai models which dont support tool calling, things
will be such that the tool calls meta data shared will be silently
ignored without much issue.

So enabling tool calling feature by default, so that in case one
is using a ai model with tool calling the feature is readily
available for use.

Revert SlidingWindow ChatHistory in Context from last 10 to last 5
(2 more then origianl, given more context support in todays models)
by default, given that now tool handshakes go through the tools
related side channel in the http handshake and arent morphed into
normal user-assistant channel of the handshake.
2025-12-04 19:41:39 +05:30
hanishkvc 1d1894ad14 SimpleChatTC:PdfText:Cleanup rename to follow a common convention
Rename path and tags/identifiers from Pdf2Text to PdfText

Rename the function call to pdf_to_text, this should also help
indicate semantic more unambiguously, just in case, especially
for smaller models.
2025-12-04 19:41:39 +05:30
hanishkvc 8501759f60 SimpleChatTC:Cleanup:UsageNote, Initial SettingsInfo shown
Usage Note
* Cleanup / fix some wording.
* Pick chat history handshaked len from config

Ensure the settings info is uptodate wrt available tool names
by chaining a reshowing with tools manager initialisation.
2025-12-04 19:41:39 +05:30
hanishkvc a4483e3bc7 SimpleChatTC:Cleanup Usage Note and its presentation a bit
Make it a details block and update the content a bit
2025-12-04 19:41:39 +05:30
hanishkvc e10a826273 SimpleChatTC: Cleanup - remove older now unused show chat logic 2025-12-04 19:41:39 +05:30
hanishkvc 9efab62702 SimpleChatTC:SimpleProxy:Add generic arxiv.org entry to allowed 2025-12-04 19:41:39 +05:30
hanishkvc 3b929f934f SimpleChatTC:SimpleProxy:Switch web flow to use file helpers
This also indirectly adds support for local file system access
through the web / fetch (ie urlraw and urltext) service request paths.
2025-12-04 19:41:39 +05:30
hanishkvc e1cf2bae7e SimpleChatTC:SimpleProxy:Pdf2Text update /cleanup readme 2025-12-04 19:41:39 +05:30
hanishkvc 494d063657 SimpleChatTC:SimpleProxy: getting local / web file module ++
Added logic to help get a file from either the local file system
or from the web, based on the url specified.

Update pdfmagic module to use the same, so that it can support
both local as well as web based pdf.

Bring in the debug module, which I had forgotten to commit, after
moving debug helper code from simpleproxy.py to the debug module
2025-12-04 19:41:39 +05:30
hanishkvc a3beacf16a SimpleChatTC:SimpleProxy:Pdf2Text cleanup page number handling
Its not necessary to request a page number range always.

Take care of page number starting from 1 and underlying data having
0 as the starting index
2025-12-04 19:41:39 +05:30
hanishkvc d012d127bf SimpleChatTC:SimpleProxy: Avoid circular deps wrt Type Checking
also move debug dump helper to its own module

also remember to specify the Class name in quotes, similar to
refering to a class within a member of th class wrt python type
checking.
2025-12-04 19:41:39 +05:30
hanishkvc 350d7d77e0 SimpleChatTC:SimpleProxy: Move web requests to its own module 2025-12-04 19:41:39 +05:30
hanishkvc a7de002fd0 SimpleChatTC:SimpleProxy:Move pdf logic into its own module 2025-12-04 19:41:39 +05:30
hanishkvc b18aed4449 SimpleChatTC:SimpleProxy: AuthAndRun hlpr for paths that check auth
Also trap any exceptions while handling and send exception info
to the client requesting service
2025-12-04 19:41:39 +05:30
hanishkvc c597572e10 SimpleChatTC:SimpleProxy: Use urlvalidator
Add --allowed.schemes config entry as a needed config.

Setup the url validator.

Use this wrt urltext, urlraw and pdf2text

This allows user to control whether local file access is enabled
or not. By default in the sample simpleproxy.json config file
local file access is allowed.
2025-12-04 19:41:39 +05:30
hanishkvc 6cab95657f SimpleChatTC:SimpleProxy:UrlValidator initial go
Check if the specified scheme is allowed or not.

If allowed then call corresponding validator to check remaining
part of the url is fine or not
2025-12-04 19:41:39 +05:30
hanishkvc c8407a1240 SimpleChatTC:SimpleProxy:UrlValidator module initial skeleton
Copy validate_url and build initial skeleton
2025-12-04 19:41:39 +05:30
hanishkvc d3a893cac9 SimpleChatTC:Update notes 2025-12-04 19:41:39 +05:30
hanishkvc c21bef4ddd SimpleChatTC:Fixup auto toolcall wrt newer ChatShow flow
This is a initial go wrt the new overall flow, should work, but
need to cross check.
2025-12-04 19:41:39 +05:30
hanishkvc dd0a7ec500 SimpleChatTC:Pdf2Text: Make it work with a subset of pages
Initial go, need to review the code flow as well as test it out
2025-12-04 19:41:39 +05:30
hanishkvc 8bc7de4416 SimpleChatTC:TC Result truncating only if needed
As I was seeing the truncated message even for stripped plain text
web acces, relooking at that initial go at truncating, revealed
a oversight, which had the truncation logic trigger anytime the
iResultMaxDataLength was greater than 0, irrespective of whether
the actual result was smaller than the allowed limit or not,
thus adding that truncated message to end of result unnecessarily.
Have fixed that oversight

Also recent any number of args based simpleprox handshake helper
in toolweb seems to be working (atleast for the existing single
arg based calls).
2025-12-04 19:41:39 +05:30
hanishkvc 63a8ddfbb9 SimpleChatTC:SimpleProxyHS: make helper work with any num of args
This makes the logic more generic, as well as prepares for additional
parameters to be passed to the simpleproxy.py helper handshakes.

Ex: Restrict extracted contents of a pdf to specified start and end
page numbers or so.
2025-12-04 19:41:39 +05:30
hanishkvc 61064baa19 SimpleChatTC:Pdf2Text and otherwise readme update
Half asleep as usual ;)
2025-12-04 19:41:39 +05:30
hanishkvc e077f23f9e SimpleChatTC:Pdf2Text: Refine desc and MaxResultDataLength
Needed to tweak the description further for the ai model to be
able to understand that its ok to pass file:// scheme based urls

Had forgotten how big the web site pages have become as also the
need for more ResultDataLength wrt one shot PDF read to get
atleast some good enough amount of content in it with large pdfs
2025-12-04 19:41:39 +05:30
hanishkvc 21544eaf87 SimpleChatTC:ResultMaxDataLength, Desc
Allow user to limit the max amount of result data returned to ai
after a tool call.

Inturn it is set by default to 2K.

Update the pdf2text tool description to try make the local file
path support more explicit
2025-12-04 19:41:39 +05:30
hanishkvc dfeb94d3f6 SimpleChatTC:Pdf2Text: cleanup initial go
Make the description bit more explicit with it supporting local
file paths as part of the url scheme, as the tested ai model was
cribbing about not supporting file url scheme. Need to check if
this new description will make things better.

Convert the text to bytes for writing to the http pipe.

Ensure CORS is kept happy by passing AccessControlAllowOrigin in
header.
2025-12-04 19:41:39 +05:30
hanishkvc f97efb86e4 SimpleChatTC:SimpleProxy:Pdf2Text: js side initial plumbing
Expose pdf2text tool call to ai server and handshake with simple
proxy for the same.
2025-12-04 19:41:39 +05:30
hanishkvc 6054ddfb65 SimpleChatTC:SimpleProxy:Pdf2Text: Initial go 2025-12-04 19:41:39 +05:30
hanishkvc 5ec29087ea SimpleChatTC:SimpleProxy:Pdf2Text: Move handling url to its own 2025-12-04 19:41:39 +05:30
hanishkvc ecfdb66c94 SimpleChatTC:SimpleProxy:Pdf2Text:Initial plumbing
Get the pdf2text request for processing.
2025-12-04 19:41:39 +05:30
hanishkvc da98a961ab SimpleChatTC:SimpleProxy: Enable allowing or not requested feature 2025-12-04 19:41:39 +05:30
hanishkvc 0b2329e5de SimpleChatTC: Update readme 2025-12-04 19:41:39 +05:30
hanishkvc 6a8ced244c SimpleChatTC:Raise Error on Ai Chat server handshake NotOk resp 2025-12-04 19:41:39 +05:30
hanishkvc 91f39b7197 SimpleChatTC:Move chat server handshake to SimpleChat 2025-12-04 19:41:39 +05:30
hanishkvc 482517543b SimpleChatTC:Seperate out actual nw handshake - initial go
move the actual chat handshake with ai server into a seperate code
to an extent.

also initial anchor to trap handshake http error responses

Rather come to think of it, its better to move this into SimpleChat
class.

Use finally to ensure any needed cleanup for handle_user_submit
occurs within itself.
2025-12-04 19:41:39 +05:30
hanishkvc 8d7eece81c SimpleChatTC:ToolsWorker: Update note to flow with chat session id 2025-12-04 19:41:39 +05:30
hanishkvc 8ab8727f70 SimpleChatTC:DataStore: update readme 2025-12-04 19:41:39 +05:30
hanishkvc 5935ecceca SimpleChatTC:DataStore:Cleanup:Msg, duplicate on routing side
Avoid the duplicate plumbing code and use a common ops plumbing
helper.

Remove args[key] oversight from DataStoreList msg on webworkr
2025-12-04 19:41:39 +05:30
hanishkvc 57dd228512 SimpleChatTC:DataStore:List keys - the plumbing 2025-12-04 19:41:39 +05:30
hanishkvc 2d497069d2 SimpleChatTC:DataStore:list - web worker side logic
The basic skeleton added on the web worker side for listing keys.

TODO: Avoid duplication of similar code to an extent across some
of these db ops.
2025-12-04 19:41:39 +05:30
hanishkvc 7f8eb04875 SimpleChatTC:DataStore:Delete a record - the plumbing side 2025-12-04 19:41:39 +05:30
hanishkvc bd7f7cb72a SimpleChatTC:DataStore: Delete a record - the db web worker side 2025-12-04 19:41:39 +05:30
hanishkvc d80e438cfa SimpleChatTC:DataStore:Put, stringify undefined, readme
Update the descriptions of set and get to indicate the possible
corner cases or rather semantic in such situations.

Update the readme also a bit. The auto save and restore mentioned
has nothing to do with the new data store mechanism.
2025-12-04 19:41:39 +05:30
hanishkvc 2dad246d53 SimpleChatTC:DataStore: Dont ignore the error paths
And indexedDB add isnt the one to be happy with updating existing
key.
2025-12-04 19:41:39 +05:30
hanishkvc 4ad88f0da8 SimpleChatTC:DataStore:Eagerness to Wrong JSON conversions
In the eagerness of initial skeleton, had forgotten that the
root/generic tool call router takes care of parsing the json string
into a object, before calling the tool call, so no need to try
parse again. Fixed the same.

Hadnt converted the object based response from data store related
calls in the db web worker, into json string before passing to the
generic tool response callback, fixed the same.

- Rather the though of making the ChatMsgEx.createAllInOne handle
string or object set aside for now, to keep things simple and
consistant to the greatest extent possible across different flows.

And good news - flow is working atleast for the overall happy path
Need to check what corner cases are lurking like calling set on
same key more than once, seemed to have some flow oddity, which I
need to check later.

Also maybe change the field name to value from data in the response
to get, to match the field name convention of set. GPT-OSS is fine
with it. But worst case micro / nano / pico models may trip up, in
worst case, so better to keep things consistent.
2025-12-04 19:41:39 +05:30
hanishkvc 797b702251 SimpleChatTC:DataStore:FuncCallArgs: Any type not supported
So mention that may be ai can send complex objects in stringified
form. Rather once type of value is set to string, ai should normally
do it, but no harm is hinting.
2025-12-04 19:41:39 +05:30