This simple scheme doesnt work. Rather the pdf outline seems
to follow below logic
If a child list is found when processing the current list, dont
increment the numbering.
Rather a chat with gpt-oss generated a assistant response which
included chat-content, chat-reasoning and chat-toolcall all in the
same response. On responding to same with tool call result, the
server http handshake responded with a 500 Internal server error,
So added this to get more details in this case, as well as in
general for future.
To make it easier for the ai model to understand that this works
mainly for html pages and not say xml or pdf or so. For those
one needs to use other explict tool calls provided like fetchpdftext
or fetchxmltext or so
The server service path renamed from urltext to htmltext.
SearchWebText also updated to use htmltext now
At simpleproxy end
* Add the tag names hierarchy before contents of a tag
* Remember to convert the tagDrops to small case as HTMLParser base
class seems to do that by default.
At the client ui end
* if undefined remember to pass a empty list wrt tagDrops.
* cleanup the func description and also mention possible tagDrops
for RSS feeds in the tool meta
Add forgotten to add , after simplechat entry.
Currently I am not strictly using the importmap feature, so the
error didnt create any problem, but the error was there which has
been fixed.
Add a pending object which maintains the pending toolcallid wrt
each chat session, when ever a tool call is made.
In turn when ever a tool call response is got cross check if its
toolcallid matches that in the pending list. If so accept the
tool call response and remove from pending list. If not just
ignore the response.
NOTE: The current implementation supports only 1 pending tool call
at any time.
NOTE: Had to change from a anonymous to arrow function so as to
be able to get access to the ToolsManager instance (this) from
within the function. ie make use of lexical binding semantic of
arrow functions.
Me.tools.toolNames is now directly updated by init of ToolsManager
The two then in the old tools.init was also unneeded then also as
both could have been merged into a single then, even then. However
with the new flow, the 1st then is no longer required.
Also now the direct calling of onmessage handler on the main thread
side wrt immidiate result from tool call is delayed for a cycling
through the events loop, by using a setTimeout.
No longer expose the tools module throught documents, given that
the tools module mainly contains ToolsManager, whose only instance
is available through the global gMe.
Move the devel related exposing throught document object into a
function of its own.
Rename Tools to ToolsManager to convey its semantic better.
Move setup of workers onmessage callback as well as directly passing
result to these callbacks into ToolsManager.
Now that Workers have been moved into ToolsManager, and ToolsManager
has been instantiated as a member of Me, use the same in place of
prev workers of Me.
So that all tools related management logic sits in tools module
itself, but is accessible from Me by having a instance of Tools.
The Workers moved into Tools class.
The tc_switch moved into Tools class.
The setup_workers, init, meta and tool_call moved into Tools class.
Given that Me is now passed to the tools logic during setup, have
the web worker handles in Me itself, instead of in tool related
modules.
Move setup of web worker related main thread callbacks, as well as
posting messages directly to these main thread callbacks, into Me.
Have main classes defined independent of and away from runtime flow
Move out the entry point including runtime instantiation of the
core Me class (which inturn brings other class instances as neede)
into its own main.js file.
With this one should be able to import simplechat.js into other
files, where one might need the SimpleChat or MultiChat or Me class
definitions.
Some ai's dont seem to be prefering to use this direct helper
provided for fetching pdf as text, on its own. Instead ai (gptoss)
seems to be keen on fetching raw pdf and extract text etal, so now
renaming the function call to try and make its semantic more
readily obivious hopefully.
It sometimes (not always) seem to assum fetch_web_url_text, can
convert pdf to text and return it. Maybe I need to place the
specific fetch pdf as text before the generic fetch web url text
and so...
With the rename, the pdf specific fetch seems to be getting used
more.
Allow user to clear the existing chat. The user does have the
option to load the just cleared chat, if required.
Add icons wrt clearing chat and settings.
Update readme wrt searchDrops, auto settings ui creation
Rename tools-auto to tools-autoSecs, to make it easy to realise
that the value represents seconds.
Update the initial skeleton wrt the tag drops logic
* had forgotten to convert object to json string at the client end
* had confused between js and python and tried accessing the dict
elements using . notation rather than [] notation in python.
* if the id filtered tag to be dropped is found, from then on
track all other tags of the same type (independent of id),
so that start and end tags can be matched. bcas end tag call
wont have attribute, so all other tags of same type need to
be tracked, for proper winding and unwinding to try find
matching end tag
* remember to reset the tracked drop tag type to None once matching
end tag at same depth is found. should avoid some unnecessary
unwinding.
* set/fix the type wrt tagDrops explicitly to needed depth and
ensure the dummy one and any explicitly got one is of right type.
Tested with duckduckgo search engine and now the div based unneeded
header is avoided in returned search result.
Rename search-drops to urltext-tag-drops, to indicate its more
generic semantic. Rather search drops specified in UI by user
will be mapped to urltext-tag-drops header entry of a urltext
web fetch request.
Implement a crude urltext-tag-drops logic in TextHtmlParser.
If there is any mismatch with opening and closing tags in the
html being parsed and inturn wrt the type of tag being targetted
for dropping, things can mess up.
Allow the web tools handshake helper to pass additional header
entries provided by its caller.
Make use of this to send a list of tag and id pairs wrt web search
tool. Which will be used to drop div's matching the specified id.
Chances are for ai models which dont support tool calling, things
will be such that the tool calls meta data shared will be silently
ignored without much issue.
So enabling tool calling feature by default, so that in case one
is using a ai model with tool calling the feature is readily
available for use.
Revert SlidingWindow ChatHistory in Context from last 10 to last 5
(2 more then origianl, given more context support in todays models)
by default, given that now tool handshakes go through the tools
related side channel in the http handshake and arent morphed into
normal user-assistant channel of the handshake.
Rename path and tags/identifiers from Pdf2Text to PdfText
Rename the function call to pdf_to_text, this should also help
indicate semantic more unambiguously, just in case, especially
for smaller models.
Usage Note
* Cleanup / fix some wording.
* Pick chat history handshaked len from config
Ensure the settings info is uptodate wrt available tool names
by chaining a reshowing with tools manager initialisation.
Added logic to help get a file from either the local file system
or from the web, based on the url specified.
Update pdfmagic module to use the same, so that it can support
both local as well as web based pdf.
Bring in the debug module, which I had forgotten to commit, after
moving debug helper code from simpleproxy.py to the debug module
also move debug dump helper to its own module
also remember to specify the Class name in quotes, similar to
refering to a class within a member of th class wrt python type
checking.
Add --allowed.schemes config entry as a needed config.
Setup the url validator.
Use this wrt urltext, urlraw and pdf2text
This allows user to control whether local file access is enabled
or not. By default in the sample simpleproxy.json config file
local file access is allowed.