SimpleChatTC:XmlFiltered: Avoid showing skipped tags as no content

Dont even insert skipped tags as tag blocks with empty content.

This should make the resultant xml cleaner and make it use less
space.
This commit is contained in:
hanishkvc 2025-11-08 00:27:43 +05:30
parent 143f9c0b1a
commit 0628226ea1
2 changed files with 17 additions and 7 deletions

View File

@ -259,19 +259,22 @@ class XMLFilterParser(html.parser.HTMLParser):
return True return True
def handle_starttag(self, tag: str, attrs: list[tuple[str, str | None]]): def handle_starttag(self, tag: str, attrs: list[tuple[str, str | None]]):
self.lastTrackedCB = "starttag"
self.prefixTags.append(tag) self.prefixTags.append(tag)
if not self.do_capture():
return
self.lastTrackedCB = "starttag"
self.prefix += "\t" self.prefix += "\t"
self.text += f"\n{self.prefix}<{tag}>" self.text += f"\n{self.prefix}<{tag}>"
def handle_endtag(self, tag: str): def handle_endtag(self, tag: str):
if (self.lastTrackedCB == "endtag"): if self.do_capture():
self.text += f"\n{self.prefix}</{tag}>" if (self.lastTrackedCB == "endtag"):
else: self.text += f"\n{self.prefix}</{tag}>"
self.text += f"</{tag}>" else:
self.lastTrackedCB = "endtag" self.text += f"</{tag}>"
self.lastTrackedCB = "endtag"
self.prefix = self.prefix[:-1]
self.prefixTags.pop() self.prefixTags.pop()
self.prefix = self.prefix[:-1]
def handle_data(self, data: str): def handle_data(self, data: str):
if self.do_capture(): if self.do_capture():

View File

@ -664,6 +664,7 @@ sliding window based drop off or even before they kick in, this can help in many
* renamed and updated logic wrt xml fetching to be fetch_xml_filtered. allow one to use re to identify * renamed and updated logic wrt xml fetching to be fetch_xml_filtered. allow one to use re to identify
the tags to be filtered in a fine grained manner including filtering based on tag heirarchy the tags to be filtered in a fine grained manner including filtering based on tag heirarchy
* avoid showing empty skipped tag blocks
* logic which shows the generated tool call has been updated to trap errors when parsing the function call * logic which shows the generated tool call has been updated to trap errors when parsing the function call
arguments generated by the ai. This ensures that the chat ui itself doesnt get stuck in it. Instead now arguments generated by the ai. This ensures that the chat ui itself doesnt get stuck in it. Instead now
@ -683,6 +684,12 @@ Handle multimodal handshaking with ai models.
Add fetch_rss and may be different document formats processing related tool calling, in turn through Add fetch_rss and may be different document formats processing related tool calling, in turn through
the simpleproxy.py if and where needed. the simpleproxy.py if and where needed.
* Using xmlfiltered and tagDropREs of
* ["^rss:channel:item:(?!title).+$"] one can fetch and extract out all the titles.
* ["^rss:channel:item:(?!title|link|description).+$"] one can fetch and extract out all the
titles along with corresponding links and descriptions
* rather with some minimal proding and guidance gpt-oss generated this to use xmlfiltered to read rss
Save used config entries along with the auto saved chat sessions and inturn give option to reload the Save used config entries along with the auto saved chat sessions and inturn give option to reload the
same when saved chat is loaded. same when saved chat is loaded.