llama.cpp/tools/server/public_simplechat/local.tools
hanishkvc c316f5a2bd SimpleChatTC:WebTools:UrlText:HtmlParser: tag drops - refine
Update the initial skeleton wrt the tag drops logic

* had forgotten to convert object to json string at the client end
* had confused between js and python and tried accessing the dict
  elements using . notation rather than [] notation in python.
* if the id filtered tag to be dropped is found, from then on
  track all other tags of the same type (independent of id),
  so that start and end tags can be matched. bcas end tag call
  wont have attribute, so all other tags of same type need to
  be tracked, for proper winding and unwinding to try find
  matching end tag
* remember to reset the tracked drop tag type to None once matching
  end tag at same depth is found. should avoid some unnecessary
  unwinding.
* set/fix the type wrt tagDrops explicitly to needed depth and
  ensure the dummy one and any explicitly got one is of right type.

Tested with duckduckgo search engine and now the div based unneeded
header is avoided in returned search result.
2025-12-04 19:41:39 +05:30
..
debug.py SimpleChatTC:SimpleProxy: getting local / web file module ++ 2025-12-04 19:41:39 +05:30
filemagic.py SimpleChatTC:SimpleProxy:Switch web flow to use file helpers 2025-12-04 19:41:39 +05:30
pdfmagic.py SimpleChatTC:PdfText:Cleanup rename to follow a common convention 2025-12-04 19:41:39 +05:30
simpleproxy.json SimpleChatTC:SimpleProxy:Add generic arxiv.org entry to allowed 2025-12-04 19:41:39 +05:30
simpleproxy.py SimpleChatTC:SimpleProxy: Validate deps wrt enabled service paths 2025-12-04 19:41:39 +05:30
urlvalidator.py SimpleChatTC:SimpleProxy: Use urlvalidator 2025-12-04 19:41:39 +05:30
webmagic.py SimpleChatTC:WebTools:UrlText:HtmlParser: tag drops - refine 2025-12-04 19:41:39 +05:30