llama.cpp

Commit Graph

Author	SHA1	Message	Date
hanishkvc	dd0a7ec500	SimpleChatTC:Pdf2Text: Make it work with a subset of pages Initial go, need to review the code flow as well as test it out	2025-12-04 19:41:39 +05:30
hanishkvc	dfeb94d3f6	SimpleChatTC:Pdf2Text: cleanup initial go Make the description bit more explicit with it supporting local file paths as part of the url scheme, as the tested ai model was cribbing about not supporting file url scheme. Need to check if this new description will make things better. Convert the text to bytes for writing to the http pipe. Ensure CORS is kept happy by passing AccessControlAllowOrigin in header.	2025-12-04 19:41:39 +05:30
hanishkvc	6054ddfb65	SimpleChatTC:SimpleProxy:Pdf2Text: Initial go	2025-12-04 19:41:39 +05:30
hanishkvc	5ec29087ea	SimpleChatTC:SimpleProxy:Pdf2Text: Move handling url to its own	2025-12-04 19:41:39 +05:30
hanishkvc	ecfdb66c94	SimpleChatTC:SimpleProxy:Pdf2Text:Initial plumbing Get the pdf2text request for processing.	2025-12-04 19:41:39 +05:30
hanishkvc	da98a961ab	SimpleChatTC:SimpleProxy: Enable allowing or not requested feature	2025-12-04 19:41:39 +05:30
hanishkvc	84403973cd	SimpleChatTC:SimpleProxy: once in a bluemoon transformed bearer instead of using the shared bearer token as is, hash it with current year and use the hash. keep /aum path out of auth check. in future bearer token could be transformed more often, as well as with additional nounce/dynamic token from server got during initial /aum handshake as also running counter and so ... NOTE: All these circus not good enough, given that currently the simpleproxy.py handshakes work over http. However these skeletons put in place, for future, if needed. TODO: There is a once in a bluemoon race when the year transitions between client generating the request and server handling the req. But other wise year transitions dont matter bcas client always creates fresh token, and server checks for year change to genrate fresh token if required.	2025-12-04 19:41:39 +05:30
hanishkvc	6d08cda9c8	SimpleChatTC:SimpleProxy: Check for bearer authorization As noted in the comments in code, this is a very insecure flow for now.	2025-12-04 19:41:39 +05:30
hanishkvc	3f1fd289eb	SimpleChatTC:SimpleProxy:BearerInsecure a needed config Add a config entry called bearer.insecure which will contain a token used for bearer auth of http requests Make bearer.insecure and allowed.domains as needed configs, and exit program if they arent got through cmdline or config file.	2025-12-04 19:41:39 +05:30
hanishkvc	0caa2e8101	SimpleChatTC:SimpleProxy: Prg Parameters handling cleanup - next Ensure load_config gets called on encountering --config in cmdline, so that the user has control over whether cmdline or config file will decide the final value of any given parameter. Ensure that str type values in cmdline are picked up directly, without running them through ast.literal_eval, bcas otherwise one will have to ensure throught the cmdline arg mechanism that string quote is retained for literal_eval Have the """ function note/description below def line immidiately so that it is interpreted as a function description.	2025-12-04 19:41:39 +05:30
hanishkvc	f221a2c356	SimpleChatTC:SimpleProxy:LoadConfig ProcessArgs cleanup - initial Now both follow a similar mechanism and do the following * exit on finding any issue, so that things are in a known state from usage perspective, without any confusion/overlook * check if the cmdlineArgCmd/configCmd being processed is a known one or not. * check value of the cmd is of the expected type * have a generic flow which can accomodate more cmds in future in a simple way	2025-12-04 19:41:39 +05:30
hanishkvc	9e97880dde	SimpleChatTC:SimpleProxy:Cleanup avoid logically duplicate debug log	2025-12-04 19:41:39 +05:30
hanishkvc	4c1c363504	SimpleChatTC:SimpleProxy: debug dumps to identify funny bing bing raised a challenge for chrome triggered search requests after few requests, which were spread few minutes apart, while still seemingly allowing wget based search to continue (again spread few minutes apart). Added a simple helper to trace this, use --debug True to enable same.	2025-12-04 19:41:39 +05:30
hanishkvc	bebf846157	SimpleChatTC:SimpleProxy:Cleanup a bit The tagging of messages wrt ValidateUrl and UrlReq Also dump req Move check for --allowed.domains to ValidateUrl NOTE: Also with mimicing of user agent etal from got request to the generated request, yahoo search/news is returning results now, instead of the bland error before.	2025-12-04 19:41:39 +05:30
hanishkvc	d0b9103176	SimpleChatTC:SimpleProxy:Try mimic real client using got req info ie include User-Agent, Accept-Language and Accept in the generated request using equivalent values got in the request being proxied.	2025-12-04 19:41:39 +05:30
hanishkvc	e6e0adbe90	SimpleChatTC:SimpleProxy: Some debug prints which give info	2025-12-04 19:41:39 +05:30
hanishkvc	840cab0b1c	SimpleChatTC:SimpleProxy: Include a sample config file with allowed domains set to few sites in general to show its use this includes some sites which allow search to be carried out through them as well as provide news aggregation	2025-12-04 19:41:39 +05:30
hanishkvc	370326b1ec	SimpleChatTC:SimpleProxy: Cleanup domain filtering and general Had confused between js and python wrt accessing dictionary contents and its consequence on non existent key. Fixed it. Use different error ids to distinguish between failure in common urlreq and the specific urltext and urlraw helpers.	2025-12-04 19:41:39 +05:30
hanishkvc	71ad609db6	SimpleChatTC:SimpleProxy: AllowedDomains based filtering Allow fetching from only specified allowed.domains	2025-12-04 19:41:39 +05:30
hanishkvc	58954c8814	SimpleChatTC:SimpleProxy: Update doc following python convention	2025-12-04 19:41:39 +05:30
hanishkvc	62dcd506e3	SimpleChatTC:SimpleProxy:Allow for loading json based config file The config entries should be named same as their equivalent cmdline argument entries but without the -- prefix	2025-12-04 19:41:39 +05:30
hanishkvc	98d43fac7f	SimpleChatTC:WebFetch: Try confirm simpleproxy before enabling	2025-12-04 19:41:39 +05:30
hanishkvc	d04c8cd38d	SimpleChatTC:SimpleProxy: Ensure CORS related headers sent always Add a new send headers common helper and use the same wrt the overridden send_error as well as do_OPTIONS This ensures that if there is any error during proxy opertions, the send_error propogates to the fetch from any browser properly without browser intercepting it with a CORS error	2025-12-04 19:41:39 +05:30
hanishkvc	73a144c44d	SimpleChatTC:SimpleProxy:HtmlParser more generic and flexible also now track header, footer and nav so that they arent captured	2025-12-04 19:41:39 +05:30
hanishkvc	9ff2c596ee	SimpleChatTC:SimpleProxy:Options just in case	2025-12-04 19:41:39 +05:30
hanishkvc	bf63b8f45a	SimpleChatTC:SimpleProxy:UrlText: Slightly better trimming First identify lines which have only whitespace and replace them with lines with only newline char in them. Next strip out adjacent lines, if they have only newlines	2025-12-04 19:41:39 +05:30
hanishkvc	266e825c68	SimpleChatTC:SimpleProxy:UrlText: Try strip empty lines some what	2025-12-04 19:41:39 +05:30
hanishkvc	b46bbc542a	SimpleChatTC:SimpleProxy:UrlText: Avoid style blocks also	2025-12-04 19:41:39 +05:30
hanishkvc	f493e1af59	SimpleChatTC:SimpleProxy:UrlText: Capture body except for scripts	2025-12-04 19:41:39 +05:30
hanishkvc	45b05df21b	SimpleChatTC:SimpleProxy: Switch to html.parser As html can be malformed, xml ElementTree XMLParser cant handle the same properly, so switch to the HtmlParser helper class that is provided by python and try extend it. Currently a minimal skeleton to just start it out, which captures only the body contents.	2025-12-04 19:41:39 +05:30
hanishkvc	d5f4183f7c	SimpleChatTC:SimpleProxy: ElementTree, No _UrlopenRet As _UrlopenRet not exposed for use outside urllib, so decode and encode the data. Add skeleton to try get the html/xml tree top elements	2025-12-04 19:41:39 +05:30
hanishkvc	6537559360	SimpleChatTC:SimpleProxy:Common UrlReq helper for UrlRaw & UrlText Declare the result of UrlReq as a DataClass, so that one doesnt goof up wrt updating and accessing members. Duplicate UrlRaw into UrlText, need to add Text extracting from html next for UrlText	2025-12-04 19:41:39 +05:30
hanishkvc	e600e62e86	SimpleChatTC:SimpleProxy: Cleanup few messages	2025-12-04 19:41:39 +05:30
hanishkvc	3bab4de0e8	SimpleChatTC:SimpleProxy:UrlRaw: Fixup basic oversight wrt 1st go	2025-12-04 19:41:39 +05:30
hanishkvc	73ef9f7d46	SimpleChatTC:SimpleProxy:implement handle_urlraw A basic go at it	2025-12-04 19:41:39 +05:30
hanishkvc	73054a5832	SimpleChatTC:SimpleProxy: Extract and check path, route to handlers	2025-12-04 19:41:39 +05:30
hanishkvc	c99788e290	SimpleChatTC:SimpleProxy: Cleanup for basic run	2025-12-04 19:41:39 +05:30
hanishkvc	80fd065993	SimpleChatTC:SimpleProxy: Start server, Show requested path	2025-12-04 19:41:39 +05:30
hanishkvc	05c0ade8be	SimpleChatTC:SimpleProxy:Process args --port	2025-12-04 19:41:39 +05:30

39 Commits