Make the description bit more explicit with it supporting local
file paths as part of the url scheme, as the tested ai model was
cribbing about not supporting file url scheme. Need to check if
this new description will make things better.
Convert the text to bytes for writing to the http pipe.
Ensure CORS is kept happy by passing AccessControlAllowOrigin in
header.
instead of using the shared bearer token as is, hash it with
current year and use the hash.
keep /aum path out of auth check.
in future bearer token could be transformed more often, as well as
with additional nounce/dynamic token from server got during initial
/aum handshake as also running counter and so ...
NOTE: All these circus not good enough, given that currently the
simpleproxy.py handshakes work over http. However these skeletons
put in place, for future, if needed.
TODO: There is a once in a bluemoon race when the year transitions
between client generating the request and server handling the req.
But other wise year transitions dont matter bcas client always
creates fresh token, and server checks for year change to genrate
fresh token if required.
Add a config entry called bearer.insecure which will contain a
token used for bearer auth of http requests
Make bearer.insecure and allowed.domains as needed configs, and
exit program if they arent got through cmdline or config file.
Ensure load_config gets called on encountering --config in cmdline,
so that the user has control over whether cmdline or config file
will decide the final value of any given parameter.
Ensure that str type values in cmdline are picked up directly, without
running them through ast.literal_eval, bcas otherwise one will have to
ensure throught the cmdline arg mechanism that string quote is retained
for literal_eval
Have the """ function note/description below def line immidiately
so that it is interpreted as a function description.
Now both follow a similar mechanism and do the following
* exit on finding any issue, so that things are in a known
state from usage perspective, without any confusion/overlook
* check if the cmdlineArgCmd/configCmd being processed is a known
one or not.
* check value of the cmd is of the expected type
* have a generic flow which can accomodate more cmds in future
in a simple way
bing raised a challenge for chrome triggered search requests after
few requests, which were spread few minutes apart, while still
seemingly allowing wget based search to continue (again spread
few minutes apart).
Added a simple helper to trace this, use --debug True to enable
same.
The tagging of messages wrt ValidateUrl and UrlReq
Also dump req
Move check for --allowed.domains to ValidateUrl
NOTE: Also with mimicing of user agent etal from got request to
the generated request, yahoo search/news is returning results now,
instead of the bland error before.
with allowed domains set to few sites in general to show its use
this includes some sites which allow search to be carried out
through them as well as provide news aggregation
Had confused between js and python wrt accessing dictionary
contents and its consequence on non existent key. Fixed it.
Use different error ids to distinguish between failure in common
urlreq and the specific urltext and urlraw helpers.
Add a new send headers common helper and use the same wrt the
overridden send_error as well as do_OPTIONS
This ensures that if there is any error during proxy opertions,
the send_error propogates to the fetch from any browser properly
without browser intercepting it with a CORS error
First identify lines which have only whitespace and replace them
with lines with only newline char in them.
Next strip out adjacent lines, if they have only newlines
As html can be malformed, xml ElementTree XMLParser cant handle
the same properly, so switch to the HtmlParser helper class that is
provided by python and try extend it.
Currently a minimal skeleton to just start it out, which captures
only the body contents.
Declare the result of UrlReq as a DataClass, so that one doesnt
goof up wrt updating and accessing members.
Duplicate UrlRaw into UrlText, need to add Text extracting from
html next for UrlText