Add a config entry called bearer.insecure which will contain a
token used for bearer auth of http requests
Make bearer.insecure and allowed.domains as needed configs, and
exit program if they arent got through cmdline or config file.
Ensure load_config gets called on encountering --config in cmdline,
so that the user has control over whether cmdline or config file
will decide the final value of any given parameter.
Ensure that str type values in cmdline are picked up directly, without
running them through ast.literal_eval, bcas otherwise one will have to
ensure throught the cmdline arg mechanism that string quote is retained
for literal_eval
Have the """ function note/description below def line immidiately
so that it is interpreted as a function description.
Now both follow a similar mechanism and do the following
* exit on finding any issue, so that things are in a known
state from usage perspective, without any confusion/overlook
* check if the cmdlineArgCmd/configCmd being processed is a known
one or not.
* check value of the cmd is of the expected type
* have a generic flow which can accomodate more cmds in future
in a simple way
bing raised a challenge for chrome triggered search requests after
few requests, which were spread few minutes apart, while still
seemingly allowing wget based search to continue (again spread
few minutes apart).
Added a simple helper to trace this, use --debug True to enable
same.
The tagging of messages wrt ValidateUrl and UrlReq
Also dump req
Move check for --allowed.domains to ValidateUrl
NOTE: Also with mimicing of user agent etal from got request to
the generated request, yahoo search/news is returning results now,
instead of the bland error before.
with allowed domains set to few sites in general to show its use
this includes some sites which allow search to be carried out
through them as well as provide news aggregation
Had confused between js and python wrt accessing dictionary
contents and its consequence on non existent key. Fixed it.
Use different error ids to distinguish between failure in common
urlreq and the specific urltext and urlraw helpers.
Add a new send headers common helper and use the same wrt the
overridden send_error as well as do_OPTIONS
This ensures that if there is any error during proxy opertions,
the send_error propogates to the fetch from any browser properly
without browser intercepting it with a CORS error
First identify lines which have only whitespace and replace them
with lines with only newline char in them.
Next strip out adjacent lines, if they have only newlines
As html can be malformed, xml ElementTree XMLParser cant handle
the same properly, so switch to the HtmlParser helper class that is
provided by python and try extend it.
Currently a minimal skeleton to just start it out, which captures
only the body contents.
Declare the result of UrlReq as a DataClass, so that one doesnt
goof up wrt updating and accessing members.
Duplicate UrlRaw into UrlText, need to add Text extracting from
html next for UrlText