If the tagged message will be of 0 length, ensure that the passed
dest char* array, has null inserted appropriately.
Check that user has passed a non-null pNumParts.
Dont hard code int32_t size, pick using sizeof
so that the size of the elements is explicit and fixed, so that
it is inturn in sync with the fixed int size specified wrt the
c-api, even with any c compilers with different idea about int.
avoid some ununsed vars, need to update compile flags later to
enable corresponding warnings.
With this and past few commits, now there is simple yet sufficient
support to help move multi-level-hierarchy config files into the
SimpCfg's simple physically 1-level, but if reqd logically multi
level hierarchy flow.
B4 this series of commits also one could have still achieved this,
but there would have been bit more effort needed.
Use the commonality between Indian languages to show mixup issue
with the simple minded trim_dump logic and how trim_oversmart
could potentially avoid that.
Given that I am using valid strings to show the pitfalls of fixed
native char size driven logic, so no need to keep the dump and
oversmart flows seperate, so merge into a common loop.
Update the notes to match the templated flow now and some of the
nitty gritties involved.
Update DumpHexString to be templated.
Split check nonenglish flow wrt trim dumb and oversmart testing,
so that things with work with one, but not the other can be
differentiated in the flow.
The constructor method doesnt convert wstring to string, when it
involves non-english chars which will encode to multibyte chars
in utf8. even thou it does work for the already utf8 u8string.
wcstombs doesnt seem to work for non english chars, when the
locale is set to the default c, need to change to something like
en_US.UTF-8, to allow it to do the conversion properly.
Seperate out the checks wrt different string types.
Add a wstring_basic, which verifies that wstring iterator handles
non english chars propery or atleast better.
Without using imbue, I couldnt get non-english wstrings to print
on mac. Need to check on linux also.
Also avoid the uint8_t typecasting, given that wchar isnt 8bit
Also the warning wrt is it string, now also logs the line number,
group and key, to help user identify the line better.
Misc: pass time last week Another life, Anchakkallakokkan, Deadloch
TODO: string check wrt true/false, doesnt seem to be working after
str_tolower was introduced. I seem to be doing some silly mistake
not able to make out, moving in and out of sleep, need to check
tomorrow.
string == string-literal failed
string == string-view failed
string.compare(string-literal) failed
Bit strange
test-chat-template-chaton now tries to check if meta-ok is ok wrt
the template-id being looked into.
Log template-id info also, where it was previously missed out.
Warn if something not starting with double quote is being treated
as a string.
Show some examples of invalid floating point values wrt this
logics floating point determination code
so that one can update the value-item's content, without needing
to explicitly update/store the value-item back into map after the
content has been updated.
This should make these setting operations/helpers more efficient.
As c doesnt have the concept of pass by reference, and inturn the
existing c api uses pointers wrt llama chat message structure, so
switching to same wrt chat_tmpl_apply logics.
Also fix a oversight in previous commit and add the remaining logic.
Initial skeletons
Update existing logics to help with same. Also the inbetween helper
was having a bad signature wrt returning status and data, thats also
fixed.
While sending the current chat session along with new user query
to the model, many models expect that a tag be added at the end
to indicate that user is expecting the model to respond, this
flags allows for the same.
Add a c api wrapper for a single message tagging scenario.
Inturn to match convention followed by existing chat_apply_template
code, make it return the size expected of the tagged message string
buffer. Update internal single logic to help with same.
Explicitly check if tmpl specified is available in the loaded json
or not and then return a error if not found.
Fix a oversight wrt key name.
Add a alert in case if passed meta json file contains begin(BoS)
wrt assistant role, similar to check for end (EoS) wrt user role.
Bcas normally both (ie EoS wrt User and BoS wrt Assistant) shouldnt
be needed.
Update main wrt begin & prefix and suffix & end addition.
Move helpers to the begining, so can avoid adding prototype
declerations/function signatures to the begining
Get the char * wrt string data in the c++ string.
Also fix a oversight wrt begin, when flag based begin adding control
was introduced.
NOTE: Currently system role suffix/end conditional adding always
triggered, if 1st system prompt seen or additional system prompt
is seen.
Now there is a simple and extended version of returning tagged
messages.
The extended version returns the tagged string, as well as the
details of the parts that make up that tagged message interms of
the type of parts and the lengths of the parts.