Commit Graph

379 Commits

Author SHA1 Message Date
HanishKVC 0342124946 GroupKV: Add to_str wrt vectors, help avoid compiler confusion 2024-05-11 12:27:42 +05:30
HanishKVC 7d7c59ec50 GroupKV:Simplify:P2: Rename tags, Make debug logs conditional
Rename all the log messages to have GKV and not SC.

The log messages in get_vector made conditional to GKV_DEBUG, this
was missed out earlier in simpcfg itself.
2024-05-11 11:57:27 +05:30
HanishKVC d764a9d395 GroupKV: Simplify code to the minimal needed for GroupKV - P1 2024-05-11 11:37:06 +05:30
HanishKVC 86b842b172 GroupKV: Duplicate SimpCfg to chop down into GroupKV
IE a minimal MapOfMapOfVariant, with some basic helpers.

This can be the basis of a ChatTemplates object, as well as
SimpCfg built on top of it.
2024-05-11 10:57:32 +05:30
HanishKVC c0506f94bf SimpCfg: Allow for direct initialization lists based init
This should pave way for having a default chat templates dataset
in the code, without needing to load it from a config file, if
one doesnt want to.

TODO: allow for loading config from json into simpcfg, so that
a program which uses llama.cpp can decide, whether it is ok with
what is already there in the internal dataset, or allow for loading
template info at runtime using the simpcfg's simple text file or
additionally include the json code to load template info at runtime
from json file.
2024-05-11 00:33:31 +05:30
HanishKVC fe27902964 SimpCfg: Avoid iostream/cout and format for direct library use
It appears like std::format is not supported in older g++/lib still
in wide use like current debian stable, so avoiding same wrt direct
library use.

Allow for empty VAARGS

NOTE: However test program mode of the same uses cout and format
2024-05-10 22:27:07 +05:30
HanishKVC 1f9a0eb8ce ChatON: Remove unneeded iostream 2024-05-10 21:10:44 +05:30
Justine Tunney 4e3880978f
Fix memory bug in grammar parser (#7194)
The llama.cpp grammar parser had a bug where forgetting to add a closing
quotation mark to strings would cause parsing to crash. Anyone running a
server on a public endpoint is advised to upgrade. To reproduce this bug

    ./llamafile -m foo.gguf -p bar --grammar 'root::="'

Credit for discovering and reporting this issue goes to Eclypsium
Security Researcher Richard Johnson <Richard.johnson@eclypsium.com>.
2024-05-10 21:01:08 +10:00
HanishKVC f89fe2732c
Main+: optionally allow special tokens from user in interactive mode (#7097)
@hanishkvc added a new `--interactive-specials` flag which would allow for inserting special tokens from user side into the embedding stream.
2024-05-10 20:21:58 +10:00
HanishKVC abb406b888 Merge branch 'master' into hkvc_chaton_v3
Have merged master branch has of 20240510IST12XY with chaton_v3
branch.

As part of same had to update the flow in examples/main/main.cpp
wrt conversion related commit in master branch and my chaton related
commits in this branch.
2024-05-10 13:14:26 +05:30
Johannes Gäßler c12452c7ae
JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143) 2024-05-08 21:53:08 +02:00
Dawid Potocki 83330d8cd6
main : add --conversation / -cnv flag (#7108) 2024-05-08 17:32:32 +03:00
HanishKVC 8fe8231313 ChatON:SubPartsAwareTokenizePath: Allow extract subparts testing 2024-05-08 19:51:57 +05:30
HanishKVC a49697b488 ChatON: Keep compiler happy simbly 2024-05-08 19:22:46 +05:30
HanishKVC 868ab608f0 ChatON: Add forceParseSpecial flag to subparts aware tokenizing 2024-05-08 18:42:22 +05:30
HanishKVC b6da7d9c9d ChatON: tokenize keeping in mind the taggedMessage subparts
Initial go
2024-05-08 18:38:07 +05:30
HanishKVC 8dfa31bb91 ChatON: Make c-api wrappers a bit robust incl some cross checks
If the tagged message will be of 0 length, ensure that the passed
dest char* array, has null inserted appropriately.

Check that user has passed a non-null pNumParts.

Dont hard code int32_t size, pick using sizeof
2024-05-08 17:05:45 +05:30
Johannes Gäßler af0a5b6163
server: fix incorrectly reported token probabilities (#7125)
* server: normalize token probabilities

* fix temperature == 0.0f
2024-05-07 23:07:58 +02:00
HanishKVC 76791bad63 ChatON:Fix partsLengths to int32_t type, instead of int
so that the size of the elements is explicit and fixed, so that
it is inturn in sync with the fixed int size specified wrt the
c-api, even with any c compilers with different idea about int.

avoid some ununsed vars, need to update compile flags later to
enable corresponding warnings.
2024-05-07 12:40:49 +05:30
HanishKVC b3a56545d6 ChatON:Reposition alertAssistantAtEnd flag for consistency 2024-05-07 11:49:43 +05:30
HanishKVC 0852f3b7ec ChatON:ExCApi: Rename for consistency 2024-05-07 11:46:40 +05:30
HanishKVC 43a3a91b03 ChatON: Cleanup/Refine initial go at tmpl_apply_ex_capi 2024-05-07 11:44:25 +05:30
HanishKVC 7c288d3dfc ChatON: Rename to partstypes for consistency 2024-05-07 11:32:20 +05:30
HanishKVC 04b4a15177 ChatON: Initial go at chat-template-apply c-api with parts info 2024-05-07 11:08:47 +05:30
HanishKVC f6a86cd209 ChatON: Update the Note a bit 2024-05-07 10:29:16 +05:30
HanishKVC 989c6c4125 SimpCfg: Cleanup the Note a bit to avoid some ambiguities 2024-05-06 11:27:56 +05:30
HanishKVC 344c068d7b SimpCfg:MultiPart keys wrt get_vector
With this and past few commits, now there is simple yet sufficient
support to help move multi-level-hierarchy config files into the
SimpCfg's simple physically 1-level, but if reqd logically multi
level hierarchy flow.

B4 this series of commits also one could have still achieved this,
but there would have been bit more effort needed.
2024-05-06 11:27:56 +05:30
HanishKVC 19d3c88e8a SimpCfg:MultiPart keys wrt get_value etal 2024-05-06 11:27:56 +05:30
HanishKVC 623d0b60da SimpCfg: General MultiPart support, KeyParts not Key wrt SetValue 2024-05-06 11:27:56 +05:30
HanishKVC c6ecd9316e SimpCfg: Use to_str instead of using stringstream directly 2024-05-06 11:27:56 +05:30
HanishKVC 2b14bcaddb SimpCfg:ChatON: add by Humans for All note 2024-05-06 11:27:56 +05:30
HanishKVC 20e5b383c5 SimpCfg:Trim DumpHexString only if SC_DEBUG_VERBOSE 2024-05-06 11:27:56 +05:30
HanishKVC f53c19baac SimpCfg: Update the notes wrt tolower and add test code 2024-05-06 11:27:56 +05:30
HanishKVC 3287fdba28 SimpCfg:Fix/cleanup trim related test samples and flow
Use the commonality between Indian languages to show mixup issue
with the simple minded trim_dump logic and how trim_oversmart
could potentially avoid that.

Given that I am using valid strings to show the pitfalls of fixed
native char size driven logic, so no need to keep the dump and
oversmart flows seperate, so merge into a common loop.
2024-05-06 11:27:56 +05:30
HanishKVC 33619a3b92 SimpCfg: Templatize str_lower 2024-05-06 11:27:56 +05:30
HanishKVC 32ba195a83 SimpCfg: Templatize str_trim_single
Also use NativeCharSize and MultiNativeCharSize wording to make
the note more generic
2024-05-06 11:27:56 +05:30
HanishKVC 5b8bf849c0 SimpCfg: Fixed & ~Variable Length to Native & MultiNativeCharSize
So as to make the notes, more generic.
2024-05-06 11:27:56 +05:30
HanishKVC d030a26f3c SimpCfg:Update TrimOverSmart use templated TrimDumb after wstrconv 2024-05-06 11:27:56 +05:30
HanishKVC 97ac443bba SimpCfg:Cleanup, updated notes, templated code
Update the notes to match the templated flow now and some of the
nitty gritties involved.

Update DumpHexString to be templated.

Split check nonenglish flow wrt trim dumb and oversmart testing,
so that things with work with one, but not the other can be
differentiated in the flow.
2024-05-06 11:27:56 +05:30
HanishKVC bf111a83f1 SimpCfg:TemplatedDumbTrim; Test dumb and oversmart trim logics 2024-05-06 11:27:56 +05:30
HanishKVC 554b00f027 SimpCfg: Add some missing const refs 2024-05-06 11:27:56 +05:30
HanishKVC cae0fff715 SimpCfg: Update notes; Try add a better trimming logic 2024-05-06 11:27:56 +05:30
HanishKVC d1156cc055 SimpCfg: As locale manipulation reqd for better processing 2024-05-06 11:27:56 +05:30
HanishKVC 2325764180 SimpCfg:CheckStrings: Switch Mbs2Wcs to multithread safe calls 2024-05-06 11:27:56 +05:30
HanishKVC 23acf07bb2 SimpCfg:CheckStrings: Cleanup wstring flow to needed parts 2024-05-06 11:27:56 +05:30
HanishKVC 2cda78f1ad SimpCfg:CheckStrings: WString2String finally
The constructor method doesnt convert wstring to string, when it
involves non-english chars which will encode to multibyte chars
in utf8. even thou it does work for the already utf8 u8string.

wcstombs doesnt seem to work for non english chars, when the
locale is set to the default c, need to change to something like
en_US.UTF-8, to allow it to do the conversion properly.
2024-05-06 11:27:56 +05:30
HanishKVC 7607dbc8c7 SimpCfg:CheckStrings: Try fixup wstring handling 2024-05-06 11:27:56 +05:30
HanishKVC 1a618a42f8 SimpCfg: Update the func notes with alert 2024-05-06 11:27:56 +05:30
HanishKVC 66d6fa62b7 SimpCfg: C++ and strings is a mess even after decades
Seperate out the checks wrt different string types.

Add a wstring_basic, which verifies that wstring iterator handles
non english chars propery or atleast better.
2024-05-06 11:27:56 +05:30
HanishKVC 3ad5cec47e SimpCfg:CheckStrings:MacOS, wstring and wcout
Without using imbue, I couldnt get non-english wstrings to print
on mac. Need to check on linux also.

Also avoid the uint8_t typecasting, given that wchar isnt 8bit
2024-05-06 11:27:56 +05:30
HanishKVC a448fec486 SimpCfg:CheckString: organise and probe - p3 wstring
wcouts' involving 2nd wstring with non english char in it not
showing up?
2024-05-06 11:27:56 +05:30
HanishKVC 691d0d43b5 SimpCfg:CheckStrings: Organise and Probe - P2 - std::u8string 2024-05-06 11:27:56 +05:30
HanishKVC 713520caac SimpCfg:CheckStrings: Organise and Probe - p1 std::string 2024-05-06 11:27:56 +05:30
HanishKVC 56f19c7a68 SimpCfg: Test c++ string handling 2024-05-06 11:27:56 +05:30
HanishKVC 86e776c857 SimpCfg: Rename to get_vector, add some test code 2024-05-06 11:27:56 +05:30
HanishKVC 1e1f54ec1d SimpCfg:GetArray flesh out, helpers to convert to string
Good: Forgotten love, All the light we cant see, laapataa ladies
2024-05-06 11:27:56 +05:30
HanishKVC 561f50930e SimpCfg: initial go at adding support for spreadout arrays
By having a seperate entry for each element of the array, the
existing logic itself can be repurposed with minimal additions.
2024-05-06 11:27:56 +05:30
HanishKVC 8fdc80533f SimpCfg:Cleanup:Cmdline Arg, GetValueCallerLogging, StringCmp
Bring the direct string comparison also back, as the issue with
cmp was more to do with string.transform.
2024-05-06 11:27:56 +05:30
HanishKVC 08b9711d68 SimpCfg:Remove dbug logs wrt str_tolower and set_bool
Also the warning wrt is it string, now also logs the line number,
group and key, to help user identify the line better.

Misc: pass time last week Another life, Anchakkallakokkan, Deadloch
2024-05-06 11:27:56 +05:30
HanishKVC 0e0d7da18f SimpCfg:Found issue with str_tolower, transform doesnt resize dst 2024-05-06 11:27:56 +05:30
HanishKVC eb56517f20 SimpCfg:Bools:Make lowercase b4 checking true/false for bool path
TODO: string check wrt true/false, doesnt seem to be working after
str_tolower was introduced. I seem to be doing some silly mistake
not able to make out, moving in and out of sleep, need to check
tomorrow.

string == string-literal failed
string == string-view failed
string.compare(string-literal) failed

Bit strange
2024-05-06 11:27:56 +05:30
HanishKVC ef5a2cf391 SimpCfg:Dbug why bool is not setting properly 2024-05-06 11:27:56 +05:30
HanishKVC 5aa1072aac SimpCfg: Move dump into its own func, Avoid KV iter wrt Get 2024-05-06 11:27:56 +05:30
HanishKVC 6b475e444f SimpCfg: Log Caller of Set/GetValue 2024-05-06 11:27:56 +05:30
HanishKVC f05f71bdc4 SimpCfg:SetBool string value, str_tolower, SetValueTypeLogging 2024-05-06 11:27:56 +05:30
HanishKVC 1dc7fd0e85 SimpCfg:WIP:Variant TypeDef, to_str and std::get
Cleanup the use of the variant

Initialize and << op stringstream seperately.
2024-05-06 11:27:56 +05:30
HanishKVC ee1a62c876 SimpCfg:WIP:Switch to C++ Variant type - initial skeleton 2024-05-06 11:27:56 +05:30
HanishKVC 7302b3ab36 SimpCfg: Use stderr wrt internal Log messaging helpers 2024-05-06 11:27:56 +05:30
HanishKVC a09571318a ChatON: meta-dump returns flag inturn returned by meta-ok
test-chat-template-chaton now tries to check if meta-ok is ok wrt
the template-id being looked into.

Log template-id info also, where it was previously missed out.
2024-05-06 11:27:56 +05:30
HanishKVC 44c05305d0 SimpCfg: Add support for get_double 2024-05-06 11:27:56 +05:30
HanishKVC 8ad2c17e5d SimpCfg: get_int64 logic 2024-05-06 11:27:56 +05:30
HanishKVC 000245b8e8 SimpCfg:Warn possible nonstring strings, some invalid floats
Warn if something not starting with double quote is being treated
as a string.

Show some examples of invalid floating point values wrt this
logics floating point determination code
2024-05-06 11:27:56 +05:30
HanishKVC a6648b02f2 SimpCfg:Show floating point values in normal and exponential form 2024-05-06 11:27:56 +05:30
HanishKVC 4181164217 SimpCfg:Implement set_int64 and set_double
Also update the sample simpcfg file, to test for int and float
values.
2024-05-06 11:27:56 +05:30
HanishKVC fb9a7dc7fe SimpCfg:Initial skeleton towards supporting int and floating point 2024-05-06 11:27:56 +05:30
HanishKVC d0b3ebf32e SimpCfg: Use & wrt destination of [] operation
so that one can update the value-item's content, without needing
to explicitly update/store the value-item back into map after the
content has been updated.

This should make these setting operations/helpers more efficient.
2024-05-06 11:27:56 +05:30
HanishKVC 0a534e6897 SimpCfg: Rename test program related #define 2024-05-06 11:27:56 +05:30
HanishKVC ca5a04d607 SimpCfg: Remove double quotes around group, key or string value 2024-05-06 11:27:56 +05:30
HanishKVC 82348e2840 SimpCfg: Put GroupMap back into Map, Iterate during get if DBUG
TODO: Have to look into C++ a bit later including these default
container types. Not sure my current flow is efficient.
2024-05-06 11:27:56 +05:30
HanishKVC 9940bd8ed7 SimpCfg: Allow default values wrt set string and set bool 2024-05-06 11:27:56 +05:30
HanishKVC 951fbc3396 SimpCfg: Change logging to LDBUG and LERRR helpers 2024-05-06 11:27:56 +05:30
HanishKVC d514c81829 SimpCfg: Add the const which I had forgotten wrt args 2024-05-06 11:27:56 +05:30
HanishKVC 6de8a14f32 SimpCfg: Rename member functions to avoid sc_ prefix
now that logic has been converted into a class, no need for this
prefix
2024-05-06 11:27:56 +05:30
HanishKVC 1ecca5a7ec SimpCfg: Convert to a class 2024-05-06 11:27:56 +05:30
HanishKVC 28ae0c5b02 SimpCfg:Make str_trim flexible, use to trim , wrt value
Now one can pass the list/string of chars to trim at either end.
2024-05-06 11:27:56 +05:30
HanishKVC 2cbb00c340 SimpCfg: Add support for boolean fields wrt key-value 2024-05-06 11:27:56 +05:30
HanishKVC aea6850131 SimpCfg: Keep compiler happy, also add newline wrt alt logging def 2024-05-06 11:27:56 +05:30
HanishKVC f4687fa5d4 SimpCfg:Parse config file and load string key-value fields 2024-05-06 11:27:56 +05:30
HanishKVC ce75d434dc SimpCfg: Initial skeleton : get and set string and bool values 2024-05-06 11:27:56 +05:30
HanishKVC af9a0a211b ChatON:ChatTmplApply: Avoid the stringstream 2024-05-06 11:27:56 +05:30
HanishKVC 889a45ff28 ChatON:ChatTmplApply:Update the function notes 2024-05-06 11:27:56 +05:30
HanishKVC ff5f68826b ChatON:ChatTmplApplySingle: Avoid streamstring, update func notes 2024-05-06 11:27:56 +05:30
HanishKVC 32e672c5dd ChatON: Dont log final tagged message string to screen 2024-05-06 11:27:56 +05:30
HanishKVC cad50c527e ChatON: Update the note to match current logic 2024-05-06 11:27:56 +05:30
HanishKVC a4b3285034 ChatON:Show Log on screen when template is applied 2024-05-06 11:27:56 +05:30
HanishKVC d61b071b8d Chaton:Common:Add missing newline wrt cmdline arg usage 2024-05-06 11:27:56 +05:30
HanishKVC fee887fe31 ChatON:Common:Update the cmdline argument name used
Had forgotten to update it before
2024-05-06 11:27:56 +05:30
HanishKVC 58e1ff16bc ChatON: switch to ordered_json from json library
to be in sync with the json namespace in server.
2024-05-06 11:27:56 +05:30
HanishKVC a630564c48 ChatON:ChatTemplateApplyCAPI remaining base logic
As c doesnt have the concept of pass by reference, and inturn the
existing c api uses pointers wrt llama chat message structure, so
switching to same wrt chat_tmpl_apply logics.

Also fix a oversight in previous commit and add the remaining logic.
2024-05-06 11:27:56 +05:30
HanishKVC 308d3bf3ff ChatON:WIP:Add c api wrapper for chat_template_apply
Initial skeletons

Update existing logics to help with same. Also the inbetween helper
was having a bad signature wrt returning status and data, thats also
fixed.
2024-05-06 11:27:56 +05:30