Commit Graph

92 Commits

Author SHA1 Message Date
HanishKVC 999bd396d0 ChatON: forgot to get c string format 2024-05-16 23:50:03 +05:30
HanishKVC 0cbfd40f18 ChatON: Option for a fallback tmpl to use wrt chat-tmpl-apply-ex 2024-05-16 23:27:34 +05:30
HanishKVC 239b5be219 ChatON+: Cleanup integration with CMake
Rename chaton-meta hpp to cpp and include this cpp file which brings
in the compile time built-in global chaton configurable template data
into the common library, and avoid the nop hpp file references.

Update chaton.hpp to not include the meta-cpp, instead just make a
reference to the global ChatTemplates instance, so that the hpp can
be used as a header file proper.

Avoid pragma once in the chaton-meta.cpp, including the script, which
helps create it.
2024-05-16 12:22:27 +05:30
HanishKVC 4a15989000 ChatON: Forgot this note earlier 2024-05-15 03:38:41 +05:30
HanishKVC a3d641b555 ChatON: Move loading from json file into its own file
Any program which wants to use json file to update/extend the
chaton's configurable template data, can include this new file
chaton_json.hpp, to get the reqd functionality.

Update chaton_meta_ok, _chaton_meta_validate_dump and
chaton_meta_load_json to either work with a passed ChatTemplates
instance, or fallback to the compiled-in global instance of same.
2024-05-15 03:00:25 +05:30
HanishKVC 8975de996b ChatON: Update Notes to match the updated semantics and flows
The initial version was rooted around a json object, while the new
version is rooted around a MapOfMapOfVariant (GroupKV), which could
be preloaded with chat templates info at compile time itself and
used as is. Or optionally one could allow the configurable template
data to be extended/updated at runtime from a text(/SimpCfg)/json
file.
2024-05-14 21:54:52 +05:30
HanishKVC f8c0b474ec ChatON+:RenameTo chaton_meta_load_json to match semantic
Also add simple note wrt itself and its helper.
2024-05-14 21:37:05 +05:30
HanishKVC bd5c39e0f0 ChatOn+GroupKV: Cleanup a bit, including using debug logging 2024-05-14 21:22:48 +05:30
HanishKVC bb9ce52b11 ChatON+: ValidateDump dumps All, wrapped in optional LDBUG_LN
GroupKV dump adds needed ":" seperator on its own, so calling
functions can just pass the tag string they want in the log without
worrying about any demarkation.
2024-05-14 18:45:25 +05:30
HanishKVC 28ddd2c474 ChatON: ChatParts dump returns info str rather than direct logging 2024-05-14 02:21:16 +05:30
HanishKVC 4dfd10a40d ChatON: Move core templating/tagging code into ChatTemplates class
However still retain the wrappers, which work with a predefined
global instance of ChatTemplates.
2024-05-14 01:49:38 +05:30
HanishKVC 600653dae2 ChatON:Optional control of MsgCntBasedTagging
Use same to bypass any msg count based tagging behaviour for the
single message tagging through its helper wrapper.
2024-05-14 01:27:24 +05:30
HanishKVC 6e13c0c87e ChatON:Control SystemMsgSuffix+End tags only wrt 1st system msg
Make it similar to user-begin+prefix control. ie only wrt 1st msg
of respective type.
2024-05-14 01:19:04 +05:30
HanishKVC 3fcaf19967 ChatON+:Multi4Single: applyGlobalIfAny flag wrt templating api
Given that now the multi chat templating logic itself is used to
apply chat templating/tagging to a single chat message, so give
flexibility of deciding whether global tags if any should be
applied or not wrt the core tagging logic.

examples/main inturn updated to not apply global tags if any wrt
the system message. Also the user messages already dont apply
global tags if any, as its currently implemented to build on the
existing in-prefix/suffix and anitprompt flow.
2024-05-14 01:00:17 +05:30
HanishKVC 8165bd4035 ChatON:WIP:chaton_tmpl_apply_single build on multi msg tagging
To avoid having to duplicate any hardcoding in future, wrt any new
model/chat-template-standard, at multiple locations, remove the
single message templating code with a wrapper which does the same
but using the multi-msg templating helper.
2024-05-14 00:44:47 +05:30
HanishKVC fe0c9ce646 ChatON:BasicCheck+:return a string with info, dont directly log 2024-05-14 00:25:00 +05:30
HanishKVC efbb87dba6 ChatON:ChatTemplates:TmplBasicCheck 2024-05-13 17:50:15 +05:30
HanishKVC 0cfe99076d ChatON:ChatTemplates: TmplExists, TmplGetKey, TmplRoleGetKeys
ChatTemplate directly supports these now, as well as the existing
global instance based corresponding helpers depend on same.
2024-05-13 17:30:47 +05:30
HanishKVC 184ac322e3 ChatON: Make json_get efficient and flexible wrt its calling
Also explicitly indicate that we are looking at a chain of keys
2024-05-13 16:21:02 +05:30
HanishKVC eb7554ca3b ChatON: Avoid -> to match simpcfg as well as corresponding keys 2024-05-13 10:37:14 +05:30
HanishKVC db2ffabb18 ChatON: use templated json_get when loading bool key-value fields
With this now even loading chaton_meta.json file will generate
more informative exception, so that user can know which field
is missing, if any.
2024-05-12 18:26:58 +05:30
HanishKVC 470b8885f3 ChatON: Switch to templated json_get for str/bool/etal 2024-05-12 18:19:18 +05:30
HanishKVC 0249c07e6b ChatON:Switch to json_get_str to help identify missing keys better
The json library generates less informative exception message,
which doesnt help one identify which key is missing, so switch to
the new json_get_str helper added in the last commit. It generates
more informative exception message.
2024-05-12 17:44:13 +05:30
HanishKVC 4eae05a6b7 ChatON: json access helper which raises exception if key missing 2024-05-12 17:34:04 +05:30
HanishKVC f94fed92d3 ChatON+MetaHpp: Had forgotten to conv reverse-prompt
Also has dump was using get_value calls with fallback to default,
so it wasnt identifying the missed field.

Have fixed both of those. Also reconverted meta json file.

Misc: interesting avesham and aattam
2024-05-12 16:20:28 +05:30
HanishKVC a3285e8e25 ChatON:Include auto converted ChatONMeta.hpp chat template data
This should allow for using this generic chat templating code flow
along with the included chat template data, without needing to
load any json file at runtime.

However If user wants to change the already included chat template
data, or add new chat template standard/model related data, one can
explicitly load json file.

TODO: Need to cross check this flow once, but logically should work
2024-05-12 14:08:09 +05:30
HanishKVC 1574201f71 ChatON:LoadJSon:ChatTemplates: revPrompt, system-user flags
WIP:NOTE:

Initial go converting from json driven flow to ChatTemplatesGroupKV
related flow done. Needs to be tested.

A optional helper added to load ChatTemplates from a specified
json file.

Need to add a compile time initialized MapOfMapOfVariants wrt
the chat template details of models/standards already known
to the program. So that one can use the llama.cpp and this new
chat template logic, even without json dependency, if one doesnt
want to.
2024-05-12 01:45:19 +05:30
HanishKVC 444d2ccf9c ChatON:LoadJSON: ChatTemplates - global/system/user/assistant
Manually iterate the json object items using begin-end explicitly,
because the implicit iteration for loop related helpers for the
used json lib gives only the values and not a key-value pair.
2024-05-12 01:35:31 +05:30
HanishKVC 2efc09f2d0 ChatON: Unnecessarily indirect nlohmann json
code used for exploring/testing commited just for future reference
2024-05-12 00:42:17 +05:30
HanishKVC b944d04d08 ChatON: Add constructor for ChatTemplates which chains into GKV 2024-05-11 23:42:08 +05:30
HanishKVC 4a9a6ce256 ChatON: ChatONMetaDump switch to GKV/ChatTemplates based flow 2024-05-11 22:53:45 +05:30
HanishKVC e999934e91 ChatON:WIP: initial go at GroupKV based flow, instead of json 2024-05-11 19:41:58 +05:30
HanishKVC 1f9a0eb8ce ChatON: Remove unneeded iostream 2024-05-10 21:10:44 +05:30
HanishKVC 8fe8231313 ChatON:SubPartsAwareTokenizePath: Allow extract subparts testing 2024-05-08 19:51:57 +05:30
HanishKVC a49697b488 ChatON: Keep compiler happy simbly 2024-05-08 19:22:46 +05:30
HanishKVC 868ab608f0 ChatON: Add forceParseSpecial flag to subparts aware tokenizing 2024-05-08 18:42:22 +05:30
HanishKVC b6da7d9c9d ChatON: tokenize keeping in mind the taggedMessage subparts
Initial go
2024-05-08 18:38:07 +05:30
HanishKVC 8dfa31bb91 ChatON: Make c-api wrappers a bit robust incl some cross checks
If the tagged message will be of 0 length, ensure that the passed
dest char* array, has null inserted appropriately.

Check that user has passed a non-null pNumParts.

Dont hard code int32_t size, pick using sizeof
2024-05-08 17:05:45 +05:30
HanishKVC 76791bad63 ChatON:Fix partsLengths to int32_t type, instead of int
so that the size of the elements is explicit and fixed, so that
it is inturn in sync with the fixed int size specified wrt the
c-api, even with any c compilers with different idea about int.

avoid some ununsed vars, need to update compile flags later to
enable corresponding warnings.
2024-05-07 12:40:49 +05:30
HanishKVC b3a56545d6 ChatON:Reposition alertAssistantAtEnd flag for consistency 2024-05-07 11:49:43 +05:30
HanishKVC 0852f3b7ec ChatON:ExCApi: Rename for consistency 2024-05-07 11:46:40 +05:30
HanishKVC 43a3a91b03 ChatON: Cleanup/Refine initial go at tmpl_apply_ex_capi 2024-05-07 11:44:25 +05:30
HanishKVC 7c288d3dfc ChatON: Rename to partstypes for consistency 2024-05-07 11:32:20 +05:30
HanishKVC 04b4a15177 ChatON: Initial go at chat-template-apply c-api with parts info 2024-05-07 11:08:47 +05:30
HanishKVC f6a86cd209 ChatON: Update the Note a bit 2024-05-07 10:29:16 +05:30
HanishKVC 2b14bcaddb SimpCfg:ChatON: add by Humans for All note 2024-05-06 11:27:56 +05:30
HanishKVC a09571318a ChatON: meta-dump returns flag inturn returned by meta-ok
test-chat-template-chaton now tries to check if meta-ok is ok wrt
the template-id being looked into.

Log template-id info also, where it was previously missed out.
2024-05-06 11:27:56 +05:30
HanishKVC af9a0a211b ChatON:ChatTmplApply: Avoid the stringstream 2024-05-06 11:27:56 +05:30
HanishKVC 889a45ff28 ChatON:ChatTmplApply:Update the function notes 2024-05-06 11:27:56 +05:30
HanishKVC ff5f68826b ChatON:ChatTmplApplySingle: Avoid streamstring, update func notes 2024-05-06 11:27:56 +05:30