Have looked at tokenizer_config.json, jinja file and default
hardcoded template in llama.cpp.
This is also one of the models where a Global BoS is needed.
NOTE: Have taken the liberty to also add a SYSTEM: prefix wrt
system message, even thou default vicuna doesnt seem to need, but
vicuna-orca seems to need, so that both models can be driven from
same chat template config. I am assuming the system prefix should
not create any problem even in default vicuna, however if it does
create a problem one can duplicate the existing vicuna block in
chaton_meta.json and make the system prefix empty in it.
The first model seen, based on templates added till now into meta
json file, that needs a Global Begin.
From tokenizer_config json file, it appears like even system role
should have a appropriate prefix, unlike what is seen in hardcoded
default chat apply template of llama.cpp and chat jinja template.
Add missing begin and end fields wrt deepseek-coder assistant in
chaton_meta.json.
Idea is to avoid json library dependency by adding a simple text
based config file support.
Wrt llama2
* add bos wrt llama2 system and user begins, but not assistant
* split system suffix into suffix and end, and add systemuser-system
flags so that end can be avoided wrt system+user message combo
* add eos wrt assistant end
* With these potentially this should work with main and server flows
Wrt llama3
* add empty begin, end fields and systemuser-system flags
* This should potentially work with main and server flows
Was looking at the tokenized vector, and noticed that the EOS
mentioned by existing chat_apply_template of llama.cpp, is different
from what I noticed in tokenizer_config.json of deepseek llm, so
I have added two entries
* "deepseek-alt" which matches llama.cpp's chat_apply_template and
* "deepseek" which matches that in tokenizer_config.json.
This impacts the assistant suffix and reverse prompt entries.
CasOfThis: Need to look into other entries which I added previously
at a later time. However as the default logic should be picking the
EOS from model file, so I assume reverse-prompt being outofsync,
may not matter beyond a limit, potentially.
Update the note
Rename global-prefix|suffix to global-begin|end.
Rename chat-apply-template to chat-apply-template-single, cas it
handles only a single message.
Add some debug log messages to the helper functions