703 lines
27 KiB
C++
703 lines
27 KiB
C++
#pragma once
|
|
|
|
/**
|
|
*
|
|
* Generic tagging logic + text config file based chat templates handling
|
|
* by Humans for All
|
|
*
|
|
* ## Overview
|
|
*
|
|
* Helps chat with models, by tagging chat messages based on the specified
|
|
* chat-handshake-template-standard. This uses a generic tagging code driven
|
|
* by a json meta data file, which specifies the handshake template details.
|
|
*
|
|
* This can be used by
|
|
*
|
|
* * main, to build on existing interactive flow and its in-prefix, in-suffix
|
|
* and antiprompt/reverse-prompt
|
|
*
|
|
* * server, by replacing its existing llama_chat_apply_template with the
|
|
* equivalent helper here.
|
|
*
|
|
*
|
|
* ## The common pattern
|
|
*
|
|
* As a convention, the tagging used by LLMs to differentiate between the
|
|
* different parts when chatting with them normally follows a general pattern of
|
|
*
|
|
* * <BeginOfSentenceIfAny> <RolePrefixIfAny> <TheContent> <RoleSuffixIfAny> <EndOfSentenceIfAny>
|
|
*
|
|
* * The Roles could include System, User and Assistant (ie the Model)
|
|
*
|
|
* * A chat normally consists of
|
|
*
|
|
* * a System message/prompt followed by
|
|
*
|
|
* * multiple user message/query - model message/response pairs
|
|
*
|
|
* The different models will normally have all or some subset of the tagging mentioned above.
|
|
*
|
|
* You may also notice some common patterns like
|
|
*
|
|
* * Because a user message is normally followed by model/assistant response, in most models
|
|
*
|
|
* * user messages wont have EndOfSentenceTag and
|
|
*
|
|
* * the following model response wont have BeginOfSentenceTag
|
|
*
|
|
* * Because a system message will normally be immidiately followed by a user query,
|
|
*
|
|
* * in many models, there wont be a EndOfSentenceTag following the system message and
|
|
* BeginOfSentenceTag wrt the 1st user message following the system message.
|
|
*
|
|
* * in some models there wont even be a RoleSuffixTag following system message
|
|
* and RolePrefixTag wrt the 1st user message following the system message.
|
|
*
|
|
* * however in many of these models, the subsequent user messages will have the
|
|
* BeginOfSentenceTag and or RolePrefixTag.
|
|
*
|
|
* * Some models may require a BoS for a group of messages, independent of BoS (if any)
|
|
* wrt individual roles.
|
|
*
|
|
*
|
|
* ## The Strategy
|
|
*
|
|
* The template meta data json file allows the user to specify the above mentioned tags wrt
|
|
* each of the Role as well as any global tag for a group of messages. Depending on whether
|
|
* a given model uses/needs a given tag or not you either specify the required tag or else
|
|
* you specify a empty string.
|
|
*
|
|
* A tag could be a single word or multiple words, and may include newline char specified
|
|
* using \n and so on. The tag is always demarcated using double quotes and thus also allows
|
|
* spaces at the begining or end of the tag, if needed.
|
|
*
|
|
* In order to account for the conditionality of tags between the system message and the 1st
|
|
* user message, flags are provided to explicitly control whether each of these possible tags
|
|
* is used by a specific model or not, as part of its template info.
|
|
*
|
|
* The Roles are identified in the json file using "system", "user" and "assistant". However
|
|
* the model may use different words to identify these roles, in which case setup RolePrefix
|
|
* and or RoleSuffix appropriately.
|
|
*
|
|
* To identify that model is finished with generating response to user query, depending on
|
|
* the model's handshake template standard, one will need to set the reverse-prompt to either
|
|
* the assistant's suffix or end tag or to the user's begin or prefix tag, depending on what
|
|
* is generated by the model at the end of its response.
|
|
*
|
|
* Currently flags for trimming wrt user text (be it wrt system or user role) is not added.
|
|
*
|
|
*
|
|
* ## The JSON File
|
|
*
|
|
* Can contain the template info wrt multiple models/handshake-standards. And inturn each
|
|
* unique template is identified by a unique template id string.
|
|
*
|
|
* The fields that make up a given chat-handshake-template-standard include
|
|
*
|
|
* * global-> begin & end
|
|
*
|
|
* * system -> begin, prefix, suffix & end
|
|
*
|
|
* * user -> begin, prefix, suffix & end
|
|
*
|
|
* * assistant -> begin, prefix, suffix & end
|
|
*
|
|
* * reverse-prompt
|
|
*
|
|
* * systemuser-system-has-suffix, systemuser-system-has-end,
|
|
* systemuser-1st-user-has-begin and systemuser-1st-user-has-prefix
|
|
*
|
|
*
|
|
* ## Usage
|
|
*
|
|
* One needs to load the json file containing the template meta data and inturn call the
|
|
* other helper functions as needed.
|
|
*
|
|
* Inturn one can use the helper functions to either extract a given tag or to apply all
|
|
* tags specified wrt a given role to the passed message or to apply tags as needed for
|
|
* a bunch of messages in one go.
|
|
*
|
|
* The individual message tagging helper, will apply all tags specified wrt that role.
|
|
*
|
|
* The multiple messages tagging helper chaton-tmpl-apply, will look at the boolean flags
|
|
* when tagging the passed messages. In this the system suffix, system end, user begin and
|
|
* user prefix get included only if corresponding flag is set.
|
|
*
|
|
* Both the single and multi messages tagging helpers provide two versions.
|
|
* * one which returns a single string which contains the tagged message(s)
|
|
* * one which returns
|
|
* * [tagged msg] the string containing the tagged message(s)
|
|
* * [parts lengths] an array of integers, which specifies the part lengths,
|
|
* which divides the returned string into parts.
|
|
* * [parts types] a string where each character indicates whether the corresponding
|
|
* part is a normal part which needs to be tokenized without parse_special
|
|
* or is a special part which needs to be tokenized with parse-special.
|
|
*
|
|
*
|
|
* ## example/main
|
|
*
|
|
* The interactive commandline program under example/main, uses
|
|
*
|
|
* * the system role related tags to tag the system prompt
|
|
* * the system prompt includes contents of -p if any
|
|
* * followed by contents of file specified using -f if any
|
|
* * the user begin+prefix to map to in-prefix
|
|
* * the user suffix+end to map to in-suffix
|
|
* * the reverse-prompt to map to antiprompt
|
|
* * wrt tokenization
|
|
* * the user specified system prompt is tokenized with parse_special flag.
|
|
* * however the user messages are tokenized without parse_special flag.
|
|
*
|
|
* Currently Main doesnt use chaton-tmpl-apply, but only
|
|
* * chaton-tmpl-apply-single (for system prompt) and
|
|
* * chaton-tmpl-role-kv which maps the user prefix, suffix and reverse-prompt
|
|
* to in-prefix, in-suffix and antiprompt of main.
|
|
* These always adds any role specific begin+prefix and suffix+end around
|
|
* the passed message.
|
|
*
|
|
* ## other uses be it wrt llama.cpp-as-library or examples/server or ...
|
|
*
|
|
* This module exposes a c-api which is equivalent to the current hardcoded
|
|
* templating logic's llama_chat_apply_template. So any program using llama.cpp's
|
|
* chat templating logic can be easily migrated to make use of this generic code
|
|
* with text based config file based flow.
|
|
*
|
|
* If a program doesnt want to bring in json dependency into their project,
|
|
* there is also common/simpcfg.hpp, which provides a simple text based config
|
|
* file format, along with the corresponding parser for the same. This can be
|
|
* modified to work with simpcfg easily, if needed.
|
|
*
|
|
* ## Adding support for new model / chat-handshake-template-standard
|
|
*
|
|
* 1. Add suitable entries in json for that model/standard
|
|
* This in itself should work for most of the models.
|
|
*
|
|
* 2. If some new model introduces a totally different kind of chat-templating
|
|
* tag inter/intra mixing, Try to reuse and update the generic flow in
|
|
* chaton-tmpl-apply, as much as possible, before trying to add any custom logic.
|
|
*
|
|
* If you update the generic flow, cross check if existing json files will
|
|
* need to be updated or not.
|
|
*
|
|
*
|
|
* ## Notes
|
|
*
|
|
* Look at the sample chaton_meta.json in examples folder for how the above may apply to
|
|
* the different llm's out there like
|
|
*
|
|
* * llama2, llama3, gemma, zephyr, deepseek(normal and coder), monarch, mistral, phi3
|
|
* * chatml, command-r, orion, openchat, vicuna
|
|
*
|
|
*/
|
|
|
|
#include <string>
|
|
#include <fstream>
|
|
#include <iostream>
|
|
#include <json.hpp>
|
|
|
|
#include "log.h"
|
|
#include "llama.h"
|
|
|
|
#define LOGXLN LOG_TEELN
|
|
|
|
const auto K_SYSTEM = "system";
|
|
const auto K_USER = "user";
|
|
const auto K_ASSISTANT = "assistant";
|
|
const auto K_PREFIX = "prefix";
|
|
const auto K_SUFFIX = "suffix";
|
|
const auto K_BEGIN = "begin";
|
|
const auto K_END = "end";
|
|
const auto K_GLOBAL = "global";
|
|
const auto K_SYSTEMUSER_SYSTEM_HAS_SUFFIX = "systemuser-system-has-suffix";
|
|
const auto K_SYSTEMUSER_SYSTEM_HAS_END = "systemuser-system-has-end";
|
|
const auto K_SYSTEMUSER_1ST_USER_HAS_BEGIN = "systemuser-1st-user-has-begin";
|
|
const auto K_SYSTEMUSER_1ST_USER_HAS_PREFIX = "systemuser-1st-user-has-prefix";
|
|
const auto K_REVERSE_PROMPT = "reverse-prompt";
|
|
|
|
|
|
using json = nlohmann::ordered_json;
|
|
|
|
json conMeta;
|
|
|
|
|
|
/**
|
|
* Helps keep user prompt and chat-hs-template tag parts seperate, but in sequence.
|
|
* Inturn gives the flexibility to tokenize with or without parse_special flag, wrt the different parts of the chat msg(s).
|
|
* One could use the triplet of str, get_types and get_partslens to achieve the above mentioned flexibility.
|
|
*/
|
|
class ChatParts {
|
|
|
|
std::vector<std::string> parts = {};
|
|
std::string types = {""};
|
|
|
|
public:
|
|
// Identify string with special tokens that need to be processed.
|
|
static const auto S = 's';
|
|
// Identify string which shouldnt have special token processing done.
|
|
static const auto N = 'n';
|
|
// Identify no string condition and or ignore string.
|
|
static const auto X = '?';
|
|
|
|
ChatParts() : parts{}, types{""} {}
|
|
|
|
char last_type() {
|
|
if (types.length() == 0) {
|
|
return ChatParts::X;
|
|
}
|
|
return types[types.length()-1];
|
|
}
|
|
|
|
void add_part(char type, const std::string &part) {
|
|
if (last_type() == type) {
|
|
parts[parts.size()-1] += part;
|
|
} else {
|
|
parts.emplace_back(part);
|
|
types += type;
|
|
}
|
|
}
|
|
|
|
std::string str() {
|
|
std::string allin = "";
|
|
for(auto part: parts) {
|
|
allin += part;
|
|
}
|
|
return allin;
|
|
}
|
|
|
|
std::string get_partstypes() {
|
|
return types;
|
|
}
|
|
|
|
std::vector<int> get_partslens() {
|
|
std::vector<int> lens = {};
|
|
for(auto part: parts) {
|
|
lens.push_back(part.length());
|
|
}
|
|
return lens;
|
|
}
|
|
|
|
std::string name() {
|
|
return typeid(*this).name();
|
|
}
|
|
|
|
void dump() {
|
|
std::string me = name() + ":" + __func__;
|
|
LOGXLN("INFO:%s:NumTypes:%zu", me.c_str(), types.length());
|
|
LOGXLN("INFO:%s:NumParts:%zu", me.c_str(), parts.size());
|
|
LOGXLN("INFO:%s:StrLength:%zu", me.c_str(), str().length());
|
|
if (parts.size() != types.length()) {
|
|
LOG_TEELN("DBUG:%s:Mismatch between parts and types", me.c_str());
|
|
}
|
|
int i = 0;
|
|
for(auto part: parts) {
|
|
LOGXLN("INFO:%s:%c:%s", me.c_str(), types[i], part.c_str());
|
|
i += 1;
|
|
}
|
|
}
|
|
|
|
};
|
|
|
|
inline bool chaton_meta_load(std::string &fname) {
|
|
if (conMeta != nullptr) {
|
|
LOGXLN("WARN:%s:ChatOn Meta: overwriting???", __func__);
|
|
}
|
|
std::ifstream f(fname);
|
|
conMeta = json::parse(f);
|
|
return true;
|
|
}
|
|
|
|
inline bool chaton_tmpl_exists(const std::string &tmpl) {
|
|
if (conMeta == nullptr) {
|
|
LOG_TEELN("ERRR:%s:ChatOnMeta: Not loaded yet...", __func__);
|
|
return false;
|
|
}
|
|
try {
|
|
auto tmplData = conMeta[tmpl];
|
|
return true;
|
|
} catch (json::exception &err) {
|
|
LOG_TEELN("WARN:%s:ChatOnMeta: tmpl[%s] not found...", __func__, tmpl.c_str());
|
|
return false;
|
|
}
|
|
}
|
|
|
|
inline std::string chaton_tmpl_role_kv(const std::string &tmpl, const std::string &role, const std::vector<std::string> &keys) {
|
|
std::string got = "";
|
|
std::string sKeys = "";
|
|
for(auto key: keys) {
|
|
try {
|
|
got += conMeta[tmpl][role][key];
|
|
} catch (json::exception &err) {
|
|
}
|
|
sKeys += "+";
|
|
sKeys += key;
|
|
}
|
|
LOGLN("DBUG:%s:%s:%s:%s:%s", __func__, tmpl.c_str(), role.c_str(), sKeys.c_str(), got.c_str());
|
|
return got;
|
|
}
|
|
|
|
inline std::string chaton_tmpl_kv(const std::string &tmpl, const std::string &key) {
|
|
std::string got = conMeta[tmpl][key];
|
|
LOGLN("DBUG:%s:%s:%s:%s", __func__, tmpl.c_str(), key.c_str(), got.c_str());
|
|
return got;
|
|
}
|
|
|
|
inline bool chaton_tmpl_kv_bool(const std::string &tmpl, const std::string &key) {
|
|
bool got = conMeta[tmpl][key];
|
|
LOGLN("DBUG:%s:%s:%s:%d", __func__, tmpl.c_str(), key.c_str(), got);
|
|
return got;
|
|
}
|
|
|
|
|
|
// Given the template standard, role and a message, this returns
|
|
// a tagged message, types string and lens vector wrt the parts that make up the returned string
|
|
//
|
|
// * a string containing the tagged message
|
|
// * role-(begin+prefix) + msg + role-(suffix+end)
|
|
// * a string where the chars contain info about
|
|
// type of sub-strings/parts that make up the tagged message.
|
|
// * a vector of ints, which give the length of each part in the tagged message.
|
|
inline bool chaton_tmpl_apply_single_ex(
|
|
const std::string &tmpl,
|
|
const std::string &role,
|
|
const std::string &content,
|
|
std::string &tagged,
|
|
std::string &types,
|
|
std::vector<int> &lens
|
|
) {
|
|
if (!chaton_tmpl_exists(tmpl)) {
|
|
return false;
|
|
}
|
|
ChatParts cp = {};
|
|
std::string beginPrefix = chaton_tmpl_role_kv(tmpl, role, {K_BEGIN, K_PREFIX});
|
|
std::string suffixEnd = chaton_tmpl_role_kv(tmpl, role, {K_SUFFIX, K_END});
|
|
cp.add_part(ChatParts::S, beginPrefix);
|
|
cp.add_part(ChatParts::N, content);
|
|
cp.add_part(ChatParts::S, suffixEnd);
|
|
cp.dump();
|
|
tagged = cp.str();
|
|
LOGLN("DBUG:%s:%s:%s:%s", __func__, tmpl.c_str(), role.c_str(), tagged.c_str());
|
|
types = cp.get_partstypes();
|
|
lens = cp.get_partslens();
|
|
return true;
|
|
}
|
|
|
|
// Given the template standard, role and a message, this returns the tagged message.
|
|
//
|
|
// * a string containing the tagged message
|
|
// * role-(begin+prefix) + msg + role-(suffix+end)
|
|
inline size_t chaton_tmpl_apply_single(
|
|
const std::string &tmpl,
|
|
const std::string &role,
|
|
const std::string &content,
|
|
std::string &tagged
|
|
) {
|
|
std::string types;
|
|
std::vector<int> lens;
|
|
if (!chaton_tmpl_apply_single_ex(tmpl, role, content, tagged, types, lens)) {
|
|
return -1;
|
|
}
|
|
return tagged.size();
|
|
}
|
|
|
|
/**
|
|
* Apply chat-handshake-template for the specified template standard and role.
|
|
* If the passed char array is smaller than that required for the tagged message,
|
|
* * part of the tagged message which fits within dest buffer is copied
|
|
* * the returned value, indicates the size of the actual tagged message
|
|
* NOTE:
|
|
* * ideally the passed char array should be able to fit the tagged message+0|null char.
|
|
* * if the return value from this function is larger than or equal to destLength,
|
|
* then you will have to increase the size of the dest buffer, and call this
|
|
* function a second time, to ensure that one gets the full tagged message.
|
|
*/
|
|
inline size_t chat_tmpl_apply_single_capi(
|
|
const char *tmpl,
|
|
const char *role,
|
|
const char *content,
|
|
char *dest,
|
|
const size_t destLength
|
|
) {
|
|
std::string tagged;
|
|
std::string types;
|
|
std::vector<int> lens;
|
|
auto taggedLength = chaton_tmpl_apply_single(tmpl, role, content, tagged);
|
|
if (taggedLength <= 0) {
|
|
return taggedLength;
|
|
}
|
|
if (dest && (destLength > 0)) {
|
|
strlcpy(dest, tagged.c_str(), destLength);
|
|
}
|
|
return taggedLength;
|
|
}
|
|
|
|
|
|
// Given the template standard and a bunch of messages including their roles, this returns
|
|
// tagged messages, types string and lens vector. Returned types string and lens vector help
|
|
// identify the parts of the tagged msgs string, which relate to passed msgs and added tags.
|
|
//
|
|
// * a string containing the tagged messages
|
|
// * global-begin + 1 or more [[role-begin] + [role-prefix] + msg + [role-suffix] +[role-end]] + global-end
|
|
// * a string where the chars contain info about
|
|
// type of sub-strings/parts that make up the tagged messages string.
|
|
// * a vector of ints, which give the length of each part in the tagged messages string.
|
|
//
|
|
// if a combination of system-user messages is passed, then tags between the system
|
|
// and the 1st user message, is based on the flags set wrt the corresponding template standard.
|
|
inline bool chaton_tmpl_apply_ex(
|
|
const std::string &tmpl,
|
|
const std::vector<const llama_chat_message *> &msgs,
|
|
std::string &tagged,
|
|
std::string &types,
|
|
std::vector<int> &lens,
|
|
bool alertAssistantAtEnd
|
|
) {
|
|
if (!chaton_tmpl_exists(tmpl)) {
|
|
return false;
|
|
}
|
|
ChatParts cp = {};
|
|
std::string globalBegin = chaton_tmpl_role_kv(tmpl, K_GLOBAL, {K_BEGIN});
|
|
cp.add_part(ChatParts::S, globalBegin);
|
|
int cntSystem = 0;
|
|
int cntUser = 0;
|
|
int cntOthers = 0;
|
|
for(const auto msg: msgs) {
|
|
auto role = msg->role;
|
|
auto content = msg->content;
|
|
std::string begin = chaton_tmpl_role_kv(tmpl, role, {K_BEGIN});
|
|
auto prefix = chaton_tmpl_role_kv(tmpl, role, {K_PREFIX});
|
|
auto suffix = chaton_tmpl_role_kv(tmpl, role, {K_SUFFIX});
|
|
auto end = chaton_tmpl_role_kv(tmpl, role, {K_END});
|
|
if (role == K_SYSTEM) {
|
|
cntSystem += 1;
|
|
cp.add_part(ChatParts::S, begin);
|
|
cp.add_part(ChatParts::S, prefix);
|
|
} else if (role == K_USER) {
|
|
cntUser += 1;
|
|
if ((cntSystem == 1) && (cntUser == 1)) {
|
|
if (conMeta[tmpl][K_SYSTEMUSER_1ST_USER_HAS_BEGIN]) {
|
|
cp.add_part(ChatParts::S, begin);
|
|
}
|
|
if (conMeta[tmpl][K_SYSTEMUSER_1ST_USER_HAS_PREFIX]) {
|
|
cp.add_part(ChatParts::S, prefix);
|
|
}
|
|
} else {
|
|
cp.add_part(ChatParts::S, begin);
|
|
cp.add_part(ChatParts::S, prefix);
|
|
}
|
|
} else {
|
|
cntOthers += 1;
|
|
cp.add_part(ChatParts::S, begin);
|
|
cp.add_part(ChatParts::S, prefix);
|
|
}
|
|
cp.add_part(ChatParts::N, content);
|
|
if (role == K_SYSTEM) {
|
|
if (chaton_tmpl_kv_bool(tmpl, K_SYSTEMUSER_SYSTEM_HAS_SUFFIX)) {
|
|
cp.add_part(ChatParts::S, suffix);
|
|
}
|
|
if (chaton_tmpl_kv_bool(tmpl, K_SYSTEMUSER_SYSTEM_HAS_END)) {
|
|
cp.add_part(ChatParts::S, end);
|
|
}
|
|
} else {
|
|
cp.add_part(ChatParts::S, suffix);
|
|
cp.add_part(ChatParts::S, end);
|
|
}
|
|
}
|
|
if (alertAssistantAtEnd) {
|
|
auto assistantBeginPrefix = chaton_tmpl_role_kv(tmpl, K_ASSISTANT, {K_BEGIN, K_PREFIX});
|
|
cp.add_part(ChatParts::S, assistantBeginPrefix);
|
|
}
|
|
auto globalEnd = chaton_tmpl_role_kv(tmpl, K_GLOBAL, {K_END});
|
|
cp.add_part(ChatParts::S, globalEnd);
|
|
cp.dump();
|
|
tagged = cp.str();
|
|
LOGLN("DBUG:%s:%s:%s", __func__, tmpl.c_str(), tagged.c_str());
|
|
LOGLN("DBUG:%s:%s:CntSys[%d]:CntUsr[%d]:CntOthers[%d]", __func__, tmpl.c_str(), cntSystem, cntUser, cntOthers);
|
|
types = cp.get_partstypes();
|
|
lens = cp.get_partslens();
|
|
return true;
|
|
}
|
|
|
|
// Given the template standard and a bunch of messages including their roles, this returns
|
|
// the tagged messages as a string.
|
|
// global-begin + 1 or more [[role-begin] + [role-prefix] + msg + [role-suffix] +[role-end]] + global-end
|
|
inline int32_t chaton_tmpl_apply(
|
|
const std::string &tmpl,
|
|
const std::vector<const llama_chat_message *> &msgs,
|
|
bool alertAssistantAtEnd,
|
|
std::string &tagged
|
|
) {
|
|
std::string types;
|
|
std::vector<int> lens;
|
|
if (!chaton_tmpl_apply_ex(tmpl, msgs, tagged, types, lens, alertAssistantAtEnd)) {
|
|
return -1;
|
|
}
|
|
return tagged.size();
|
|
}
|
|
|
|
// Given the template standard and a bunch of messages including their roles, this returns
|
|
// the tagged messages as a string.
|
|
// global-begin + 1 or more [[role-begin] + [role-prefix] + msg + [role-suffix] +[role-end]] + global-end
|
|
//
|
|
// If the passed char array is smaller than that required for the tagged messages string,
|
|
// * part of the tagged messages string which fits within dest buffer is copied
|
|
// * the returned value, indicates the size of the actual tagged message
|
|
//
|
|
// NOTE:
|
|
// * ideally the passed char array should be able to fit the tagged messages string + 0|null char.
|
|
// * if the return value from this function is larger than or equal to destLength,
|
|
// then you will have to increase the size of the dest buffer, and call this
|
|
// function a second time, to ensure that one gets the full tagged messages string.
|
|
inline int32_t chaton_tmpl_apply_capi(
|
|
const char *tmpl,
|
|
const struct llama_chat_message *msgs,
|
|
const size_t numMsgs,
|
|
bool alertAssistantAtEnd,
|
|
char *dest,
|
|
int32_t destLength
|
|
) {
|
|
if ((tmpl == nullptr) || (dest == nullptr)) {
|
|
return -1;
|
|
}
|
|
std::vector<const llama_chat_message *> vMsgs;
|
|
for(size_t i=0; i<numMsgs; i++) {
|
|
vMsgs.push_back(&msgs[i]);
|
|
}
|
|
std::string taggedMsgs;
|
|
int32_t taggedLength = chaton_tmpl_apply(tmpl, vMsgs, alertAssistantAtEnd, taggedMsgs);
|
|
if (taggedLength <= 0) {
|
|
return taggedLength;
|
|
}
|
|
if (destLength > 0) {
|
|
strlcpy(dest, taggedMsgs.c_str(), destLength);
|
|
}
|
|
return taggedLength;
|
|
}
|
|
|
|
//
|
|
// In addition to the semantic provided by chaton_tmpl_apply_capi
|
|
// this additionally also returns info about the parts that make up
|
|
// the returned tagged message.
|
|
//
|
|
// partTypes and partLengths should be arrays that can accomodate the
|
|
// same number of elements belonging to its respective type.
|
|
// Inturn the pNumParts should point to a int which specifies the
|
|
// number of elements.
|
|
// If the generated tagged message has more parts than the specified
|
|
// *pNumParts, then the logic copies partTypes and partLengths to the
|
|
// specified length/NumOfParts only. Parallely it updates *pNumParts
|
|
// to the actual needed length (not including any terminating null char or so).
|
|
//
|
|
inline int32_t chaton_tmpl_apply_ex_capi(
|
|
const char *tmpl,
|
|
const struct llama_chat_message *msgs,
|
|
const size_t numMsgs,
|
|
bool alertAssistantAtEnd,
|
|
char *dest,
|
|
int32_t destLength,
|
|
char *partTypes,
|
|
int32_t *partLengths,
|
|
int32_t *pNumParts
|
|
) {
|
|
if ((tmpl == nullptr) || (dest == nullptr)) {
|
|
return -1;
|
|
}
|
|
std::vector<const llama_chat_message *> vMsgs;
|
|
for(size_t i=0; i<numMsgs; i++) {
|
|
vMsgs.push_back(&msgs[i]);
|
|
}
|
|
std::string taggedMsgs;
|
|
std::string types;
|
|
std::vector<int> lens;
|
|
int32_t taggedLength = chaton_tmpl_apply_ex(tmpl, vMsgs, taggedMsgs, types, lens, alertAssistantAtEnd);
|
|
if (taggedLength <= 0) {
|
|
return taggedLength;
|
|
}
|
|
if (destLength > 0) {
|
|
strlcpy(dest, taggedMsgs.c_str(), destLength);
|
|
}
|
|
if (*pNumParts > 0) {
|
|
strlcpy(partTypes, types.c_str(), *pNumParts);
|
|
for(int i=0; i < *pNumParts; i++) {
|
|
partLengths[i] = lens[i];
|
|
}
|
|
}
|
|
*pNumParts = types.length();
|
|
return taggedLength;
|
|
}
|
|
|
|
|
|
/**
|
|
* if tmpl is
|
|
* * empty string, then dump the full loaded chaton-meta
|
|
* * chaton-template-id, then dump contents related to that specific chat-handshake-template-standard
|
|
*/
|
|
inline bool _chaton_meta_dump(std::string &tmpl) {
|
|
json theJson;
|
|
if (tmpl.empty()) {
|
|
theJson = conMeta;
|
|
} else {
|
|
theJson = conMeta[tmpl];
|
|
if (theJson.empty()) {
|
|
LOGXLN("ERRR:%s:Specified template-id [%s] not found", __func__, tmpl.c_str());
|
|
return false;
|
|
}
|
|
}
|
|
LOGXLN("\n\nINFO:%s:ChatOn Meta:%s:\n%s", __func__, tmpl.c_str(), theJson.dump(4).c_str());
|
|
if (!tmpl.empty()) {
|
|
std::string globalBegin = conMeta[tmpl][K_GLOBAL][K_BEGIN];
|
|
std::string globalEnd = conMeta[tmpl][K_GLOBAL][K_END];
|
|
std::string systemBegin = conMeta[tmpl][K_SYSTEM][K_BEGIN];
|
|
std::string systemPrefix = conMeta[tmpl][K_SYSTEM][K_PREFIX];
|
|
std::string systemSuffix = conMeta[tmpl][K_SYSTEM][K_SUFFIX];
|
|
std::string systemEnd = conMeta[tmpl][K_SYSTEM][K_END];
|
|
std::string userBegin = conMeta[tmpl][K_USER][K_BEGIN];
|
|
std::string userPrefix = conMeta[tmpl][K_USER][K_PREFIX];
|
|
std::string userSuffix = conMeta[tmpl][K_USER][K_SUFFIX];
|
|
std::string userEnd = conMeta[tmpl][K_USER][K_END];
|
|
std::string assistantBegin = conMeta[tmpl][K_ASSISTANT][K_BEGIN];
|
|
std::string assistantPrefix = conMeta[tmpl][K_ASSISTANT][K_PREFIX];
|
|
std::string assistantSuffix = conMeta[tmpl][K_ASSISTANT][K_SUFFIX];
|
|
std::string assistantEnd = conMeta[tmpl][K_ASSISTANT][K_END];
|
|
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "global->begin", globalBegin.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "global->end", globalEnd.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "system->begin", systemBegin.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "system->prefix", systemPrefix.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "system->suffix", systemSuffix.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "system->end", systemEnd.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "user->begin", userBegin.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "user->prefix", userPrefix.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "user->suffix", userSuffix.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "user->end", userEnd.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "assistant->begin", assistantBegin.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "assistant->prefix", assistantPrefix.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "assistant->suffix", assistantSuffix.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, "assistant->end", assistantEnd.c_str());
|
|
LOGXLN("INFO:%s:%s:%s", __func__, K_REVERSE_PROMPT, chaton_tmpl_kv(tmpl, K_REVERSE_PROMPT).c_str());
|
|
LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_SYSTEM_HAS_SUFFIX, chaton_tmpl_kv_bool(tmpl, K_SYSTEMUSER_SYSTEM_HAS_SUFFIX));
|
|
LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_SYSTEM_HAS_END, chaton_tmpl_kv_bool(tmpl, K_SYSTEMUSER_SYSTEM_HAS_END));
|
|
LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_1ST_USER_HAS_BEGIN, chaton_tmpl_kv_bool(tmpl, K_SYSTEMUSER_1ST_USER_HAS_BEGIN));
|
|
LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_1ST_USER_HAS_PREFIX, chaton_tmpl_kv_bool(tmpl, K_SYSTEMUSER_1ST_USER_HAS_PREFIX));
|
|
|
|
if (!userEnd.empty()) {
|
|
LOG_TEELN("WARN:%s:User->End seems to be set to [%s], do cross check if this is proper and needed", __func__, userEnd.c_str());
|
|
}
|
|
if (!assistantBegin.empty()) {
|
|
LOG_TEELN("WARN:%s:Assistant->Begin seems to be set to [%s], do cross check if this is proper and needed", __func__, assistantBegin.c_str());
|
|
}
|
|
}
|
|
return true;
|
|
}
|
|
|
|
/**
|
|
* Check that a meta-json file has been loaded.
|
|
* Verify that specified chaton-template-id contains required fields in meta-json, using meta-dump
|
|
*/
|
|
inline bool chaton_meta_ok(std::string &tmpl) {
|
|
if (conMeta == nullptr) {
|
|
LOG_TEELN("ERRR:%s:%s:ChatOn Meta: Not loaded yet...", __func__, tmpl.c_str());
|
|
return false;
|
|
}
|
|
return _chaton_meta_dump(tmpl);
|
|
}
|