#pragma once

/**
 * 
 * Generic tagging logic + text config file based chat templates handling
 * by Humans for All
 * 
 * ## Overview
 * 
 * Helps chat with models, by tagging chat messages based on the specified
 * chat-handshake-template-standard. This uses a generic tagging code driven
 * by a json meta data file, which specifies the handshake template details.
 * 
 * This can be used by
 * 
 * * main, to build on existing interactive flow and its in-prefix, in-suffix
 *   and antiprompt/reverse-prompt
 * 
 * * server, by replacing its existing llama_chat_apply_template with the
 *   equivalent helper here.
 * 
 * 
 * ## The common pattern
 * 
 * As a convention, the tagging used by LLMs to differentiate between the
 * different parts when chatting with them normally follows a general pattern of
 * 
 * * <BeginOfSentenceIfAny> <RolePrefixIfAny> <TheContent> <RoleSuffixIfAny> <EndOfSentenceIfAny>
 * 
 * * The Roles could include System, User and Assistant (ie the Model)
 * 
 * * A chat normally consists of
 * 
 *   * a System message/prompt followed by
 * 
 *   * multiple user message/query - model message/response pairs
 * 
 * The different models will normally have all or some subset of the tagging mentioned above.
 * 
 * You may also notice some common patterns like
 * 
 * * Because a user message is normally followed by model/assistant response, in most models
 * 
 *   * user messages wont have EndOfSentenceTag and
 * 
 *   * the following model response wont have BeginOfSentenceTag
 * 
 * * Because a system message will normally be immidiately followed by a user query,
 * 
 *   * in many models, there wont be a EndOfSentenceTag following the system message and
 *     BeginOfSentenceTag wrt the 1st user message following the system message.
 * 
 *   * in some models there wont even be a RoleSuffixTag following system message
 *     and RolePrefixTag wrt the 1st user message following the system message.
 * 
 *   * however in many of these models, the subsequent user messages will have the
 *     BeginOfSentenceTag and or RolePrefixTag.
 * 
 * * Some models may require a BoS for a group of messages, independent of BoS (if any)
 *   wrt individual roles.
 * 
 * 
 * ## The Strategy
 * 
 * The template meta data json file allows the user to specify the above mentioned tags wrt
 * each of the Role as well as any global tag for a group of messages. Depending on whether
 * a given model uses/needs a given tag or not you either specify the required tag or else
 * you specify a empty string.
 * 
 * A tag could be a single word or multiple words, and may include newline char specified
 * using \n and so on. The tag is always demarcated using double quotes and thus also allows
 * spaces at the begining or end of the tag, if needed.
 * 
 * In order to account for the conditionality of tags between the system message and the 1st
 * user message, flags are provided to explicitly control whether each of these possible tags
 * is used by a specific model or not, as part of its template info.
 * 
 * The Roles are identified in the json file using "system", "user" and "assistant". However
 * the model may use different words to identify these roles, in which case setup RolePrefix
 * and or RoleSuffix appropriately.
 * 
 * To identify that model is finished with generating response to user query, depending on
 * the model's handshake template standard, one will need to set the reverse-prompt to either
 * the assistant's suffix or end tag or to the user's begin or prefix tag, depending on what
 * is generated by the model at the end of its response.
 * 
 * Currently flags for trimming wrt user text (be it wrt system or user role) is not added.
 * 
 * 
 * ## The JSON File
 * 
 * Can contain the template info wrt multiple models/handshake-standards. And inturn each
 * unique template is identified by a unique template id string.
 * 
 * The fields that make up a given chat-handshake-template-standard include
 * 
 * * global -> begin & end
 * 
 * * system -> begin, prefix, suffix & end
 * 
 * * user -> begin, prefix, suffix & end
 * 
 * * assistant -> begin, prefix, suffix & end
 * 
 * * reverse-prompt
 * 
 * * systemuser-system-has-suffix, systemuser-system-has-end,
 *   systemuser-1st-user-has-begin and systemuser-1st-user-has-prefix
 * 
 *
 * ## Usage
 * 
 * One needs to load the json file containing the template meta data and inturn call the
 * other helper functions as needed.
 * 
 * Inturn one can use the helper functions to either extract a given tag or to apply all
 * tags specified wrt a given role to the passed message or to apply tags as needed for
 * a bunch of messages in one go.
 * 
 * The individual message tagging helper, will apply all tags specified wrt that role.
 * 
 * The multiple messages tagging helper chaton-tmpl-apply, will look at the boolean flags
 * when tagging the passed messages. In this the system suffix, system end, user begin and
 * user prefix get included only if corresponding flag is set.
 * 
 * Both the single and multi messages tagging helpers provide two versions.
 * * one which returns a single string which contains the tagged message(s)
 * * one which returns
 *   * [tagged msg] the string containing the tagged message(s)
 *   * [parts lengths] an array of integers, which specifies the part lengths,
 *     which divides the returned string into parts.
 *   * [parts types] a string where each character indicates whether the corresponding
 *     part is a normal part which needs to be tokenized without parse_special
 *     or is a special part which needs to be tokenized with parse-special.
 * 
 * 
 * ## example/main
 * 
 * The interactive commandline program under example/main, uses
 * 
 * * the system role related tags to tag the system prompt
 *   * the system prompt includes contents of -p if any
 *   * followed by contents of file specified using -f if any
 * * the user begin+prefix to map to in-prefix
 * * the user suffix+end to map to in-suffix
 * * the reverse-prompt to map to antiprompt
 * * wrt tokenization
 *   * the user specified system prompt is tokenized with parse_special flag.
 *   * however the user messages are tokenized without parse_special flag.
 * 
 * Currently Main doesnt use chaton-tmpl-apply, but only 
 * * chaton-tmpl-apply-single (for system prompt) and
 * * chaton-tmpl-role-kv which maps the user prefix, suffix and reverse-prompt
 *   to in-prefix, in-suffix and antiprompt of main.
 * These always adds any role specific begin+prefix and suffix+end around
 * the passed message.
 * 
 * ## other uses be it wrt llama.cpp-as-library or examples/server or ...
 * 
 * This module exposes a c-api which is equivalent to the current hardcoded
 * templating logic's llama_chat_apply_template. So any program using llama.cpp's
 * chat templating logic can be easily migrated to make use of this generic code
 * with text based config file based flow.
 * 
 * If a program doesnt want to bring in json dependency into their project,
 * there is also common/simpcfg.hpp, which provides a simple text based config
 * file format, along with the corresponding parser for the same. This can be
 * modified to work with simpcfg easily, if needed.
 * 
 * ## Adding support for new model / chat-handshake-template-standard
 * 
 * 1. Add suitable entries in json for that model/standard
 *    This in itself should work for most of the models.
 * 
 * 2. If some new model introduces a totally different kind of chat-templating
 *    tag inter/intra mixing, Try to reuse and update the generic flow in
 *    chaton-tmpl-apply, as much as possible, before trying to add any custom logic.
 * 
 *    If you update the generic flow, cross check if existing json files will
 *    need to be updated or not.
 * 
 * 
 * ## Notes
 * 
 * Look at the sample chaton_meta.json in examples folder for how the above may apply to
 * the different llm's out there like
 * 
 * * llama2, llama3, gemma, zephyr, deepseek(normal and coder), monarch, mistral, phi3
 * * chatml, command-r, orion, openchat, vicuna
 * 
 */

#include <string>
#include <fstream>
#include <groupkv.hpp>

#include "log.h"
#include "llama.h"

#define LOGXLN LOG_TEELN

const auto K_SYSTEM = "system";
const auto K_USER = "user";
const auto K_ASSISTANT = "assistant";
const auto K_PREFIX = "prefix";
const auto K_SUFFIX = "suffix";
const auto K_BEGIN = "begin";
const auto K_END = "end";
const auto K_GLOBAL = "global";
const auto K_SYSTEMUSER_SYSTEM_HAS_SUFFIX = "systemuser-system-has-suffix";
const auto K_SYSTEMUSER_SYSTEM_HAS_END = "systemuser-system-has-end";
const auto K_SYSTEMUSER_1ST_USER_HAS_BEGIN = "systemuser-1st-user-has-begin";
const auto K_SYSTEMUSER_1ST_USER_HAS_PREFIX = "systemuser-1st-user-has-prefix";
const auto K_REVERSE_PROMPT = "reverse-prompt";

#define CHATON_JSON
#ifdef CHATON_JSON
#include <json.hpp>
using json = nlohmann::ordered_json;
#endif


/**
 * Helps keep user prompt and chat-hs-template tag parts seperate, but in sequence.
 * Inturn gives the flexibility to tokenize with or without parse_special flag, wrt the different parts of the chat msg(s).
 * One could use the triplet of str, get_types and get_partslens to achieve the above mentioned flexibility.
 */
class ChatParts {

    std::vector<std::string> parts = {};
    std::string types = {""};

public:
    // Identify string with special tokens that need to be processed.
    static const auto S = 's';
    // Identify string which shouldnt have special token processing done.
    static const auto N = 'n';
    // Identify no string condition and or ignore string.
    static const auto X = '?';

    ChatParts() : parts{}, types{""} {}

    char last_type() {
        if (types.length() == 0) {
            return ChatParts::X;
        }
        return types[types.length()-1];
    }

    void add_part(char type, const std::string &part) {
        if (last_type() == type) {
            parts[parts.size()-1] += part;
        } else {
            parts.emplace_back(part);
            types += type;
        }
    }

    std::string str() {
        std::string allin = "";
        for(auto part: parts) {
            allin += part;
        }
        return allin;
    }

    std::string get_partstypes() {
        return types;
    }

    std::vector<int32_t> get_partslens() {
        std::vector<int32_t> lens = {};
        for(auto part: parts) {
            lens.push_back(part.length());
        }
        return lens;
    }

    std::string name() {
        return typeid(*this).name();
    }

    void dump() {
        std::string me = name() + ":" + __func__;
        LOGXLN("INFO:%s:NumTypes:%zu", me.c_str(), types.length());
        LOGXLN("INFO:%s:NumParts:%zu", me.c_str(), parts.size());
        LOGXLN("INFO:%s:StrLength:%zu", me.c_str(), str().length());
        if (parts.size() != types.length()) {
            LOG_TEELN("DBUG:%s:Mismatch between parts and types", me.c_str());
        }
        int i = 0;
        for(auto part: parts) {
            LOGXLN("INFO:%s:%c:%s", me.c_str(), types[i], part.c_str());
            i += 1;
        }
    }

};


class ChatTemplates : public GroupKV {

public:

    ChatTemplates(GroupKVMapMapVariant defaultMap) : GroupKV(defaultMap) {}

    /**
     * Check if the specified chat-template exists or not.
     * NOTE: This doesnt cross check, if the template inturn contains all the required fields or not.
    */
    bool tmpl_exists(const std::string &tmpl) {
        if (!group_exists(tmpl)) {
            LOG_TEELN("WARN:CT:%s: tmpl[%s] not found...", __func__, tmpl.c_str());
            return false;
        }
        return true;
    }

    /**
     * Check if all expected keys/fields are present wrt the specified chat-template.
     * If any key/field is missing, expect a exception.
    */
    bool tmpl_basiccheck(const std::string &tmpl, std::stringstream &ss) {
        std::string globalBegin = get_value<std::string>(tmpl, { K_GLOBAL, K_BEGIN });
        std::string globalEnd = get_value<std::string>(tmpl, { K_GLOBAL, K_END });
        std::string systemBegin = get_value<std::string>(tmpl, { K_SYSTEM, K_BEGIN });
        std::string systemPrefix = get_value<std::string>(tmpl, { K_SYSTEM, K_PREFIX });
        std::string systemSuffix = get_value<std::string>(tmpl, { K_SYSTEM, K_SUFFIX });
        std::string systemEnd = get_value<std::string>(tmpl, { K_SYSTEM, K_END });
        std::string userBegin = get_value<std::string>(tmpl, { K_USER, K_BEGIN });
        std::string userPrefix = get_value<std::string>(tmpl, { K_USER, K_PREFIX });
        std::string userSuffix = get_value<std::string>(tmpl, { K_USER, K_SUFFIX });
        std::string userEnd = get_value<std::string>(tmpl, { K_USER, K_END });
        std::string assistantBegin = get_value<std::string>(tmpl, { K_ASSISTANT, K_BEGIN });
        std::string assistantPrefix = get_value<std::string>(tmpl, { K_ASSISTANT, K_PREFIX });
        std::string assistantSuffix = get_value<std::string>(tmpl, { K_ASSISTANT, K_SUFFIX });
        std::string assistantEnd = get_value<std::string>(tmpl, { K_ASSISTANT, K_END });
        std::string reversePrompt = get_value<std::string>(tmpl, { K_REVERSE_PROMPT });
        bool systemHasSuffix = get_value<bool>(tmpl, { K_SYSTEMUSER_SYSTEM_HAS_SUFFIX });
        bool systemHasEnd = get_value<bool>(tmpl, { K_SYSTEMUSER_SYSTEM_HAS_END });
        bool userHasBegin = get_value<bool>(tmpl, { K_SYSTEMUSER_1ST_USER_HAS_BEGIN });
        bool userHasPrefix = get_value<bool>(tmpl, { K_SYSTEMUSER_1ST_USER_HAS_PREFIX });

        LOGXLN("INFO:%s:%s:%s", __func__, "global-begin", globalBegin.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "global-end", globalEnd.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "system-begin", systemBegin.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "system-prefix", systemPrefix.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "system-suffix", systemSuffix.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "system-end", systemEnd.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "user-begin", userBegin.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "user-prefix", userPrefix.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "user-suffix", userSuffix.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "user-end", userEnd.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "assistant-begin", assistantBegin.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "assistant-prefix", assistantPrefix.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "assistant-suffix", assistantSuffix.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, "assistant-end", assistantEnd.c_str());
        LOGXLN("INFO:%s:%s:%s", __func__, K_REVERSE_PROMPT, reversePrompt.c_str());
        LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_SYSTEM_HAS_SUFFIX, systemHasSuffix);
        LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_SYSTEM_HAS_END, systemHasEnd);
        LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_1ST_USER_HAS_BEGIN, userHasBegin);
        LOGXLN("INFO:%s:%s:%d", __func__, K_SYSTEMUSER_1ST_USER_HAS_PREFIX, userHasPrefix);

        if (!userEnd.empty()) {
            LOG_TEELN("WARN:%s:User-End seems to be set to [%s], do cross check if this is proper and needed", __func__, userEnd.c_str());
        }
        if (!assistantBegin.empty()) {
            LOG_TEELN("WARN:%s:Assistant-Begin seems to be set to [%s], do cross check if this is proper and needed", __func__, assistantBegin.c_str());
        }

    }

    /**
     * For the specified chat-template, get the value associated with the specified key/field.
    */
    template <typename SupportedDataType>
    SupportedDataType tmpl_getkey(const std::string &tmpl, const std::string &key, const SupportedDataType &defaultValue) {
        return get_value(tmpl, {key}, defaultValue, "CTTmplGetKey");
    }

    /**
     * For the specified chat-template and the role within, cumulate the values of the specified keys/fields
     * and return the same.
    */
    std::string tmpl_role_getkeys(const std::string &tmpl, const std::string &role, const std::vector<std::string> &keys) {
        std::string got = "";
        std::string sKeys = "";
        for(auto key: keys) {
            got += get_value<std::string>(tmpl, {role, key}, "", "CTTmplRoleGetKeys");
            sKeys += "+";
            sKeys += key;
        }
        LDBUG_LN("DBUG:CT:%s:%s:%s:%s:%s", __func__, tmpl.c_str(), role.c_str(), sKeys.c_str(), got.c_str());
        return got;
    }


};

#include "chaton_meta.hpp"
//ChatTemplates gCT = {{}};

#ifdef CHATON_JSON

template <typename SupportedType>
inline SupportedType json_get(json &j, const std::vector<std::string_view> &keys, const std::string &msgTag) {
    json curJ = j;
    std::stringstream skey;
    int i = 0;
    for(auto key: keys) {
        if (i != 0) skey << "-";
        i += 1;
        skey << key;
        if (curJ.contains(key)) {
            curJ = curJ[key];
        } else {
            std::stringstream ss;
            ss << "ERRR:ChatON:" << __func__ << ":" << msgTag << ":KeyChain [" << skey.str() << "] is missing";
            throw std::runtime_error(ss.str());
        }
    }
    return curJ;
}

inline bool chaton_meta_load(const std::string &fname) {
    std::ifstream f(fname);
    json conMeta = json::parse(f);
    for(auto it=conMeta.begin(); it != conMeta.end(); ++it) {

        auto group = it.key();
        auto curTmpl = conMeta[group];

        std::string globalBegin = json_get<std::string>(curTmpl, { K_GLOBAL, K_BEGIN }, group);
        gCT.set_value<std::string>(group, { K_GLOBAL, K_BEGIN }, globalBegin);
        std::string globalEnd = json_get<std::string>(curTmpl, { K_GLOBAL, K_END }, group);
        gCT.set_value<std::string>(group, { K_GLOBAL, K_END }, globalEnd);

        std::string systemBegin = json_get<std::string>(curTmpl, { K_SYSTEM, K_BEGIN }, group);
        gCT.set_value<std::string>(group, { K_SYSTEM, K_BEGIN }, systemBegin);
        std::string systemPrefix = json_get<std::string>(curTmpl, { K_SYSTEM, K_PREFIX }, group);
        gCT.set_value<std::string>(group, { K_SYSTEM, K_PREFIX }, systemPrefix);
        std::string systemSuffix = json_get<std::string>(curTmpl, { K_SYSTEM, K_SUFFIX }, group);
        gCT.set_value<std::string>(group, { K_SYSTEM, K_SUFFIX }, systemSuffix);
        std::string systemEnd = json_get<std::string>(curTmpl, { K_SYSTEM, K_END }, group);
        gCT.set_value<std::string>(group, { K_SYSTEM, K_END }, systemEnd);

        std::string userBegin = json_get<std::string>(curTmpl, { K_USER, K_BEGIN }, group);
        gCT.set_value<std::string>(group, { K_USER, K_BEGIN }, userBegin);
        std::string userPrefix = json_get<std::string>(curTmpl, { K_USER, K_PREFIX }, group);
        gCT.set_value<std::string>(group, { K_USER, K_PREFIX }, userPrefix);
        std::string userSuffix = json_get<std::string>(curTmpl, { K_USER, K_SUFFIX }, group);
        gCT.set_value<std::string>(group, { K_USER, K_SUFFIX }, userSuffix);
        std::string userEnd = json_get<std::string>(curTmpl, { K_USER, K_END }, group);
        gCT.set_value<std::string>(group, { K_USER, K_END }, userEnd);

        std::string assistantBegin = json_get<std::string>(curTmpl, { K_ASSISTANT, K_BEGIN }, group);
        gCT.set_value<std::string>(group, { K_ASSISTANT, K_BEGIN }, assistantBegin);
        std::string assistantPrefix = json_get<std::string>(curTmpl, { K_ASSISTANT, K_PREFIX }, group);
        gCT.set_value<std::string>(group, { K_ASSISTANT, K_PREFIX }, assistantPrefix);
        std::string assistantSuffix = json_get<std::string>(curTmpl, { K_ASSISTANT, K_SUFFIX }, group);
        gCT.set_value<std::string>(group, { K_ASSISTANT, K_SUFFIX }, assistantSuffix);
        std::string assistantEnd = json_get<std::string>(curTmpl, { K_ASSISTANT, K_END }, group);
        gCT.set_value<std::string>(group, { K_ASSISTANT, K_END }, assistantEnd);

        std::string reversePrompt = json_get<std::string>(curTmpl, { K_REVERSE_PROMPT }, group);
        gCT.set_value<std::string>(group, { K_REVERSE_PROMPT }, reversePrompt);

        bool systemHasSuffix = json_get<bool>(curTmpl, { K_SYSTEMUSER_SYSTEM_HAS_SUFFIX }, group);
        gCT.set_value(group, { K_SYSTEMUSER_SYSTEM_HAS_SUFFIX }, systemHasSuffix);
        bool systemHasEnd = json_get<bool>(curTmpl, { K_SYSTEMUSER_SYSTEM_HAS_END }, group);
        gCT.set_value(group, { K_SYSTEMUSER_SYSTEM_HAS_END }, systemHasEnd);

        bool userHasBegin = json_get<bool>(curTmpl, { K_SYSTEMUSER_1ST_USER_HAS_BEGIN }, group);
        gCT.set_value(group, { K_SYSTEMUSER_1ST_USER_HAS_BEGIN }, userHasBegin);
        bool userHasPrefix = json_get<bool>(curTmpl, { K_SYSTEMUSER_1ST_USER_HAS_PREFIX }, group);
        gCT.set_value(group, { K_SYSTEMUSER_1ST_USER_HAS_PREFIX }, userHasPrefix);

    }
    LOGXLN("%s", gCT.dump("", "DBUG:ChatONMetaLoad:ChatTemplates:").c_str());
    return true;
}

#endif


inline bool chaton_tmpl_exists(const std::string &tmpl) {
    return gCT.tmpl_exists(tmpl);
}

inline std::string chaton_tmpl_role_getkeys(const std::string &tmpl, const std::string &role, const std::vector<std::string> &keys) {
    return gCT.tmpl_role_getkeys(tmpl, role, keys);
}

inline std::string chaton_tmpl_getkey_str(const std::string &tmpl, const std::string &key) {
    return gCT.tmpl_getkey<std::string>(tmpl, {key}, "");
}

inline bool chaton_tmpl_getkey_bool(const std::string &tmpl, const std::string &key) {
    return gCT.tmpl_getkey<bool>(tmpl, {key}, false);
}


// Given the template standard, role and a message, this returns
// a tagged message, types string and lens vector wrt the parts that make up the returned string
//
// * a string containing the tagged message
//   * role-(begin+prefix) + msg + role-(suffix+end)
// * a string where the chars contain info about
//   type of sub-strings/parts that make up the tagged message.
// * a vector of ints, which give the length of each part in the tagged message.
inline bool chaton_tmpl_apply_single_ex(
        const std::string &tmpl,
        const std::string &role,
        const std::string &content,
        std::string &tagged,
        std::string &types,
        std::vector<int32_t> &lens
        ) {
    if (!chaton_tmpl_exists(tmpl)) {
        return false;
    }
    ChatParts cp = {};
    std::string beginPrefix = chaton_tmpl_role_getkeys(tmpl, role, {K_BEGIN, K_PREFIX});
    std::string suffixEnd = chaton_tmpl_role_getkeys(tmpl, role, {K_SUFFIX, K_END});
    cp.add_part(ChatParts::S, beginPrefix);
    cp.add_part(ChatParts::N, content);
    cp.add_part(ChatParts::S, suffixEnd);
    cp.dump();
    tagged = cp.str();
    LOGLN("DBUG:%s:%s:%s:%s", __func__, tmpl.c_str(), role.c_str(), tagged.c_str());
    types = cp.get_partstypes();
    lens = cp.get_partslens();
    return true;
}

// Given the template standard, role and a message, this returns the tagged message.
//
// * a string containing the tagged message
//   * role-(begin+prefix) + msg + role-(suffix+end)
inline size_t chaton_tmpl_apply_single(
        const std::string &tmpl,
        const std::string &role,
        const std::string &content,
        std::string &tagged
        ) {
    std::string types;
    std::vector<int32_t> lens;
    if (!chaton_tmpl_apply_single_ex(tmpl, role, content, tagged, types, lens)) {
        return -1;
    }
    return tagged.size();
}

/**
 * Apply chat-handshake-template for the specified template standard and role.
 * If the passed char array is smaller than that required for the tagged message,
 * * part of the tagged message which fits within dest buffer is copied
 * * the returned value, indicates the size of the actual tagged message
 * NOTE:
 * * ideally the passed char array should be able to fit the tagged message+0|null char.
 * * if the return value from this function is larger than or equal to destLength,
 *   then you will have to increase the size of the dest buffer, and call this
 *   function a second time, to ensure that one gets the full tagged message.
 */
inline size_t chat_tmpl_apply_single_capi(
        const char *tmpl,
        const char *role,
        const char *content,
        char *dest,
        const size_t destLength
        ) {
    std::string tagged;
    auto taggedLength = chaton_tmpl_apply_single(tmpl, role, content, tagged);
    if (taggedLength <= 0) {
        return taggedLength;
    }
    if (dest && (destLength > 0)) {
        strlcpy(dest, tagged.c_str(), destLength);
    }
    return taggedLength;
}


// Given the template standard and a bunch of messages including their roles, this returns
// tagged messages, types string and lens vector. Returned types string and lens vector help
// identify the parts of the tagged msgs string, which relate to passed msgs and added tags.
//
// * a string containing the tagged messages
//   * global-begin + 1 or more [[role-begin] + [role-prefix] + msg + [role-suffix] +[role-end]] + global-end
// * a string where the chars contain info about
//   type of sub-strings/parts that make up the tagged messages string.
// * a vector of ints, which give the length of each part in the tagged messages string.
//
// if a combination of system-user messages is passed, then tags between the system
// and the 1st user message, is based on the flags set wrt the corresponding template standard.
inline bool chaton_tmpl_apply_ex(
        const std::string &tmpl,
        const std::vector<const llama_chat_message *> &msgs,
        bool alertAssistantAtEnd,
        std::string &tagged,
        std::string &types,
        std::vector<int32_t> &lens
        ) {
    if (!chaton_tmpl_exists(tmpl)) {
        return false;
    }
    ChatParts cp = {};
    std::string globalBegin = chaton_tmpl_role_getkeys(tmpl, K_GLOBAL, {K_BEGIN});
    cp.add_part(ChatParts::S, globalBegin);
    int cntSystem = 0;
    int cntUser = 0;
    int cntOthers = 0;
    for(const auto msg: msgs) {
        auto role = msg->role;
        auto content = msg->content;
        std::string begin = chaton_tmpl_role_getkeys(tmpl, role, {K_BEGIN});
        auto prefix = chaton_tmpl_role_getkeys(tmpl, role, {K_PREFIX});
        auto suffix = chaton_tmpl_role_getkeys(tmpl, role, {K_SUFFIX});
        auto end = chaton_tmpl_role_getkeys(tmpl, role, {K_END});
        if (role == K_SYSTEM) {
            cntSystem += 1;
            cp.add_part(ChatParts::S, begin);
            cp.add_part(ChatParts::S, prefix);
        } else if (role == K_USER) {
            cntUser += 1;
            if ((cntSystem == 1) && (cntUser == 1)) {
                if (chaton_tmpl_getkey_bool(tmpl, K_SYSTEMUSER_1ST_USER_HAS_BEGIN)) {
                    cp.add_part(ChatParts::S, begin);
                }
                if (chaton_tmpl_getkey_bool(tmpl, K_SYSTEMUSER_1ST_USER_HAS_PREFIX)) {
                    cp.add_part(ChatParts::S, prefix);
                }
            } else {
                cp.add_part(ChatParts::S, begin);
                cp.add_part(ChatParts::S, prefix);
            }
        } else {
            cntOthers += 1;
            cp.add_part(ChatParts::S, begin);
            cp.add_part(ChatParts::S, prefix);
        }
        cp.add_part(ChatParts::N, content);
        if (role == K_SYSTEM) {
            if (chaton_tmpl_getkey_bool(tmpl, K_SYSTEMUSER_SYSTEM_HAS_SUFFIX)) {
                cp.add_part(ChatParts::S, suffix);
            }
            if (chaton_tmpl_getkey_bool(tmpl, K_SYSTEMUSER_SYSTEM_HAS_END)) {
                cp.add_part(ChatParts::S, end);
            }
        } else {
            cp.add_part(ChatParts::S, suffix);
            cp.add_part(ChatParts::S, end);
        }
    }
    if (alertAssistantAtEnd) {
        auto assistantBeginPrefix = chaton_tmpl_role_getkeys(tmpl, K_ASSISTANT, {K_BEGIN, K_PREFIX});
        cp.add_part(ChatParts::S, assistantBeginPrefix);
    }
    auto globalEnd = chaton_tmpl_role_getkeys(tmpl, K_GLOBAL, {K_END});
    cp.add_part(ChatParts::S, globalEnd);
    cp.dump();
    tagged = cp.str();
    LOGLN("DBUG:%s:%s:%s", __func__, tmpl.c_str(), tagged.c_str());
    LOGLN("DBUG:%s:%s:CntSys[%d]:CntUsr[%d]:CntOthers[%d]", __func__, tmpl.c_str(), cntSystem, cntUser, cntOthers);
    types = cp.get_partstypes();
    lens = cp.get_partslens();
    return true;
}

// Given the template standard and a bunch of messages including their roles, this returns
// the tagged messages as a string.
// global-begin + 1 or more [[role-begin] + [role-prefix] + msg + [role-suffix] +[role-end]] + global-end
inline int32_t chaton_tmpl_apply(
        const std::string &tmpl,
        const std::vector<const llama_chat_message *> &msgs,
        bool alertAssistantAtEnd,
        std::string &tagged
        ) {
    std::string types;
    std::vector<int32_t> lens;
    if (!chaton_tmpl_apply_ex(tmpl, msgs, alertAssistantAtEnd, tagged, types, lens)) {
        return -1;
    }
    return tagged.size();
}

// Given the template standard and a bunch of messages including their roles, this returns
// the tagged messages as a string.
// global-begin + 1 or more [[role-begin] + [role-prefix] + msg + [role-suffix] +[role-end]] + global-end
//
// If the passed char array is smaller than that required for the tagged messages string,
// * part of the tagged messages string which fits within dest buffer is copied
// * the returned value, indicates the size of the actual tagged message
//
// NOTE:
// * ideally the passed char array should be able to fit the tagged messages string + 0|null char.
// * if the return value from this function is larger than or equal to destLength,
//   then you will have to increase the size of the dest buffer, and call this
//   function a second time, to ensure that one gets the full tagged messages string.
inline int32_t chaton_tmpl_apply_capi(
        const char *tmpl,
        const struct llama_chat_message *msgs,
        const size_t numMsgs,
        bool alertAssistantAtEnd,
        char *dest,
        int32_t destLength
        ) {
    if ((tmpl == nullptr) || (dest == nullptr)) {
        return -1;
    }
    std::vector<const llama_chat_message *> vMsgs;
    for(size_t i=0; i<numMsgs; i++) {
        vMsgs.push_back(&msgs[i]);
    }
    std::string taggedMsgs;
    int32_t taggedLength = chaton_tmpl_apply(tmpl, vMsgs, alertAssistantAtEnd, taggedMsgs);
    if (taggedLength < 0) {
        return taggedLength;
    }
    if (destLength > 0) {
        strlcpy(dest, taggedMsgs.c_str(), destLength);
    }
    return taggedLength;
}

//
// In addition to the semantic provided by chaton_tmpl_apply_capi
// this additionally also returns info about the parts that make up
// the returned tagged message.
//
// partsTypes and partsLengths should be arrays that can accomodate the
// same number of elements belonging to its respective type.
// Inturn the pNumParts should point to a int which specifies the
// number of elements.
// If the generated tagged message has more parts than the specified
// *pNumParts, then the logic copies partsTypes and partsLengths to the
// specified length/NumOfParts only. Parallely it updates *pNumParts
// to the actual needed length (not including any terminating null char or so).
//
inline int32_t chaton_tmpl_apply_ex_capi(
        const char *tmpl,
        const struct llama_chat_message *msgs,
        const size_t numMsgs,
        bool alertAssistantAtEnd,
        char *dest,
        int32_t destLength,
        char *partsTypes,
        int32_t *partsLengths,
        int32_t *pNumParts
        ) {
    if ((tmpl == nullptr) || (dest == nullptr) || (pNumParts == nullptr)) {
        return -1;
    }
    std::vector<const llama_chat_message *> vMsgs;
    for(size_t i=0; i<numMsgs; i++) {
        vMsgs.push_back(&msgs[i]);
    }
    std::string taggedMsgs;
    std::string types;
    std::vector<int32_t> lens;
    if (!chaton_tmpl_apply_ex(tmpl, vMsgs, alertAssistantAtEnd, taggedMsgs, types, lens)) {
        return -1;
    }
    int32_t taggedLength = taggedMsgs.size();
    if (taggedLength < 0) {
        return taggedLength;
    }
    if (destLength > 0) {
        strlcpy(dest, taggedMsgs.c_str(), destLength);
    }
    if (*pNumParts > 0) {
        if (partsTypes != nullptr) {
            strlcpy(partsTypes, types.c_str(), *pNumParts);
        }
        if (partsLengths != nullptr) {
            memcpy(partsLengths, lens.data(), (*pNumParts)*sizeof(int32_t));
        }
    }
    *pNumParts = types.length();
    return taggedLength;
}

// Copied from common.cpp
inline std::vector<llama_token> chaton_llama_tokenize(
    const struct llama_model * model,
           const std::string & text,
                        bool   add_special,
                        bool   parse_special) {
    LOGLN("DBUG:%s:%s:special[add:%d, parse:%d]", __func__, text.c_str(), add_special, parse_special);
    if (model == nullptr) {
        LOG_TEELN("ERRR:%s:Model NOT Provided:%s:special[add:%d, parse:%d]", __func__, text.c_str(), add_special, parse_special);
        return std::vector<llama_token>{};
    }
    // upper limit for the number of tokens
    int n_tokens = text.length() + 2 * add_special;
    std::vector<llama_token> result(n_tokens);
    n_tokens = llama_tokenize(model, text.data(), text.length(), result.data(), result.size(), add_special, parse_special);
    if (n_tokens < 0) {
        result.resize(-n_tokens);
        int check = llama_tokenize(model, text.data(), text.length(), result.data(), result.size(), add_special, parse_special);
        GGML_ASSERT(check == -n_tokens);
    } else {
        result.resize(n_tokens);
    }
    return result;
}

// Tokenize the passed taggedText, keeping in mind the subparts within and
// inturn whether to parse special tokens in them or not (partsTypes).
// If you want to parse special tokens in the taggedText, independent of what
// partsTypes specifies, then set forceParseSpecial to true.
inline std::vector<llama_token> chaton_llama_tokenize_ex(
        const struct llama_model *model,
        const std::string &taggedText,
        const std::string &partsTypes,
        const std::vector<int32_t> &partsLengths,
        bool addSpecial,
        bool forceParseSpecial
        ) {
    std::vector<llama_token> tokens;
    int iPart = 0;
    int iStart = 0;
    for(auto partLen: partsLengths) {
        auto partType = partsTypes[iPart];
        iPart += 1;
        auto msgPart = taggedText.substr(iStart, partLen);
        iStart += partLen;
        auto parseSpecial = partType == ChatParts::S ? true : false;
        parseSpecial |= forceParseSpecial;
        auto curTokens = chaton_llama_tokenize(model, msgPart, addSpecial, parseSpecial);
        tokens.insert(tokens.end(), curTokens.begin(), curTokens.end());
    }
    return tokens;
}


/**
 * if tmpl is
 * * empty string, then dump the full loaded chaton-meta
 * * chaton-template-id, then dump contents related to that specific chat-handshake-template-standard
 * NOTE: It uses the exception raising get_value to check if the tags related keys are present
 * wrt the specified template-standard/model-id or not.
 */
inline bool _chaton_meta_dump(std::string &tmpl) {
    if (!tmpl.empty()) {
        if (!gCT.tmpl_exists(tmpl)) {
            LOGXLN("ERRR:%s:Specified template-id [%s] not found", __func__, tmpl.c_str());
            return false;
        }
    }
    LOGXLN("\n\nINFO:%s:%s:\n%s", __func__, tmpl.c_str(), gCT.dump(tmpl, "INFO:ChatOnMetaDump:").c_str());
    if (!tmpl.empty()) {
        gCT.tmpl_basiccheck(tmpl);
    }
    return true;
}

/**
 * Verify that specified chaton-template-id contains required fields using meta-dump
 */
inline bool chaton_meta_ok(std::string &tmpl) {
    return _chaton_meta_dump(tmpl);
}