llama.cpp/examples
HanishKVC 3fcaf19967 ChatON+:Multi4Single: applyGlobalIfAny flag wrt templating api
Given that now the multi chat templating logic itself is used to
apply chat templating/tagging to a single chat message, so give
flexibility of deciding whether global tags if any should be
applied or not wrt the core tagging logic.

examples/main inturn updated to not apply global tags if any wrt
the system message. Also the user messages already dont apply
global tags if any, as its currently implemented to build on the
existing in-prefix/suffix and anitprompt flow.
2024-05-14 01:00:17 +05:30
..
baby-llama code : normalize enum names (#5697) 2024-02-25 12:09:09 +02:00
batched llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
batched-bench ggml : add Flash Attention (#5021) 2024-04-30 12:16:08 +03:00
batched.swift llama : add option to render special/control tokens (#6807) 2024-04-21 18:36:45 +03:00
beam-search llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
benchmark ggml : remove old quantization functions (#5942) 2024-03-09 15:53:59 +02:00
convert-llama2c-to-ggml TypoFix (#7162) 2024-05-09 10:16:45 +02:00
embedding BERT tokenizer fixes (#6498) 2024-04-09 13:44:08 -04:00
eval-callback eval-callback : fix conversion to float (#7184) 2024-05-10 01:04:12 +02:00
export-lora ci : add an option to fail on compile warning (#3952) 2024-02-17 23:03:14 +02:00
finetune ggml : introduce bfloat16 support (#6412) 2024-05-08 09:30:09 +03:00
gbnf-validator grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609) 2024-04-11 19:47:34 +01:00
gguf gguf : add option to not check tensor data (#6582) 2024-04-10 21:16:48 +03:00
gguf-split gguf-split: add --no-tensor-first-split (#7072) 2024-05-04 18:56:22 +02:00
gritlm gritlm : add --outdir option to hf.sh script (#6699) 2024-04-16 09:34:06 +03:00
imatrix Fixed save_imatrix to match old behaviour for MoE (#7099) 2024-05-08 02:24:16 +02:00
infill llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
jeopardy parallel : add option to load external prompt file (#3416) 2023-10-06 16:16:38 +03:00
llama-bench Adding support for the --numa argument for llama-bench. (#7080) 2024-05-05 14:17:47 +02:00
llama.android llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
llama.swiftui llama : add option to render special/control tokens (#6807) 2024-04-21 18:36:45 +03:00
llava llava : fix moondream support (#7163) 2024-05-10 09:41:10 +03:00
lookahead llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
lookup Server: fix seed for multiple slots (#6835) 2024-04-24 11:08:36 +02:00
main ChatON+:Multi4Single: applyGlobalIfAny flag wrt templating api 2024-05-14 01:00:17 +05:30
main-cmake-pkg build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964) 2024-04-29 17:02:45 +01:00
parallel llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
passkey llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
perplexity perplexity: more statistics, added documentation (#6936) 2024-04-30 23:36:27 +02:00
quantize ggml : introduce bfloat16 support (#6412) 2024-05-08 09:30:09 +03:00
quantize-stats Improve usability of --model-url & related flags (#6930) 2024-04-30 00:52:50 +01:00
retrieval examples : add "retrieval" (#6193) 2024-03-25 09:38:22 +02:00
save-load-state llama : save and restore kv cache for single seq id (#6341) 2024-04-08 15:43:30 +03:00
server convert-hf : save memory with lazy evaluation (#7075) 2024-05-08 18:16:38 -04:00
simple llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
speculative llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
sycl docs: fix typos (#7124) 2024-05-07 18:20:33 +03:00
tokenize BERT tokenizer fixes (#6498) 2024-04-09 13:44:08 -04:00
train-text-from-scratch train : add general name (#6752) 2024-04-19 10:16:45 +03:00
CMakeLists.txt eval-callback: Example how to use eval callback for debugging (#6576) 2024-04-11 14:51:07 +02:00
Miku.sh MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287) 2023-07-21 11:13:18 +03:00
alpaca.sh alpaca.sh : update model file name (#2074) 2023-07-06 19:17:50 +03:00
base-translate.sh examples : improve base-translate.sh script (#4783) 2024-01-06 11:40:24 +02:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
chat-persistent.sh llama : fix session saving/loading (#3400) 2023-10-03 21:04:01 +03:00
chat-vicuna.sh examples : add chat-vicuna.sh (#1854) 2023-06-15 21:05:53 +03:00
chat.sh main : log file (#2748) 2023-08-30 09:29:32 +03:00
chaton_meta.json ChatON:Initial go at vicuna chat template in meta.json 2024-05-06 11:27:56 +05:30
chaton_meta.old_simple.json ChatON: Backup the current simple meta json file 2024-05-06 11:27:56 +05:30
chaton_meta.simpcfg SimpCfg:Dbug why bool is not setting properly 2024-05-06 11:27:56 +05:30
gpt4all.sh examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
json-schema-pydantic-example.py json-schema-to-grammar improvements (+ added to server) (#5978) 2024-03-21 11:50:43 +00:00
json_schema_to_grammar.py JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00
llama.vim llama.vim : added api key support (#5090) 2024-01-23 08:51:27 +02:00
llama2-13b.sh gitignore : changes for Poetry users + chat examples (#2284) 2023-07-21 13:53:27 +03:00
llama2.sh gitignore : changes for Poetry users + chat examples (#2284) 2023-07-21 13:53:27 +03:00
llm.vim llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) 2023-08-30 09:50:55 +03:00
make-ggml.py make-ggml.py : compatibility with more models and GGUF (#3290) 2023-09-27 19:25:12 +03:00
pydantic-models-to-grammar-examples.py examples : make pydantic scripts pass mypy and support py3.8 (#5099) 2024-01-25 14:51:24 -05:00
pydantic_models_to_grammar.py examples : make pydantic scripts pass mypy and support py3.8 (#5099) 2024-01-25 14:51:24 -05:00
reason-act.sh chmod : make scripts executable (#2675) 2023-08-23 17:29:09 +03:00
regex-to-grammar.py JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00
server-embd.py server : refactor (#5882) 2024-03-07 11:41:53 +02:00
server-llama2-13B.sh chmod : make scripts executable (#2675) 2023-08-23 17:29:09 +03:00
ts-type-to-grammar.sh JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00