llama.cpp/examples
HanishKVC 13857f29d6 ChatON+Main: Updates wrt detailed meta json
Fix a oversight wrt key name.

Add a alert in case if passed meta json file contains begin(BoS)
wrt assistant role, similar to check for end (EoS) wrt user role.
Bcas normally both (ie EoS wrt User and BoS wrt Assistant) shouldnt
be needed.

Update main wrt begin & prefix and suffix & end addition.
2024-05-06 11:27:56 +05:30
..
baby-llama code : normalize enum names (#5697) 2024-02-25 12:09:09 +02:00
batched llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
batched-bench ggml : add Flash Attention (#5021) 2024-04-30 12:16:08 +03:00
batched.swift llama : add option to render special/control tokens (#6807) 2024-04-21 18:36:45 +03:00
beam-search llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
benchmark ggml : remove old quantization functions (#5942) 2024-03-09 15:53:59 +02:00
convert-llama2c-to-ggml llama2c : open file as binary (#6332) 2024-03-27 09:16:02 +02:00
embedding BERT tokenizer fixes (#6498) 2024-04-09 13:44:08 -04:00
eval-callback model: support arch `DbrxForCausalLM` (#6515) 2024-04-13 11:33:52 +02:00
export-lora ci : add an option to fail on compile warning (#3952) 2024-02-17 23:03:14 +02:00
finetune code : normalize enum names (#5697) 2024-02-25 12:09:09 +02:00
gbnf-validator grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609) 2024-04-11 19:47:34 +01:00
gguf gguf : add option to not check tensor data (#6582) 2024-04-10 21:16:48 +03:00
gguf-split gguf-split: add --no-tensor-first-split (#7072) 2024-05-04 18:56:22 +02:00
gritlm gritlm : add --outdir option to hf.sh script (#6699) 2024-04-16 09:34:06 +03:00
imatrix quantize: add imatrix and dataset metadata in GGUF (#6658) 2024-04-26 20:06:33 +02:00
infill llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
jeopardy parallel : add option to load external prompt file (#3416) 2023-10-06 16:16:38 +03:00
llama-bench Adding support for the --numa argument for llama-bench. (#7080) 2024-05-05 14:17:47 +02:00
llama.android llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
llama.swiftui llama : add option to render special/control tokens (#6807) 2024-04-21 18:36:45 +03:00
llava llava-cli : multiple images (#6969) 2024-04-29 17:34:24 +03:00
lookahead llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
lookup Server: fix seed for multiple slots (#6835) 2024-04-24 11:08:36 +02:00
main ChatON+Main: Updates wrt detailed meta json 2024-05-06 11:27:56 +05:30
main-cmake-pkg build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964) 2024-04-29 17:02:45 +01:00
parallel llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
passkey llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
perplexity perplexity: more statistics, added documentation (#6936) 2024-04-30 23:36:27 +02:00
quantize quantize: add imatrix and dataset metadata in GGUF (#6658) 2024-04-26 20:06:33 +02:00
quantize-stats Improve usability of --model-url & related flags (#6930) 2024-04-30 00:52:50 +01:00
retrieval examples : add "retrieval" (#6193) 2024-03-25 09:38:22 +02:00
save-load-state llama : save and restore kv cache for single seq id (#6341) 2024-04-08 15:43:30 +03:00
server If first token generated from the server is the stop word the server will crash (#7038) 2024-05-04 11:06:40 +02:00
simple llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
speculative llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
sycl fix memcpy() crash, add missed cmd in guide, fix softmax (#6622) 2024-04-14 10:42:29 +08:00
tokenize BERT tokenizer fixes (#6498) 2024-04-09 13:44:08 -04:00
train-text-from-scratch train : add general name (#6752) 2024-04-19 10:16:45 +03:00
CMakeLists.txt eval-callback: Example how to use eval callback for debugging (#6576) 2024-04-11 14:51:07 +02:00
Miku.sh MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287) 2023-07-21 11:13:18 +03:00
alpaca.sh alpaca.sh : update model file name (#2074) 2023-07-06 19:17:50 +03:00
base-translate.sh examples : improve base-translate.sh script (#4783) 2024-01-06 11:40:24 +02:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
chat-persistent.sh llama : fix session saving/loading (#3400) 2023-10-03 21:04:01 +03:00
chat-vicuna.sh examples : add chat-vicuna.sh (#1854) 2023-06-15 21:05:53 +03:00
chat.sh main : log file (#2748) 2023-08-30 09:29:32 +03:00
chaton_meta.json ChatON: Update to new detailed format wrt llama2 and llama3 2024-05-06 11:27:56 +05:30
chaton_meta.old_simple.json ChatON: Backup the current simple meta json file 2024-05-06 11:27:56 +05:30
gpt4all.sh examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
json-schema-pydantic-example.py json-schema-to-grammar improvements (+ added to server) (#5978) 2024-03-21 11:50:43 +00:00
json_schema_to_grammar.py JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00
llama.vim llama.vim : added api key support (#5090) 2024-01-23 08:51:27 +02:00
llama2-13b.sh gitignore : changes for Poetry users + chat examples (#2284) 2023-07-21 13:53:27 +03:00
llama2.sh gitignore : changes for Poetry users + chat examples (#2284) 2023-07-21 13:53:27 +03:00
llm.vim llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) 2023-08-30 09:50:55 +03:00
make-ggml.py make-ggml.py : compatibility with more models and GGUF (#3290) 2023-09-27 19:25:12 +03:00
pydantic-models-to-grammar-examples.py examples : make pydantic scripts pass mypy and support py3.8 (#5099) 2024-01-25 14:51:24 -05:00
pydantic_models_to_grammar.py examples : make pydantic scripts pass mypy and support py3.8 (#5099) 2024-01-25 14:51:24 -05:00
reason-act.sh chmod : make scripts executable (#2675) 2023-08-23 17:29:09 +03:00
regex-to-grammar.py JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00
server-embd.py server : refactor (#5882) 2024-03-07 11:41:53 +02:00
server-llama2-13B.sh chmod : make scripts executable (#2675) 2023-08-23 17:29:09 +03:00
ts-type-to-grammar.sh JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00