llama.cpp/examples
Georgi Gerganov d48c88cbd5
ggml : remove ggml_flash_attn and ggml_flash_ff (#7463)
ggml-ci
2024-05-23 10:00:44 +03:00
..
baby-llama code : normalize enum names (#5697) 2024-02-25 12:09:09 +02:00
batched common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
batched-bench ggml : add Flash Attention (#5021) 2024-04-30 12:16:08 +03:00
batched.swift llama : add option to render special/control tokens (#6807) 2024-04-21 18:36:45 +03:00
beam-search llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
benchmark ggml : remove old quantization functions (#5942) 2024-03-09 15:53:59 +02:00
convert-llama2c-to-ggml TypoFix (#7162) 2024-05-09 10:16:45 +02:00
embedding common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
eval-callback common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
export-lora
finetune ggml : remove ggml_flash_attn and ggml_flash_ff (#7463) 2024-05-23 10:00:44 +03:00
gbnf-validator grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609) 2024-04-11 19:47:34 +01:00
gguf gguf : add option to not check tensor data (#6582) 2024-04-10 21:16:48 +03:00
gguf-split gguf-split: add --no-tensor-first-split (#7072) 2024-05-04 18:56:22 +02:00
gritlm gritlm : add --outdir option to hf.sh script (#6699) 2024-04-16 09:34:06 +03:00
imatrix common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
infill common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
jeopardy
llama-bench common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
llama.android cmake : update android comments (#7341) 2024-05-19 11:01:01 +03:00
llama.swiftui llama : add option to render special/control tokens (#6807) 2024-04-21 18:36:45 +03:00
llava common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
lookahead common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
lookup common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
main main : minor (#7462) 2024-05-23 09:43:49 +03:00
main-cmake-pkg build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964) 2024-04-29 17:02:45 +01:00
parallel common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
passkey llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
perplexity common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
quantize common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
quantize-stats Improve usability of --model-url & related flags (#6930) 2024-04-30 00:52:50 +01:00
retrieval common : normalize naming style (#7462) 2024-05-22 20:04:20 +03:00
rpc rpc : set SO_REUSEADDR for the server socket (#7320) 2024-05-17 17:25:44 +03:00
save-load-state llama : save and restore kv cache for single seq id (#6341) 2024-04-08 15:43:30 +03:00
server SimpleChat: a simple and dumb web front end for testing /chat/completions and /completions end points and try chat (#7350) 2024-05-23 03:53:21 +10:00
simple llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
speculative llama : support Llama 3 HF conversion (#6745) 2024-04-21 14:50:41 +03:00
sycl docs: fix typos (#7124) 2024-05-07 18:20:33 +03:00
tokenize BERT tokenizer fixes (#6498) 2024-04-09 13:44:08 -04:00
train-text-from-scratch ggml : remove ggml_flash_attn and ggml_flash_ff (#7463) 2024-05-23 10:00:44 +03:00
CMakeLists.txt ggml : add RPC backend (#6829) 2024-05-14 14:27:19 +03:00
Miku.sh
alpaca.sh
base-translate.sh
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
gpt4all.sh
json-schema-pydantic-example.py json-schema-to-grammar improvements (+ added to server) (#5978) 2024-03-21 11:50:43 +00:00
json_schema_to_grammar.py JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00
llama.vim
llama2-13b.sh
llama2.sh
llm.vim
make-ggml.py
pydantic-models-to-grammar-examples.py
pydantic_models_to_grammar.py
reason-act.sh
regex-to-grammar.py JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00
server-embd.py server : refactor (#5882) 2024-03-07 11:41:53 +02:00
server-llama2-13B.sh
ts-type-to-grammar.sh JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00