llama.cpp

History

Daniel Bevenius fd1085ffb7 model-conversion : use CONVERTED_MODEL value for converted model [no ci] (#17984 ) * model-conversion : use CONVERTED_MODEL value for converted model [no ci] This commit updates the model verification scripts to use the CONVERTED_MODEL environment variable instead of using the MODEL_PATH (the original model path) as the basis for the converted model file name. The motivation for this that currently if the converted model file name differs from the original model directory/name the verification scripts will look for the wrong .bin files that were generating when running the models. For example, the following steps were not possible: ```console (venv) $ huggingface-cli download google/gemma-3-270m-it --local-dir ggml-org/gemma-3-270m (venv) $ python3 convert_hf_to_gguf.py ggml-org/gemma-3-270m --outfile test-bf16.gguf --outtype bf16 (venv) $ cd examples/model-conversion/ (venv) $ export MODEL_PATH=../../ggml-org/gemma-3-270m (venv) $ export CONVERTED_MODEL=../../test-bf16.gguf (venv) $ make causal-verify-logits ... Data saved to data/llamacpp-test-bf16.bin Data saved to data/llamacpp-test-bf16.txt Error: llama.cpp logits file not found: data/llamacpp-gemma-3-270m.bin Please run scripts/run-converted-model.sh first to generate this file. make: *** [Makefile:62: causal-verify-logits] Error 1 ``` With the changes in this commit, the above steps will now work as expected.		2025-12-13 08:34:26 +01:00
..
batched	examples : add -kvu to batched usage example [no ci] (#17469 )	2025-11-24 15:38:45 +02:00
batched.swift	examples : remove references to `make` in examples [no ci] (#15457 )	2025-08-21 06:12:28 +02:00
convert-llama2c-to-ggml	gguf: gguf_writer refactor (#15691 )	2025-09-05 11:34:28 +02:00
deprecation-warning	Update deprecation-warning.cpp (#10619 )	2024-12-04 23:19:20 +01:00
diffusion	models : Added support for RND1 Diffusion Language Model (#17433 )	2025-11-24 14:16:56 +08:00
embedding	ggml : add GGML_SCHED_NO_REALLOC option to disable reallocations in ggml_backend_sched (#17276 )	2025-11-28 17:33:23 +02:00
eval-callback	common : more accurate sampling timing (#17382 )	2025-11-20 13:40:10 +02:00
gen-docs	common: support negated args (#17919 )	2025-12-12 23:58:53 +01:00
gguf	examples(gguf): GGUF example outputs (#17025 )	2025-11-05 19:58:16 +02:00
gguf-hash	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
idle	metal : add residency sets keep-alive heartbeat (#17766 )	2025-12-05 19:38:54 +02:00
llama.android	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
llama.swiftui	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
lookahead	lookahead : add sample command to readme (#15447 )	2025-08-20 13:30:46 +03:00
lookup	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
model-conversion	model-conversion : use CONVERTED_MODEL value for converted model [no ci] (#17984 )	2025-12-13 08:34:26 +01:00
parallel	parallel : add option for different RNG seeds (#14757 )	2025-07-18 17:33:41 +03:00
passkey	examples : remove references to `make` in examples [no ci] (#15457 )	2025-08-21 06:12:28 +02:00
retrieval	examples : remove references to `make` in examples [no ci] (#15457 )	2025-08-21 06:12:28 +02:00
save-load-state	metal : fix build(#17799 )	2025-12-06 09:33:59 +02:00
simple	examples : support encoder-decoder models in the simple example (#16002 )	2025-09-17 10:29:00 +03:00
simple-chat	simple-chat : fix context-exceeded condition (#14494 )	2025-07-02 14:12:07 +03:00
simple-cmake-pkg	examples : add missing code block end marker [no ci] (#17756 )	2025-12-04 14:17:30 +01:00
speculative	sampling : optimize samplers by reusing bucket sort (#15665 )	2025-08-31 20:41:02 +03:00
speculative-simple	common : add --override-tensor-draft, --cpu-moe-draft and --n-cpu-moe-draft parameters (#15191 )	2025-08-13 12:44:40 +02:00
sycl	sycl : support to malloc memory on device more than 4GB, update the doc and script (#17566 )	2025-11-29 14:59:44 +02:00
training	finetune: SGD optimizer, more CLI args (#13873 )	2025-08-14 12:03:57 +02:00
CMakeLists.txt	metal : add residency sets keep-alive heartbeat (#17766 )	2025-12-05 19:38:54 +02:00
convert_legacy_llama.py	metadata: Detailed Dataset Authorship Metadata (#8875 )	2024-11-13 21:10:38 +11:00
json_schema_pydantic_example.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
json_schema_to_grammar.py	common : fix json schema with '\' in literals (#17307 )	2025-11-29 17:06:32 +01:00
llama.vim	llama : remove KV cache defragmentation logic (#15473 )	2025-08-22 12:22:13 +03:00
pydantic_models_to_grammar.py	pydantic : replace uses of __annotations__ with get_type_hints (#8474 )	2024-07-14 19:51:21 -04:00
pydantic_models_to_grammar_examples.py	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
reason-act.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
regex_to_grammar.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
server-llama2-13B.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
server_embd.py	llama : fix FA when KV cache is not used (i.e. embeddings) (#12825 )	2025-04-08 19:54:51 +03:00
ts-type-to-grammar.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00