llama.cpp

History

Adrien Gallouët 3cba8bba18 common : fix split model migration (#21019 ) Sadly the manifest does not list all required files, i honestly thought it was the case Without the files listed we don't have the sha256, so if the first file is valid, and all others have the correct size, then we can assume we are good and do the migration... Here my test: $ find /home/angt/.cache/llama.cpp /home/angt/.cache/llama.cpp /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf.etag /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf.etag /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json $ build/bin/llama-server ================================================================================ WARNING: Migrating cache to HuggingFace cache directory Old cache: /home/angt/.cache/llama.cpp/ New cache: /home/angt/.cache/huggingface/hub This one-time migration moves models previously downloaded with -hf from the legacy llama.cpp cache to the standard HuggingFace cache. Models downloaded with --model-url are not affected. ================================================================================ migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf migrate_old_cache_to_hf_cache: migration complete, deleting manifest: /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json $ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface /home/angt/.cache/huggingface/hub /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694 /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf Signed-off-by: Adrien Gallouët <angt@huggingface.co>		2026-03-26 12:04:37 +01:00
..
jinja	jinja: fix macro with kwargs (#20960 )	2026-03-25 12:22:48 +01:00
CMakeLists.txt	common : add standard Hugging Face cache support (#20775 )	2026-03-24 07:30:33 +01:00
arg.cpp	common : fix verbosity setup (#20989 )	2026-03-25 19:41:01 +01:00
arg.h	vendor : update cpp-httplib to 0.30.0 (#18660 )	2026-01-08 13:53:54 +01:00
base64.hpp	llava : expose as a shared library for downstream projects (#3613 )	2023-11-07 00:36:23 +03:00
build-info.cpp.in	cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167 )	2025-06-13 10:38:52 +02:00
chat-auto-parser-generator.cpp	common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912 )	2026-03-23 22:21:47 -05:00
chat-auto-parser-helpers.cpp	common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912 )	2026-03-23 22:21:47 -05:00
chat-auto-parser-helpers.h	common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912 )	2026-03-23 22:21:47 -05:00
chat-auto-parser.h	common/parser: add proper reasoning tag prefill reading (#20424 )	2026-03-19 16:58:21 +01:00
chat-diff-analyzer.cpp	common/autoparser : detect reasoning markers when enable_thinking changes system prompt (#20859 )	2026-03-23 08:35:27 +01:00
chat-peg-parser.cpp	common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912 )	2026-03-23 22:21:47 -05:00
chat-peg-parser.h	common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912 )	2026-03-23 22:21:47 -05:00
chat.cpp	common : replace wrap_for_generation with a prefix convenience function and fix gpt-oss (#20912 )	2026-03-23 22:21:47 -05:00
chat.h	common/parser: add proper reasoning tag prefill reading (#20424 )	2026-03-19 16:58:21 +01:00
common.cpp	llama : re-enable manual LoRA adapter free (#19983 )	2026-03-18 12:03:26 +02:00
common.h	common/parser: add proper reasoning tag prefill reading (#20424 )	2026-03-19 16:58:21 +01:00
console.cpp	cli : add command and file auto-completion (#19985 )	2026-03-05 10:47:28 +01:00
console.h	cli : add command and file auto-completion (#19985 )	2026-03-05 10:47:28 +01:00
debug.cpp	debug: make common_debug_print_tensor readable (#19331 )	2026-02-04 17:55:31 +01:00
debug.h	chore : correct typos [no ci] (#20041 )	2026-03-05 08:50:21 +01:00
download.cpp	common : fix gguf selection in common_list_cached_models (#20996 )	2026-03-25 19:18:06 +01:00
download.h	common : add standard Hugging Face cache support (#20775 )	2026-03-24 07:30:33 +01:00
hf-cache.cpp	common : fix split model migration (#21019 )	2026-03-26 12:04:37 +01:00
hf-cache.h	common : fix split model migration (#21019 )	2026-03-26 12:04:37 +01:00
http.h	server: Parse port numbers from MCP server URLs in CORS proxy (#20208 )	2026-03-09 17:47:54 +01:00
json-partial.cpp	common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )	2025-11-18 18:54:15 +01:00
json-partial.h	cli : fix reasoning responses in CLI (#18961 )	2026-01-20 18:23:25 +01:00
json-schema-to-grammar.cpp	common : fix incorrect uses of stoul (#20313 )	2026-03-10 11:40:26 +01:00
json-schema-to-grammar.h	common : add nemotron 3 parsing (#18077 )	2025-12-16 04:05:23 -06:00
llguidance.cpp	sampling : add support for backend sampling (#17004 )	2026-01-04 22:22:16 +02:00
log.cpp	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
log.h	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
ngram-cache.cpp	spec : add self‑speculative decoding (no draft model required) + refactor (#18471 )	2026-01-28 19:42:42 +02:00
ngram-cache.h	spec : add self‑speculative decoding (no draft model required) + refactor (#18471 )	2026-01-28 19:42:42 +02:00
ngram-map.cpp	llama : correct typos 'occured' and 'occurences' (#19414 )	2026-02-11 07:05:31 +01:00
ngram-map.h	llama : correct typos 'occured' and 'occurences' (#19414 )	2026-02-11 07:05:31 +01:00
ngram-mod.cpp	spec : add ngram-mod (#19164 )	2026-01-30 18:21:48 +02:00
ngram-mod.h	ngram-mod : fix build [no ci] (#19216 )	2026-01-30 21:27:27 +02:00
peg-parser.cpp	common: consolidate PEG string parsers (#20263 )	2026-03-10 00:29:21 +01:00
peg-parser.h	common: consolidate PEG string parsers (#20263 )	2026-03-10 00:29:21 +01:00
preset.cpp	preset: allow named remote preset (#18728 )	2026-01-10 15:12:29 +01:00
preset.h	common: support remote preset (#18520 )	2026-01-08 22:35:40 +01:00
reasoning-budget.cpp	common/parser: add proper reasoning tag prefill reading (#20424 )	2026-03-19 16:58:21 +01:00
reasoning-budget.h	common/parser: add proper reasoning tag prefill reading (#20424 )	2026-03-19 16:58:21 +01:00
regex-partial.cpp	common : fix iterator::end() dereference (#20445 )	2026-03-16 08:50:38 +02:00
regex-partial.h	`common`: add partial regex support (#12808 )	2025-05-14 19:50:57 +01:00
sampling.cpp	common/parser: add proper reasoning tag prefill reading (#20424 )	2026-03-19 16:58:21 +01:00
sampling.h	sampling : add support for backend sampling (#17004 )	2026-01-04 22:22:16 +02:00
speculative.cpp	spec : remove check rate (#19377 )	2026-02-09 15:30:50 +02:00
speculative.h	common : add common_speculative_is_compat() (#19270 )	2026-02-06 16:47:22 +02:00
unicode.cpp	common/parser: handle reasoning budget (#20297 )	2026-03-11 10:26:12 +01:00
unicode.h	common/parser: handle reasoning budget (#20297 )	2026-03-11 10:26:12 +01:00