llama.cpp

History

Adrien Gallouët 8c7957ca33 common : add standard Hugging Face cache support (#20775 ) * common : add standard Hugging Face cache support - Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check with the quant tag Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Cleanup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Improve error handling and report API errors Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Restore common_cached_model_info and align mmproj filtering Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Prefer main when getting cached ref Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use cached files when HF API fails Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use final_path.. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check all inputs Signed-off-by: Adrien Gallouët <angt@huggingface.co> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co>		2026-03-24 07:30:33 +01:00
..
test_basic.py	server : support multiple model aliases via comma-separated --alias (#19926 )	2026-02-27 07:05:23 +01:00
test_chat_completion.py	server: Add cached_tokens info to oaicompat responses (#19361 )	2026-03-19 19:09:33 +01:00
test_compat_anthropic.py	server: Add cached_tokens info to oaicompat responses (#19361 )	2026-03-19 19:09:33 +01:00
test_compat_oai_responses.py	…
test_completion.py	server : fix wait in test_cancel_requests() test (#20601 )	2026-03-15 20:54:37 +02:00
test_ctx_shift.py	…
test_embedding.py	llama : fix pooling assertion crash in chunked GDN detection path (#20468 )	2026-03-13 20:53:42 +02:00
test_infill.py	…
test_lora.py	…
test_proxy.py	server: Parse port numbers from MCP server URLs in CORS proxy (#20208 )	2026-03-09 17:47:54 +01:00
test_rerank.py	…
test_router.py	common : add standard Hugging Face cache support (#20775 )	2026-03-24 07:30:33 +01:00
test_security.py	…
test_sleep.py	…
test_slot_save.py	…
test_speculative.py	…
test_template.py	tests : use `reasoning` instead of `reasoning_budget` in server tests (#20432 )	2026-03-12 13:41:01 +01:00
test_tokenize.py	…
test_tool_call.py	ci : switch from pyright to ty (#20826 )	2026-03-21 08:54:34 +01:00
test_vision_api.py	…