llama.cpp

History

Daniel Bevenius 6ab881b7c3 model-conversion : add tensor-info.py utility (#18954 ) This commit adds a new python script that can be used to print tensors information from a tensor in a safetensors model. The motivation for this is that during model conversion work it can sometimes be useful to verify the shape of tensors in the original model. While it is possible to print the tensors when loading the model this can be slow when working with larger models. With this script it is possible to quickly query tensor shapes. Example usage: ```console (venv) $ ./scripts/utils/tensor-info.py --help usage: tensor-info.py [-h] [-m MODEL_PATH] [-l] [tensor_name] Print tensor information from a safetensors model positional arguments: tensor_name Name of the tensor to inspect options: -h, --help show this help message and exit -m MODEL_PATH, --model-path MODEL_PATH Path to the model directory (default: MODEL_PATH environment variable) -l, --list List unique tensor patterns in the model (layer numbers replaced with #) ``` Listing tensor names: ```console (venv) $ ./scripts/utils/tensor-info.py -m ~/work/ai/models/google/embeddinggemma-300m -l embed_tokens.weight layers.#.input_layernorm.weight layers.#.mlp.down_proj.weight layers.#.mlp.gate_proj.weight layers.#.mlp.up_proj.weight layers.#.post_attention_layernorm.weight layers.#.post_feedforward_layernorm.weight layers.#.pre_feedforward_layernorm.weight layers.#.self_attn.k_norm.weight layers.#.self_attn.k_proj.weight layers.#.self_attn.o_proj.weight layers.#.self_attn.q_norm.weight layers.#.self_attn.q_proj.weight layers.#.self_attn.v_proj.weight norm.weight ``` Printing a specific tensor's information: ```console (venv) $ ./scripts/utils/tensor-info.py -m ~/work/ai/models/google/embeddinggemma-300m layers.0.input_layernorm.weight Tensor: layers.0.input_layernorm.weight File: model.safetensors Shape: [768] ```		2026-02-04 10:40:53 +01:00
..
batched	context : reserve new scheduler when graph topology changes (#18547 )	2026-01-15 16:39:17 +02:00
batched.swift	examples : remove references to `make` in examples [no ci] (#15457 )	2025-08-21 06:12:28 +02:00
convert-llama2c-to-ggml	gguf: gguf_writer refactor (#15691 )	2025-09-05 11:34:28 +02:00
debug	Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914 )	2026-01-14 20:29:35 +01:00
deprecation-warning	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
diffusion	llama : add `use_direct_io` flag for model loading (#18166 )	2026-01-08 08:35:30 +02:00
embedding	model : add LFM2-ColBert-350M (#18607 )	2026-01-05 19:52:56 +01:00
eval-callback	tests : download models only when running ctest (#18843 )	2026-01-15 09:47:29 +01:00
gen-docs	gen-docs: automatically update markdown file (#18294 )	2025-12-22 19:30:19 +01:00
gguf	examples(gguf): GGUF example outputs (#17025 )	2025-11-05 19:58:16 +02:00
gguf-hash	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
idle	metal : add residency sets keep-alive heartbeat (#17766 )	2025-12-05 19:38:54 +02:00
llama.android	refactor : remove libcurl, use OpenSSL when available (#18828 )	2026-01-14 18:02:47 +01:00
llama.swiftui	llama : deprecate llama_kv_self_ API (#14030 )	2025-06-06 14:11:15 +03:00
lookahead	lookup, lookahead: fix crash when n_ctx not specified (#18729 )	2026-01-30 22:10:24 +02:00
lookup	lookup, lookahead: fix crash when n_ctx not specified (#18729 )	2026-01-30 22:10:24 +02:00
model-conversion	model-conversion : add tensor-info.py utility (#18954 )	2026-02-04 10:40:53 +01:00
parallel	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
passkey	examples : remove references to `make` in examples [no ci] (#15457 )	2025-08-21 06:12:28 +02:00
retrieval	model : add LFM2-ColBert-350M (#18607 )	2026-01-05 19:52:56 +01:00
save-load-state	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
simple	examples : support encoder-decoder models in the simple example (#16002 )	2025-09-17 10:29:00 +03:00
simple-chat	simple-chat : fix context-exceeded condition (#14494 )	2025-07-02 14:12:07 +03:00
simple-cmake-pkg	examples : add missing code block end marker [no ci] (#17756 )	2025-12-04 14:17:30 +01:00
speculative	spec : add self‑speculative decoding (no draft model required) + refactor (#18471 )	2026-01-28 19:42:42 +02:00
speculative-simple	spec : add self‑speculative decoding (no draft model required) + refactor (#18471 )	2026-01-28 19:42:42 +02:00
sycl	create test.sh to enhance the parameters for testing, update the guide, rm useless script (#19243 )	2026-02-01 18:24:00 +08:00
training	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
CMakeLists.txt	examples : add debug utility/example (#18464 )	2026-01-07 10:42:19 +01:00
convert_legacy_llama.py	metadata: Detailed Dataset Authorship Metadata (#8875 )	2024-11-13 21:10:38 +11:00
json_schema_pydantic_example.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
json_schema_to_grammar.py	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
llama.vim	llama : remove KV cache defragmentation logic (#15473 )	2025-08-22 12:22:13 +03:00
pydantic_models_to_grammar.py	pydantic : replace uses of __annotations__ with get_type_hints (#8474 )	2024-07-14 19:51:21 -04:00
pydantic_models_to_grammar_examples.py	llama : move end-user examples to tools directory (#13249 )	2025-05-02 20:27:13 +02:00
reason-act.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
regex_to_grammar.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
server-llama2-13B.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
server_embd.py	llama : fix FA when KV cache is not used (i.e. embeddings) (#12825 )	2025-04-08 19:54:51 +03:00
ts-type-to-grammar.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00