Commit Graph

13 Commits

Author SHA1 Message Date
Francis Couture-Harpin 183eeb5518 imatrix : avoid loading model to convert or combine imatrix 2025-07-12 16:50:10 -04:00
Francis Couture-Harpin 50f53b3e40 imatrix : warn when writing partial data, to help guess dataset coverage
Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.
2025-07-12 16:50:10 -04:00
Francis Couture-Harpin 42423ec4d3 imatrix : add warning when legacy format is written 2025-07-12 15:19:51 -04:00
Francis Couture-Harpin e33de128c7 common : move string_remove_suffix from quantize and imatrix
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-06-23 16:24:06 -04:00
Francis Couture-Harpin 43cd2b3eb5 imatrix : support 3d tensors with MUL_MAT 2025-06-23 12:20:55 -04:00
Francis Couture-Harpin 1a9454a3d2 imatrix : avoid returning from void function save_imatrix 2025-06-18 16:44:41 -04:00
Francis Couture-Harpin ba6f6be6ce imatrix : don't use FMA explicitly
This should make comparisons between the formats easier
because this matches the behavior of the previous version.
2025-06-18 16:33:37 -04:00
Francis Couture-Harpin 2c0945027a Merge branch 'master' into compilade/imatrix-batched-chunks 2025-06-18 16:32:35 -04:00
Georgi Gerganov 745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
* llama : deprecate llama_kv_self_ API

ggml-ci

* llama : allow llama_memory_(nullptr)

ggml-ci

* memory : add flag for optional data clear in llama_memory_clear

ggml-ci
2025-06-06 14:11:15 +03:00
Bartowski efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389)
* Add --parse-special for enabling parsing of special tokens in imatrix calculation

* whitespace
2025-05-09 11:53:58 +02:00
Georgi Gerganov 51fb96b1ff
context : remove logits_all flag (#13284)
* context : remove logits_all flag

ggml-ci

* llama : remove logits_all flag + reorder llama_context_params

ggml-ci
2025-05-08 14:26:50 +03:00
Johannes Gäßler 3e959f0976
imatrix: fix oob writes if src1 is not contiguous (#13286) 2025-05-04 00:50:37 +02:00
Diego Devesa 1d36b3670b
llama : move end-user examples to tools directory (#13249)
* llama : move end-user examples to tools directory

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-05-02 20:27:13 +02:00
Renamed from examples/imatrix/imatrix.cpp (Browse further)