Commit Graph

52 Commits

Author SHA1 Message Date
Ed Addario d19e6c9afa
Use { and } around the conditionally-executed statement
Co-authored-by: compilade <git@compilade.net>
2025-08-17 08:08:26 +01:00
Ed Addario 97d839c441
Using one line per variable definition
Co-authored-by: compilade <git@compilade.net>
2025-08-17 08:06:15 +01:00
Ed Addario 4a487ea7e4
Use { and } around the conditionally-executed statement
Co-authored-by: compilade <git@compilade.net>
2025-08-17 07:26:16 +01:00
Ed Addario e3149a2168
Use the corresponding size
Co-authored-by: compilade <git@compilade.net>
2025-08-17 07:24:27 +01:00
Ed Addario d4b0d89115
Fix return type bug 2025-08-16 11:00:43 +01:00
Ed Addario 030ec53d7a
Remove unnecessary include 2025-08-16 10:46:09 +01:00
Ed Addario 42bfe3b2a3
Update stats output sort based on imatrix type 2025-08-15 21:12:56 +01:00
Ed Addario dcac206f8e
Add --activation-statistics logic to avoid doubling the imatrix size by default 2025-08-09 14:49:25 +01:00
Ed Addario c5ecdaa1a1
Add Euclidean–Cosine Score (ECS) 2025-08-07 19:04:49 +01:00
Ed Addario 5bb2def02d
Add --activation-statistics parameter 2025-08-07 17:41:21 +01:00
Ed Addario dadd90ef73
Rename report heading 2025-08-07 14:07:48 +01:00
Ed Addario e0d6471340
Reverse conditional logic to match convention 2025-08-07 12:04:52 +01:00
Ed Addario 3e9d53c61e
Refactor variable names 2025-08-07 12:03:24 +01:00
Ed Addario 030ed3c909
Merge branch 'master' into imatrix 2025-08-05 21:58:00 +01:00
Ed Addario 88854c9179
Refactor legacy mode 2025-08-05 14:16:45 +01:00
Ed Addario 4c3fea89d6
Update report layout 2025-08-05 13:32:59 +01:00
Ed Addario 49996a19da
Refactor variable names 2025-08-05 13:32:46 +01:00
Ed Addario aea9b31db5
Make ZD Score two-tailed 2025-08-05 12:57:13 +01:00
Ed Addario 906548a00a
Update aggregated sum of squared activations per layer 2025-08-05 12:06:19 +01:00
Ed Addario b37393423d
Compute aggregated (per layer) l2 norm 2025-08-05 08:54:57 +01:00
Ed Addario 5e40cf4f1c
Do not resize if in_sum is null 2025-08-05 00:18:53 +01:00
compilade 19f68fa5a4
imatrix : warn when GGUF imatrix is saved without .gguf suffix (#15076)
* imatrix : add warning when suffix is not .gguf for GGUF imatrix

* imatrix : only warn about suffix when output format is unspecified
2025-08-04 23:26:52 +02:00
Ed Addario adbff66394
Merge branch 'master' into imatrix 2025-08-04 22:16:10 +01:00
Ed Addario c39c4e2a33
Refactor variable name 2025-08-04 22:15:50 +01:00
compilade d31192b4ee
imatrix : use GGUF by default (#14842)
* imatrix : use GGUF by default

* imatrix : use GGUF regardless of the output filename

The legacy format can only be produced with --output-format dat
2025-08-03 22:00:05 +02:00
compilade 0a2f5496be
imatrix : fix 3d activation handling for hybrid and recurrent models (#14994)
* imatrix : use a single count for dense 3d tensors

* imatrix : fix 3d activations when model tensor is 2d

* imatrix : fix 3d tensor counts
2025-08-03 21:49:13 +02:00
Ed Addario f1c2a4ca3f
Fix printing l2 norm when calc_mode = 1 2025-08-03 17:14:46 +01:00
Ed Addario 90cb1be99d
Minor cosmetic changes 2025-08-03 16:57:27 +01:00
Ed Addario 2117c4e54b
Update aggregated statistic report layout 2025-08-03 16:38:02 +01:00
Ed Addario a6155a8125
Add compute_layer_statistics() function 2025-08-03 16:35:03 +01:00
Ed Addario be60469f25
Refactor function names 2025-08-03 15:10:17 +01:00
Ed Addario fce05aac9e
Refactor lambda into compute_tensor_averages() function 2025-08-03 13:03:21 +01:00
Ed Addario 5324558132
Update table layout 2025-08-03 10:28:47 +01:00
Ed Addario 4d1325e1eb
Refactor variables 2025-08-03 10:28:23 +01:00
Ed Addario a32a2ecbed
Reformat report layout 2025-08-03 00:51:33 +01:00
Ed Addario 4c01f51ae1
Remove inactive 2025-08-03 00:51:12 +01:00
Ed Addario fc8f92596f
Update table display 2025-08-02 16:46:27 +01:00
Ed Addario ee2509f563
Adjust threshold 2025-08-02 16:45:56 +01:00
Ed Addario 9b841eb696
Compute l2 norm 2025-08-02 16:45:09 +01:00
Ed Addario b7fb362d8e
Compute cosine similarity based on activations 2025-08-02 16:43:49 +01:00
Ed Addario cce514a392
Compute entropy for activations 2025-08-02 16:40:40 +01:00
Ed Addario 9744a4a1c6
Determine calculation mode 2025-08-02 16:36:12 +01:00
Ed Addario 78ddb475de
Fix problem up when GGUF does not have in_sum 2025-08-02 16:31:21 +01:00
Ed Addario 2097f038b0
Refactor variable names 2025-07-31 20:46:40 +01:00
Ed Addario 09bc7c24e7
Use activations to calculate the stats 2025-07-26 17:06:41 +01:00
Ed Addario d1aa0cc5d1
imatrix: add option to display importance score statistics for a given imatrix file (#12718)
* Add --show-statistics option

* Add --show-statistics logic

* Add tensor name parsing

* Tidy output format

* Fix typo in title

* Improve tensor influence ranking

* Add better statistics

* Change statistics' sort order

* Add Cosine Similarity

* Add header search path

* Change header search path to private

* Add weighted statistics per layer

* Update report title

* Refactor compute_statistics out of main

* Refactor compute_cossim out of load_imatrix

* Refactor compute_statistics out of load_imatrix

* Move imatrix statistics calculation into its own functions

* Add checks and validations

* Remove unnecessary include directory

* Rename labels

* Add m_stats getter and refactor compute_statistics out of load_imatrix

* Refactor variable names

* Minor cosmetic change

* Retrigger checks (empty commit)

* Rerun checks (empty commit)

* Fix unnecessary type promotion

Co-authored-by: compilade <git@compilade.net>

* Reverting change to improve code readability

* Rerun checks (empty commit)

* Rerun checks (empty commit)

* Rerun checks - third time's the Charm 🤞 (empty commit)

* Minor cosmetic change

* Update README

* Fix typo

* Update README

* Rerun checks (empty commit)

* Re-implement changes on top of #9400

* Update README.md

* Update README

* Update README.md

Co-authored-by: compilade <git@compilade.net>

* Update README.md

Co-authored-by: compilade <git@compilade.net>

* Update README.md

* Remove duplicate option in print_usage()

* Update README.md

* Update README.md

Co-authored-by: compilade <git@compilade.net>

* Update README.md

Co-authored-by: compilade <git@compilade.net>

* Remove input check

* Remove commented out code

---------

Co-authored-by: compilade <git@compilade.net>
2025-07-22 14:33:37 +02:00
compilade 90083283ec
imatrix : use GGUF to store importance matrices (#9400)
* imatrix : allow processing multiple chunks per batch

* perplexity : simplify filling the batch

* imatrix : fix segfault when using a single chunk per batch

* imatrix : use GGUF to store imatrix data

* imatrix : fix conversion problems

* imatrix : use FMA and sort tensor names

* py : add requirements for legacy imatrix convert script

* perplexity : revert changes

* py : include imatrix converter requirements in toplevel requirements

* imatrix : avoid using designated initializers in C++

* imatrix : remove unused n_entries

* imatrix : allow loading mis-ordered tensors

Sums and counts tensors no longer need to be consecutive.

* imatrix : more sanity checks when loading multiple imatrix files

* imatrix : use ggml_format_name instead of std::string concatenation

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

* quantize : use unused imatrix chunk_size with LLAMA_TRACE

* common : use GGUF for imatrix output by default

* imatrix : two-way conversion between old format and GGUF

* convert : remove imatrix to gguf python script

* imatrix : use the function name in more error messages

* imatrix : don't use FMA explicitly

This should make comparisons between the formats easier
because this matches the behavior of the previous version.

* imatrix : avoid returning from void function save_imatrix

* imatrix : support 3d tensors with MUL_MAT

* quantize : fix dataset name loading from gguf imatrix

* common : move string_remove_suffix from quantize and imatrix

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* imatrix : add warning when legacy format is written

* imatrix : warn when writing partial data, to help guess dataset coverage

Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.

* imatrix : avoid loading model to convert or combine imatrix

* imatrix : avoid using imatrix.dat in README

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-07-19 12:51:22 -04:00
Georgi Gerganov 745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
* llama : deprecate llama_kv_self_ API

ggml-ci

* llama : allow llama_memory_(nullptr)

ggml-ci

* memory : add flag for optional data clear in llama_memory_clear

ggml-ci
2025-06-06 14:11:15 +03:00
Bartowski efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389)
* Add --parse-special for enabling parsing of special tokens in imatrix calculation

* whitespace
2025-05-09 11:53:58 +02:00
Georgi Gerganov 51fb96b1ff
context : remove logits_all flag (#13284)
* context : remove logits_all flag

ggml-ci

* llama : remove logits_all flag + reorder llama_context_params

ggml-ci
2025-05-08 14:26:50 +03:00