Ed Addario
97d839c441
Using one line per variable definition
...
Co-authored-by: compilade <git@compilade.net>
2025-08-17 08:06:15 +01:00
Ed Addario
4a487ea7e4
Use { and } around the conditionally-executed statement
...
Co-authored-by: compilade <git@compilade.net>
2025-08-17 07:26:16 +01:00
Ed Addario
e3149a2168
Use the corresponding size
...
Co-authored-by: compilade <git@compilade.net>
2025-08-17 07:24:27 +01:00
Ed Addario
d4b0d89115
Fix return type bug
2025-08-16 11:00:43 +01:00
Ed Addario
030ec53d7a
Remove unnecessary include
2025-08-16 10:46:09 +01:00
Ed Addario
42bfe3b2a3
Update stats output sort based on imatrix type
2025-08-15 21:12:56 +01:00
Ed Addario
dcac206f8e
Add --activation-statistics logic to avoid doubling the imatrix size by default
2025-08-09 14:49:25 +01:00
Ed Addario
c5ecdaa1a1
Add Euclidean–Cosine Score (ECS)
2025-08-07 19:04:49 +01:00
Ed Addario
5bb2def02d
Add --activation-statistics parameter
2025-08-07 17:41:21 +01:00
Ed Addario
dadd90ef73
Rename report heading
2025-08-07 14:07:48 +01:00
Ed Addario
e0d6471340
Reverse conditional logic to match convention
2025-08-07 12:04:52 +01:00
Ed Addario
3e9d53c61e
Refactor variable names
2025-08-07 12:03:24 +01:00
Ed Addario
030ed3c909
Merge branch 'master' into imatrix
2025-08-05 21:58:00 +01:00
Ed Addario
88854c9179
Refactor legacy mode
2025-08-05 14:16:45 +01:00
Ed Addario
4c3fea89d6
Update report layout
2025-08-05 13:32:59 +01:00
Ed Addario
49996a19da
Refactor variable names
2025-08-05 13:32:46 +01:00
Ed Addario
aea9b31db5
Make ZD Score two-tailed
2025-08-05 12:57:13 +01:00
Ed Addario
906548a00a
Update aggregated sum of squared activations per layer
2025-08-05 12:06:19 +01:00
Ed Addario
b37393423d
Compute aggregated (per layer) l2 norm
2025-08-05 08:54:57 +01:00
Ed Addario
5e40cf4f1c
Do not resize if in_sum is null
2025-08-05 00:18:53 +01:00
compilade
19f68fa5a4
imatrix : warn when GGUF imatrix is saved without .gguf suffix ( #15076 )
...
* imatrix : add warning when suffix is not .gguf for GGUF imatrix
* imatrix : only warn about suffix when output format is unspecified
2025-08-04 23:26:52 +02:00
Ed Addario
adbff66394
Merge branch 'master' into imatrix
2025-08-04 22:16:10 +01:00
Ed Addario
c39c4e2a33
Refactor variable name
2025-08-04 22:15:50 +01:00
compilade
d31192b4ee
imatrix : use GGUF by default ( #14842 )
...
* imatrix : use GGUF by default
* imatrix : use GGUF regardless of the output filename
The legacy format can only be produced with --output-format dat
2025-08-03 22:00:05 +02:00
compilade
0a2f5496be
imatrix : fix 3d activation handling for hybrid and recurrent models ( #14994 )
...
* imatrix : use a single count for dense 3d tensors
* imatrix : fix 3d activations when model tensor is 2d
* imatrix : fix 3d tensor counts
2025-08-03 21:49:13 +02:00
Ed Addario
f1c2a4ca3f
Fix printing l2 norm when calc_mode = 1
2025-08-03 17:14:46 +01:00
Ed Addario
90cb1be99d
Minor cosmetic changes
2025-08-03 16:57:27 +01:00
Ed Addario
2117c4e54b
Update aggregated statistic report layout
2025-08-03 16:38:02 +01:00
Ed Addario
a6155a8125
Add compute_layer_statistics() function
2025-08-03 16:35:03 +01:00
Ed Addario
be60469f25
Refactor function names
2025-08-03 15:10:17 +01:00
Ed Addario
fce05aac9e
Refactor lambda into compute_tensor_averages() function
2025-08-03 13:03:21 +01:00
Ed Addario
5324558132
Update table layout
2025-08-03 10:28:47 +01:00
Ed Addario
4d1325e1eb
Refactor variables
2025-08-03 10:28:23 +01:00
Ed Addario
a32a2ecbed
Reformat report layout
2025-08-03 00:51:33 +01:00
Ed Addario
4c01f51ae1
Remove inactive
2025-08-03 00:51:12 +01:00
Ed Addario
fc8f92596f
Update table display
2025-08-02 16:46:27 +01:00
Ed Addario
ee2509f563
Adjust threshold
2025-08-02 16:45:56 +01:00
Ed Addario
9b841eb696
Compute l2 norm
2025-08-02 16:45:09 +01:00
Ed Addario
b7fb362d8e
Compute cosine similarity based on activations
2025-08-02 16:43:49 +01:00
Ed Addario
cce514a392
Compute entropy for activations
2025-08-02 16:40:40 +01:00
Ed Addario
9744a4a1c6
Determine calculation mode
2025-08-02 16:36:12 +01:00
Ed Addario
78ddb475de
Fix problem up when GGUF does not have in_sum
2025-08-02 16:31:21 +01:00
Ed Addario
2097f038b0
Refactor variable names
2025-07-31 20:46:40 +01:00
Ed Addario
09bc7c24e7
Use activations to calculate the stats
2025-07-26 17:06:41 +01:00
Ed Addario
d1aa0cc5d1
imatrix: add option to display importance score statistics for a given imatrix file ( #12718 )
...
* Add --show-statistics option
* Add --show-statistics logic
* Add tensor name parsing
* Tidy output format
* Fix typo in title
* Improve tensor influence ranking
* Add better statistics
* Change statistics' sort order
* Add Cosine Similarity
* Add header search path
* Change header search path to private
* Add weighted statistics per layer
* Update report title
* Refactor compute_statistics out of main
* Refactor compute_cossim out of load_imatrix
* Refactor compute_statistics out of load_imatrix
* Move imatrix statistics calculation into its own functions
* Add checks and validations
* Remove unnecessary include directory
* Rename labels
* Add m_stats getter and refactor compute_statistics out of load_imatrix
* Refactor variable names
* Minor cosmetic change
* Retrigger checks (empty commit)
* Rerun checks (empty commit)
* Fix unnecessary type promotion
Co-authored-by: compilade <git@compilade.net>
* Reverting change to improve code readability
* Rerun checks (empty commit)
* Rerun checks (empty commit)
* Rerun checks - third time's the Charm 🤞 (empty commit)
* Minor cosmetic change
* Update README
* Fix typo
* Update README
* Rerun checks (empty commit)
* Re-implement changes on top of #9400
* Update README.md
* Update README
* Update README.md
Co-authored-by: compilade <git@compilade.net>
* Update README.md
Co-authored-by: compilade <git@compilade.net>
* Update README.md
* Remove duplicate option in print_usage()
* Update README.md
* Update README.md
Co-authored-by: compilade <git@compilade.net>
* Update README.md
Co-authored-by: compilade <git@compilade.net>
* Remove input check
* Remove commented out code
---------
Co-authored-by: compilade <git@compilade.net>
2025-07-22 14:33:37 +02:00
compilade
90083283ec
imatrix : use GGUF to store importance matrices ( #9400 )
...
* imatrix : allow processing multiple chunks per batch
* perplexity : simplify filling the batch
* imatrix : fix segfault when using a single chunk per batch
* imatrix : use GGUF to store imatrix data
* imatrix : fix conversion problems
* imatrix : use FMA and sort tensor names
* py : add requirements for legacy imatrix convert script
* perplexity : revert changes
* py : include imatrix converter requirements in toplevel requirements
* imatrix : avoid using designated initializers in C++
* imatrix : remove unused n_entries
* imatrix : allow loading mis-ordered tensors
Sums and counts tensors no longer need to be consecutive.
* imatrix : more sanity checks when loading multiple imatrix files
* imatrix : use ggml_format_name instead of std::string concatenation
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
* quantize : use unused imatrix chunk_size with LLAMA_TRACE
* common : use GGUF for imatrix output by default
* imatrix : two-way conversion between old format and GGUF
* convert : remove imatrix to gguf python script
* imatrix : use the function name in more error messages
* imatrix : don't use FMA explicitly
This should make comparisons between the formats easier
because this matches the behavior of the previous version.
* imatrix : avoid returning from void function save_imatrix
* imatrix : support 3d tensors with MUL_MAT
* quantize : fix dataset name loading from gguf imatrix
* common : move string_remove_suffix from quantize and imatrix
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* imatrix : add warning when legacy format is written
* imatrix : warn when writing partial data, to help guess dataset coverage
Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.
* imatrix : avoid loading model to convert or combine imatrix
* imatrix : avoid using imatrix.dat in README
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-07-19 12:51:22 -04:00
Georgi Gerganov
745aa5319b
llama : deprecate llama_kv_self_ API ( #14030 )
...
* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci
2025-06-06 14:11:15 +03:00
Bartowski
efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation ( #13389 )
...
* Add --parse-special for enabling parsing of special tokens in imatrix calculation
* whitespace
2025-05-09 11:53:58 +02:00
Georgi Gerganov
51fb96b1ff
context : remove logits_all flag ( #13284 )
...
* context : remove logits_all flag
ggml-ci
* llama : remove logits_all flag + reorder llama_context_params
ggml-ci
2025-05-08 14:26:50 +03:00
Johannes Gäßler
3e959f0976
imatrix: fix oob writes if src1 is not contiguous ( #13286 )
2025-05-04 00:50:37 +02:00