llama.cpp

Commit Graph

Author	SHA1	Message	Date
Ed Addario	97d839c441	Using one line per variable definition Co-authored-by: compilade <git@compilade.net>	2025-08-17 08:06:15 +01:00
Ed Addario	4a487ea7e4	Use { and } around the conditionally-executed statement Co-authored-by: compilade <git@compilade.net>	2025-08-17 07:26:16 +01:00
Ed Addario	e3149a2168	Use the corresponding size Co-authored-by: compilade <git@compilade.net>	2025-08-17 07:24:27 +01:00
Ed Addario	d4b0d89115	Fix return type bug	2025-08-16 11:00:43 +01:00
Ed Addario	030ec53d7a	Remove unnecessary include	2025-08-16 10:46:09 +01:00
Ed Addario	42bfe3b2a3	Update stats output sort based on imatrix type	2025-08-15 21:12:56 +01:00
Ed Addario	dcac206f8e	Add --activation-statistics logic to avoid doubling the imatrix size by default	2025-08-09 14:49:25 +01:00
Ed Addario	c5ecdaa1a1	Add Euclidean–Cosine Score (ECS)	2025-08-07 19:04:49 +01:00
Ed Addario	5bb2def02d	Add --activation-statistics parameter	2025-08-07 17:41:21 +01:00
Ed Addario	dadd90ef73	Rename report heading	2025-08-07 14:07:48 +01:00
Ed Addario	e0d6471340	Reverse conditional logic to match convention	2025-08-07 12:04:52 +01:00
Ed Addario	3e9d53c61e	Refactor variable names	2025-08-07 12:03:24 +01:00
Ed Addario	030ed3c909	Merge branch 'master' into imatrix	2025-08-05 21:58:00 +01:00
Ed Addario	88854c9179	Refactor legacy mode	2025-08-05 14:16:45 +01:00
Ed Addario	4c3fea89d6	Update report layout	2025-08-05 13:32:59 +01:00
Ed Addario	49996a19da	Refactor variable names	2025-08-05 13:32:46 +01:00
Ed Addario	aea9b31db5	Make ZD Score two-tailed	2025-08-05 12:57:13 +01:00
Ed Addario	906548a00a	Update aggregated sum of squared activations per layer	2025-08-05 12:06:19 +01:00
Ed Addario	b37393423d	Compute aggregated (per layer) l2 norm	2025-08-05 08:54:57 +01:00
Ed Addario	5e40cf4f1c	Do not resize if in_sum is null	2025-08-05 00:18:53 +01:00
compilade	19f68fa5a4	imatrix : warn when GGUF imatrix is saved without .gguf suffix (#15076 ) * imatrix : add warning when suffix is not .gguf for GGUF imatrix * imatrix : only warn about suffix when output format is unspecified	2025-08-04 23:26:52 +02:00
Ed Addario	adbff66394	Merge branch 'master' into imatrix	2025-08-04 22:16:10 +01:00
Ed Addario	c39c4e2a33	Refactor variable name	2025-08-04 22:15:50 +01:00
compilade	d31192b4ee	imatrix : use GGUF by default (#14842 ) * imatrix : use GGUF by default * imatrix : use GGUF regardless of the output filename The legacy format can only be produced with --output-format dat	2025-08-03 22:00:05 +02:00
compilade	0a2f5496be	imatrix : fix 3d activation handling for hybrid and recurrent models (#14994 ) * imatrix : use a single count for dense 3d tensors * imatrix : fix 3d activations when model tensor is 2d * imatrix : fix 3d tensor counts	2025-08-03 21:49:13 +02:00
Ed Addario	f1c2a4ca3f	Fix printing l2 norm when calc_mode = 1	2025-08-03 17:14:46 +01:00
Ed Addario	90cb1be99d	Minor cosmetic changes	2025-08-03 16:57:27 +01:00
Ed Addario	2117c4e54b	Update aggregated statistic report layout	2025-08-03 16:38:02 +01:00
Ed Addario	a6155a8125	Add compute_layer_statistics() function	2025-08-03 16:35:03 +01:00
Ed Addario	be60469f25	Refactor function names	2025-08-03 15:10:17 +01:00
Ed Addario	fce05aac9e	Refactor lambda into compute_tensor_averages() function	2025-08-03 13:03:21 +01:00
Ed Addario	5324558132	Update table layout	2025-08-03 10:28:47 +01:00
Ed Addario	4d1325e1eb	Refactor variables	2025-08-03 10:28:23 +01:00
Ed Addario	a32a2ecbed	Reformat report layout	2025-08-03 00:51:33 +01:00
Ed Addario	4c01f51ae1	Remove inactive	2025-08-03 00:51:12 +01:00
Ed Addario	fc8f92596f	Update table display	2025-08-02 16:46:27 +01:00
Ed Addario	ee2509f563	Adjust threshold	2025-08-02 16:45:56 +01:00
Ed Addario	9b841eb696	Compute l2 norm	2025-08-02 16:45:09 +01:00
Ed Addario	b7fb362d8e	Compute cosine similarity based on activations	2025-08-02 16:43:49 +01:00
Ed Addario	cce514a392	Compute entropy for activations	2025-08-02 16:40:40 +01:00
Ed Addario	9744a4a1c6	Determine calculation mode	2025-08-02 16:36:12 +01:00
Ed Addario	78ddb475de	Fix problem up when GGUF does not have in_sum	2025-08-02 16:31:21 +01:00
Ed Addario	2097f038b0	Refactor variable names	2025-07-31 20:46:40 +01:00
Ed Addario	09bc7c24e7	Use activations to calculate the stats	2025-07-26 17:06:41 +01:00
Ed Addario	d1aa0cc5d1	imatrix: add option to display importance score statistics for a given imatrix file (#12718 ) * Add --show-statistics option * Add --show-statistics logic * Add tensor name parsing * Tidy output format * Fix typo in title * Improve tensor influence ranking * Add better statistics * Change statistics' sort order * Add Cosine Similarity * Add header search path * Change header search path to private * Add weighted statistics per layer * Update report title * Refactor compute_statistics out of main * Refactor compute_cossim out of load_imatrix * Refactor compute_statistics out of load_imatrix * Move imatrix statistics calculation into its own functions * Add checks and validations * Remove unnecessary include directory * Rename labels * Add m_stats getter and refactor compute_statistics out of load_imatrix * Refactor variable names * Minor cosmetic change * Retrigger checks (empty commit) * Rerun checks (empty commit) * Fix unnecessary type promotion Co-authored-by: compilade <git@compilade.net> * Reverting change to improve code readability * Rerun checks (empty commit) * Rerun checks (empty commit) * Rerun checks - third time's the Charm 🤞 (empty commit) * Minor cosmetic change * Update README * Fix typo * Update README * Rerun checks (empty commit) * Re-implement changes on top of #9400 * Update README.md * Update README * Update README.md Co-authored-by: compilade <git@compilade.net> * Update README.md Co-authored-by: compilade <git@compilade.net> * Update README.md * Remove duplicate option in print_usage() * Update README.md * Update README.md Co-authored-by: compilade <git@compilade.net> * Update README.md Co-authored-by: compilade <git@compilade.net> * Remove input check * Remove commented out code --------- Co-authored-by: compilade <git@compilade.net>	2025-07-22 14:33:37 +02:00
compilade	90083283ec	imatrix : use GGUF to store importance matrices (#9400 ) * imatrix : allow processing multiple chunks per batch * perplexity : simplify filling the batch * imatrix : fix segfault when using a single chunk per batch * imatrix : use GGUF to store imatrix data * imatrix : fix conversion problems * imatrix : use FMA and sort tensor names * py : add requirements for legacy imatrix convert script * perplexity : revert changes * py : include imatrix converter requirements in toplevel requirements * imatrix : avoid using designated initializers in C++ * imatrix : remove unused n_entries * imatrix : allow loading mis-ordered tensors Sums and counts tensors no longer need to be consecutive. * imatrix : more sanity checks when loading multiple imatrix files * imatrix : use ggml_format_name instead of std::string concatenation Co-authored-by: Xuan Son Nguyen <son@huggingface.co> * quantize : use unused imatrix chunk_size with LLAMA_TRACE * common : use GGUF for imatrix output by default * imatrix : two-way conversion between old format and GGUF * convert : remove imatrix to gguf python script * imatrix : use the function name in more error messages * imatrix : don't use FMA explicitly This should make comparisons between the formats easier because this matches the behavior of the previous version. * imatrix : avoid returning from void function save_imatrix * imatrix : support 3d tensors with MUL_MAT * quantize : fix dataset name loading from gguf imatrix * common : move string_remove_suffix from quantize and imatrix Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * imatrix : add warning when legacy format is written * imatrix : warn when writing partial data, to help guess dataset coverage Also make the legacy format store partial data by using neutral values for missing data. This matches what is done at read-time for the new format, and so should get the same quality in case the old format is still used. * imatrix : avoid loading model to convert or combine imatrix * imatrix : avoid using imatrix.dat in README --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-07-19 12:51:22 -04:00
Georgi Gerganov	745aa5319b	llama : deprecate llama_kv_self_ API (#14030 ) * llama : deprecate llama_kv_self_ API ggml-ci * llama : allow llama_memory_(nullptr) ggml-ci * memory : add flag for optional data clear in llama_memory_clear ggml-ci	2025-06-06 14:11:15 +03:00
Bartowski	efb8b47eda	imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389 ) * Add --parse-special for enabling parsing of special tokens in imatrix calculation * whitespace	2025-05-09 11:53:58 +02:00
Georgi Gerganov	51fb96b1ff	context : remove logits_all flag (#13284 ) * context : remove logits_all flag ggml-ci * llama : remove logits_all flag + reorder llama_context_params ggml-ci	2025-05-08 14:26:50 +03:00
Johannes Gäßler	3e959f0976	imatrix: fix oob writes if src1 is not contiguous (#13286 )	2025-05-04 00:50:37 +02:00

1 2

51 Commits