Commit Graph

161 Commits

Author SHA1 Message Date
Ed Addario 229109f329
Increase importance boost for final pass 2025-11-29 10:31:39 +00:00
Ed Addario 5b557ca958
Minor refactoring 2025-11-29 10:30:20 +00:00
Ed Addario 6616008420
Use more descriptive option naming 2025-11-24 18:26:45 +00:00
Ed Addario 1c9993e131
Add --disable-tensor-importance option 2025-11-23 17:51:04 +00:00
Ed Addario 9ec3e6e262
Remove processing statistics_data 2025-11-23 17:49:53 +00:00
Ed Addario a0ba913613
Fix lambda capture bug in Windows and initialise candidate_types struct 2025-11-19 11:19:44 +00:00
Ed Addario ac8cfbdd12
Improved is_important() logic 2025-11-17 18:03:09 +00:00
Ed Addario b02b1b2304
Merge branch 'master' into quantize 2025-10-31 23:20:17 +00:00
Ed Addario c59bb6d49d
Add Euclidean-Cosine score to identify important tensors 2025-10-30 22:11:40 +00:00
Ed Addario 6e32244a06
Read statistics from imatrix 2025-10-30 21:53:07 +00:00
Jan Boon d7395115ba
llama : use std::abs instead of abs (#16853) 2025-10-30 08:30:58 +02:00
Ed Addario f8863b9a80
Minor refactoring 2025-10-28 15:22:32 +00:00
Ed Addario 5303212324
Simplify tensor selection 2025-10-26 17:40:52 +00:00
Ed Addario d6ccd5649a
Finetune heuristics 2025-10-25 12:09:20 +01:00
Ed Addario 04561d5782
Update epsilon specifier 2025-10-21 12:53:26 +01:00
Ed Addario 27bf25e93c
Fix lambda capture 2025-10-20 22:04:35 +01:00
Ed Addario 543b5a99db
Fix lambda capture 2025-10-20 21:57:03 +01:00
Ed Addario fa1df81d49
Finetune heuristics 2025-10-20 20:52:23 +01:00
Ed Addario 41a0069613
Merge branch 'master' into quantize 2025-10-16 22:20:04 +01:00
Ed Addario a5103933bb
Minor refactoring 2025-10-16 15:11:48 +01:00
Ed Addario 0b3e930d52
Add option to override bpw state file name 2025-10-16 11:41:26 +01:00
Ed Addario a6853ea2ae
Add tensor type and depth heuristics 2025-10-16 11:20:24 +01:00
Xuan-Son Nguyen 3e3cb19f64
llama-quant: add support for mmproj (#16592)
* llama-quant: add support for mmproj

* Update src/llama.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* check prefix instead

* small fix

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-10-15 14:48:08 +02:00
Ed Addario b7911f1431
Minor refactoring 2025-10-13 17:46:45 +01:00
Ed Addario cd734b89ce
Update quant types 2025-10-13 15:15:23 +01:00
Ed Addario b1b58e67df
Refactor signal handlers 2025-10-13 14:54:32 +01:00
Ed Addario ca282302b5
Add --keep-bpw-state option 2025-10-12 18:23:23 +01:00
Ed Addario b6094a97bf
Add quant types 2025-10-12 16:30:35 +01:00
Ed Addario 12e0524f3a
Reduce compute time by parallelising tensor processing - courtesy of https://github.com/ddh0 2025-10-12 15:12:15 +01:00
Ed Addario 5b0d3f6d5a
Automatically determine if bias error is significant 2025-10-11 10:04:48 +01:00
Ed Addario c93131cef6
Remove --no-bias option 2025-10-10 13:26:51 +01:00
Ed Addario 3a3d807fc3
Remove bias mode computation 2025-10-10 13:10:42 +01:00
Ed Addario c11184a3c1
Generate model ID hash 2025-10-09 11:58:01 +01:00
Ed Addario 044fa783c7
Fix trimming logic 2025-10-06 21:40:37 +01:00
Ed Addario 84ada44894
Uninstall signal handler and cleanup 2025-10-05 20:20:56 +01:00
Ed Addario 46706cec28
Persist progress 2025-10-05 20:20:28 +01:00
Ed Addario 74c62ed4e6
Add delete_bpw_state() 2025-10-05 20:19:03 +01:00
Ed Addario 02c3073b81
Add load_bpw_state() 2025-10-05 20:18:36 +01:00
Ed Addario e48ca32f19
Add save_bpw_state() 2025-10-05 20:17:27 +01:00
Ed Addario 533cda3076
Add signal handler 2025-10-05 20:16:33 +01:00
Ed Addario 560e8c9d70
Relax lambda clamping 2025-10-05 14:41:42 +01:00
Ed Addario f5d8811ddd
Prioritise important tensors 2025-10-01 19:04:43 +01:00
Ed Addario b3b8a111a5
Compute rows based on tensor shape and slice count 2025-09-28 18:45:25 +01:00
Ed Addario e49e241d37
Calculate bpw over all tensors 2025-09-27 17:28:39 +01:00
Ed Addario 3d75b14c0f
Simplify dequantisation 2025-09-27 17:27:58 +01:00
Ed Addario 8a2c71f471
Check for direction reversal 2025-09-27 17:27:29 +01:00
Ed Addario 87cba65908
Tighten worker allocator 2025-09-27 17:26:30 +01:00
Ed Addario d16945730e
Refactor outlier trimming 2025-09-27 17:25:29 +01:00
Ed Addario dd4f4bd0b8
Reduce bpw range 2025-09-27 17:23:48 +01:00
Ed Addario dbdd179a92
Combine quant types 2025-09-25 19:50:20 +01:00