Ed Addario
229109f329
Increase importance boost for final pass
2025-11-29 10:31:39 +00:00
Ed Addario
5b557ca958
Minor refactoring
2025-11-29 10:30:20 +00:00
Ed Addario
6616008420
Use more descriptive option naming
2025-11-24 18:26:45 +00:00
Ed Addario
1c9993e131
Add --disable-tensor-importance option
2025-11-23 17:51:04 +00:00
Ed Addario
9ec3e6e262
Remove processing statistics_data
2025-11-23 17:49:53 +00:00
Ed Addario
a0ba913613
Fix lambda capture bug in Windows and initialise candidate_types struct
2025-11-19 11:19:44 +00:00
Ed Addario
ac8cfbdd12
Improved is_important() logic
2025-11-17 18:03:09 +00:00
Ed Addario
b02b1b2304
Merge branch 'master' into quantize
2025-10-31 23:20:17 +00:00
Ed Addario
c59bb6d49d
Add Euclidean-Cosine score to identify important tensors
2025-10-30 22:11:40 +00:00
Ed Addario
6e32244a06
Read statistics from imatrix
2025-10-30 21:53:07 +00:00
Jan Boon
d7395115ba
llama : use std::abs instead of abs ( #16853 )
2025-10-30 08:30:58 +02:00
Ed Addario
f8863b9a80
Minor refactoring
2025-10-28 15:22:32 +00:00
Ed Addario
5303212324
Simplify tensor selection
2025-10-26 17:40:52 +00:00
Ed Addario
d6ccd5649a
Finetune heuristics
2025-10-25 12:09:20 +01:00
Ed Addario
04561d5782
Update epsilon specifier
2025-10-21 12:53:26 +01:00
Ed Addario
27bf25e93c
Fix lambda capture
2025-10-20 22:04:35 +01:00
Ed Addario
543b5a99db
Fix lambda capture
2025-10-20 21:57:03 +01:00
Ed Addario
fa1df81d49
Finetune heuristics
2025-10-20 20:52:23 +01:00
Ed Addario
41a0069613
Merge branch 'master' into quantize
2025-10-16 22:20:04 +01:00
Ed Addario
a5103933bb
Minor refactoring
2025-10-16 15:11:48 +01:00
Ed Addario
0b3e930d52
Add option to override bpw state file name
2025-10-16 11:41:26 +01:00
Ed Addario
a6853ea2ae
Add tensor type and depth heuristics
2025-10-16 11:20:24 +01:00
Xuan-Son Nguyen
3e3cb19f64
llama-quant: add support for mmproj ( #16592 )
...
* llama-quant: add support for mmproj
* Update src/llama.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* check prefix instead
* small fix
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-10-15 14:48:08 +02:00
Ed Addario
b7911f1431
Minor refactoring
2025-10-13 17:46:45 +01:00
Ed Addario
cd734b89ce
Update quant types
2025-10-13 15:15:23 +01:00
Ed Addario
b1b58e67df
Refactor signal handlers
2025-10-13 14:54:32 +01:00
Ed Addario
ca282302b5
Add --keep-bpw-state option
2025-10-12 18:23:23 +01:00
Ed Addario
b6094a97bf
Add quant types
2025-10-12 16:30:35 +01:00
Ed Addario
12e0524f3a
Reduce compute time by parallelising tensor processing - courtesy of https://github.com/ddh0
2025-10-12 15:12:15 +01:00
Ed Addario
5b0d3f6d5a
Automatically determine if bias error is significant
2025-10-11 10:04:48 +01:00
Ed Addario
c93131cef6
Remove --no-bias option
2025-10-10 13:26:51 +01:00
Ed Addario
3a3d807fc3
Remove bias mode computation
2025-10-10 13:10:42 +01:00
Ed Addario
c11184a3c1
Generate model ID hash
2025-10-09 11:58:01 +01:00
Ed Addario
044fa783c7
Fix trimming logic
2025-10-06 21:40:37 +01:00
Ed Addario
84ada44894
Uninstall signal handler and cleanup
2025-10-05 20:20:56 +01:00
Ed Addario
46706cec28
Persist progress
2025-10-05 20:20:28 +01:00
Ed Addario
74c62ed4e6
Add delete_bpw_state()
2025-10-05 20:19:03 +01:00
Ed Addario
02c3073b81
Add load_bpw_state()
2025-10-05 20:18:36 +01:00
Ed Addario
e48ca32f19
Add save_bpw_state()
2025-10-05 20:17:27 +01:00
Ed Addario
533cda3076
Add signal handler
2025-10-05 20:16:33 +01:00
Ed Addario
560e8c9d70
Relax lambda clamping
2025-10-05 14:41:42 +01:00
Ed Addario
f5d8811ddd
Prioritise important tensors
2025-10-01 19:04:43 +01:00
Ed Addario
b3b8a111a5
Compute rows based on tensor shape and slice count
2025-09-28 18:45:25 +01:00
Ed Addario
e49e241d37
Calculate bpw over all tensors
2025-09-27 17:28:39 +01:00
Ed Addario
3d75b14c0f
Simplify dequantisation
2025-09-27 17:27:58 +01:00
Ed Addario
8a2c71f471
Check for direction reversal
2025-09-27 17:27:29 +01:00
Ed Addario
87cba65908
Tighten worker allocator
2025-09-27 17:26:30 +01:00
Ed Addario
d16945730e
Refactor outlier trimming
2025-09-27 17:25:29 +01:00
Ed Addario
dd4f4bd0b8
Reduce bpw range
2025-09-27 17:23:48 +01:00
Ed Addario
dbdd179a92
Combine quant types
2025-09-25 19:50:20 +01:00