gemma.cpp/compression
Jan Wassenberg 8532da47f7 Major refactor of allocator/args:
use new ThreadingContext2 instead of monostate/init in each frontend
Add ThreadingArgs(replaces AppArgs)

backprop: use Packed() accessor and MakePacked factory and row-based access to allow for stride
compress_weights: remove, moving to py-only exporter instead

Move MatPtr to mat.h and revise interface:
- Generic MatOwner
- rename accessors to Packed*
- support stride/row accessors, fix RowPtr stride

Add TypeBits(Type)
Move GenerateMat to test_util-inl for sharing between matmul test/bench
Move internal init to gemma.cc to avoid duplication
Rename GemmaEnv model_ to gemma_ for disambiguating vs upcoming ModelStorage
Remove --compressed_weights, use --weights instead.
tensor_index: add ExtentsFromInfo and TensorIndexLLM/Img
Allocator: use normal unique_ptr for AllocBytes so users can call directly
threading: use -> because AlignedPtr no longer assumes arrays
PiperOrigin-RevId: 745918637
2025-04-10 01:29:54 -07:00
..
python Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
BUILD.bazel Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
analyze.h Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
blob_compare.cc Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
blob_store.cc Expose BlobReader::Keys() 2024-11-07 10:28:39 -08:00
blob_store.h Added the TensorInfo arg to the compressor so the shape and scale can be output correctly to the file in future. 2024-12-11 01:26:35 -08:00
blob_store_test.cc Expose BlobReader::Keys() 2024-11-07 10:28:39 -08:00
compress-inl.h Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
compress.cc Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
compress.h Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
compress_test.cc Allow conversion, loading and inference with NUQ. 2025-02-05 07:45:54 -08:00
convert_weights.py Cleanup: move util/compress and convert_weights to compression/ 2024-07-05 04:16:52 -07:00
distortion.h Refactor/cleanup, remove even_odd 2024-09-04 09:25:13 -07:00
distortion_test.cc Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
fields.cc Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes 2025-04-08 03:35:55 -07:00
fields.h Minor cleanup: enable 0,0 Extents2D, add SerializedSpan typedef, include fixes 2025-04-08 03:35:55 -07:00
fields_test.cc Added ability to load/save a complete model file, including tokenizer. 2024-12-19 07:59:41 -08:00
io.cc Further improve IO, enable multiple backends without -D. 2024-04-19 00:40:29 -07:00
io.h Major duplicated code reduction in test/benchmarks 2024-06-14 00:16:25 -07:00
io_win.cc Further improve IO, enable multiple backends without -D. 2024-04-19 00:40:29 -07:00
migrate_weights.cc Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
nuq-inl.h Fix nuq Enc() to handle groups < kGroupSize. 2025-02-10 07:17:59 -08:00
nuq_test.cc Base interleaved handling for 4.5-bit NUQ, specifically Enc, DecompressAndZeroPad, and Dec2. Includes tests. 2025-01-31 10:35:32 -08:00
sfp-inl.h Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
sfp_test.cc Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
shared.h Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00
test_util-inl.h Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00