gemma.cpp/compression
Jan Wassenberg 8d0882b966 Huge refactor of weight handling and model loading.
Weight handling:
- new ModelStore2 supports both pre-2025 multi-file and single-file formats
- simpler ForEachTensor with TensorArgs
- tensors are constructed with their full suffixed name

I/O:
- support mmap and stride
- Simplified SbsWriter, single insert(); add SbsReader

Misc:
- kMockTokenizer: allow creating with unavailable tokenizer
- configs.h: Simpler enum validity checks via kSentinel
- matmul.h: remove unused enable_bind (now in allocator.h)
- tensor_info: single TensorInfoRegistry class, rename from tensor_index.h

Frontends:
- Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&)
- Deduce model/weight type, remove --model and parsing
- Replace most common.h includes with configs.h
- Remove --compressed_weights, use --weights instead
- Remove ModelInfo, replaced by ModelConfig.

Backprop:
- Reduce max loss, remove backward_scalar_test (timeout)
- Update thresholds because new RandInit changes rng eval order and thus numerics
PiperOrigin-RevId: 755317484
2025-05-06 04:44:21 -07:00
..
python Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
BUILD.bazel Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
analyze.h Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
blob_compare.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
blob_store.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
blob_store.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
blob_store_test.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
compress-inl.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
compress.cc Minor cleanup, on-demand NUQ buffer allocation 2025-04-16 10:49:43 -07:00
compress.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
compress_test.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
distortion.h Refactor/cleanup, remove even_odd 2024-09-04 09:25:13 -07:00
distortion_test.cc Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
fields.cc Add mmap support (not yet used) 2025-04-10 10:03:40 -07:00
fields.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
fields_test.cc Added ability to load/save a complete model file, including tokenizer. 2024-12-19 07:59:41 -08:00
io.cc Add mmap support (not yet used) 2025-04-10 10:03:40 -07:00
io.h Add mmap support (not yet used) 2025-04-10 10:03:40 -07:00
io_win.cc Add mmap support (not yet used) 2025-04-10 10:03:40 -07:00
migrate_weights.cc Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
nuq-inl.h Fix nuq Enc() to handle groups < kGroupSize. 2025-02-10 07:17:59 -08:00
nuq_test.cc Base interleaved handling for 4.5-bit NUQ, specifically Enc, DecompressAndZeroPad, and Dec2. Includes tests. 2025-01-31 10:35:32 -08:00
sfp-inl.h Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
sfp_test.cc Major compression update, arbitrary-len unpack + new Dot 2024-09-10 02:22:19 -07:00
shared.h Huge refactor of weight handling and model loading. 2025-05-06 04:44:21 -07:00
test_util-inl.h Major refactor of allocator/args: 2025-04-10 01:29:54 -07:00