Commit Graph

11 Commits

Author SHA1 Message Date
Jan Wassenberg 275135d7e8 Rename-only: remove Allocator2 etc suffixes now that refactoring is complete
PiperOrigin-RevId: 755397220
2025-05-06 09:12:43 -07:00
Jan Wassenberg 8d0882b966 Huge refactor of weight handling and model loading.
Weight handling:
- new ModelStore2 supports both pre-2025 multi-file and single-file formats
- simpler ForEachTensor with TensorArgs
- tensors are constructed with their full suffixed name

I/O:
- support mmap and stride
- Simplified SbsWriter, single insert(); add SbsReader

Misc:
- kMockTokenizer: allow creating with unavailable tokenizer
- configs.h: Simpler enum validity checks via kSentinel
- matmul.h: remove unused enable_bind (now in allocator.h)
- tensor_info: single TensorInfoRegistry class, rename from tensor_index.h

Frontends:
- Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&)
- Deduce model/weight type, remove --model and parsing
- Replace most common.h includes with configs.h
- Remove --compressed_weights, use --weights instead
- Remove ModelInfo, replaced by ModelConfig.

Backprop:
- Reduce max loss, remove backward_scalar_test (timeout)
- Update thresholds because new RandInit changes rng eval order and thus numerics
PiperOrigin-RevId: 755317484
2025-05-06 04:44:21 -07:00
Jan Wassenberg fe80f10ed7 Backprop test fixes and allocator cleanup
- Shorten backprop tests to prevent timeout
- Add line number of failing test
- matmul: remove unused enable_bind
- allocator: we will retain enable_bind there
- mat: disable cyclic padding optimization (broken)

PiperOrigin-RevId: 752656068
2025-04-29 03:01:10 -07:00
Jan Wassenberg 8532da47f7 Major refactor of allocator/args:
use new ThreadingContext2 instead of monostate/init in each frontend
Add ThreadingArgs(replaces AppArgs)

backprop: use Packed() accessor and MakePacked factory and row-based access to allow for stride
compress_weights: remove, moving to py-only exporter instead

Move MatPtr to mat.h and revise interface:
- Generic MatOwner
- rename accessors to Packed*
- support stride/row accessors, fix RowPtr stride

Add TypeBits(Type)
Move GenerateMat to test_util-inl for sharing between matmul test/bench
Move internal init to gemma.cc to avoid duplication
Rename GemmaEnv model_ to gemma_ for disambiguating vs upcoming ModelStorage
Remove --compressed_weights, use --weights instead.
tensor_index: add ExtentsFromInfo and TensorIndexLLM/Img
Allocator: use normal unique_ptr for AllocBytes so users can call directly
threading: use -> because AlignedPtr no longer assumes arrays
PiperOrigin-RevId: 745918637
2025-04-10 01:29:54 -07:00
Ray Smith 6254f2e5ca Removed duplicated tensor sizes from weights.h by changing the constructor used for MatPtrT
PiperOrigin-RevId: 705085054
2024-12-11 06:30:28 -08:00
Ray Smith 0d68555f87 Eliminated TConfig.
Changed CompressedLayer and CompressedWeights to be constructed with an instance of a LayerConfig and WeightsConfig respectively.
Added CompressedModel to remove ByteStorageT and get rid of most of the type casting, as well as allowing the default destructor to be used and work properly.
Adjusted WeightsWrapper and ForwardLayer etc to match.
The only remaining template arg is the weight type.
This enables all the instantiations to be deleted, apart from one per type.
It also enables (but not yet done) the config to be stored in the blob file instead of having to be specified separately.
Reduces the size of the gemma_lib and weights shared libraries by a factor of 4.3 and 3.2 respectively.

PiperOrigin-RevId: 686870060
2024-10-17 05:04:22 -07:00
Ray Smith 85958f5fd3 Added MatPtr/MatPtrT/MatStorageT/MatStorage as a dynamically-sized replacement for CompressedArray.
Definition of array size is moved to the constructor.
Allocation is separate and parallelized.
All users of weights_raw.h migrated to CompressedWeights and weights_raw.h deleted.
Replaced all previous ForEachTensor functions with a single unified function.

PiperOrigin-RevId: 684451604
2024-10-10 08:22:30 -07:00
Jan Wassenberg cbb67b4ee0 Move benchmark_helper to evals/, weights_raw to compression/.
PiperOrigin-RevId: 650155983
2024-07-08 01:13:23 -07:00
Jan Wassenberg e588a7f45d Add config for att/final cap, skip max-subtract. Fixes #278
Also update includes/deps for backprop/.

PiperOrigin-RevId: 648399222
2024-07-01 09:45:26 -07:00
Jan Wassenberg 7d0720675f Move raw_weights into separate header, used mainly by compress_weights.
Fix warnings in backprop/* (include)

PiperOrigin-RevId: 643983136
2024-06-17 06:17:02 -07:00
Zoltan Szabadka df01700b54 Move the backpropagation code to its own directory 2024-06-04 10:20:16 +00:00
Renamed from gemma/test_util.h (Browse further)