Commit Graph

10 Commits

Author SHA1 Message Date
Jan Wassenberg 2b4c16e243 Remove Griffin support
Also add IsObsolete helper

PiperOrigin-RevId: 803376921
2025-09-05 02:35:40 -07:00
Charles Zhao 50ee1a3e92 Write SBS progressively.
(1) Directly write to file in BlobWriter::Add and destruct the MatOwner to release the rams.

(2) Write a fake header to indicate this is V2, and write correct header and directory at the end of the file.

(3) Tested on loading sbs written the old way, and new way, both worked.

PiperOrigin-RevId: 789306837
2025-07-31 06:05:38 -07:00
Jan Wassenberg cf4d7ceb82 1.16x decode speedup: remove last MatVec in Attention
Precompute row pointers.
Remove no longer used MHA support; QStride -> qkv_dim.
Remove RowPtr from MatMul interface, use only MatPtrT.
Require opt-in define for NUQ to speed up builds.
Also fix io.cc on Windows.

PiperOrigin-RevId: 766228108
2025-06-02 09:40:29 -07:00
Jan Wassenberg 8d0882b966 Huge refactor of weight handling and model loading.
Weight handling:
- new ModelStore2 supports both pre-2025 multi-file and single-file formats
- simpler ForEachTensor with TensorArgs
- tensors are constructed with their full suffixed name

I/O:
- support mmap and stride
- Simplified SbsWriter, single insert(); add SbsReader

Misc:
- kMockTokenizer: allow creating with unavailable tokenizer
- configs.h: Simpler enum validity checks via kSentinel
- matmul.h: remove unused enable_bind (now in allocator.h)
- tensor_info: single TensorInfoRegistry class, rename from tensor_index.h

Frontends:
- Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&)
- Deduce model/weight type, remove --model and parsing
- Replace most common.h includes with configs.h
- Remove --compressed_weights, use --weights instead
- Remove ModelInfo, replaced by ModelConfig.

Backprop:
- Reduce max loss, remove backward_scalar_test (timeout)
- Update thresholds because new RandInit changes rng eval order and thus numerics
PiperOrigin-RevId: 755317484
2025-05-06 04:44:21 -07:00
Phil Culliton 7ccc6abe87 Allow conversion, loading and inference with NUQ.
PiperOrigin-RevId: 723507890
2025-02-05 07:45:54 -08:00
Daniel Keysers 7af2e70321 Add python wrappers for configs and inference.
Enable building compression/python/compression_test using bazel.
Add default image path for image_test and paligemma_test.

PiperOrigin-RevId: 720583438
2025-01-28 08:22:03 -08:00
Ray Smith e69bc3bc1c Added the TensorInfo arg to the compressor so the shape and scale can be output correctly to the file in future.
Corrected some errors in the TensorIndex.

PiperOrigin-RevId: 705014619
2024-12-11 01:26:35 -08:00
Ray Smith 3d1625d8c5 Improved consistency of compressor API, and added a universal method with a target type arg.
Moved configs pybind up to root level.

PiperOrigin-RevId: 698743417
2024-11-21 05:27:40 -08:00
Daniel Keysers 1c8ddcdffe Adds insert_float() to SbsWriter() to store a float array directly.
PiperOrigin-RevId: 673982528
2024-09-12 13:27:24 -07:00
Jan Wassenberg 41efec4dba Add Py bindings for weight compression
TODO: this uses clif instead of pybind11, and depends on absl.

PiperOrigin-RevId: 649575815
2024-07-05 01:06:00 -07:00