Commit Graph

23 Commits

Author SHA1 Message Date
Jan Wassenberg c8d92948f4 Move fields, io* and blob* from compression/ into io/
PiperOrigin-RevId: 755445712
2025-05-06 11:17:19 -07:00
Jan Wassenberg 275135d7e8 Rename-only: remove Allocator2 etc suffixes now that refactoring is complete
PiperOrigin-RevId: 755397220
2025-05-06 09:12:43 -07:00
Jan Wassenberg 8d0882b966 Huge refactor of weight handling and model loading.
Weight handling:
- new ModelStore2 supports both pre-2025 multi-file and single-file formats
- simpler ForEachTensor with TensorArgs
- tensors are constructed with their full suffixed name

I/O:
- support mmap and stride
- Simplified SbsWriter, single insert(); add SbsReader

Misc:
- kMockTokenizer: allow creating with unavailable tokenizer
- configs.h: Simpler enum validity checks via kSentinel
- matmul.h: remove unused enable_bind (now in allocator.h)
- tensor_info: single TensorInfoRegistry class, rename from tensor_index.h

Frontends:
- Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&)
- Deduce model/weight type, remove --model and parsing
- Replace most common.h includes with configs.h
- Remove --compressed_weights, use --weights instead
- Remove ModelInfo, replaced by ModelConfig.

Backprop:
- Reduce max loss, remove backward_scalar_test (timeout)
- Update thresholds because new RandInit changes rng eval order and thus numerics
PiperOrigin-RevId: 755317484
2025-05-06 04:44:21 -07:00
Paul Chang d4050a2917 Expose BlobReader::Keys()
PiperOrigin-RevId: 694166186
2024-11-07 10:28:39 -08:00
Jan Wassenberg 19cfe14c76 Warning fixes (casts) and fix Windows build for aligned_alloc
PiperOrigin-RevId: 689734618
2024-10-25 04:14:04 -07:00
Paul Chang 4197d69dfc New blob_store_test, ensure ReadOne checks actual size against requested size
PiperOrigin-RevId: 688974390
2024-10-23 08:30:46 -07:00
Ray Smith 0d68555f87 Eliminated TConfig.
Changed CompressedLayer and CompressedWeights to be constructed with an instance of a LayerConfig and WeightsConfig respectively.
Added CompressedModel to remove ByteStorageT and get rid of most of the type casting, as well as allowing the default destructor to be used and work properly.
Adjusted WeightsWrapper and ForwardLayer etc to match.
The only remaining template arg is the weight type.
This enables all the instantiations to be deleted, apart from one per type.
It also enables (but not yet done) the config to be stored in the blob file instead of having to be specified separately.
Reduces the size of the gemma_lib and weights shared libraries by a factor of 4.3 and 3.2 respectively.

PiperOrigin-RevId: 686870060
2024-10-17 05:04:22 -07:00
Ray Smith 85958f5fd3 Added MatPtr/MatPtrT/MatStorageT/MatStorage as a dynamically-sized replacement for CompressedArray.
Definition of array size is moved to the constructor.
Allocation is separate and parallelized.
All users of weights_raw.h migrated to CompressedWeights and weights_raw.h deleted.
Replaced all previous ForEachTensor functions with a single unified function.

PiperOrigin-RevId: 684451604
2024-10-10 08:22:30 -07:00
Jan Wassenberg 13a9f76f64 Fix mismatch between blob_store and compress interfaces (bytes)
PiperOrigin-RevId: 673027268
2024-09-10 10:59:17 -07:00
Paul Chang cfce314715 Make BlobWriter::Add() accept const void*
PiperOrigin-RevId: 634780483
2024-05-17 08:11:06 -07:00
Jan Wassenberg e9a0caed87 Further improve IO, enable multiple backends without -D.
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.

Plus lint fixes.

PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Jan Wassenberg a8ceb75f43 Improved IO abstraction layer
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.

PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Luca Versari 4c23932289 Improve weight handling.
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros

Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Jan Wassenberg 7122afed5a Add note on weight update and improve error message
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
Jan Wassenberg fce5c8c967 Avoid fadvise on older Android. Fixes #84
PiperOrigin-RevId: 613815953
2024-03-07 22:19:22 -08:00
Jan Wassenberg b6aaf6bbb8 Fix for Android's 32-bit off_t. Fixes #62
PiperOrigin-RevId: 611249534
2024-02-28 15:30:19 -08:00
Dan Zheng 4c155bd3df Restore reverted changes.
Sync to 84444c93a4.

PiperOrigin-RevId: 610263918
2024-02-25 19:32:07 -08:00
Silvio Traversaro 696597383c Copybara import of the project:
--
19694e1f2e by Silvio Traversaro <silvio@traversaro.it>:

Do not pass explicitly -O2 flag to compiler in Release build

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/3 from traversaro:patch-1 19694e1f2e
PiperOrigin-RevId: 610096914
2024-02-24 20:41:33 -08:00
Dan Zheng 84444c93a4 Revert "Copybara configuration update."
This reverts commit c03b5da542.

Restore lost changes due to improper Copybara syncing.
2024-02-24 15:15:14 -08:00
Dan Zheng c03b5da542 Copybara configuration update.
PiperOrigin-RevId: 609931218
2024-02-24 12:02:47 -08:00
David Coles 39e385782c Allow building on Windows using `clang-cl` toolchain
It's not possible to build `gemma.cpp` with the standard MSVC front-end
as it doesn't support arrays more than `0x7ffffffff` bytes (see Compiler Error C2148),
however this isn't a problem with the optional Visual Studio Clang/LLVM frontend.

This can be specified using the `-T` flag when running CMake:

```
$ cmake -B build -T ClangCL
$ cmake --build build --config Release
```

Windows doesn't provide `pread`/`pwrite` so this must be emulated using
the `ReadFile`/`WriteFile` Win32 APIs.

`_CRT_SECURE_NO_WARNINGS` is defined to prevent a large number of warnings
about using "depricated" function names (e.g. `close` instead of `_close`).

`NOMINMAX` is defined to prevent the `min`/`max` macros from `windows.h`
from conflicting with expressions like `std::min`. Generally libraries should
avoid including `windows.h` in their public headers or define `WIN32_LEAN_AND_MEAN`
before including the `windows.h` header, but this unfortunately isn't always the case.
2024-02-23 00:38:54 -08:00
Austin Huang e29cd566cf initial commit 2024-02-21 03:31:22 +00:00