gemma.cpp

Commit Graph

Author	SHA1	Message	Date
Jan Wassenberg	c8d92948f4	Move fields, io* and blob* from compression/ into io/ PiperOrigin-RevId: 755445712	2025-05-06 11:17:19 -07:00
Jan Wassenberg	275135d7e8	Rename-only: remove Allocator2 etc suffixes now that refactoring is complete PiperOrigin-RevId: 755397220	2025-05-06 09:12:43 -07:00
Jan Wassenberg	8d0882b966	Huge refactor of weight handling and model loading. Weight handling: - new ModelStore2 supports both pre-2025 multi-file and single-file formats - simpler ForEachTensor with TensorArgs - tensors are constructed with their full suffixed name I/O: - support mmap and stride - Simplified SbsWriter, single insert(); add SbsReader Misc: - kMockTokenizer: allow creating with unavailable tokenizer - configs.h: Simpler enum validity checks via kSentinel - matmul.h: remove unused enable_bind (now in allocator.h) - tensor_info: single TensorInfoRegistry class, rename from tensor_index.h Frontends: - Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&) - Deduce model/weight type, remove --model and parsing - Replace most common.h includes with configs.h - Remove --compressed_weights, use --weights instead - Remove ModelInfo, replaced by ModelConfig. Backprop: - Reduce max loss, remove backward_scalar_test (timeout) - Update thresholds because new RandInit changes rng eval order and thus numerics PiperOrigin-RevId: 755317484	2025-05-06 04:44:21 -07:00
Paul Chang	d4050a2917	Expose BlobReader::Keys() PiperOrigin-RevId: 694166186	2024-11-07 10:28:39 -08:00
Jan Wassenberg	19cfe14c76	Warning fixes (casts) and fix Windows build for aligned_alloc PiperOrigin-RevId: 689734618	2024-10-25 04:14:04 -07:00
Paul Chang	4197d69dfc	New blob_store_test, ensure ReadOne checks actual size against requested size PiperOrigin-RevId: 688974390	2024-10-23 08:30:46 -07:00
Ray Smith	0d68555f87	Eliminated TConfig. Changed CompressedLayer and CompressedWeights to be constructed with an instance of a LayerConfig and WeightsConfig respectively. Added CompressedModel to remove ByteStorageT and get rid of most of the type casting, as well as allowing the default destructor to be used and work properly. Adjusted WeightsWrapper and ForwardLayer etc to match. The only remaining template arg is the weight type. This enables all the instantiations to be deleted, apart from one per type. It also enables (but not yet done) the config to be stored in the blob file instead of having to be specified separately. Reduces the size of the gemma_lib and weights shared libraries by a factor of 4.3 and 3.2 respectively. PiperOrigin-RevId: 686870060	2024-10-17 05:04:22 -07:00
Ray Smith	85958f5fd3	Added MatPtr/MatPtrT/MatStorageT/MatStorage as a dynamically-sized replacement for CompressedArray. Definition of array size is moved to the constructor. Allocation is separate and parallelized. All users of weights_raw.h migrated to CompressedWeights and weights_raw.h deleted. Replaced all previous ForEachTensor functions with a single unified function. PiperOrigin-RevId: 684451604	2024-10-10 08:22:30 -07:00
Jan Wassenberg	13a9f76f64	Fix mismatch between blob_store and compress interfaces (bytes) PiperOrigin-RevId: 673027268	2024-09-10 10:59:17 -07:00
Paul Chang	cfce314715	Make BlobWriter::Add() accept const void* PiperOrigin-RevId: 634780483	2024-05-17 08:11:06 -07:00
Jan Wassenberg	e9a0caed87	Further improve IO, enable multiple backends without -D. Move Path into io.h and use for opening files. Removes dependency of gemma_lib on args. Separate Windows codepath instead of emulating POSIX functions. Plus lint fixes. PiperOrigin-RevId: 626279004	2024-04-19 00:40:29 -07:00
Jan Wassenberg	a8ceb75f43	Improved IO abstraction layer Move to unique_ptr-like File class. Move `if OS_WIN` into wrapper functions. exists -> Exists. PiperOrigin-RevId: 625923056	2024-04-17 23:15:07 -07:00
Jan Wassenberg	a982ec1287	Move code to gemma/ so we can remove error-prone copybara: comments. Also fix includes and Lint warnings. PiperOrigin-RevId: 623127487	2024-04-09 04:45:42 -07:00
Luca Versari	4c23932289	Improve weight handling. - Allow scaling of SFP weights - Allow using uncompressed weights - Do not try to compress weights in the main model calls - Reduce code duplication in weight handling with some macros Co-authored-by: Eugene Kliuchnikov <eustas@google.com> Co-authored-by: Thomas Fischbacher <tfish@google.com> Co-authored-by: Zoltan Szabadka <szabadka@google.com>	2024-04-06 11:08:47 +02:00
Jan Wassenberg	7122afed5a	Add note on weight update and improve error message PiperOrigin-RevId: 621849989	2024-04-04 07:17:27 -07:00
Jan Wassenberg	fce5c8c967	Avoid fadvise on older Android. Fixes #84 PiperOrigin-RevId: 613815953	2024-03-07 22:19:22 -08:00
Jan Wassenberg	b6aaf6bbb8	Fix for Android's 32-bit off_t. Fixes #62 PiperOrigin-RevId: 611249534	2024-02-28 15:30:19 -08:00
Dan Zheng	4c155bd3df	Restore reverted changes. Sync to `84444c93a4`. PiperOrigin-RevId: 610263918	2024-02-25 19:32:07 -08:00
Silvio Traversaro	696597383c	Copybara import of the project: -- `19694e1f2e` by Silvio Traversaro <silvio@traversaro.it>: Do not pass explicitly -O2 flag to compiler in Release build COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/3 from traversaro:patch-1 `19694e1f2e` PiperOrigin-RevId: 610096914	2024-02-24 20:41:33 -08:00
Dan Zheng	84444c93a4	Revert "Copybara configuration update." This reverts commit `c03b5da542`. Restore lost changes due to improper Copybara syncing.	2024-02-24 15:15:14 -08:00
Dan Zheng	c03b5da542	Copybara configuration update. PiperOrigin-RevId: 609931218	2024-02-24 12:02:47 -08:00
David Coles	39e385782c	Allow building on Windows using `clang-cl` toolchain It's not possible to build `gemma.cpp` with the standard MSVC front-end as it doesn't support arrays more than `0x7ffffffff` bytes (see Compiler Error C2148), however this isn't a problem with the optional Visual Studio Clang/LLVM frontend. This can be specified using the `-T` flag when running CMake: ``` $ cmake -B build -T ClangCL $ cmake --build build --config Release ``` Windows doesn't provide `pread`/`pwrite` so this must be emulated using the `ReadFile`/`WriteFile` Win32 APIs. `_CRT_SECURE_NO_WARNINGS` is defined to prevent a large number of warnings about using "depricated" function names (e.g. `close` instead of `_close`). `NOMINMAX` is defined to prevent the `min`/`max` macros from `windows.h` from conflicting with expressions like `std::min`. Generally libraries should avoid including `windows.h` in their public headers or define `WIN32_LEAN_AND_MEAN` before including the `windows.h` header, but this unfortunately isn't always the case.	2024-02-23 00:38:54 -08:00
Austin Huang	e29cd566cf	initial commit	2024-02-21 03:31:22 +00:00

23 Commits