Commit Graph

151 Commits

Author SHA1 Message Date
Jan Wassenberg cbb67b4ee0 Move benchmark_helper to evals/, weights_raw to compression/.
PiperOrigin-RevId: 650155983
2024-07-08 01:13:23 -07:00
Jan Wassenberg f823371691 Cleanup: move util/compress and convert_weights to compression/
Also remove unused models/, lint convert_weights

PiperOrigin-RevId: 649613088
2024-07-05 04:16:52 -07:00
Jan Wassenberg 41efec4dba Add Py bindings for weight compression
TODO: this uses clif instead of pybind11, and depends on absl.

PiperOrigin-RevId: 649575815
2024-07-05 01:06:00 -07:00
Jan Wassenberg d3c6a45b59 Major duplicated code reduction in test/benchmarks
Helper functions to tokenize/wrap
Move LayersOutputFunc into RuntimeConfig
AcceptFunc passes the probability
Implement StringFromType using the parser, and verify results match

PiperOrigin-RevId: 643255119
2024-06-14 00:16:25 -07:00
Jan Wassenberg a0e808e341 Add compression/ comments, especially on SFP range
PiperOrigin-RevId: 642238720
2024-06-11 05:47:49 -07:00
Jan Wassenberg 5c3e5f7038 Remove no longer required stats.h - use Highway version instead
PiperOrigin-RevId: 640440379
2024-06-05 01:37:48 -07:00
Paul Chang 175e389c3c revert back to HWY_ASSERT for lane constraints, qualify hn::Add
PiperOrigin-RevId: 640193239
2024-06-04 10:10:18 -07:00
Jan Wassenberg 4f9155d8c6 Add bf16 matmul support, update naming+test
Avoid int32, which can easily overflow for large matrices.
Also fix IDE warning in sfp-inl.

PiperOrigin-RevId: 640149845
2024-06-04 07:41:46 -07:00
Zoltan Szabadka 36e4d8bbfe Add first version of backpropagation support.
This is still in progress / experimental, currently it is only
implemented for normal gemma MQA attention layers, and no
parallelism is added yet for backward pass.

Since we need to remember all activations from all layers, the
forward pass was also reimplemented with a new activation data
structure.
2024-06-04 08:37:49 +00:00
Jan Wassenberg a44cbdadc2 Update to Highway 1.2 for topology/VQSelect
Also fix unused-warning in compress-inl.

PiperOrigin-RevId: 639116915
2024-05-31 12:29:10 -07:00
Paul Chang c0643577c3 Minor internal refactoring.
PiperOrigin-RevId: 635852078
2024-05-21 10:29:59 -07:00
Paul Chang cfce314715 Make BlobWriter::Add() accept const void*
PiperOrigin-RevId: 634780483
2024-05-17 08:11:06 -07:00
Jan Wassenberg 22fe9809ac Fix SVE build: add missing hn::
PiperOrigin-RevId: 632481097
2024-05-10 06:49:26 -07:00
Jan Wassenberg c5c9fc300c Enable even/odd for SFP. Refs #166
Disable it for float32 because there is not enough benefit.

PiperOrigin-RevId: 631788326
2024-05-08 07:09:06 -07:00
Jan Wassenberg f6d02b2870 Fix RecurrentGemma (refs #166) - one Dot was ignoring scale.
Remove extra Dot() overload
MatVecAdd always adds, use MatVecT<kAdd> if conditional.
Remove ununsed MatVecAddLoop and MatVecLoop
No longer tsan-verify even_odd

PiperOrigin-RevId: 631377279
2024-05-07 04:40:42 -07:00
Jan Wassenberg b5a9ade75f 2x speedup of SFP decode (1.4x overall) on AVX3_DL+.
Thanks @nzmichaelh for suggesting table lookups!

PiperOrigin-RevId: 631337524
2024-05-07 01:46:43 -07:00
Zoltan Szabadka 429eb78512 Remove unused vars. 2024-05-03 13:37:17 +00:00
Sam Kaufman f608337fef Remove Bf16ToF32EO and use PromoteEvenTo and PromoteOddTo. 2024-04-29 14:13:07 -07:00
Sam Kaufman 5cb63346aa supports_eo -> kSupportsEvenOdd 2024-04-29 12:51:35 -07:00
Sam Kaufman 0816a1070d Even-odd layout MatVecs for bf16 weights. 2024-04-28 20:09:25 -07:00
Paul Chang e8f59bb411 Fix underflow in NUQ ClusterCost()
PiperOrigin-RevId: 628137904
2024-04-25 11:28:51 -07:00
Jan Wassenberg e9a0caed87 Further improve IO, enable multiple backends without -D.
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.

Plus lint fixes.

PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Jan Wassenberg a8ceb75f43 Improved IO abstraction layer
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.

PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Jan Wassenberg a939b5fc9f Update distortion.h to weighted average, add distortion_test.
More thorough checks in sfp_test and nuq_test.
nuq_test: use deterministic input generator.

PiperOrigin-RevId: 625602019
2024-04-17 01:44:19 -07:00
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Luca Versari 4c23932289 Improve weight handling.
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros

Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Jan Wassenberg 7122afed5a Add note on weight update and improve error message
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
Jan Wassenberg 61e031fe98 Towards building tests without GUnit Refs #29
PiperOrigin-RevId: 618032987
2024-03-21 19:28:02 -07:00
Jan Wassenberg 24add61dd9 Fix SFP/NUQ for bf16 rounding in Highway
SFP: Avoid rounding twice, and more robust TestDot.
NUQ: also more robust SNR, minor touchups to header.

PiperOrigin-RevId: 618030096
2024-03-21 19:06:19 -07:00
Jan Wassenberg ba86c8d590 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Eric Ye 89be4c3de8 No public description
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Jan Wassenberg 30b8a3c1ac Fix build for RPi, missing hn::. Refs #112, thanks long568
PiperOrigin-RevId: 617704418
2024-03-20 20:07:49 -07:00
Jan Wassenberg 06cea2bcdb Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Eric Ye ffd02c59ad No public description
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg 7d5364bb80 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
Jan Wassenberg fce5c8c967 Avoid fadvise on older Android. Fixes #84
PiperOrigin-RevId: 613815953
2024-03-07 22:19:22 -08:00
Jan Wassenberg bb9b023502 Support Bazel builds. Fixes #16
Also fix nuq/sfp-inl: warning, cast, and disable SCALAR

PiperOrigin-RevId: 612704056
2024-03-04 22:07:25 -08:00
Copybara-Service cd7468199c Merge pull request #65 from enum-class:narrowing-issues
PiperOrigin-RevId: 612279564
2024-03-03 18:51:59 -08:00
Jan Wassenberg b6aaf6bbb8 Fix for Android's 32-bit off_t. Fixes #62
PiperOrigin-RevId: 611249534
2024-02-28 15:30:19 -08:00
Jan Wassenberg 272f17ddb3 Warning fixes: unused member, cast, unused function
PiperOrigin-RevId: 611074887
2024-02-28 05:54:22 -08:00
enum-class 06dd013397 Add clang-tidy, fix narrowing issues, fix constness 2024-02-28 20:04:09 +08:00
Jan Wassenberg b3fecef45d Warning fix: sign cast
PiperOrigin-RevId: 610635789
2024-02-26 22:31:39 -08:00
Dan Zheng 4c155bd3df Restore reverted changes.
Sync to 84444c93a4.

PiperOrigin-RevId: 610263918
2024-02-25 19:32:07 -08:00
Silvio Traversaro 696597383c Copybara import of the project:
--
19694e1f2e by Silvio Traversaro <silvio@traversaro.it>:

Do not pass explicitly -O2 flag to compiler in Release build

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/3 from traversaro:patch-1 19694e1f2e
PiperOrigin-RevId: 610096914
2024-02-24 20:41:33 -08:00
Dan Zheng 84444c93a4 Revert "Copybara configuration update."
This reverts commit c03b5da542.

Restore lost changes due to improper Copybara syncing.
2024-02-24 15:15:14 -08:00
Dan Zheng c03b5da542 Copybara configuration update.
PiperOrigin-RevId: 609931218
2024-02-24 12:02:47 -08:00
Austin Huang 34b22c56f5
Merge pull request #6 from dcoles/clang-cl
Allow building on Windows using `clang-cl` toolchain
2024-02-24 12:27:40 -05:00
Ikko Eltociear Ashimine e4e02a17d4 Copybara import of the project:
--
5c7dbc6599 by Ikko Eltociear Ashimine <eltociear@gmail.com>:

Update build.yml

dispath -> dispatch

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/22 from eltociear:patch-1 5c7dbc6599
PiperOrigin-RevId: 609827161
2024-02-23 22:32:51 -08:00
David Coles 39e385782c Allow building on Windows using `clang-cl` toolchain
It's not possible to build `gemma.cpp` with the standard MSVC front-end
as it doesn't support arrays more than `0x7ffffffff` bytes (see Compiler Error C2148),
however this isn't a problem with the optional Visual Studio Clang/LLVM frontend.

This can be specified using the `-T` flag when running CMake:

```
$ cmake -B build -T ClangCL
$ cmake --build build --config Release
```

Windows doesn't provide `pread`/`pwrite` so this must be emulated using
the `ReadFile`/`WriteFile` Win32 APIs.

`_CRT_SECURE_NO_WARNINGS` is defined to prevent a large number of warnings
about using "depricated" function names (e.g. `close` instead of `_close`).

`NOMINMAX` is defined to prevent the `min`/`max` macros from `windows.h`
from conflicting with expressions like `std::min`. Generally libraries should
avoid including `windows.h` in their public headers or define `WIN32_LEAN_AND_MEAN`
before including the `windows.h` header, but this unfortunately isn't always the case.
2024-02-23 00:38:54 -08:00
The gemma_cpp Authors 587e80f276 Code update
PiperOrigin-RevId: 609394329
2024-02-22 09:19:47 -08:00
Austin Huang e29cd566cf initial commit 2024-02-21 03:31:22 +00:00