Commit Graph

45 Commits

Author SHA1 Message Date
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Luca Versari 9c3f969405 Implement the Griffin model.
Also implement support for some model variations:

- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Luca Versari 5862d1f995 Add a benchmark and additional tests.
Also add a script to help running sanitizer builds, and do some cleanup.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Luca Versari 4c23932289 Improve weight handling.
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros

Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Zoltan Szabadka b670d43e4f Add standalone tool to compress weights.
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Jan Wassenberg ba86c8d590 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Jan Wassenberg f8baac80f9 Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-21 04:18:55 +01:00
Eric Ye 52940d435f Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-21 04:18:48 +01:00
Eric Ye 89be4c3de8 No public description
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Jan Wassenberg 06cea2bcdb Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Jan Wassenberg edaafe335f Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-20 23:37:32 +01:00
Eric Ye e2a04b79ed Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-20 23:37:25 +01:00
Eric Ye ffd02c59ad No public description
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg 7d5364bb80 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
Jan Wassenberg 11d9c51473 Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-20 12:13:13 +01:00
Eric Ye 6865819bb7 Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-20 12:13:06 +01:00
Eric Ye fdc3812446 No public description
PiperOrigin-RevId: 617315030
2024-03-20 12:12:54 +01:00
Jan Wassenberg 5e0cafbdc2 Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-19 21:12:35 -07:00
Eric Ye fdb1091b9c Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-19 16:08:26 -07:00
austinvhuang 60d054e041 move arg definitions out of gemma.h to app.h 2024-03-10 23:49:25 -04:00
austinvhuang 03147effbd update loader arg names: cache -> compressed_weights, model -> weights 2024-03-08 17:32:36 -05:00
austinvhuang dfd2fdc1dd Decouple gemma constructor from loader args, update hello_world example, add convenience version of constructor (no uncompressed weights) 2024-03-08 17:26:03 -05:00
austinvhuang 42e53e2da8 [WIP] simplify hello world example, add convenience function. TODO: update git hash in CMakeLists.txt of hello world after push 2024-03-08 14:56:22 -05:00
austinvhuang b67e28d1a0 [WIP] remove args from GetWeights, GetCompressedWeights 2024-03-08 00:00:11 -05:00
RangerUFO 170a9b4690 Make `CreateKVCache` a free function rather than a method 2024-03-07 15:52:55 +08:00
RangerUFO b841612e8c Separate KV cache from GemmaImpl 2024-03-07 15:47:31 +08:00
austinvhuang 6c0388e049 [WIP] refine Runtime struct definition 2024-03-07 01:14:07 -05:00
austinvhuang e781007836 [WIP] Remove InferenceArgs from hello_world example, fix ordering of LoaderArgs validation, revert ReplGemma EOT token behavior 2024-03-06 23:21:13 -05:00
austinvhuang 7042316013 [WIP] update GemmaInterface, Gemma, and Generate input parameter specs to remove InferenceArgs. TODO: update hello_world example after git commit hash is available for fetching 2024-03-06 22:22:59 -05:00
austinvhuang 10f7a086aa [WIP] decouple GemmaImpl from CLI args 2024-03-06 15:06:41 -05:00
austinvhuang 0ea7b993de remove --log fixing https://github.com/google/gemma.cpp/issues/59, improve command line args help, add copybara #include sort guards in more source files, add README sections on running faster and related projects 2024-02-28 15:18:40 -05:00
austinvhuang d37f9c3604 re-enable SortIncludes to conform to vanilla Google style, add comment lines to #includes in gemma.h as barriers to block destructive sorting, update doc + remove shell script 2024-02-27 21:23:33 -05:00
austinvhuang 8f3bd63bf7 Fix copybara include path substitutions errors (which break the google3 build) arising from clang-format linter automation 2024-02-27 17:11:15 -05:00
austinvhuang f70d2de16f use `style=Google` - dumped for .clang-format, gemma.h updated 2024-02-27 15:44:03 -05:00
austinvhuang 9cdc9223bc clean up formatting after 129e66ada2, add .clang-format defaults, minor updates to DEVELOPERS doc 2024-02-27 14:22:02 -05:00
Dan Zheng afc354dcb1 Import from GitHub.
PiperOrigin-RevId: 610595796
2024-02-26 19:05:11 -08:00
Dan Zheng 8db89304bd No public description
PiperOrigin-RevId: 610498969
2024-02-26 19:03:48 -08:00
austinvhuang 129e66ada2 Reduce KV cache preallocation to 4096 and make it comptime configurable, add rm build note in readme, add note on comptime options in DEVELOPERS, make multiturn=0 the default 2024-02-26 17:05:32 -05:00
Dan Zheng d9e1af7551 Copybara fix.
PiperOrigin-RevId: 610032760
2024-02-24 12:02:59 -08:00
Dan Zheng c03b5da542 Copybara configuration update.
PiperOrigin-RevId: 609931218
2024-02-24 12:02:47 -08:00
Dan Zheng 4b1fa03e95
Fix build. (#35)
* Enable GitHub Actions CI for pull requests.
* Fix sentencepiece include directives.
2024-02-24 11:03:36 -08:00
Ikko Eltociear Ashimine e4e02a17d4 Copybara import of the project:
--
5c7dbc6599 by Ikko Eltociear Ashimine <eltociear@gmail.com>:

Update build.yml

dispath -> dispatch

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/22 from eltociear:patch-1 5c7dbc6599
PiperOrigin-RevId: 609827161
2024-02-23 22:32:51 -08:00
The gemma.cpp Authors a16df06cf2 Toward Bazel support: expose BUILD, add WORKSPACE/MODULE.bazel. Refs #16
PiperOrigin-RevId: 609734560
2024-02-23 08:23:18 -08:00
The gemma_cpp Authors 587e80f276 Code update
PiperOrigin-RevId: 609394329
2024-02-22 09:19:47 -08:00
Austin Huang e29cd566cf initial commit 2024-02-21 03:31:22 +00:00