Jan Wassenberg
a982ec1287
Move code to gemma/ so we can remove error-prone copybara: comments.
...
Also fix includes and Lint warnings.
PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Luca Versari
9c3f969405
Implement the Griffin model.
...
Also implement support for some model variations:
- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Luca Versari
5862d1f995
Add a benchmark and additional tests.
...
Also add a script to help running sanitizer builds, and do some cleanup.
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Luca Versari
4c23932289
Improve weight handling.
...
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Zoltan Szabadka
b670d43e4f
Add standalone tool to compress weights.
...
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Jan Wassenberg
ba86c8d590
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Jan Wassenberg
f8baac80f9
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-21 04:18:55 +01:00
Eric Ye
52940d435f
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-21 04:18:48 +01:00
Eric Ye
89be4c3de8
No public description
...
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Jan Wassenberg
06cea2bcdb
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Jan Wassenberg
edaafe335f
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-20 23:37:32 +01:00
Eric Ye
e2a04b79ed
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-20 23:37:25 +01:00
Eric Ye
ffd02c59ad
No public description
...
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg
7d5364bb80
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
Jan Wassenberg
11d9c51473
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-20 12:13:13 +01:00
Eric Ye
6865819bb7
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-20 12:13:06 +01:00
Eric Ye
fdc3812446
No public description
...
PiperOrigin-RevId: 617315030
2024-03-20 12:12:54 +01:00
Jan Wassenberg
5e0cafbdc2
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-19 21:12:35 -07:00
Eric Ye
fdb1091b9c
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-19 16:08:26 -07:00
austinvhuang
60d054e041
move arg definitions out of gemma.h to app.h
2024-03-10 23:49:25 -04:00
austinvhuang
03147effbd
update loader arg names: cache -> compressed_weights, model -> weights
2024-03-08 17:32:36 -05:00
austinvhuang
dfd2fdc1dd
Decouple gemma constructor from loader args, update hello_world example, add convenience version of constructor (no uncompressed weights)
2024-03-08 17:26:03 -05:00
austinvhuang
42e53e2da8
[WIP] simplify hello world example, add convenience function. TODO: update git hash in CMakeLists.txt of hello world after push
2024-03-08 14:56:22 -05:00
austinvhuang
b67e28d1a0
[WIP] remove args from GetWeights, GetCompressedWeights
2024-03-08 00:00:11 -05:00
RangerUFO
170a9b4690
Make `CreateKVCache` a free function rather than a method
2024-03-07 15:52:55 +08:00
RangerUFO
b841612e8c
Separate KV cache from GemmaImpl
2024-03-07 15:47:31 +08:00
austinvhuang
6c0388e049
[WIP] refine Runtime struct definition
2024-03-07 01:14:07 -05:00
austinvhuang
e781007836
[WIP] Remove InferenceArgs from hello_world example, fix ordering of LoaderArgs validation, revert ReplGemma EOT token behavior
2024-03-06 23:21:13 -05:00
austinvhuang
7042316013
[WIP] update GemmaInterface, Gemma, and Generate input parameter specs to remove InferenceArgs. TODO: update hello_world example after git commit hash is available for fetching
2024-03-06 22:22:59 -05:00
austinvhuang
10f7a086aa
[WIP] decouple GemmaImpl from CLI args
2024-03-06 15:06:41 -05:00
austinvhuang
0ea7b993de
remove --log fixing https://github.com/google/gemma.cpp/issues/59 , improve command line args help, add copybara #include sort guards in more source files, add README sections on running faster and related projects
2024-02-28 15:18:40 -05:00
austinvhuang
d37f9c3604
re-enable SortIncludes to conform to vanilla Google style, add comment lines to #includes in gemma.h as barriers to block destructive sorting, update doc + remove shell script
2024-02-27 21:23:33 -05:00
austinvhuang
8f3bd63bf7
Fix copybara include path substitutions errors (which break the google3 build) arising from clang-format linter automation
2024-02-27 17:11:15 -05:00
austinvhuang
f70d2de16f
use `style=Google` - dumped for .clang-format, gemma.h updated
2024-02-27 15:44:03 -05:00
austinvhuang
9cdc9223bc
clean up formatting after 129e66ada2, add .clang-format defaults, minor updates to DEVELOPERS doc
2024-02-27 14:22:02 -05:00
Dan Zheng
afc354dcb1
Import from GitHub.
...
PiperOrigin-RevId: 610595796
2024-02-26 19:05:11 -08:00
Dan Zheng
8db89304bd
No public description
...
PiperOrigin-RevId: 610498969
2024-02-26 19:03:48 -08:00
austinvhuang
129e66ada2
Reduce KV cache preallocation to 4096 and make it comptime configurable, add rm build note in readme, add note on comptime options in DEVELOPERS, make multiturn=0 the default
2024-02-26 17:05:32 -05:00
Dan Zheng
d9e1af7551
Copybara fix.
...
PiperOrigin-RevId: 610032760
2024-02-24 12:02:59 -08:00
Dan Zheng
c03b5da542
Copybara configuration update.
...
PiperOrigin-RevId: 609931218
2024-02-24 12:02:47 -08:00
Dan Zheng
4b1fa03e95
Fix build. ( #35 )
...
* Enable GitHub Actions CI for pull requests.
* Fix sentencepiece include directives.
2024-02-24 11:03:36 -08:00
Ikko Eltociear Ashimine
e4e02a17d4
Copybara import of the project:
...
--
5c7dbc6599 by Ikko Eltociear Ashimine <eltociear@gmail.com>:
Update build.yml
dispath -> dispatch
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/22 from eltociear:patch-1 5c7dbc6599
PiperOrigin-RevId: 609827161
2024-02-23 22:32:51 -08:00
The gemma.cpp Authors
a16df06cf2
Toward Bazel support: expose BUILD, add WORKSPACE/MODULE.bazel. Refs #16
...
PiperOrigin-RevId: 609734560
2024-02-23 08:23:18 -08:00
The gemma_cpp Authors
587e80f276
Code update
...
PiperOrigin-RevId: 609394329
2024-02-22 09:19:47 -08:00
Austin Huang
e29cd566cf
initial commit
2024-02-21 03:31:22 +00:00