Jan Wassenberg
ee6e017a77
Fix windows build: min conflict, unused VF
...
PiperOrigin-RevId: 650955138
2024-07-10 04:18:25 -07:00
Jan Wassenberg
6a3f7cf3ea
Lint fix - string append, remove stale TODO
...
PiperOrigin-RevId: 650197468
2024-07-08 04:11:21 -07:00
Jan Wassenberg
f823371691
Cleanup: move util/compress and convert_weights to compression/
...
Also remove unused models/, lint convert_weights
PiperOrigin-RevId: 649613088
2024-07-05 04:16:52 -07:00
Jan Wassenberg
85fcd3cd80
Cleanup: add ModelInfo struct, remove gcpp::
...
PiperOrigin-RevId: 648707763
2024-07-02 07:11:15 -07:00
Jan Wassenberg
b1c1ec1d59
Use benchmark_helper in py bindings (adds BOS)
...
Also remove thread clamp (OK to be zero or large).
PiperOrigin-RevId: 648657155
2024-07-02 03:27:15 -07:00
Jan Wassenberg
af8eb2fde3
Declutter gemma/ directory, move binaries to evals/ and util/.
...
PiperOrigin-RevId: 648400795
2024-07-01 09:51:04 -07:00
The gemma.cpp Authors
ef786f1bfc
Use hwy::ThreadPool::MaxThreads() to determine the number of threads to use.
...
PiperOrigin-RevId: 646117298
2024-06-24 09:16:04 -07:00
Daniel Keysers
0570972d43
Fixing two typos.
...
PiperOrigin-RevId: 645103198
2024-06-20 11:33:12 -07:00
Jan Wassenberg
d3c6a45b59
Major duplicated code reduction in test/benchmarks
...
Helper functions to tokenize/wrap
Move LayersOutputFunc into RuntimeConfig
AcceptFunc passes the probability
Implement StringFromType using the parser, and verify results match
PiperOrigin-RevId: 643255119
2024-06-14 00:16:25 -07:00
Jan Wassenberg
3e2396f98c
Use Loader/AppArgs to construct gemma_test model, simplify AcceptFunc
...
accept_token: allow default, check if empty when using
allow mixing sample_func and stream_func, call the latter after the former
Also fix missing includes/deps.
PiperOrigin-RevId: 642240012
2024-06-11 05:53:10 -07:00
Jan Wassenberg
f9b390b134
Support all weight types in a single binary.
...
This changes the command line flags, but the default value retains the previous behavior.
Also add a CreateGemma helper to enable extra args without interface changes.
PiperOrigin-RevId: 641266411
2024-06-07 09:04:45 -07:00
Zelalem Aweke
9e213b3d96
Use system topology to pin threads across clusters.
...
PiperOrigin-RevId: 640151974
2024-06-04 07:50:32 -07:00
Jan Wassenberg
12fb2f05cf
Add per-thread even_odd storage for #166 .
...
Also inline ProjQ and ProjKV lambdas,
add missing includes/deps for ops_test.
PiperOrigin-RevId: 629460608
2024-04-30 10:42:23 -07:00
Jan Wassenberg
7a12e29027
Add error-checking for py binding, add missing include+hwasan check
...
PiperOrigin-RevId: 628453112
2024-04-26 10:59:41 -07:00
Phil Culliton
9e0ac5de34
Update Clif wrapper to work with latest gemma.cpp and add simple example
...
PiperOrigin-RevId: 628134201
2024-04-25 11:17:16 -07:00
Jan Wassenberg
e9a0caed87
Further improve IO, enable multiple backends without -D.
...
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.
Plus lint fixes.
PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Jan Wassenberg
a8ceb75f43
Improved IO abstraction layer
...
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.
PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Jan Wassenberg
a982ec1287
Move code to gemma/ so we can remove error-prone copybara: comments.
...
Also fix includes and Lint warnings.
PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Luca Versari
9c3f969405
Implement the Griffin model.
...
Also implement support for some model variations:
- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Luca Versari
5862d1f995
Add a benchmark and additional tests.
...
Also add a script to help running sanitizer builds, and do some cleanup.
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Luca Versari
4c23932289
Improve weight handling.
...
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Zoltan Szabadka
b670d43e4f
Add standalone tool to compress weights.
...
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Copybara-Service
bbf4df4584
Merge pull request #115 from villesundell:patch-1
...
PiperOrigin-RevId: 619262700
2024-03-26 11:46:54 -07:00
Copybara-Service
fcf5c1af88
Merge pull request #114 from ufownl:experimental
...
PiperOrigin-RevId: 618148701
2024-03-22 05:36:07 -07:00
Jan Wassenberg
ba86c8d590
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Eric Ye
89be4c3de8
No public description
...
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Ville Sundell
546519c855
Added a missing space in app.h
...
When the user runs "--help", they see the non-existent word
"compressingnew". This is because of a missing space, which
is now added, resulting in "compressing new".
2024-03-21 00:39:45 +02:00
Jan Wassenberg
06cea2bcdb
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Eric Ye
ffd02c59ad
No public description
...
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg
7d5364bb80
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
RangerUFO
6923aec853
Add MQA support
2024-03-20 18:17:24 +08:00
RangerUFO
130e1f678f
Adjust vocab size to be the same as gemma_pytorch
2024-03-20 18:17:24 +08:00
Copybara-Service
a0f316d853
Merge pull request #95 from google:conversion
...
PiperOrigin-RevId: 615448039
2024-03-13 09:37:36 -07:00
pculliton
f520e5c25c
Remove WIP messages.
2024-03-13 11:36:19 -04:00
Copybara-Service
0221956b2e
Merge pull request #87 from google:refactor-tidy
...
PiperOrigin-RevId: 615204427
2024-03-12 16:10:47 -07:00
Phil Culliton
b6831a2256
Fixed 7B conversion.
2024-03-12 21:12:28 +00:00
austinvhuang
4aa8d0584e
Merge branch 'dev' into refactor-tidy
2024-03-12 15:01:46 -04:00
Copybara-Service
ccd055e06b
Merge pull request #82 from google:examples
...
PiperOrigin-RevId: 615066980
2024-03-12 09:24:24 -07:00
Jan Wassenberg
0d406061c0
Detect and print build type. Refs #88
...
PiperOrigin-RevId: 614906000
2024-03-11 21:58:10 -07:00
austinvhuang
60d054e041
move arg definitions out of gemma.h to app.h
2024-03-10 23:49:25 -04:00
Phil Culliton
2161908f50
Added 7B support and args parsing. Still todo: more testing of 7B conversion.
2024-03-07 22:34:14 +00:00
austinvhuang
10f7a086aa
[WIP] decouple GemmaImpl from CLI args
2024-03-06 15:06:41 -05:00
Phil Culliton
c93e1a1e4d
Resolved layer ordering, reshaping, MQA->MHA, and quantization. Works only for 2B.
2024-03-05 17:54:55 +00:00
austinvhuang
3c69695c1e
transformations and validations (wip)
2024-03-02 14:46:51 -05:00
austinvhuang
7d7d43e661
converter transformations (wip)
2024-03-02 08:11:55 -05:00
austinvhuang
5be9a2243f
initial (wip) convert_weights script from pytorch
2024-03-01 15:52:51 -05:00
austinvhuang
0ea7b993de
remove --log fixing https://github.com/google/gemma.cpp/issues/59 , improve command line args help, add copybara #include sort guards in more source files, add README sections on running faster and related projects
2024-02-28 15:18:40 -05:00
Jan Wassenberg
272f17ddb3
Warning fixes: unused member, cast, unused function
...
PiperOrigin-RevId: 611074887
2024-02-28 05:54:22 -08:00
Copybara-Service
1a1dd90287
Merge pull request #33 from shirayu:add_eot_option
...
PiperOrigin-RevId: 610838070
2024-02-27 12:32:01 -08:00
Jan Wassenberg
179ecf9e78
Warn instead of assert for setaffinity. Fixes #49
...
PiperOrigin-RevId: 610638517
2024-02-26 22:46:11 -08:00