Commit Graph

175 Commits

Author SHA1 Message Date
Copybara-Service 325ef06cf9 Merge pull request #130 from veluca93:weight-handling
PiperOrigin-RevId: 622405491
2024-04-06 02:22:00 -07:00
Luca Versari 4c23932289 Improve weight handling.
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros

Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Copybara-Service 280b8cb8a1 Merge pull request #129 from veluca93:more-ops
PiperOrigin-RevId: 622145499
2024-04-05 05:02:00 -07:00
Luca Versari 6cdb8a45a0 Add more ops: Sigmoid, (Two)MatVecAdd. Faster TwoMatVec.
drive-by: some build system simplifications

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Lode Vandevenne <lode@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-05 12:27:31 +02:00
Jan Wassenberg 7122afed5a Add note on weight update and improve error message
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
Copybara-Service 08948f13ac Merge pull request #127 from szabadka:gemma3
PiperOrigin-RevId: 621815677
2024-04-04 04:32:03 -07:00
Jan Wassenberg 44e6274e99 1.07x speedup: merge MQA parallel sections as suggested by @veluca93
PiperOrigin-RevId: 621772392
2024-04-04 01:12:53 -07:00
Zoltan Szabadka 71ead04afb Fix off-by-one errors in generation code and token streaming callback.
In the generation code we were feeding the last token of the prompt
twice through the transformer. The new version fixes that and also
works in the case where Prefill is completely disabled.
2024-04-04 07:56:21 +00:00
Copybara-Service ede337f876 Merge pull request #125 from szabadka:gemma1
PiperOrigin-RevId: 621549709
2024-04-03 09:35:25 -07:00
Zoltan Szabadka b670d43e4f Add standalone tool to compress weights.
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Copybara-Service 93a648926c Merge pull request #122 from LINKIWI:bazelversion
PiperOrigin-RevId: 621148731
2024-04-02 06:02:42 -07:00
Kevin Lin 1845b19b47
.bazelversion: Bazel 7.1.1 2024-03-31 11:39:21 -07:00
Copybara-Service 7e0a6fcab1 Merge pull request #120 from ufownl:bugfix/gcc_compilation_error
PiperOrigin-RevId: 620016059
2024-03-28 12:10:08 -07:00
RangerUFO 1c03d7446d Fix compilation error when `HWY_COMPILER_GCC_ACTUAL < 1300` 2024-03-28 14:54:37 +08:00
Jan Wassenberg bb767d788d Bounds-checks for large prompts. Refs #99
Also remove init placeholder and move Sqrt to ops.h.

PiperOrigin-RevId: 619529202
2024-03-27 07:49:46 -07:00
Copybara-Service bbf4df4584 Merge pull request #115 from villesundell:patch-1
PiperOrigin-RevId: 619262700
2024-03-26 11:46:54 -07:00
Jan Wassenberg c1d3c3284c Add ops_test to BUILD, rename transformer_ops->ops, fix includes.
Also fix copybara. Refs #105

PiperOrigin-RevId: 619157071
2024-03-26 05:37:31 -07:00
Copybara-Service 9f1595c110 Merge pull request #105 from enum-class:improve_ops_utility
PiperOrigin-RevId: 618827910
2024-03-25 07:00:45 -07:00
Jan Wassenberg 3f2fabcfcb Update todo to mention PartialSort
PiperOrigin-RevId: 618685783
2024-03-24 17:14:31 -07:00
enum-class aa6e88e591 add unit tests for ops 2024-03-23 21:09:19 +08:00
enum-class d079c8f1ba Merge branch 'dev' into improve_ops_utility 2024-03-23 10:36:56 +08:00
Copybara-Service fcf5c1af88 Merge pull request #114 from ufownl:experimental
PiperOrigin-RevId: 618148701
2024-03-22 05:36:07 -07:00
Jan Wassenberg 61e031fe98 Towards building tests without GUnit Refs #29
PiperOrigin-RevId: 618032987
2024-03-21 19:28:02 -07:00
Jan Wassenberg 24add61dd9 Fix SFP/NUQ for bf16 rounding in Highway
SFP: Avoid rounding twice, and more robust TestDot.
NUQ: also more robust SNR, minor touchups to header.

PiperOrigin-RevId: 618030096
2024-03-21 19:06:19 -07:00
RangerUFO 90b0e9fd7a Refactor the implementation of `Attention` 2024-03-21 14:40:56 +08:00
Jan Wassenberg a135bc1e47 Fix build for RPi, missing hn::. Refs #112, thanks long568
PiperOrigin-RevId: 617704418
2024-03-21 04:19:09 +01:00
Jan Wassenberg ba86c8d590 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Jan Wassenberg f8baac80f9 Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-21 04:18:55 +01:00
Eric Ye 52940d435f Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-21 04:18:48 +01:00
Eric Ye 89be4c3de8 No public description
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Jan Wassenberg 30b8a3c1ac Fix build for RPi, missing hn::. Refs #112, thanks long568
PiperOrigin-RevId: 617704418
2024-03-20 20:07:49 -07:00
Ville Sundell 546519c855
Added a missing space in app.h
When the user runs "--help", they see the non-existent word
"compressingnew". This is because of a missing space, which
is now added, resulting in "compressing new".
2024-03-21 00:39:45 +02:00
Jan Wassenberg 06cea2bcdb Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Jan Wassenberg edaafe335f Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-20 23:37:32 +01:00
Eric Ye e2a04b79ed Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-20 23:37:25 +01:00
Eric Ye ffd02c59ad No public description
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg 7d5364bb80 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
RangerUFO 8fc6959950 Move conditional branch out of `pos2` loop 2024-03-20 23:50:14 +08:00
RangerUFO c75d2eb635 Add the missing `HWY_ATTR` of `ProjKV` 2024-03-20 23:21:43 +08:00
RangerUFO ce32f4db81 Streamline the implementation 2024-03-20 22:39:31 +08:00
Jan Wassenberg 11d9c51473 Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-20 12:13:13 +01:00
Eric Ye 6865819bb7 Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-20 12:13:06 +01:00
Eric Ye fdc3812446 No public description
PiperOrigin-RevId: 617315030
2024-03-20 12:12:54 +01:00
RangerUFO 6923aec853 Add MQA support 2024-03-20 18:17:24 +08:00
RangerUFO 130e1f678f Adjust vocab size to be the same as gemma_pytorch 2024-03-20 18:17:24 +08:00
Jan Wassenberg 5e0cafbdc2 Fix msan error, uninitialized model_training
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.

Also fix includes.

PiperOrigin-RevId: 617386447
2024-03-19 21:12:35 -07:00
Eric Ye fdb1091b9c Connect "--weights" parameter to Gemma
PiperOrigin-RevId: 617323257
2024-03-19 16:08:26 -07:00
enum-class 4400842337 Minor refactor in Softmax 2024-03-20 00:20:14 +08:00
enum-class 858d5b08c2 Use highway in AddFrom, MulBy, MulByConst, MulByConstAndAdd, create_distribution 2024-03-19 08:38:09 +08:00
Copybara-Service 720f609d84 Merge pull request #102 from google:experimental
PiperOrigin-RevId: 616882521
2024-03-18 10:56:52 -07:00