Copybara-Service
325ef06cf9
Merge pull request #130 from veluca93:weight-handling
...
PiperOrigin-RevId: 622405491
2024-04-06 02:22:00 -07:00
Luca Versari
4c23932289
Improve weight handling.
...
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Copybara-Service
280b8cb8a1
Merge pull request #129 from veluca93:more-ops
...
PiperOrigin-RevId: 622145499
2024-04-05 05:02:00 -07:00
Luca Versari
6cdb8a45a0
Add more ops: Sigmoid, (Two)MatVecAdd. Faster TwoMatVec.
...
drive-by: some build system simplifications
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Lode Vandevenne <lode@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-05 12:27:31 +02:00
Jan Wassenberg
7122afed5a
Add note on weight update and improve error message
...
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
Copybara-Service
08948f13ac
Merge pull request #127 from szabadka:gemma3
...
PiperOrigin-RevId: 621815677
2024-04-04 04:32:03 -07:00
Jan Wassenberg
44e6274e99
1.07x speedup: merge MQA parallel sections as suggested by @veluca93
...
PiperOrigin-RevId: 621772392
2024-04-04 01:12:53 -07:00
Zoltan Szabadka
71ead04afb
Fix off-by-one errors in generation code and token streaming callback.
...
In the generation code we were feeding the last token of the prompt
twice through the transformer. The new version fixes that and also
works in the case where Prefill is completely disabled.
2024-04-04 07:56:21 +00:00
Copybara-Service
ede337f876
Merge pull request #125 from szabadka:gemma1
...
PiperOrigin-RevId: 621549709
2024-04-03 09:35:25 -07:00
Zoltan Szabadka
b670d43e4f
Add standalone tool to compress weights.
...
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Copybara-Service
93a648926c
Merge pull request #122 from LINKIWI:bazelversion
...
PiperOrigin-RevId: 621148731
2024-04-02 06:02:42 -07:00
Kevin Lin
1845b19b47
.bazelversion: Bazel 7.1.1
2024-03-31 11:39:21 -07:00
Copybara-Service
7e0a6fcab1
Merge pull request #120 from ufownl:bugfix/gcc_compilation_error
...
PiperOrigin-RevId: 620016059
2024-03-28 12:10:08 -07:00
RangerUFO
1c03d7446d
Fix compilation error when `HWY_COMPILER_GCC_ACTUAL < 1300`
2024-03-28 14:54:37 +08:00
Jan Wassenberg
bb767d788d
Bounds-checks for large prompts. Refs #99
...
Also remove init placeholder and move Sqrt to ops.h.
PiperOrigin-RevId: 619529202
2024-03-27 07:49:46 -07:00
Copybara-Service
bbf4df4584
Merge pull request #115 from villesundell:patch-1
...
PiperOrigin-RevId: 619262700
2024-03-26 11:46:54 -07:00
Jan Wassenberg
c1d3c3284c
Add ops_test to BUILD, rename transformer_ops->ops, fix includes.
...
Also fix copybara. Refs #105
PiperOrigin-RevId: 619157071
2024-03-26 05:37:31 -07:00
Copybara-Service
9f1595c110
Merge pull request #105 from enum-class:improve_ops_utility
...
PiperOrigin-RevId: 618827910
2024-03-25 07:00:45 -07:00
Jan Wassenberg
3f2fabcfcb
Update todo to mention PartialSort
...
PiperOrigin-RevId: 618685783
2024-03-24 17:14:31 -07:00
enum-class
aa6e88e591
add unit tests for ops
2024-03-23 21:09:19 +08:00
enum-class
d079c8f1ba
Merge branch 'dev' into improve_ops_utility
2024-03-23 10:36:56 +08:00
Copybara-Service
fcf5c1af88
Merge pull request #114 from ufownl:experimental
...
PiperOrigin-RevId: 618148701
2024-03-22 05:36:07 -07:00
Jan Wassenberg
61e031fe98
Towards building tests without GUnit Refs #29
...
PiperOrigin-RevId: 618032987
2024-03-21 19:28:02 -07:00
Jan Wassenberg
24add61dd9
Fix SFP/NUQ for bf16 rounding in Highway
...
SFP: Avoid rounding twice, and more robust TestDot.
NUQ: also more robust SNR, minor touchups to header.
PiperOrigin-RevId: 618030096
2024-03-21 19:06:19 -07:00
RangerUFO
90b0e9fd7a
Refactor the implementation of `Attention`
2024-03-21 14:40:56 +08:00
Jan Wassenberg
a135bc1e47
Fix build for RPi, missing hn::. Refs #112 , thanks long568
...
PiperOrigin-RevId: 617704418
2024-03-21 04:19:09 +01:00
Jan Wassenberg
ba86c8d590
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Jan Wassenberg
f8baac80f9
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-21 04:18:55 +01:00
Eric Ye
52940d435f
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-21 04:18:48 +01:00
Eric Ye
89be4c3de8
No public description
...
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Jan Wassenberg
30b8a3c1ac
Fix build for RPi, missing hn::. Refs #112 , thanks long568
...
PiperOrigin-RevId: 617704418
2024-03-20 20:07:49 -07:00
Ville Sundell
546519c855
Added a missing space in app.h
...
When the user runs "--help", they see the non-existent word
"compressingnew". This is because of a missing space, which
is now added, resulting in "compressing new".
2024-03-21 00:39:45 +02:00
Jan Wassenberg
06cea2bcdb
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Jan Wassenberg
edaafe335f
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-20 23:37:32 +01:00
Eric Ye
e2a04b79ed
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-20 23:37:25 +01:00
Eric Ye
ffd02c59ad
No public description
...
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg
7d5364bb80
Remove obsolete copybara tags, faster bazel builds (debug)
...
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
RangerUFO
8fc6959950
Move conditional branch out of `pos2` loop
2024-03-20 23:50:14 +08:00
RangerUFO
c75d2eb635
Add the missing `HWY_ATTR` of `ProjKV`
2024-03-20 23:21:43 +08:00
RangerUFO
ce32f4db81
Streamline the implementation
2024-03-20 22:39:31 +08:00
Jan Wassenberg
11d9c51473
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-20 12:13:13 +01:00
Eric Ye
6865819bb7
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-20 12:13:06 +01:00
Eric Ye
fdc3812446
No public description
...
PiperOrigin-RevId: 617315030
2024-03-20 12:12:54 +01:00
RangerUFO
6923aec853
Add MQA support
2024-03-20 18:17:24 +08:00
RangerUFO
130e1f678f
Adjust vocab size to be the same as gemma_pytorch
2024-03-20 18:17:24 +08:00
Jan Wassenberg
5e0cafbdc2
Fix msan error, uninitialized model_training
...
This arose during the unpacking of LoaderArgs into individual ctor args. Probably better to pass LoaderArgs in, and have only a single ctor to reduce confusion.
Also fix includes.
PiperOrigin-RevId: 617386447
2024-03-19 21:12:35 -07:00
Eric Ye
fdb1091b9c
Connect "--weights" parameter to Gemma
...
PiperOrigin-RevId: 617323257
2024-03-19 16:08:26 -07:00
enum-class
4400842337
Minor refactor in Softmax
2024-03-20 00:20:14 +08:00
enum-class
858d5b08c2
Use highway in AddFrom, MulBy, MulByConst, MulByConstAndAdd, create_distribution
2024-03-19 08:38:09 +08:00
Copybara-Service
720f609d84
Merge pull request #102 from google:experimental
...
PiperOrigin-RevId: 616882521
2024-03-18 10:56:52 -07:00