Jan Wassenberg
e9a0caed87
Further improve IO, enable multiple backends without -D.
...
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.
Plus lint fixes.
PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Paul Chang
38f1ea9b80
Eliminate redundant copies of TokenString()
...
Move this function outside of HWY_NAMESPACE since it doesn't need to be
optimized for any particular architecture.
PiperOrigin-RevId: 626098641
2024-04-18 11:31:50 -07:00
Jan Wassenberg
a8ceb75f43
Improved IO abstraction layer
...
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.
PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Jan Wassenberg
a939b5fc9f
Update distortion.h to weighted average, add distortion_test.
...
More thorough checks in sfp_test and nuq_test.
nuq_test: use deterministic input generator.
PiperOrigin-RevId: 625602019
2024-04-17 01:44:19 -07:00
Copybara-Service
05e7e2b2bb
Merge pull request #145 from atorero:dev
...
PiperOrigin-RevId: 624221085
2024-04-12 10:27:18 -07:00
Andrey Mikhaylov
4ef3da733a
Fixed minor things and added comments.
2024-04-12 15:39:16 +00:00
Andrey Mikhaylov
2c5706f159
Add comments regarding layers output usage.
2024-04-12 15:39:16 +00:00
Andrey Mikhaylov
03284d752e
Added layers output functionality to gemma and a binary debug_output to save the outputs to a json file.
2024-04-12 15:39:16 +00:00
Copybara-Service
342e998cb6
Merge pull request #142 from ufownl:refactor/data_structures
...
PiperOrigin-RevId: 623503486
2024-04-10 08:35:18 -07:00
RangerUFO
e541707caa
Rename the fields of Griffin weights
2024-04-10 21:04:31 +08:00
RangerUFO
4e960d67f6
Fix typos
2024-04-10 20:38:18 +08:00
RangerUFO
809bd0709d
Refactor data structures to reduce memory usage
2024-04-10 19:35:23 +08:00
Jan Wassenberg
54120a5571
Mention Makefile contributed by @jart
...
PiperOrigin-RevId: 623436818
2024-04-10 03:21:10 -07:00
Jan Wassenberg
881eeffe0a
Lint fixes: strcat, includes, arg naming
...
PiperOrigin-RevId: 623435210
2024-04-10 03:12:41 -07:00
Copybara-Service
da91f4c4be
Merge pull request #137 from zond:main
...
PiperOrigin-RevId: 623255639
2024-04-09 12:57:57 -07:00
Copybara-Service
827fec1904
Merge pull request #139 from ufownl:feature/public_layers
...
PiperOrigin-RevId: 623254705
2024-04-09 12:54:23 -07:00
RangerUFO
2099b37732
Change `NumGemmaLayers` and `NumGriffinLayers` to constants in configs
2024-04-09 20:44:41 +08:00
Jan Wassenberg
a982ec1287
Move code to gemma/ so we can remove error-prone copybara: comments.
...
Also fix includes and Lint warnings.
PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
zond
9ca662dc14
Clarified README
...
Made it more visible that the recurrent weights are at a different Kaggle page.
2024-04-09 09:58:47 +02:00
Copybara-Service
83dd08ac87
Merge pull request #136 from pculliton:griffin
...
PiperOrigin-RevId: 623054233
2024-04-08 22:29:24 -07:00
Luca Versari
9c3f969405
Implement the Griffin model.
...
Also implement support for some model variations:
- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Jan Wassenberg
4326249d0a
Fix includes
...
PiperOrigin-RevId: 622456877
2024-04-06 09:27:09 -07:00
Jan Wassenberg
a3a0f78fda
Merge pull request #131 from veluca93:benchmark-and-test
...
PiperOrigin-RevId: 622452794
2024-04-06 18:06:03 +02:00
Jan Wassenberg
9e51a91cfc
Faster bazel builds by only building all local targets.
...
PiperOrigin-RevId: 622442126
2024-04-06 18:05:49 +02:00
Luca Versari
5862d1f995
Add a benchmark and additional tests.
...
Also add a script to help running sanitizer builds, and do some cleanup.
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Jan Wassenberg
d852cf5089
Remove unused includes
...
PiperOrigin-RevId: 622412150
2024-04-06 03:13:43 -07:00
Copybara-Service
325ef06cf9
Merge pull request #130 from veluca93:weight-handling
...
PiperOrigin-RevId: 622405491
2024-04-06 02:22:00 -07:00
Luca Versari
4c23932289
Improve weight handling.
...
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Copybara-Service
280b8cb8a1
Merge pull request #129 from veluca93:more-ops
...
PiperOrigin-RevId: 622145499
2024-04-05 05:02:00 -07:00
Luca Versari
6cdb8a45a0
Add more ops: Sigmoid, (Two)MatVecAdd. Faster TwoMatVec.
...
drive-by: some build system simplifications
Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Lode Vandevenne <lode@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-05 12:27:31 +02:00
Jan Wassenberg
7122afed5a
Add note on weight update and improve error message
...
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
Copybara-Service
08948f13ac
Merge pull request #127 from szabadka:gemma3
...
PiperOrigin-RevId: 621815677
2024-04-04 04:32:03 -07:00
Jan Wassenberg
44e6274e99
1.07x speedup: merge MQA parallel sections as suggested by @veluca93
...
PiperOrigin-RevId: 621772392
2024-04-04 01:12:53 -07:00
Zoltan Szabadka
71ead04afb
Fix off-by-one errors in generation code and token streaming callback.
...
In the generation code we were feeding the last token of the prompt
twice through the transformer. The new version fixes that and also
works in the case where Prefill is completely disabled.
2024-04-04 07:56:21 +00:00
Copybara-Service
ede337f876
Merge pull request #125 from szabadka:gemma1
...
PiperOrigin-RevId: 621549709
2024-04-03 09:35:25 -07:00
Zoltan Szabadka
b670d43e4f
Add standalone tool to compress weights.
...
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Copybara-Service
93a648926c
Merge pull request #122 from LINKIWI:bazelversion
...
PiperOrigin-RevId: 621148731
2024-04-02 06:02:42 -07:00
Kevin Lin
1845b19b47
.bazelversion: Bazel 7.1.1
2024-03-31 11:39:21 -07:00
Copybara-Service
7e0a6fcab1
Merge pull request #120 from ufownl:bugfix/gcc_compilation_error
...
PiperOrigin-RevId: 620016059
2024-03-28 12:10:08 -07:00
RangerUFO
1c03d7446d
Fix compilation error when `HWY_COMPILER_GCC_ACTUAL < 1300`
2024-03-28 14:54:37 +08:00
Jan Wassenberg
bb767d788d
Bounds-checks for large prompts. Refs #99
...
Also remove init placeholder and move Sqrt to ops.h.
PiperOrigin-RevId: 619529202
2024-03-27 07:49:46 -07:00
Copybara-Service
bbf4df4584
Merge pull request #115 from villesundell:patch-1
...
PiperOrigin-RevId: 619262700
2024-03-26 11:46:54 -07:00
Jan Wassenberg
c1d3c3284c
Add ops_test to BUILD, rename transformer_ops->ops, fix includes.
...
Also fix copybara. Refs #105
PiperOrigin-RevId: 619157071
2024-03-26 05:37:31 -07:00
Copybara-Service
9f1595c110
Merge pull request #105 from enum-class:improve_ops_utility
...
PiperOrigin-RevId: 618827910
2024-03-25 07:00:45 -07:00
Jan Wassenberg
3f2fabcfcb
Update todo to mention PartialSort
...
PiperOrigin-RevId: 618685783
2024-03-24 17:14:31 -07:00
enum-class
aa6e88e591
add unit tests for ops
2024-03-23 21:09:19 +08:00
enum-class
d079c8f1ba
Merge branch 'dev' into improve_ops_utility
2024-03-23 10:36:56 +08:00
Copybara-Service
fcf5c1af88
Merge pull request #114 from ufownl:experimental
...
PiperOrigin-RevId: 618148701
2024-03-22 05:36:07 -07:00
Jan Wassenberg
61e031fe98
Towards building tests without GUnit Refs #29
...
PiperOrigin-RevId: 618032987
2024-03-21 19:28:02 -07:00
Jan Wassenberg
24add61dd9
Fix SFP/NUQ for bf16 rounding in Highway
...
SFP: Avoid rounding twice, and more robust TestDot.
NUQ: also more robust SNR, minor touchups to header.
PiperOrigin-RevId: 618030096
2024-03-21 19:06:19 -07:00