Commit Graph

851 Commits

Author SHA1 Message Date
Jan Wassenberg e9a0caed87 Further improve IO, enable multiple backends without -D.
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.

Plus lint fixes.

PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Paul Chang 38f1ea9b80 Eliminate redundant copies of TokenString()
Move this function outside of HWY_NAMESPACE since it doesn't need to be
optimized for any particular architecture.

PiperOrigin-RevId: 626098641
2024-04-18 11:31:50 -07:00
Jan Wassenberg a8ceb75f43 Improved IO abstraction layer
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.

PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Jan Wassenberg a939b5fc9f Update distortion.h to weighted average, add distortion_test.
More thorough checks in sfp_test and nuq_test.
nuq_test: use deterministic input generator.

PiperOrigin-RevId: 625602019
2024-04-17 01:44:19 -07:00
Copybara-Service 05e7e2b2bb Merge pull request #145 from atorero:dev
PiperOrigin-RevId: 624221085
2024-04-12 10:27:18 -07:00
Andrey Mikhaylov 4ef3da733a Fixed minor things and added comments. 2024-04-12 15:39:16 +00:00
Andrey Mikhaylov 2c5706f159 Add comments regarding layers output usage. 2024-04-12 15:39:16 +00:00
Andrey Mikhaylov 03284d752e Added layers output functionality to gemma and a binary debug_output to save the outputs to a json file. 2024-04-12 15:39:16 +00:00
Copybara-Service 342e998cb6 Merge pull request #142 from ufownl:refactor/data_structures
PiperOrigin-RevId: 623503486
2024-04-10 08:35:18 -07:00
RangerUFO e541707caa Rename the fields of Griffin weights 2024-04-10 21:04:31 +08:00
RangerUFO 4e960d67f6 Fix typos 2024-04-10 20:38:18 +08:00
RangerUFO 809bd0709d Refactor data structures to reduce memory usage 2024-04-10 19:35:23 +08:00
Jan Wassenberg 54120a5571 Mention Makefile contributed by @jart
PiperOrigin-RevId: 623436818
2024-04-10 03:21:10 -07:00
Jan Wassenberg 881eeffe0a Lint fixes: strcat, includes, arg naming
PiperOrigin-RevId: 623435210
2024-04-10 03:12:41 -07:00
Copybara-Service da91f4c4be Merge pull request #137 from zond:main
PiperOrigin-RevId: 623255639
2024-04-09 12:57:57 -07:00
Copybara-Service 827fec1904 Merge pull request #139 from ufownl:feature/public_layers
PiperOrigin-RevId: 623254705
2024-04-09 12:54:23 -07:00
RangerUFO 2099b37732 Change `NumGemmaLayers` and `NumGriffinLayers` to constants in configs 2024-04-09 20:44:41 +08:00
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
zond 9ca662dc14
Clarified README
Made it more visible that the recurrent weights are at a different Kaggle page.
2024-04-09 09:58:47 +02:00
Copybara-Service 83dd08ac87 Merge pull request #136 from pculliton:griffin
PiperOrigin-RevId: 623054233
2024-04-08 22:29:24 -07:00
Luca Versari 9c3f969405 Implement the Griffin model.
Also implement support for some model variations:

- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Jan Wassenberg 4326249d0a Fix includes
PiperOrigin-RevId: 622456877
2024-04-06 09:27:09 -07:00
Jan Wassenberg a3a0f78fda Merge pull request #131 from veluca93:benchmark-and-test
PiperOrigin-RevId: 622452794
2024-04-06 18:06:03 +02:00
Jan Wassenberg 9e51a91cfc Faster bazel builds by only building all local targets.
PiperOrigin-RevId: 622442126
2024-04-06 18:05:49 +02:00
Luca Versari 5862d1f995 Add a benchmark and additional tests.
Also add a script to help running sanitizer builds, and do some cleanup.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Jan Wassenberg d852cf5089 Remove unused includes
PiperOrigin-RevId: 622412150
2024-04-06 03:13:43 -07:00
Copybara-Service 325ef06cf9 Merge pull request #130 from veluca93:weight-handling
PiperOrigin-RevId: 622405491
2024-04-06 02:22:00 -07:00
Luca Versari 4c23932289 Improve weight handling.
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros

Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Copybara-Service 280b8cb8a1 Merge pull request #129 from veluca93:more-ops
PiperOrigin-RevId: 622145499
2024-04-05 05:02:00 -07:00
Luca Versari 6cdb8a45a0 Add more ops: Sigmoid, (Two)MatVecAdd. Faster TwoMatVec.
drive-by: some build system simplifications

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Lode Vandevenne <lode@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-05 12:27:31 +02:00
Jan Wassenberg 7122afed5a Add note on weight update and improve error message
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
Copybara-Service 08948f13ac Merge pull request #127 from szabadka:gemma3
PiperOrigin-RevId: 621815677
2024-04-04 04:32:03 -07:00
Jan Wassenberg 44e6274e99 1.07x speedup: merge MQA parallel sections as suggested by @veluca93
PiperOrigin-RevId: 621772392
2024-04-04 01:12:53 -07:00
Zoltan Szabadka 71ead04afb Fix off-by-one errors in generation code and token streaming callback.
In the generation code we were feeding the last token of the prompt
twice through the transformer. The new version fixes that and also
works in the case where Prefill is completely disabled.
2024-04-04 07:56:21 +00:00
Copybara-Service ede337f876 Merge pull request #125 from szabadka:gemma1
PiperOrigin-RevId: 621549709
2024-04-03 09:35:25 -07:00
Zoltan Szabadka b670d43e4f Add standalone tool to compress weights.
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-04-03 14:54:08 +00:00
Copybara-Service 93a648926c Merge pull request #122 from LINKIWI:bazelversion
PiperOrigin-RevId: 621148731
2024-04-02 06:02:42 -07:00
Kevin Lin 1845b19b47
.bazelversion: Bazel 7.1.1 2024-03-31 11:39:21 -07:00
Copybara-Service 7e0a6fcab1 Merge pull request #120 from ufownl:bugfix/gcc_compilation_error
PiperOrigin-RevId: 620016059
2024-03-28 12:10:08 -07:00
RangerUFO 1c03d7446d Fix compilation error when `HWY_COMPILER_GCC_ACTUAL < 1300` 2024-03-28 14:54:37 +08:00
Jan Wassenberg bb767d788d Bounds-checks for large prompts. Refs #99
Also remove init placeholder and move Sqrt to ops.h.

PiperOrigin-RevId: 619529202
2024-03-27 07:49:46 -07:00
Copybara-Service bbf4df4584 Merge pull request #115 from villesundell:patch-1
PiperOrigin-RevId: 619262700
2024-03-26 11:46:54 -07:00
Jan Wassenberg c1d3c3284c Add ops_test to BUILD, rename transformer_ops->ops, fix includes.
Also fix copybara. Refs #105

PiperOrigin-RevId: 619157071
2024-03-26 05:37:31 -07:00
Copybara-Service 9f1595c110 Merge pull request #105 from enum-class:improve_ops_utility
PiperOrigin-RevId: 618827910
2024-03-25 07:00:45 -07:00
Jan Wassenberg 3f2fabcfcb Update todo to mention PartialSort
PiperOrigin-RevId: 618685783
2024-03-24 17:14:31 -07:00
enum-class aa6e88e591 add unit tests for ops 2024-03-23 21:09:19 +08:00
enum-class d079c8f1ba Merge branch 'dev' into improve_ops_utility 2024-03-23 10:36:56 +08:00
Copybara-Service fcf5c1af88 Merge pull request #114 from ufownl:experimental
PiperOrigin-RevId: 618148701
2024-03-22 05:36:07 -07:00
Jan Wassenberg 61e031fe98 Towards building tests without GUnit Refs #29
PiperOrigin-RevId: 618032987
2024-03-21 19:28:02 -07:00
Jan Wassenberg 24add61dd9 Fix SFP/NUQ for bf16 rounding in Highway
SFP: Avoid rounding twice, and more robust TestDot.
NUQ: also more robust SNR, minor touchups to header.

PiperOrigin-RevId: 618030096
2024-03-21 19:06:19 -07:00