Commit Graph

38 Commits

Author SHA1 Message Date
Jan Wassenberg 85fcd3cd80 Cleanup: add ModelInfo struct, remove gcpp::
PiperOrigin-RevId: 648707763
2024-07-02 07:11:15 -07:00
Jan Wassenberg b1c1ec1d59 Use benchmark_helper in py bindings (adds BOS)
Also remove thread clamp (OK to be zero or large).

PiperOrigin-RevId: 648657155
2024-07-02 03:27:15 -07:00
The gemma.cpp Authors ef786f1bfc Use hwy::ThreadPool::MaxThreads() to determine the number of threads to use.
PiperOrigin-RevId: 646117298
2024-06-24 09:16:04 -07:00
Daniel Keysers 0570972d43 Fixing two typos.
PiperOrigin-RevId: 645103198
2024-06-20 11:33:12 -07:00
Jan Wassenberg 3e2396f98c Use Loader/AppArgs to construct gemma_test model, simplify AcceptFunc
accept_token: allow default, check if empty when using
allow mixing sample_func and stream_func, call the latter after the former
Also fix missing includes/deps.
PiperOrigin-RevId: 642240012
2024-06-11 05:53:10 -07:00
Jan Wassenberg f9b390b134 Support all weight types in a single binary.
This changes the command line flags, but the default value retains the previous behavior.

Also add a CreateGemma helper to enable extra args without interface changes.

PiperOrigin-RevId: 641266411
2024-06-07 09:04:45 -07:00
Zelalem Aweke 9e213b3d96 Use system topology to pin threads across clusters.
PiperOrigin-RevId: 640151974
2024-06-04 07:50:32 -07:00
Jan Wassenberg 12fb2f05cf Add per-thread even_odd storage for #166.
Also inline ProjQ and ProjKV lambdas,
add missing includes/deps for ops_test.

PiperOrigin-RevId: 629460608
2024-04-30 10:42:23 -07:00
Jan Wassenberg 7a12e29027 Add error-checking for py binding, add missing include+hwasan check
PiperOrigin-RevId: 628453112
2024-04-26 10:59:41 -07:00
Phil Culliton 9e0ac5de34 Update Clif wrapper to work with latest gemma.cpp and add simple example
PiperOrigin-RevId: 628134201
2024-04-25 11:17:16 -07:00
Jan Wassenberg a8ceb75f43 Improved IO abstraction layer
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.

PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Jan Wassenberg a982ec1287 Move code to gemma/ so we can remove error-prone copybara: comments.
Also fix includes and Lint warnings.

PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00
Luca Versari 9c3f969405 Implement the Griffin model.
Also implement support for some model variations:

- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Luca Versari 5862d1f995 Add a benchmark and additional tests.
Also add a script to help running sanitizer builds, and do some cleanup.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Sami Boukortt <sboukortt@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 12:54:52 +02:00
Luca Versari 4c23932289 Improve weight handling.
- Allow scaling of SFP weights
- Allow using uncompressed weights
- Do not try to compress weights in the main model calls
- Reduce code duplication in weight handling with some macros

Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
Co-authored-by: Thomas Fischbacher <tfish@google.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-06 11:08:47 +02:00
Copybara-Service bbf4df4584 Merge pull request #115 from villesundell:patch-1
PiperOrigin-RevId: 619262700
2024-03-26 11:46:54 -07:00
Jan Wassenberg ba86c8d590 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-21 04:19:02 +01:00
Eric Ye 89be4c3de8 No public description
PiperOrigin-RevId: 617315030
2024-03-21 04:18:36 +01:00
Ville Sundell 546519c855
Added a missing space in app.h
When the user runs "--help", they see the non-existent word
"compressingnew". This is because of a missing space, which
is now added, resulting in "compressing new".
2024-03-21 00:39:45 +02:00
Jan Wassenberg 06cea2bcdb Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 23:37:39 +01:00
Eric Ye ffd02c59ad No public description
PiperOrigin-RevId: 617315030
2024-03-20 23:37:12 +01:00
Jan Wassenberg 7d5364bb80 Remove obsolete copybara tags, faster bazel builds (debug)
PiperOrigin-RevId: 617576799
2024-03-20 11:31:59 -07:00
Copybara-Service 0221956b2e Merge pull request #87 from google:refactor-tidy
PiperOrigin-RevId: 615204427
2024-03-12 16:10:47 -07:00
austinvhuang 4aa8d0584e Merge branch 'dev' into refactor-tidy 2024-03-12 15:01:46 -04:00
Copybara-Service ccd055e06b Merge pull request #82 from google:examples
PiperOrigin-RevId: 615066980
2024-03-12 09:24:24 -07:00
Jan Wassenberg 0d406061c0 Detect and print build type. Refs #88
PiperOrigin-RevId: 614906000
2024-03-11 21:58:10 -07:00
austinvhuang 60d054e041 move arg definitions out of gemma.h to app.h 2024-03-10 23:49:25 -04:00
austinvhuang 10f7a086aa [WIP] decouple GemmaImpl from CLI args 2024-03-06 15:06:41 -05:00
austinvhuang 0ea7b993de remove --log fixing https://github.com/google/gemma.cpp/issues/59, improve command line args help, add copybara #include sort guards in more source files, add README sections on running faster and related projects 2024-02-28 15:18:40 -05:00
Copybara-Service 1a1dd90287 Merge pull request #33 from shirayu:add_eot_option
PiperOrigin-RevId: 610838070
2024-02-27 12:32:01 -08:00
Jan Wassenberg 179ecf9e78 Warn instead of assert for setaffinity. Fixes #49
PiperOrigin-RevId: 610638517
2024-02-26 22:46:11 -08:00
Dan Zheng 4c155bd3df Restore reverted changes.
Sync to 84444c93a4.

PiperOrigin-RevId: 610263918
2024-02-25 19:32:07 -08:00
Silvio Traversaro 696597383c Copybara import of the project:
--
19694e1f2e by Silvio Traversaro <silvio@traversaro.it>:

Do not pass explicitly -O2 flag to compiler in Release build

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/3 from traversaro:patch-1 19694e1f2e
PiperOrigin-RevId: 610096914
2024-02-24 20:41:33 -08:00
Dan Zheng 84444c93a4 Revert "Copybara configuration update."
This reverts commit c03b5da542.

Restore lost changes due to improper Copybara syncing.
2024-02-24 15:15:14 -08:00
Dan Zheng c03b5da542 Copybara configuration update.
PiperOrigin-RevId: 609931218
2024-02-24 12:02:47 -08:00
Yuta Hayashibe 1a95cf3274 Add --eot_line option 2024-02-24 23:27:33 +09:00
David Coles 39e385782c Allow building on Windows using `clang-cl` toolchain
It's not possible to build `gemma.cpp` with the standard MSVC front-end
as it doesn't support arrays more than `0x7ffffffff` bytes (see Compiler Error C2148),
however this isn't a problem with the optional Visual Studio Clang/LLVM frontend.

This can be specified using the `-T` flag when running CMake:

```
$ cmake -B build -T ClangCL
$ cmake --build build --config Release
```

Windows doesn't provide `pread`/`pwrite` so this must be emulated using
the `ReadFile`/`WriteFile` Win32 APIs.

`_CRT_SECURE_NO_WARNINGS` is defined to prevent a large number of warnings
about using "depricated" function names (e.g. `close` instead of `_close`).

`NOMINMAX` is defined to prevent the `min`/`max` macros from `windows.h`
from conflicting with expressions like `std::min`. Generally libraries should
avoid including `windows.h` in their public headers or define `WIN32_LEAN_AND_MEAN`
before including the `windows.h` header, but this unfortunately isn't always the case.
2024-02-23 00:38:54 -08:00
Austin Huang e29cd566cf initial commit 2024-02-21 03:31:22 +00:00