Commit Graph

55 Commits

Author SHA1 Message Date
Phil Culliton ecab0cef3a Update README with Gemma 3 support and contributor acknowledgments
PiperOrigin-RevId: 825588241
2025-10-29 09:46:51 -07:00
Nitin Gangahar 085a34965a Update README since backprop and Adam optimizer has been deleted.
PiperOrigin-RevId: 823388833
2025-10-24 00:18:05 -07:00
Jan Wassenberg 3ed403e287 Major cleanup of profiler zones, add Caller annotation for all pool.Run
Pass ThreadingContext instead of Pools/Profiler individually, for access to Zones
Add GCPP_ZONE helper
Add Caller argument to pool.Run to enable new stats
Remove most direct dependencies on ThreadPool, prefer ParallelFor

PiperOrigin-RevId: 822934530
2025-10-23 01:54:24 -07:00
Jan Wassenberg 035273c184 tune pool kSpin mode in threading_context
Previously, this happened concurrently with the matmul autotune, which could lead to incorrect outcomes.

threading: de-singleton Pinning (no longer stores affinity); pass PoolWorkerMapping; fix Pool dtor order
Also enable SPR target (Zen4 is AMD-only),
update Highway version for renamed Thread()->GlobalIdx().
PiperOrigin-RevId: 816223017
2025-10-07 08:36:26 -07:00
Jan Wassenberg 2b4c16e243 Remove Griffin support
Also add IsObsolete helper

PiperOrigin-RevId: 803376921
2025-09-05 02:35:40 -07:00
Jan Wassenberg faa4102992 (Resubmit) Prepare profiler annotations for new API
Pass hwy::Profiler& to low-level functions.
Used ThreadingContext arg instead of NestedPools.
Use new PROFILER_ZONE3.

PiperOrigin-RevId: 794461159
2025-08-13 01:38:24 -07:00
The gemma.cpp Authors a2d9133f7d Prepare profiler annotations for new API
Pass hwy::Profiler& to low-level functions.
Used ThreadingContext arg instead of NestedPools.
Use new PROFILER_ZONE3.

PiperOrigin-RevId: 793865287
2025-08-11 17:51:38 -07:00
Jan Wassenberg 4cbf63e6f0 Prepare profiler annotations for new API
Pass hwy::Profiler& to low-level functions.
Used ThreadingContext arg instead of NestedPools.
Use new PROFILER_ZONE3.

PiperOrigin-RevId: 793821255
2025-08-11 15:34:52 -07:00
Jan Wassenberg 31d2b231af Update PaliGemma Kaggle link to point to v2
PiperOrigin-RevId: 772328912
2025-06-16 23:24:57 -07:00
Jan Wassenberg c027a45a2e MatPtr-ify KV, shared div_seq_len, --seq_len flag
PiperOrigin-RevId: 770194455
2025-06-11 09:49:38 -07:00
Jan Wassenberg 252a4e955e Remove support for Gemma 1 and PaliGemma 1 models, superseded by (Pali)Gemma 2.
PiperOrigin-RevId: 756671308
2025-05-09 02:17:27 -07:00
Biruk Mammo d9d1709df8 Updates stale references to `compression/migrate_weights`.
PiperOrigin-RevId: 755938143
2025-05-07 11:33:59 -07:00
Jan Wassenberg 8d0882b966 Huge refactor of weight handling and model loading.
Weight handling:
- new ModelStore2 supports both pre-2025 multi-file and single-file formats
- simpler ForEachTensor with TensorArgs
- tensors are constructed with their full suffixed name

I/O:
- support mmap and stride
- Simplified SbsWriter, single insert(); add SbsReader

Misc:
- kMockTokenizer: allow creating with unavailable tokenizer
- configs.h: Simpler enum validity checks via kSentinel
- matmul.h: remove unused enable_bind (now in allocator.h)
- tensor_info: single TensorInfoRegistry class, rename from tensor_index.h

Frontends:
- Replace Allocate/CreateGemma with ctor(LoaderArgs, MatMulEnv&)
- Deduce model/weight type, remove --model and parsing
- Replace most common.h includes with configs.h
- Remove --compressed_weights, use --weights instead
- Remove ModelInfo, replaced by ModelConfig.

Backprop:
- Reduce max loss, remove backward_scalar_test (timeout)
- Update thresholds because new RandInit changes rng eval order and thus numerics
PiperOrigin-RevId: 755317484
2025-05-06 04:44:21 -07:00
Jan Wassenberg a3caf6e5d2 Add summary of optimizations/infra present in the repository
PiperOrigin-RevId: 754838402
2025-05-05 01:46:01 -07:00
Jan Wassenberg 83219e3c68 Add note on attention length and SFP
PiperOrigin-RevId: 738698399
2025-03-20 00:39:06 -07:00
Quirin Niedernhuber 0ff6b3123a Point out Gemma 3 support in README.md
PiperOrigin-RevId: 736125794
2025-03-12 07:33:30 -07:00
Daniel Keysers f173aa776e Add conversion tool for HF safetensors to gemma.cpp for PaliGemma.
PiperOrigin-RevId: 725990158
2025-02-12 03:47:43 -08:00
Daniel Keysers 493688f6f1 Allow interactive use with new single-file weight format.
Add section about new weights format to README.md.
Remove model_type_required parameter.
Update error handling for flags.

PiperOrigin-RevId: 715788822
2025-01-15 07:22:33 -08:00
Daniel Keysers 73766e8ee3 Small updates to the README file.
PiperOrigin-RevId: 707036429
2024-12-17 04:09:55 -08:00
Daniel Keysers a4d6adbc43 Introduce QueryResult in GemmaEnv and add a shortcut for WrapAndTokenize.
Remove max_tokens (and rely on only max_generated_tokens).

PiperOrigin-RevId: 685662260
2024-10-14 04:45:21 -07:00
Daniel Keysers 71116daf64 Tiny update of the README formatting.
PiperOrigin-RevId: 679162673
2024-09-26 08:38:12 -07:00
Daniel Keysers 709143e9a6 Add download location of Pali Gemma weights to README.md.
PiperOrigin-RevId: 679127088
2024-09-26 06:38:11 -07:00
Daniel Keysers f8835fe4a4 Add support for PaliGemma Vision-LM (224x224) to gemma.cpp
See https://arxiv.org/abs/2407.07726 for a description of the model.
Because PaliGemma operates as a prefix-LM on the image+prompt, add support for that.

PiperOrigin-RevId: 677841119
2024-09-23 10:09:38 -07:00
Jan Wassenberg 4154f5a910 Document Gemma 2 model names
PiperOrigin-RevId: 659858832
2024-08-06 01:44:15 -07:00
Jan Wassenberg f9b390b134 Support all weight types in a single binary.
This changes the command line flags, but the default value retains the previous behavior.

Also add a CreateGemma helper to enable extra args without interface changes.

PiperOrigin-RevId: 641266411
2024-06-07 09:04:45 -07:00
Jan Wassenberg e3f4374e81 Fix fix for weight type define, refs #198
GEMMA_WEIGHT_T is indeed the correct flag for the C++ compiler,
but the readme references CMake, and there the correct flag name is WEIGHT_TYPE.

PiperOrigin-RevId: 641170380
2024-06-07 01:32:25 -07:00
Jan Wassenberg 8dc0e5ea83 Fix reference to GEMMA_WEIGHT_T. Refs #198
PiperOrigin-RevId: 641161403
2024-06-07 00:54:30 -07:00
Paul Chang 82623bdc7f Refer to --weights rather than --compressed_weights to simplify CLI docs
PiperOrigin-RevId: 634391135
2024-05-16 07:51:49 -07:00
Jan Wassenberg 54120a5571 Mention Makefile contributed by @jart
PiperOrigin-RevId: 623436818
2024-04-10 03:21:10 -07:00
zond 9ca662dc14
Clarified README
Made it more visible that the recurrent weights are at a different Kaggle page.
2024-04-09 09:58:47 +02:00
Luca Versari 9c3f969405 Implement the Griffin model.
Also implement support for some model variations:

- Local attention.
- Add support for biases.
- Use RoPE only on half vectors.
- Support different order of QKV weights.

Co-authored-by: Andrey Mikhaylov <amik@google.com>
Co-authored-by: Martin Bruse <zondolfin@gmail.com>
Co-authored-by: Zoltan Szabadka <szabadka@google.com>
2024-04-08 21:45:54 +02:00
Jan Wassenberg 7122afed5a Add note on weight update and improve error message
PiperOrigin-RevId: 621849989
2024-04-04 07:17:27 -07:00
austinvhuang 810b5a0cc2 Update README with more details on contributing code, add experimental/ directory, add READMEs for subdirectories, clean up DEVELOPER notes 2024-03-15 14:10:24 -04:00
Copybara-Service c8b9675898 Merge pull request #74 from osanseviero:patch-1
PiperOrigin-RevId: 612937722
2024-03-05 12:49:09 -08:00
Jan Wassenberg bb9b023502 Support Bazel builds. Fixes #16
Also fix nuq/sfp-inl: warning, cast, and disable SCALAR

PiperOrigin-RevId: 612704056
2024-03-04 22:07:25 -08:00
Omar Sanseviero 8c857b957e
Update README.md 2024-03-04 12:58:49 +01:00
Omar Sanseviero 86761dc113
Update README.md 2024-03-01 23:44:38 +01:00
austinvhuang 0ea7b993de remove --log fixing https://github.com/google/gemma.cpp/issues/59, improve command line args help, add copybara #include sort guards in more source files, add README sections on running faster and related projects 2024-02-28 15:18:40 -05:00
Dan Zheng afc354dcb1 Import from GitHub.
PiperOrigin-RevId: 610595796
2024-02-26 19:05:11 -08:00
Dan Zheng 8db89304bd No public description
PiperOrigin-RevId: 610498969
2024-02-26 19:03:48 -08:00
austinvhuang 129e66ada2 Reduce KV cache preallocation to 4096 and make it comptime configurable, add rm build note in readme, add note on comptime options in DEVELOPERS, make multiturn=0 the default 2024-02-26 17:05:32 -05:00
Naoki Kishida 7ab968c957 Copybara import of the project:
--
26b541b666 by kishida <naokikishida@gmail.com>:

add information for the reseting conversation

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/40 from kishida:add_info_for_reset_conv 26b541b666
PiperOrigin-RevId: 610418671
2024-02-26 08:39:27 -08:00
Dan Zheng 4c155bd3df Restore reverted changes.
Sync to 84444c93a4.

PiperOrigin-RevId: 610263918
2024-02-25 19:32:07 -08:00
Silvio Traversaro 696597383c Copybara import of the project:
--
19694e1f2e by Silvio Traversaro <silvio@traversaro.it>:

Do not pass explicitly -O2 flag to compiler in Release build

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gemma.cpp/pull/3 from traversaro:patch-1 19694e1f2e
PiperOrigin-RevId: 610096914
2024-02-24 20:41:33 -08:00
Dan Zheng 84444c93a4 Revert "Copybara configuration update."
This reverts commit c03b5da542.

Restore lost changes due to improper Copybara syncing.
2024-02-24 15:15:14 -08:00
Dan Zheng c03b5da542 Copybara configuration update.
PiperOrigin-RevId: 609931218
2024-02-24 12:02:47 -08:00
Austin Huang 34b22c56f5
Merge pull request #6 from dcoles/clang-cl
Allow building on Windows using `clang-cl` toolchain
2024-02-24 12:27:40 -05:00
Jan Wassenberg af715d2436 Update readme to match code, see cl/609177092
PiperOrigin-RevId: 609912278
2024-02-23 22:34:08 -08:00
Jan Wassenberg 8f27580fb6
Merge branch 'dev' into clang-cl 2024-02-24 04:22:42 +01:00
The gemma.cpp Authors 7c9954dea5 Code update
PiperOrigin-RevId: 609719211
2024-02-23 07:13:10 -08:00