Apoorv Reddy
fd1b0743a7
Rename Gemma9B and Gemma27B to Gemma2_9B and Gemma2_27B.
...
This is to make it clear that these models are part of the Gemma2 family of models.
PiperOrigin-RevId: 661181682
2024-08-09 02:09:06 -07:00
Phil Culliton
1982a6ba00
Internal change
...
PiperOrigin-RevId: 657831926
2024-07-30 20:24:54 -07:00
Daniel Keysers
5a751a9a44
Update gemma-27b to the correct query scaling.
...
PiperOrigin-RevId: 653201646
2024-07-17 05:43:52 -07:00
The gemma.cpp Authors
df3fb70802
Improve readability with RepeatedAttentionWindowSizes
...
PiperOrigin-RevId: 651431738
2024-07-11 09:11:46 -07:00
Kan Wu
f519ab6693
Refactor configurables.
...
PiperOrigin-RevId: 651259154
2024-07-10 21:30:58 -07:00
Kan Wu
7e4b20455e
Add sliding window attention for Gemma 2.
...
PiperOrigin-RevId: 648778253
2024-07-02 11:08:03 -07:00
Jan Wassenberg
e588a7f45d
Add config for att/final cap, skip max-subtract. Fixes #278
...
Also update includes/deps for backprop/.
PiperOrigin-RevId: 648399222
2024-07-01 09:45:26 -07:00
Paul Chang
8ac5d66575
Introduce new Gemma 9B and 27B configs
...
PiperOrigin-RevId: 647299080
2024-06-27 06:45:24 -07:00
The gemma.cpp Authors
a85725614a
Refactor kCachePosSize and kCacheLayerSize into separate functors.
...
PiperOrigin-RevId: 645048519
2024-06-20 08:52:08 -07:00
Paul Chang
d7d9d14f0e
Move kGriffinLayers into ConfigNoSSM, set kGemmaLayers directly
...
For regular (non-SSM) Gemma models, kGriffinLayers is by definition always zero
and kGemmaLayers is just the number of layers.
PiperOrigin-RevId: 644384531
2024-06-18 07:52:52 -07:00
Jan Wassenberg
29c0c574e6
Integrate matmul into FFW: 4.3x prefill speedup
...
```
before, bf16:
27.2929 prefill tokens / sec
17.2114 tokens / sec
after, bf16
116.496 prefill tokens / sec
17.5391 tokens / sec
```
PiperOrigin-RevId: 643328437
2024-06-14 06:32:26 -07:00
Jan Wassenberg
c15ff9529c
Reduce duplication in Config* by inheriting no-SSM
...
PiperOrigin-RevId: 643030629
2024-06-13 09:48:56 -07:00
Jan Wassenberg
f9b390b134
Support all weight types in a single binary.
...
This changes the command line flags, but the default value retains the previous behavior.
Also add a CreateGemma helper to enable extra args without interface changes.
PiperOrigin-RevId: 641266411
2024-06-07 09:04:45 -07:00
Jan Wassenberg
57c2cd8b52
Simplifications: remove GemmaInterface and GemmaImpl
...
Split common and weights into separate lib
Remove common-inl (does not have to be SIMD code), activations.cc
Centralize switch(Model) to avoid duplication
Move CompressWeightsT to compress_weights.cc
Move LoadWeights to weights.cc
PiperOrigin-RevId: 640869202
2024-06-06 05:54:21 -07:00
Zoltan Szabadka
36e4d8bbfe
Add first version of backpropagation support.
...
This is still in progress / experimental, currently it is only
implemented for normal gemma MQA attention layers, and no
parallelism is added yet for backward pass.
Since we need to remember all activations from all layers, the
forward pass was also reimplemented with a new activation data
structure.
2024-06-04 08:37:49 +00:00
Paul Chang
bacba351d4
Support additional scaling
...
PiperOrigin-RevId: 631429113
2024-05-07 08:16:25 -07:00
Jan Wassenberg
12fb2f05cf
Add per-thread even_odd storage for #166 .
...
Also inline ProjQ and ProjKV lambdas,
add missing includes/deps for ops_test.
PiperOrigin-RevId: 629460608
2024-04-30 10:42:23 -07:00
Paul Chang
2d4de6b08b
Support absolute positional embeddings from vanilla transformer
...
PiperOrigin-RevId: 628100831
2024-04-25 09:32:14 -07:00
RangerUFO
2099b37732
Change `NumGemmaLayers` and `NumGriffinLayers` to constants in configs
2024-04-09 20:44:41 +08:00
Jan Wassenberg
a982ec1287
Move code to gemma/ so we can remove error-prone copybara: comments.
...
Also fix includes and Lint warnings.
PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00