Jan Wassenberg
09a7e75ead
Prep for sharding gemma.cc: split into kv_cache, tokenizer.
...
Move activations.h to backprop/ to make space for another activations.h.
PiperOrigin-RevId: 648744500
2024-07-02 09:31:06 -07:00
Jan Wassenberg
e588a7f45d
Add config for att/final cap, skip max-subtract. Fixes #278
...
Also update includes/deps for backprop/.
PiperOrigin-RevId: 648399222
2024-07-01 09:45:26 -07:00
Jan Wassenberg
7d0720675f
Move raw_weights into separate header, used mainly by compress_weights.
...
Fix warnings in backprop/* (include)
PiperOrigin-RevId: 643983136
2024-06-17 06:17:02 -07:00
Zoltan Szabadka
a3a75b77f9
Use CompressedWeights<TConfig<float>> in backpropagation.
...
kWeightsAreCompressed are removed and LoadRawWeights is moved
to compress_weights.cc
2024-06-10 14:34:24 +00:00
Copybara-Service
f7ac7092d6
Merge pull request #212 from szabadka:adam2
...
PiperOrigin-RevId: 641182573
2024-06-07 02:25:18 -07:00
Jan Wassenberg
57c2cd8b52
Simplifications: remove GemmaInterface and GemmaImpl
...
Split common and weights into separate lib
Remove common-inl (does not have to be SIMD code), activations.cc
Centralize switch(Model) to avoid duplication
Move CompressWeightsT to compress_weights.cc
Move LoadWeights to weights.cc
PiperOrigin-RevId: 640869202
2024-06-06 05:54:21 -07:00
Zoltan Szabadka
df01700b54
Move the backpropagation code to its own directory
2024-06-04 10:20:16 +00:00