Jan Wassenberg
c5c9fc300c
Enable even/odd for SFP. Refs #166
...
Disable it for float32 because there is not enough benefit.
PiperOrigin-RevId: 631788326
2024-05-08 07:09:06 -07:00
Jan Wassenberg
f6d02b2870
Fix RecurrentGemma (refs #166 ) - one Dot was ignoring scale.
...
Remove extra Dot() overload
MatVecAdd always adds, use MatVecT<kAdd> if conditional.
Remove ununsed MatVecAddLoop and MatVecLoop
No longer tsan-verify even_odd
PiperOrigin-RevId: 631377279
2024-05-07 04:40:42 -07:00
Phil Culliton
28ca001d5e
Matmul and test functions
...
PiperOrigin-RevId: 630373984
2024-05-03 06:39:36 -07:00
Copybara-Service
6eeef2e2d9
Merge pull request #166 from samkaufman:deinterleave-vecs
...
PiperOrigin-RevId: 630360778
2024-05-03 05:23:31 -07:00
Zoltan Szabadka
9a2682d544
Use more parallelism in the QKV projections of the MHA block.
...
We compute all three projections with one MatVec and then copy
the kv part to the cache.
Benchmark results for 7b-it model that uses MHA blocks (summarization with
1600 tokens for prefill and essay writing with 500 tokens for generation):
```
Prefill speed Generation speed
Num threads BEFORE AFTER BEFORE AFTER
32 13.75 t/s 14.80 t/s 9.22 t/s 9.77 t/s
64 19.89 t/s 24.83 t/s 12.46 t/s 13.66 t/s
```
2024-05-02 13:46:45 +00:00
Sam Kaufman
4a6173d929
Remove unused vars.
2024-05-02 00:41:44 -07:00
Sam Kaufman
564937ede6
Merge branch 'dev' into deinterleave-vecs
2024-04-30 16:23:04 -07:00
Sam Kaufman
2829ef17ad
Check for HWY_NATIVE_DOT_BF16.
2024-04-30 15:19:28 -07:00
Sam Kaufman
59ebecce22
Fix: specialized MatVecAdd was never called.
2024-04-30 15:17:27 -07:00
Jan Wassenberg
12fb2f05cf
Add per-thread even_odd storage for #166 .
...
Also inline ProjQ and ProjKV lambdas,
add missing includes/deps for ops_test.
PiperOrigin-RevId: 629460608
2024-04-30 10:42:23 -07:00
Sam Kaufman
6a78a23f4c
Abstracted some MatVecAdd spec. dupes.
2024-04-29 16:23:38 -07:00
Sam Kaufman
f608337fef
Remove Bf16ToF32EO and use PromoteEvenTo and PromoteOddTo.
2024-04-29 14:13:07 -07:00
Sam Kaufman
aa0b113214
(VecT*) to static_cast<VecT*>.
2024-04-29 12:53:47 -07:00
Sam Kaufman
5cb63346aa
supports_eo -> kSupportsEvenOdd
2024-04-29 12:51:35 -07:00
Sam Kaufman
0816a1070d
Even-odd layout MatVecs for bf16 weights.
2024-04-28 20:09:25 -07:00
Jan Wassenberg
a982ec1287
Move code to gemma/ so we can remove error-prone copybara: comments.
...
Also fix includes and Lint warnings.
PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00