Jan Wassenberg
|
4033ed9e78
|
Avoid duplication of RMSNorm, support all activation/weight types
Add test for RMSNorm
Rename VectorizedRopeAndMulBy -> RopeAndMulBy
Move test_util to util/
PiperOrigin-RevId: 668332927
|
2024-08-28 01:26:55 -07:00 |
Apoorv Reddy
|
c6eb3b6f0d
|
VectorizedRopeAndMulBy.
~8x reduction (tested on few prompts) in Rope.
~3.8% prefill latency improvement.
~2.6% decode latency improvement.
PiperOrigin-RevId: 664650108
|
2024-08-18 23:17:01 -07:00 |
Jan Wassenberg
|
85cac13fb1
|
Split up ops.h into ops/ops-inl and matmul-inl
PiperOrigin-RevId: 654068303
|
2024-07-19 11:21:48 -07:00 |