Sam Kaufman
6a78a23f4c
Abstracted some MatVecAdd spec. dupes.
2024-04-29 16:23:38 -07:00
Sam Kaufman
f608337fef
Remove Bf16ToF32EO and use PromoteEvenTo and PromoteOddTo.
2024-04-29 14:13:07 -07:00
Sam Kaufman
aa0b113214
(VecT*) to static_cast<VecT*>.
2024-04-29 12:53:47 -07:00
Sam Kaufman
5cb63346aa
supports_eo -> kSupportsEvenOdd
2024-04-29 12:51:35 -07:00
Zoltan Szabadka
27117cc39f
Simplify threading: remove the use of inner_pool.
...
We only used inner_pool in the prefill FFW function, and there we
can achieve sufficient parallelism on the rows of the matrix-vector
multiplications.
Benchmark results on a 1600-token summarization task:
```
Prefill speed
Num threads BEFORE AFTER
4 9.24 t/s 9.76 t/s
18 31.41 t/s 31.16 t/s
32 31.41 t/s 45.13 t/s
64 31.03 t/s 57.85 t/s
```
2024-04-29 16:07:30 +00:00
Paul Chang
1d18c5a129
Improve documentation for compress_weights flags
...
PiperOrigin-RevId: 629053191
2024-04-29 06:49:50 -07:00
Sam Kaufman
0816a1070d
Even-odd layout MatVecs for bf16 weights.
2024-04-28 20:09:25 -07:00
Paul Chang
2d4de6b08b
Support absolute positional embeddings from vanilla transformer
...
PiperOrigin-RevId: 628100831
2024-04-25 09:32:14 -07:00
Paul Chang
75eca87039
Simplify prefill early-exit (originally Merge #156 )
...
PiperOrigin-RevId: 627788524
2024-04-24 11:11:42 -07:00
Charles Chan
ea45d7c4d7
Use lambda to split function and Make stream_token can break prefill, too
2024-04-23 22:55:01 +08:00
Paul Chang
e8d29792ac
New token validity assertions, improve prompt truncation warning
...
PiperOrigin-RevId: 627376194
2024-04-23 07:05:59 -07:00
Jan Wassenberg
3bf22abb22
Fix sign comparison warnings
...
PiperOrigin-RevId: 627299902
2024-04-23 01:16:51 -07:00
Jan Wassenberg
e9a0caed87
Further improve IO, enable multiple backends without -D.
...
Move Path into io.h and use for opening files.
Removes dependency of gemma_lib on args.
Separate Windows codepath instead of emulating POSIX functions.
Plus lint fixes.
PiperOrigin-RevId: 626279004
2024-04-19 00:40:29 -07:00
Paul Chang
38f1ea9b80
Eliminate redundant copies of TokenString()
...
Move this function outside of HWY_NAMESPACE since it doesn't need to be
optimized for any particular architecture.
PiperOrigin-RevId: 626098641
2024-04-18 11:31:50 -07:00
Jan Wassenberg
a8ceb75f43
Improved IO abstraction layer
...
Move to unique_ptr-like File class.
Move `if OS_WIN` into wrapper functions.
exists -> Exists.
PiperOrigin-RevId: 625923056
2024-04-17 23:15:07 -07:00
Andrey Mikhaylov
4ef3da733a
Fixed minor things and added comments.
2024-04-12 15:39:16 +00:00
Andrey Mikhaylov
2c5706f159
Add comments regarding layers output usage.
2024-04-12 15:39:16 +00:00
Andrey Mikhaylov
03284d752e
Added layers output functionality to gemma and a binary debug_output to save the outputs to a json file.
2024-04-12 15:39:16 +00:00
RangerUFO
e541707caa
Rename the fields of Griffin weights
2024-04-10 21:04:31 +08:00
RangerUFO
4e960d67f6
Fix typos
2024-04-10 20:38:18 +08:00
RangerUFO
809bd0709d
Refactor data structures to reduce memory usage
2024-04-10 19:35:23 +08:00
Jan Wassenberg
881eeffe0a
Lint fixes: strcat, includes, arg naming
...
PiperOrigin-RevId: 623435210
2024-04-10 03:12:41 -07:00
RangerUFO
2099b37732
Change `NumGemmaLayers` and `NumGriffinLayers` to constants in configs
2024-04-09 20:44:41 +08:00
Jan Wassenberg
a982ec1287
Move code to gemma/ so we can remove error-prone copybara: comments.
...
Also fix includes and Lint warnings.
PiperOrigin-RevId: 623127487
2024-04-09 04:45:42 -07:00