gemma.cpp

History

Jan Wassenberg aaf51898b6 Major revamp #2 of Prefill: fix token order, parallel for multi-query - Allocate only the required KV caches and activation batch size - Add flags for batch sizes - Const-correct interface: Span of const int. - Also clean up the KVCache arg to a span. - Move kPrefillBatchSize into RuntimeConfig and remove related global constants. PiperOrigin-RevId: 655893197		2024-07-25 03:28:55 -07:00
..
app.h	Major revamp #2 of Prefill: fix token order, parallel for multi-query	2024-07-25 03:28:55 -07:00
args.h	Lint fix - string append, remove stale TODO	2024-07-08 04:11:21 -07:00