Further 1.02x prefill speedup from batch 64->512

Measured on SKX. Larger speedup expected for Zen4/SPR.

PiperOrigin-RevId: 652472928
This commit is contained in:
Jan Wassenberg 2024-07-15 07:25:25 -07:00 committed by Copybara-Service
parent aaee666a1d
commit cd530374b3
1 changed files with 1 additions and 1 deletions

View File

@ -36,7 +36,7 @@ ByteStorageT AllocateSizeof() {
return hwy::AllocateAligned<uint8_t>(sizeof(T));
}
constexpr size_t kPrefillBatchSize = 64;
constexpr size_t kPrefillBatchSize = 512;
constexpr size_t kDecodeBatchSize = 1;
constexpr size_t kBatchedQueryBatchSize = 16;
constexpr size_t kMinAdjustedPrefillBatchSize =