Daniel Keysers
437e0eb9af
Internal change. Slight restructuring of gemma_test.
...
PiperOrigin-RevId: 670529565
2024-09-03 06:16:09 -07:00
Daniel Keysers
a8e08778d4
Add an additional QueryModel() overload to GemmaEnv.
...
Use args only in GemmaEnv constructor, store everything else in RuntimeConfig.
Add runtime option to turn off thread spinning.
PiperOrigin-RevId: 670467320
2024-09-03 02:25:19 -07:00
Daniel Keysers
3c17911875
Make gemma_test slightly more allowing on MultiTurn.
...
PiperOrigin-RevId: 668097277
2024-08-27 12:40:16 -07:00
Jan Wassenberg
c4303cd89b
Fix test for 2b - update prompt
...
PiperOrigin-RevId: 667878053
2024-08-27 00:56:47 -07:00
Daniel Keysers
18e6012872
Fix prefill for batched queries.
...
This lets gemma_test/GeographyBatched pass now also for gemma2-27B.
PiperOrigin-RevId: 664827485
2024-08-19 08:50:42 -07:00
Jan Wassenberg
22995c699d
Simplify pos handling, auto-increment output arg
...
- no longer multiply by num_queries
- remove unused interleaved prompts
- Rename to Queries*
- Rename batch_start/interleaved_pos/pos to queries_pos
PiperOrigin-RevId: 663331823
2024-08-15 09:25:26 -07:00
RangerUFO
730b6bfc94
Implement `start_pos` per query for batch interface
2024-08-12 18:50:23 +02:00
Daniel Keysers
7316ee8f96
Fix gemma_test GeographyBatched for 2b-it and add entropy expectations for gemma2-2b-it.
...
PiperOrigin-RevId: 662072395
2024-08-12 07:12:46 -07:00
Apoorv Reddy
fd1b0743a7
Rename Gemma9B and Gemma27B to Gemma2_9B and Gemma2_27B.
...
This is to make it clear that these models are part of the Gemma2 family of models.
PiperOrigin-RevId: 661181682
2024-08-09 02:09:06 -07:00
The gemma.cpp Authors
27258b03e6
Improve performance logging
...
PiperOrigin-RevId: 660534330
2024-08-07 14:15:43 -07:00
Jan Wassenberg
5e433e774a
1.1x prefill speedup, revamp threading in preparation for hierarchical parallelism.
...
Limit thread counts to detected. Add max_clusters arg.
Update detection logic to check for smt0 - previously we pinned to some siblings.
PiperOrigin-RevId: 659755311
2024-08-05 18:50:09 -07:00
Paul Chang
d37c088e44
Extend LayersOutputFunc to take query index and auxillary int
...
PiperOrigin-RevId: 657574814
2024-07-30 06:53:56 -07:00
Jan Wassenberg
aaf51898b6
Major revamp #2 of Prefill: fix token order, parallel for multi-query
...
- Allocate only the required KV caches and activation batch size
- Add flags for batch sizes
- Const-correct interface: Span of const int.
- Also clean up the KVCache arg to a span.
- Move kPrefillBatchSize into RuntimeConfig and remove related global constants.
PiperOrigin-RevId: 655893197
2024-07-25 03:28:55 -07:00
Jan Wassenberg
12016d31c3
Major Prefill/Generate cleanup, 1.3x Prefill speedup
...
This fixes TTFT, which was not including prefill.
PiperOrigin-RevId: 653690626
2024-07-18 11:16:46 -07:00
Daniel Keysers
cf76f0a401
Update gemma_test to also pass for the v1.1. models.
...
Make it an error if the model cannot be loaded.
PiperOrigin-RevId: 650232602
2024-07-08 06:45:37 -07:00
Jan Wassenberg
cbb67b4ee0
Move benchmark_helper to evals/, weights_raw to compression/.
...
PiperOrigin-RevId: 650155983
2024-07-08 01:13:23 -07:00
Daniel Keysers
cdebcc3533
Update gemma_test with the expected entropy values for the IT models of size 2B/7B/9B/27B.
...
PiperOrigin-RevId: 649662047
2024-07-05 08:58:51 -07:00
Jan Wassenberg
118e802b00
Fix gemma_test - moved to evals/.
...
PiperOrigin-RevId: 649338633
2024-07-04 02:04:05 -07:00
Jan Wassenberg
85fcd3cd80
Cleanup: add ModelInfo struct, remove gcpp::
...
PiperOrigin-RevId: 648707763
2024-07-02 07:11:15 -07:00
Jan Wassenberg
af8eb2fde3
Declutter gemma/ directory, move binaries to evals/ and util/.
...
PiperOrigin-RevId: 648400795
2024-07-01 09:51:04 -07:00