Commit Graph

5 Commits

Author SHA1 Message Date
Jan Wassenberg 85fcd3cd80 Cleanup: add ModelInfo struct, remove gcpp::
PiperOrigin-RevId: 648707763
2024-07-02 07:11:15 -07:00
The gemma.cpp Authors da7507e6f0 Add prompt batching to Gemma.cpp.
This CL adds a new function to Gemma that allows for batching of multiple prompts. The function takes a vector of prompts and returns a vector of responses. The prompts are processed in parallel, and the responses are returned in the same order as the prompts.

PiperOrigin-RevId: 648367559
2024-07-01 07:51:31 -07:00
Jan Wassenberg 2ac47e4a06 Fix Py binding/run_example: use GemmaEnv
PiperOrigin-RevId: 644318962
2024-06-18 03:20:22 -07:00
Jan Wassenberg d3c6a45b59 Major duplicated code reduction in test/benchmarks
Helper functions to tokenize/wrap
Move LayersOutputFunc into RuntimeConfig
AcceptFunc passes the probability
Implement StringFromType using the parser, and verify results match

PiperOrigin-RevId: 643255119
2024-06-14 00:16:25 -07:00
Ray Smith bdf33c7008 Updated benchmarks.cc to recent changes to Gemma API.
PiperOrigin-RevId: 642285902
2024-06-11 08:55:40 -07:00