llama.cpp/examples/duo/README.md

8 lines
202 B
Markdown

## duo
Minimal example. What's not implemented, but can be implemented separately in pieces:
* tree-based speculation
* correct sampling
* support more than 2 instances
* just one instance speculates