Add readme

This commit is contained in:
gatbontonpc 2026-01-12 13:53:39 -05:00
parent f3a5b4ea72
commit b0d50a5681
2 changed files with 21 additions and 1 deletions

View File

@ -0,0 +1,20 @@
# llama.cpp/example/llama-eval
The purpose of this example is to to run evaluations metrics against a an openapi api compatible LLM via http (llama-server).
```bash
./llama-server -m model.gguf --port 8033
```
```bash
python examples/llama-eval/llama-eval.py --path_server http://localhost:8033 --n_prompt 100 --prompt_source arc
```
## Supported tasks (MVP)
- **GSM8K** — grade-school math (final-answer only)
- **AIME** — competition math (final-answer only)
- **MMLU** — multi-domain knowledge (multiple choice)
- **HellaSwag** — commonsense reasoning (multiple choice)
- **ARC** — grade-school science reasoning (multiple choice)
- **WinoGrande** — commonsense coreference resolution (multiple choice)

View File

@ -576,7 +576,7 @@ if __name__ == "__main__":
"--prompt_source",
type=str,
default="mmlu",
help=f"Eval types supported: all,{TASK_DICT.keys()}",
help=f"Eval types supported: all,{list(TASK_DICT.keys())}",
)
parser.add_argument(
"--n_prompts", type=int, default=None, help="Number of prompts to evaluate"