37 lines
1016 B
Markdown
37 lines
1016 B
Markdown
# llama-server-simulator
|
|
|
|
Standalone Python script simulating llama-server HTTP endpoint for testing.
|
|
|
|
## Features
|
|
|
|
- HTTP Server with OpenAI-compatible `/v1/chat/completions` endpoint
|
|
- AIME Dataset Integration - Loads 90 questions from HuggingFace
|
|
- Intelligent Question Matching - Uses exact matching, LaTeX removal, and Levenshtein distance
|
|
- Configurable Success Rate - Control correct/wrong answer generation (0-1)
|
|
- Debug Logging - Troubleshoot matching issues
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
python llama-server-simulator.py --success-rate 0.8
|
|
```
|
|
|
|
## Arguments
|
|
|
|
- `--success-rate`: Probability of returning correct answer (0.0-1.0, default: 0.8)
|
|
- `--port`: Server port (default: 8033)
|
|
- `--debug`: Enable debug logging (default: False)
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
./test-simulator.sh
|
|
```
|
|
|
|
## Implementation Details
|
|
|
|
- Uses Levenshtein distance for partial matching (threshold: 0.3)
|
|
- Automatic caching via HuggingFace datasets library
|
|
- Wrong answers generated by incrementing expected answer
|
|
- Debug output written to stderr
|