197 lines
6.4 KiB
Markdown
197 lines
6.4 KiB
Markdown
# examples.agent: Interactive agent that can use Python tools!
|
|
|
|
Have any LLM use local (sandboxed) tools, with a simple CLI.
|
|
|
|
```bash
|
|
python -m examples.agent \
|
|
--model ~/AI/Models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf \
|
|
--tools examples/agent/tools/example_math_tools.py \
|
|
--goal "What is the sum of 2535 squared and 32222000403 then multiplied by one and a half. What's a third of the result?" \
|
|
--greedy
|
|
```
|
|
|
|
<details>
|
|
<summary>Show output</summary>
|
|
|
|
```bash
|
|
💭 First, I will calculate the square of 2535, then add it to 32222000403. After that, I will multiply the result by 1.5 and finally, I will divide the result by 3.
|
|
⚙️ pow(value=2535, power=2) -> 6426225.0
|
|
💭 Now that I have calculated the square of 2535, I will calculate the sum of 6426225 and 32222000403.
|
|
⚙️ add(a=6426225, b=32222000403) -> 32228426628
|
|
💭 Now that I have calculated the sum, I will multiply it by 1.5.
|
|
⚙️ multiply(a=32228426628, b=1.5) -> 48342639942.0
|
|
💭 Now that I have calculated the product, I will divide it by 3.
|
|
⚙️ divide(a=48342639942.0, b=3) -> 16114213314.0
|
|
➡️ "\nThe result of the calculation is 16114213314.0."
|
|
```
|
|
|
|
</details>
|
|
|
|
```bash
|
|
python -m examples.agent \
|
|
--tools examples/agent/tools/example_weather_tools.py \
|
|
--goal "What is the weather going to be like in San Francisco and Glasgow over the next 4 days." \
|
|
--greedy
|
|
```
|
|
|
|
<details>
|
|
<summary>Show output</summary>
|
|
|
|
```bash
|
|
💭 I will first get the current weather in San Francisco, then get the 4-day weather forecast for both San Francisco and Glasgow.
|
|
⚙️ get_current_weather(location=San Francisco, format=fahrenheit) -> ...
|
|
💭 I will first get the current weather in San Francisco, then get the 4-day weather forecast for both San Francisco and Glasgow.
|
|
⚙️ get_n_day_weather_forecast(location=San Francisco, format=fahrenheit, num_days=4) -> ...
|
|
💭 I will first get the current weather in San Francisco, then get the 4-day weather forecast for both San Francisco and Glasgow.
|
|
⚙️ get_n_day_weather_forecast(location=Glasgow, format=celsius, num_days=4) -> ...
|
|
The current weather in San Francisco is sunny and 87.8F. Here is the 4-day weather forecast:
|
|
|
|
For San Francisco:
|
|
- In 1 day: Cloudy, 60.8F
|
|
- In 2 days: Sunny, 73.4F
|
|
- In 3 days: Cloudy, 62.6F
|
|
|
|
For Glasgow:
|
|
- In 1 day: Cloudy, 16C
|
|
- In 2 days: Sunny, 23C
|
|
- In 3 days: Cloudy, 17C
|
|
```
|
|
|
|
</details>
|
|
|
|
|
|
```bash
|
|
python -m examples.agent \
|
|
--model ~/AI/Models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf \
|
|
--std_tools \
|
|
--goal "Wait 10sec then say Hi out loud" \
|
|
--greedy
|
|
```
|
|
|
|
<details>
|
|
<summary>Show output</summary>
|
|
|
|
```bash
|
|
```
|
|
|
|
</details>
|
|
|
|
## Prerequisites
|
|
|
|
Note: To get conda, just install Miniforge (it's OSS): https://github.com/conda-forge/miniforge
|
|
|
|
```bash
|
|
conda create -n agent python=3.11
|
|
conda activate agent
|
|
pip install -r examples/agent/requirements.txt
|
|
pip install -r examples/openai/requirements.txt
|
|
```
|
|
|
|
## Components
|
|
|
|
This example relies on the new [OpenAI compatibility server](../openai).
|
|
|
|
```
|
|
agent.py → examples.openai → server.cpp
|
|
→ safe_tools.py
|
|
→ ( run_sandboxed_tools.sh : Docker → fastify.py ) → unsafe_tools.py → code interpreter, etc...
|
|
```
|
|
|
|
The agent can use tools written in Python, or (soon) exposed under OpenAPI endpoints. Only has standard Python deps (e.g. no langchain)
|
|
|
|
- Can call into any OpenAI endpoint that supports tool calling, spawns a local one if `--endpoint` isn't specified
|
|
(can pass all llama.cpp params)
|
|
|
|
- [Standard tools](./tools/std.py) include "safe" TTS, wait for/until helpers, and *requesting user input*.
|
|
|
|
- Tools are often "unsafe" (e.g. [Python execution functions](./tools/unsafe_python_tools.py)),
|
|
so we provide a script to run them in a Docker-sandboxed environment, exposed as an OpenAPI server:
|
|
|
|
```bash
|
|
examples/openai/run_sandboxed_tools.sh \
|
|
examples/agent/tools/unsafe_python_tools.py 6666 &
|
|
|
|
python -m examples.openai.reactor \
|
|
--model ~/AI/Models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf \
|
|
--tools http://localhost:6666 \
|
|
--goal "Whats cos(123) / 23 * 12.6 ?"
|
|
```
|
|
|
|
- [fastify.py](./fastify.py) turns a python module into an OpenAPI endpoint using FastAPI
|
|
|
|
- [run_sandboxed_tools.sh](./run_sandboxed_tools.sh) builds and runs a Docker environment with fastify inside it, and exposes its port locally
|
|
|
|
- Beyond just "tools", output format can be constrained using JSON schemas or Pydantic types
|
|
|
|
```bash
|
|
python -m examples.agent \
|
|
--model ~/AI/Models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf \
|
|
--tools examples/agent/tools/example_summaries.py \
|
|
--format PyramidalSummary \
|
|
--goal "Create a pyramidal summary of Mankind's recent advancements"
|
|
```
|
|
|
|
## Launch parts separately
|
|
|
|
If you'd like to debug each binary separately (rather than have an agent spawing an OAI compat proxy spawning a C++ server), you can run these commands:
|
|
|
|
```bash
|
|
# C++ server
|
|
make -j server
|
|
./server --model mixtral.gguf --port 8081
|
|
|
|
# OpenAI compatibility layer
|
|
python -m examples.openai \
|
|
--port 8080
|
|
--endpoint http://localhost:8081 \
|
|
--template_hf_model_id_fallback mistralai/Mixtral-8x7B-Instruct-v0.1
|
|
|
|
# Or have the OpenAI compatibility layer spawn the C++ server under the hood:
|
|
# python -m examples.openai --model mixtral.gguf
|
|
|
|
# Agent itself:
|
|
python -m examples.agent --endpoint http://localhost:8080 \
|
|
```
|
|
|
|
## Use existing tools (WIP)
|
|
|
|
```bash
|
|
git clone https://github.com/NousResearch/Hermes-Function-Calling examples/openai/hermes_function_calling
|
|
```
|
|
|
|
Then edit `examples/agents/hermes_function_calling/utils.py`:
|
|
|
|
```py
|
|
log_folder = os.environ.get('LOG_FOLDER', os.path.join(script_dir, "inference_logs"))
|
|
```
|
|
|
|
Then run tools in a sandbox:
|
|
|
|
```bash
|
|
REQUIREMENTS_FILE=<( cat examples/agents/hermes_function_calling/requirements.txt | grep -vE "bitsandbytes|flash-attn" ) \
|
|
examples/agents/run_sandboxed_tools.sh \
|
|
examples/agents/hermes_function_calling/functions.py \
|
|
-e LOG_FOLDER=/data/inference_logs
|
|
```
|
|
|
|
## TODO
|
|
|
|
- Wait for spawned servers to be heathly
|
|
|
|
- Add model URL / HF loading support
|
|
|
|
- Add Embedding endpoint + storage / retrieval tools (Faiss? ScaNN?), or spontaneous RAG
|
|
|
|
- Auto discover tools exposed by an OpenAPI endpoint
|
|
|
|
- Add a Python notebook tool example
|
|
|
|
- Update `run_sandboxed_tools.sh` to support dev mode (`uvicorn fastify:app --reload`)
|
|
|
|
- Follow-ups (depending on the vibe)
|
|
|
|
- Remove OAI support from server
|
|
|
|
- Remove non-Python json schema to grammar converters
|
|
|