Update build doc

This commit is contained in:
Yu, Zijun 2025-05-20 10:38:15 +08:00 committed by Mustafa Cavus
parent d7cc802292
commit fd324366d0
1 changed files with 21 additions and 26 deletions

View File

@ -683,33 +683,30 @@ To read documentation for how to build on IBM Z & LinuxONE, [click here](./build
## OPENVINO ## OPENVINO
### Build openvino-llama ### Build openvino
```bash ```bash
git lfs install --skip-smudge git clone https://github.com/openvinotoolkit/openvino.git
git clone https://github.com/intel-sandbox/openvino-llama.git -b dev_ggml_frontend cd openvino
cd openvino-llama git submodule update --init --recursive
git submodule update --init --recursive export OPENVINO_DIR=$(pwd)
export OPENVINO_LLAMA_PATH=$(pwd) sudo ./install_build_dependencies.sh
```
Before building, change "ENABLE_OV_GGML_FRONTEND" from true to false in the CMakePresets.json file since we already have the code from the ov side in this branch of llama.cpp (`full_backend`). You could also build the master branch of ov instead. mkdir -p build/Release && cd build/Release
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_DEBUG_CAPS=ON ../..
``` ```
cmake --preset Release
cmake --build build/Release
```
### Build llama.cpp-ov ### Build llama.cpp-ov
```bash ```bash
git clone https://github.com/intel-sandbox/llama.cpp-ov.git -b full_backend git clone https://github.com/intel-sandbox/llama.cpp-ov.git
cd llama.cpp-ov cd llama.cpp-ov
git switch dev_backend_openvino
cmake --preset ReleaseOV cmake --preset ReleaseOV
cmake --build build/ReleaseOV cmake --build build/ReleaseOV
``` ```
Download the test model file [Phi-3-mini-4k-instruct-fp16.gguf](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf) from hugging face website. Download the test model file [Phi-3-mini-4k-instruct-fp16.gguf](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf) from hugging face website.
``` bash ``` bash
@ -717,12 +714,10 @@ Download the test model file [Phi-3-mini-4k-instruct-fp16.gguf](https://huggingf
``` ```
Execute the following command to test. Execute the following command to test.
```bash ```bash
export GGML_OPENVINO_CACHE_DIR=/tmp/ov_cache export GGML_OPENVINO_CACHE_DIR=/tmp/ov_cache
# Currently GGML_OPENVINO_WEIGHT_AS_INPUT has better performance ./build/ReleaseOV/bin/llama-simple -m ~/models/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-fp16.gguf -n 10 "Hello, my name is "
export GGML_OPENVINO_WEIGHT_AS_INPUT=1 ```
./build/ReleaseOV/bin/llama-simple -m ~/models/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-fp16.gguf -n 10 "Hello, my name is "
```
Environment variables: Environment variables:
- GGML_OPENVINO_WEIGHT_AS_INPUT: - GGML_OPENVINO_WEIGHT_AS_INPUT: