From 262fef812a632121c6e37c8eb1729c1f790e20dd Mon Sep 17 00:00:00 2001
From: KokerZhou <111279477+KokerZhou@users.noreply.github.com>
Date: Fri, 20 Mar 2026 22:11:40 +0800
Subject: [PATCH] cann: update CANN.md
---
docs/backend/CANN.md | 131 +++++++++++++++++++++++++------------------
1 file changed, 77 insertions(+), 54 deletions(-)
diff --git a/docs/backend/CANN.md b/docs/backend/CANN.md
index 51adaaf95f..20a7e6fe07 100755
--- a/docs/backend/CANN.md
+++ b/docs/backend/CANN.md
@@ -42,12 +42,22 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
### Ascend NPU
-**Verified devices**
+You can retrieve your Ascend device IDs using the following command:
-| Ascend NPU | Status |
-|:-----------------------------:|:-------:|
-| Atlas 300T A2 | Support |
-| Atlas 300I Duo | Support |
+```sh
+lspci -n | grep -Eo '19e5:d[0-9a-f]{3}' | cut -d: -f2
+```
+
+**Devices**
+
+| Device Id | Product Series | Product Models | Chip Model | Verified Status |
+|:---------:|----------------|----------------|:----------:|:---------------:|
+| d803 | Atlas A3 Train | | 910C | |
+| d803 | Atlas A3 Infer | | 910C | |
+| d802 | Atlas A2 Train | | 910B | |
+| d802 | Atlas A2 Infer | Atlas 300I A2 | 910B | Support |
+| d801 | Atlas Train | | 910 | |
+| d500 | Atlas Infer | Atlas 300I Duo | 310P | Support |
*Notes:*
@@ -57,6 +67,9 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
## Model Supports
+
+Text-only
+
| Model Name | FP16 | Q4_0 | Q8_0 |
|:----------------------------|:-----:|:----:|:----:|
| Llama-2 | √ | √ | √ |
@@ -118,8 +131,11 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
| Trillion-7B-preview | √ | √ | √ |
| Ling models | √ | √ | √ |
+
+
+
+Multimodal
-**Multimodal**
| Model Name | FP16 | Q4_0 | Q8_0 |
|:----------------------------|:-----:|:----:|:----:|
| LLaVA 1.5 models, LLaVA 1.6 models | x | x | x |
@@ -134,15 +150,21 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
| GLM-EDGE | √ | √ | √ |
| Qwen2-VL | √ | √ | √ |
+
+
## DataType Supports
-| DataType | Status |
-|:----------------------:|:-------:|
-| FP16 | Support |
-| Q8_0 | Support |
-| Q4_0 | Support |
+| DataType | 910B | 310P |
+|:----------------------:|:-------:|:-------:|
+| FP16 | Support | Support |
+| Q8_0 | Support | Partial |
+| Q4_0 | Support | Partial |
+
+> **310P note**
+> - `Q8_0`: data transform / buffer path is implemented, and `GET_ROWS` is supported, but quantized `MUL_MAT` / `MUL_MAT_ID` are not supported.
+> - `Q4_0`: data transform / buffer path is implemented, but quantized `MUL_MAT` / `MUL_MAT_ID` are not supported.
## Docker
@@ -160,7 +182,20 @@ npu-smi info
# Select the cards that you want to use, make sure these cards are not used by someone.
# Following using cards of device0.
-docker run --name llamacpp --device /dev/davinci0 --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info -v /PATH_TO_YOUR_MODELS/:/app/models -it llama-cpp-cann -m /app/models/MODEL_PATH -ngl 32 -p "Building a website can be done in 10 simple steps:"
+docker run --name llamacpp \
+ --device /dev/davinci0 \
+ --device /dev/davinci_manager \
+ --device /dev/devmm_svm \
+ --device /dev/hisi_hdc \
+ -v /usr/local/dcmi:/usr/local/dcmi \
+ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
+ -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
+ -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
+ -v /PATH_TO_YOUR_MODELS/:/app/models \
+ -it llama-cpp-cann \
+ -m /app/models/MODEL_PATH \
+ -ngl 32 \
+ -p "Building a website can be done in 10 simple steps:"
```
*Notes:*
@@ -171,69 +206,57 @@ docker run --name llamacpp --device /dev/davinci0 --device /dev/davinci_manager
### I. Setup Environment
-1. **Install Ascend Driver and firmware**
+1. **Configure Ascend user and group**
```sh
- # create driver running user.
- sudo groupadd -g HwHiAiUser
+ sudo groupadd HwHiAiUser
sudo useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash
sudo usermod -aG HwHiAiUser $USER
-
- # download driver from https://www.hiascend.com/hardware/firmware-drivers/community according to your system
- # and install driver.
- sudo sh Ascend-hdk-910b-npu-driver_x.x.x_linux-{arch}.run --full --install-for-all
```
- Once installed, run `npu-smi info` to check whether driver is installed successfully.
+2. **Install dependencies**
+
+ **Ubuntu/Debian:**
```sh
- +-------------------------------------------------------------------------------------------+
- | npu-smi 24.1.rc2 Version: 24.1.rc2 |
- +----------------------+---------------+----------------------------------------------------+
- | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)|
- | Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) |
- +======================+===============+====================================================+
- | 2 xxx | OK | 64.4 51 15 / 15 |
- | 0 | 0000:01:00.0 | 0 1873 / 15077 0 / 32768 |
- +======================+===============+====================================================+
- | 5 xxx | OK | 64.0 52 15 / 15 |
- | 0 | 0000:81:00.0 | 0 1874 / 15077 0 / 32768 |
- +======================+===============+====================================================+
- | No running processes found in NPU 2 |
- +======================+===============+====================================================+
- | No running processes found in NPU 5 |
- +======================+===============+====================================================+
+ sudo apt-get update
+ sudo apt-get install -y gcc python3 python3-pip linux-headers-$(uname -r)
```
-2. **Install Ascend Firmware**
+ **RHEL/CentOS:**
```sh
- # download driver from https://www.hiascend.com/hardware/firmware-drivers/community according to your system
- # and install driver.
- sudo sh Ascend-hdk-910b-npu-firmware_x.x.x.x.X.run --full
+ sudo yum makecache
+ sudo yum install -y gcc python3 python3-pip kernel-headers-$(uname -r) kernel-devel-$(uname -r)
```
- If the following message appears, firmware is installed successfully.
+
+3. **Install CANN (driver + toolkit)**
+
+ > The `Ascend-cann` package includes both the driver and toolkit.
+ > `$ARCH` can be `x86_64` or `aarch64`, `$CHIP` can be `910b` or `310p`.
+
```sh
- Firmware package installed successfully!
+ wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%208.5.T63/Ascend-cann_8.5.0_linux-$ARCH.run
+ sudo bash ./Ascend-cann_8.5.0_linux-$ARCH.run --install
+
+ wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%208.5.T63/Ascend-cann-$CHIP-ops_8.5.0_linux-$ARCH.run
+ sudo bash ./Ascend-cann-$CHIP-ops_8.5.0_linux-$ARCH.run --install
```
+4. **Verify installation**
-3. **Install CANN toolkit and kernels**
-
- CANN toolkit and kernels can be obtained from the official [CANN Toolkit](https://www.hiascend.com/zh/developer/download/community/result?module=cann) page.
-
- Please download the corresponding version that satified your system. The minimum version required is 8.0.RC2.alpha002 and here is the install command.
```sh
- pip3 install attrs numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py wheel typing_extensions
- sh Ascend-cann-toolkit_8.0.RC2.alpha002_linux-aarch64.run --install
- sh Ascend-cann-kernels-910b_8.0.RC2.alpha002_linux.run --install
+ npu-smi info
```
- Set Ascend Variables:
+ If device information is displayed correctly, the driver is functioning properly.
+
```sh
- echo "source ~/Ascend/ascend-toolkit/set_env.sh" >> ~/.bashrc
- source ~/.bashrc
+ # Set environment variables (adjust path if needed)
+ source /usr/local/Ascend/cann/set_env.sh
+
+ python3 -c "import acl; print(acl.get_soc_name())"
```
-Upon a successful installation, CANN is enabled for the available ascend devices.
+ If the command outputs the chip model, the installation was successful.
### II. Build llama.cpp