From 262fef812a632121c6e37c8eb1729c1f790e20dd Mon Sep 17 00:00:00 2001 From: KokerZhou <111279477+KokerZhou@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:11:40 +0800 Subject: [PATCH] cann: update CANN.md --- docs/backend/CANN.md | 131 +++++++++++++++++++++++++------------------ 1 file changed, 77 insertions(+), 54 deletions(-) diff --git a/docs/backend/CANN.md b/docs/backend/CANN.md index 51adaaf95f..20a7e6fe07 100755 --- a/docs/backend/CANN.md +++ b/docs/backend/CANN.md @@ -42,12 +42,22 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi ### Ascend NPU -**Verified devices** +You can retrieve your Ascend device IDs using the following command: -| Ascend NPU | Status | -|:-----------------------------:|:-------:| -| Atlas 300T A2 | Support | -| Atlas 300I Duo | Support | +```sh +lspci -n | grep -Eo '19e5:d[0-9a-f]{3}' | cut -d: -f2 +``` + +**Devices** + +| Device Id | Product Series | Product Models | Chip Model | Verified Status | +|:---------:|----------------|----------------|:----------:|:---------------:| +| d803 | Atlas A3 Train | | 910C | | +| d803 | Atlas A3 Infer | | 910C | | +| d802 | Atlas A2 Train | | 910B | | +| d802 | Atlas A2 Infer | Atlas 300I A2 | 910B | Support | +| d801 | Atlas Train | | 910 | | +| d500 | Atlas Infer | Atlas 300I Duo | 310P | Support | *Notes:* @@ -57,6 +67,9 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi ## Model Supports +
+Text-only + | Model Name | FP16 | Q4_0 | Q8_0 | |:----------------------------|:-----:|:----:|:----:| | Llama-2 | √ | √ | √ | @@ -118,8 +131,11 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi | Trillion-7B-preview | √ | √ | √ | | Ling models | √ | √ | √ | +
+ +
+Multimodal -**Multimodal** | Model Name | FP16 | Q4_0 | Q8_0 | |:----------------------------|:-----:|:----:|:----:| | LLaVA 1.5 models, LLaVA 1.6 models | x | x | x | @@ -134,15 +150,21 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi | GLM-EDGE | √ | √ | √ | | Qwen2-VL | √ | √ | √ | +
+ ## DataType Supports -| DataType | Status | -|:----------------------:|:-------:| -| FP16 | Support | -| Q8_0 | Support | -| Q4_0 | Support | +| DataType | 910B | 310P | +|:----------------------:|:-------:|:-------:| +| FP16 | Support | Support | +| Q8_0 | Support | Partial | +| Q4_0 | Support | Partial | + +> **310P note** +> - `Q8_0`: data transform / buffer path is implemented, and `GET_ROWS` is supported, but quantized `MUL_MAT` / `MUL_MAT_ID` are not supported. +> - `Q4_0`: data transform / buffer path is implemented, but quantized `MUL_MAT` / `MUL_MAT_ID` are not supported. ## Docker @@ -160,7 +182,20 @@ npu-smi info # Select the cards that you want to use, make sure these cards are not used by someone. # Following using cards of device0. -docker run --name llamacpp --device /dev/davinci0 --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info -v /PATH_TO_YOUR_MODELS/:/app/models -it llama-cpp-cann -m /app/models/MODEL_PATH -ngl 32 -p "Building a website can be done in 10 simple steps:" +docker run --name llamacpp \ + --device /dev/davinci0 \ + --device /dev/davinci_manager \ + --device /dev/devmm_svm \ + --device /dev/hisi_hdc \ + -v /usr/local/dcmi:/usr/local/dcmi \ + -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ + -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ + -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ + -v /PATH_TO_YOUR_MODELS/:/app/models \ + -it llama-cpp-cann \ + -m /app/models/MODEL_PATH \ + -ngl 32 \ + -p "Building a website can be done in 10 simple steps:" ``` *Notes:* @@ -171,69 +206,57 @@ docker run --name llamacpp --device /dev/davinci0 --device /dev/davinci_manager ### I. Setup Environment -1. **Install Ascend Driver and firmware** +1. **Configure Ascend user and group** ```sh - # create driver running user. - sudo groupadd -g HwHiAiUser + sudo groupadd HwHiAiUser sudo useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash sudo usermod -aG HwHiAiUser $USER - - # download driver from https://www.hiascend.com/hardware/firmware-drivers/community according to your system - # and install driver. - sudo sh Ascend-hdk-910b-npu-driver_x.x.x_linux-{arch}.run --full --install-for-all ``` - Once installed, run `npu-smi info` to check whether driver is installed successfully. +2. **Install dependencies** + + **Ubuntu/Debian:** ```sh - +-------------------------------------------------------------------------------------------+ - | npu-smi 24.1.rc2 Version: 24.1.rc2 | - +----------------------+---------------+----------------------------------------------------+ - | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)| - | Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) | - +======================+===============+====================================================+ - | 2 xxx | OK | 64.4 51 15 / 15 | - | 0 | 0000:01:00.0 | 0 1873 / 15077 0 / 32768 | - +======================+===============+====================================================+ - | 5 xxx | OK | 64.0 52 15 / 15 | - | 0 | 0000:81:00.0 | 0 1874 / 15077 0 / 32768 | - +======================+===============+====================================================+ - | No running processes found in NPU 2 | - +======================+===============+====================================================+ - | No running processes found in NPU 5 | - +======================+===============+====================================================+ + sudo apt-get update + sudo apt-get install -y gcc python3 python3-pip linux-headers-$(uname -r) ``` -2. **Install Ascend Firmware** + **RHEL/CentOS:** ```sh - # download driver from https://www.hiascend.com/hardware/firmware-drivers/community according to your system - # and install driver. - sudo sh Ascend-hdk-910b-npu-firmware_x.x.x.x.X.run --full + sudo yum makecache + sudo yum install -y gcc python3 python3-pip kernel-headers-$(uname -r) kernel-devel-$(uname -r) ``` - If the following message appears, firmware is installed successfully. + +3. **Install CANN (driver + toolkit)** + + > The `Ascend-cann` package includes both the driver and toolkit. + > `$ARCH` can be `x86_64` or `aarch64`, `$CHIP` can be `910b` or `310p`. + ```sh - Firmware package installed successfully! + wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%208.5.T63/Ascend-cann_8.5.0_linux-$ARCH.run + sudo bash ./Ascend-cann_8.5.0_linux-$ARCH.run --install + + wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%208.5.T63/Ascend-cann-$CHIP-ops_8.5.0_linux-$ARCH.run + sudo bash ./Ascend-cann-$CHIP-ops_8.5.0_linux-$ARCH.run --install ``` +4. **Verify installation** -3. **Install CANN toolkit and kernels** - - CANN toolkit and kernels can be obtained from the official [CANN Toolkit](https://www.hiascend.com/zh/developer/download/community/result?module=cann) page. - - Please download the corresponding version that satified your system. The minimum version required is 8.0.RC2.alpha002 and here is the install command. ```sh - pip3 install attrs numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py wheel typing_extensions - sh Ascend-cann-toolkit_8.0.RC2.alpha002_linux-aarch64.run --install - sh Ascend-cann-kernels-910b_8.0.RC2.alpha002_linux.run --install + npu-smi info ``` - Set Ascend Variables: + If device information is displayed correctly, the driver is functioning properly. + ```sh - echo "source ~/Ascend/ascend-toolkit/set_env.sh" >> ~/.bashrc - source ~/.bashrc + # Set environment variables (adjust path if needed) + source /usr/local/Ascend/cann/set_env.sh + + python3 -c "import acl; print(acl.get_soc_name())" ``` -Upon a successful installation, CANN is enabled for the available ascend devices. + If the command outputs the chip model, the installation was successful. ### II. Build llama.cpp