PaddlePaddle
diff --git a/‎docs/guides/hardware_support/dcu/example_cn.md
+109 b/‎docs/guides/hardware_support/dcu/example_cn.md
+109
diff --git a/‎docs/guides/hardware_support/dcu/index_cn.rst
+20 b/‎docs/guides/hardware_support/dcu/index_cn.rst
+20
diff --git a/‎docs/guides/hardware_support/dcu/install_cn.md
+114 b/‎docs/guides/hardware_support/dcu/install_cn.md
+114
diff --git a/‎docs/guides/hardware_support/rocm_docs/paddle_rocm_cn.md renamed to ‎docs/guides/hardware_support/dcu/support_cn.md
+13-14 b/‎docs/guides/hardware_support/rocm_docs/paddle_rocm_cn.md renamed to ‎docs/guides/hardware_support/dcu/support_cn.md
+13-14
diff --git a/‎docs/guides/hardware_support/hardware_info_cn.md
+5-5 b/‎docs/guides/hardware_support/hardware_info_cn.md
+5-5
diff --git a/‎docs/guides/hardware_support/index_cn.rst
+4-4 b/‎docs/guides/hardware_support/index_cn.rst
+4-4
diff --git a/‎docs/guides/hardware_support/rocm_docs/index_cn.rst
-22 b/‎docs/guides/hardware_support/rocm_docs/index_cn.rst
-22
@@ -0,0 +1,109 @@
+# 海光 DCU 运行示例
+
+**预先要求**：请先根据文档 [海光 DCU 安装说明](./install_cn.html) 准备海光 DCU 运行环境，建议以下步骤都在 docker 环境中运行。
+
+## 训练示例
+
+以 [ResNet50_vd](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/quick_start/quick_start_classification_new_user.md) 模型为例，介绍如何使用海光 DCU 进行训练。
+
+### 一、下载套件代码
+
+```bash
+# 下载套件源码
+git clone https://github.com/PaddlePaddle/PaddleClas.git
+cd PaddleClas/
+
+# 安装 Python 依赖库
+pip install -r requirements.txt
+
+# 编译安装 paddleclas
+python setup.py install
+```
+
+### 二、准备训练数据
+
+进入 `PaddleClas/dataset` 目录，下载并解压 `flowers102` 数据集：
+
+```bash
+# 准备数据集 - 将数据集下载到对应的目录下，并解压
+cd PaddleClas/dataset
+wget https://paddle-imagenet-models-name.bj.bcebos.com/data/flowers102.zip
+unzip flowers102.zip
+
+# 下载解压完成之后，当前目录结构如下
+PaddleClas/dataset/flowers102
+├── flowers102_label_list.txt
+├── jpg
+├── train_extra_list.txt
+├── train_list.txt
+└── val_list.txt
+```
+
+### 三、运行四卡训练
+
+```bash
+# 进入套件目录
+cd PaddleClas/
+
+# 四卡训练
+python -m paddle.distributed.launch --devices "0,1,2,3" \
+       tools/train.py -c ./ppcls/configs/quick_start/ResNet50_vd.yaml \
+       -o Arch.pretrained=True \
+       -o Global.device=gpu
+# 训练完成之后，预期得到输出如下
+# ppcls INFO: [Eval][Epoch 20][best metric: 0.9245098829269409]
+# ppcls INFO: Already save model in ./output/ResNet50_vd/epoch_20
+# ppcls INFO: Already save model in ./output/ResNet50_vd/latest
+
+# 单卡评估 - 使用上一步训练得到的模型进行评估
+python tools/eval.py -c ./ppcls/configs/quick_start/ResNet50_vd.yaml \
+       -o Arch.pretrained="output/ResNet50_vd/best_model" \
+       -o Global.device=gpu
+# 评估完成之后，预期得到输出如下
+# [Eval][Epoch 0][Avg]CELoss: 0.51397, loss: 0.51397, top1: 0.91569, top5: 0.98039
+```
+
+## 推理示例
+
+以 [ResNet50](https://paddle-inference-dist.bj.bcebos.com/Paddle-Inference-Demo/resnet50.tgz) 模型为例，介绍如何使用海光 DCU 进行推理。
+
+### 一、下载推理程序
+
+```bash
+# 下载 Paddle-Inference-Demo 示例代码，并进入 Python 代码目录
+git clone https://github.com/PaddlePaddle/Paddle-Inference-Demo.git
+```
+
+### 二、准备推理模型
+
+```bash
+# 进入 python gpu 推理示例程序目录
+cd Paddle-Inference-Demo/python/gpu/resnet50
+
+# 下载推理模型文件并解压
+wget https://paddle-inference-dist.bj.bcebos.com/Paddle-Inference-Demo/resnet50.tgz
+tar xzf resnet50.tgz
+
+# 准备预测示例图片
+wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ILSVRC2012_val_00000247.jpeg
+
+# 准备完成后的模型和图片目录如下
+Paddle-Inference-Demo/python/gpu/resnet50
+├── ILSVRC2012_val_00000247.jpeg
+└── resnet50
+    ├── inference.pdiparams
+    ├── inference.pdiparams.info
+    └── inference.pdmodel
+```
+
+### 三、运行推理程序
+
+```bash
+# 运行 Python 推理程序
+python infer_resnet.py \
+    --model_file=./resnet50/inference.pdmodel \
+    --params_file=./resnet50/inference.pdiparams
+
+# 预期得到输出如下
+# class index:  13
+```
@@ -0,0 +1,20 @@
+.. _cn_rocm_information:
+
+####################
+海光 DCU 芯片
+####################
+
+海光 Z100 系列芯片基于通用 GPGPU 架构设计，是海光（HYGON）推出的一款面向各种应用场景的通用 GPU 产品，更加适合为⼈⼯智能计算提供强⼤的算⼒。更多海光 DCU 芯片详情及技术指标请 `点击这里 <https://www.hygon.cn/>`_ 。
+
+飞桨框架支持基于海光 DCU 芯片的训练和推理，请参考以下内容快速体验：
+
+- `海光 DCU 安装说明 <./install_cn.html>`_ : 海光 DCU 安装说明
+- `海光 DCU 运行示例 <./example_cn.html>`_ : 海光 DCU 运行示例
+- `海光 DCU 支持模型 <./support_cn.html>`_ : 海光 DCU 支持模型
+
+..  toctree::
+    :hidden:
+
+    install_cn.md
+    example_cn.md
+    support_cn.md
@@ -0,0 +1,114 @@
+# 海光 DCU 安装说明
+
+飞桨框架 DCU 版支持海光 DCU 的训练和推理，提供两种安装方式：
+
+1. 通过飞桨官网发布的 wheel 包安装
+2. 通过源代码编译安装得到 wheel 包
+
+## 海光 DCU 系统要求
+
+| 要求类型 |   要求内容   |
+| --------- | -------- |
+| 芯片型号 | 海光 Z100 系列芯片，包括 Z100、Z100L |
+| 操作系统 | Linux 操作系统，包括 CentOS、KylinV10 |
+
+## 运行环境准备
+
+推荐使用飞桨官方发布的海光 DCU 开发镜像，该镜像预装有海光 DCU 基础运行环境库（DTK）。
+
+```bash
+# 拉取镜像
+docker pull registry.baidubce.com/device/paddle-dcu:dtk23.10.1-kylinv10-gcc73-py310
+
+# 参考如下命令，启动容器
+docker run -it --name paddle-dcu-dev -v $(pwd):/work \
+  --workdir=/work --shm-size=128G --network=host  \
+  --device=/dev/kfd --device=/dev/dri --group-add video \
+  --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
+  registry.baidubce.com/device/paddle-dcu:dtk23.10.1-kylinv10-gcc73-py310 /bin/bash
+
+# 检查容器内是否可以正常识别海光 DCU 设备
+rocm-smi
+
+# 预期得到输出如下
+============System Management Interface ============
+====================================================
+DCU  Temp   AvgPwr  Fan   Perf  PwrCap  VRAM%  DCU%
+0    30.0c  38.0W   0.0%  auto  280.0W    0%   0%
+1    30.0c  41.0W   0.0%  auto  280.0W    0%   0%
+2    29.0c  38.0W   0.0%  auto  280.0W    0%   0%
+3    29.0c  39.0W   0.0%  auto  280.0W    0%   0%
+====================================================
+===================End of SMI Log===================
+```
+
+## 安装飞桨框架
+
+**注意**：飞桨框架 DCU 版仅支持海光 C86 架构。
+
+### 安装方式一：wheel 包安装
+
+在启动的 docker 容器中，下载并安装飞桨官网发布的 wheel 包。
+
+```bash
+# 下载 wheel 包
+wget https://paddle-device.bj.bcebos.com/0.0.0/dcu/paddlepaddle_rocm-0.0.0-cp310-cp310-linux_x86_64.whl
+
+# 安装 wheel 包
+pip install -U paddlepaddle_rocm-0.0.0-cp310-cp310-linux_x86_64.whl
+```
+
+### 安装方式二：源代码编译安装
+
+在启动的 docker 容器中，下载 Paddle 源码并编译，CMAKE 编译选项含义请参见[编译选项表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#Compile)。
+
+```bash
+# 下载 Paddle 源码
+git clone https://github.com/PaddlePaddle/Paddle.git -b develop
+cd Paddle
+
+# 创建编译目录
+mkdir build && cd build
+
+# cmake 编译命令
+cmake .. -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_CXX_FLAGS="-Wno-error -w" \
+  -DPY_VERSION=3.10 -DPYTHON_EXECUTABLE=`which python3` -DWITH_CUSTOM_DEVICE=OFF \
+  -DWITH_TESTING=OFF -DON_INFER=ON -DWITH_DISTRIBUTE=ON -DWITH_MKL=ON \
+  -DWITH_ROCM=ON -DWITH_RCCL=ON
+
+# make 编译命令
+make -j16
+
+# 编译产出在 build/python/dist/ 路径下，使用 pip 安装即可
+pip install -U paddlepaddle_rocm-0.0.0-cp310-cp310-linux_x86_64.whl
+```
+
+## 基础功能检查
+
+安装完成后，在 docker 容器中输入如下命令进行飞桨基础健康功能的检查。
+
+```bash
+# 检查当前安装版本
+python -c "import paddle; paddle.version.show()"
+# 预期得到输出如下
+commit: d37bd8bcf75cf51f6c1117526f3f67d04946ebb9
+cuda: False
+cudnn: False
+nccl: 0
+
+# 飞桨基础健康检查
+python -c "import paddle; paddle.utils.run_check()"
+# 预期得到输出如下
+Running verify PaddlePaddle program ...
+PaddlePaddle works well on 1 GPU.
+PaddlePaddle works well on 8 GPUs.
+PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
+```
+
+## 如何卸载
+
+请使用以下命令卸载：
+
+```bash
+pip uninstall paddlepaddle-rocm
+```
@@ -1,7 +1,6 @@
+# 海光 DCU 支持模型
 
-# 飞桨框架 ROCm 版支持模型
-
-目前 Paddle ROCm 版基于海光 CPU(X86)和 DCU 支持以下模型的单机单卡/单机多卡的训练与推理。
+飞桨框架在海光 DCU 上经验证的模型的支持情况如下：
 
 ## 图像分类
 
@@ -123,14 +122,14 @@
 
 模型放置在飞桨模型套件中，各领域套件是 github.com/PaddlePaddle 下的独立 repo，git clone 下载即可获取所需的模型文件：
 
-| 领域        | 套件名称        | 分支/版本        |
-| ----------- | --------------- | ---------------- |
-| 图像分类     | PaddleClas      | release/2.3      |
-| 目标检测     | PaddleDetection | release/2.2      |
-| 图像分割     | PaddleSeg       | release/v2.0     |
-| 自然语言处理  | PaddleNLP       | develop          |
-| 字符识别     | PaddleOCR       | release/2.3      |
-| 推荐系统     | PaddleRec       | release/2.1.0    |
-| 视频分类     | PaddleVideo     | develop          |
-| 语音合成     | Parakeet        | develop          |
-| 生成对抗网络  | PaddleGAN       | develop          |
+| 领域        | 套件名称        |
+| ----------- | --------------- |
+| 图像分类     | PaddleClas      |
+| 目标检测     | PaddleDetection |
+| 图像分割     | PaddleSeg       |
+| 自然语言处理  | PaddleNLP       |
+| 字符识别     | PaddleOCR       |
+| 推荐系统     | PaddleRec       |
+| 视频分类     | PaddleVideo     |
+| 语音合成     | Parakeet        |
+| 生成对抗网络  | PaddleGAN       |
@@ -9,7 +9,7 @@
 | 服务端 CPU | x86 | Intel | 常见 CPU 型号如 Xeon、Core 全系列 | [安装](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html) | [源码编译](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/compile/linux-compile.html) | ✔️ |  |
 | 服务端 GPU |  | NVIDIA | Ada Lovelace、Hopper、 Ampere、Turing、 Volta 架构 | [安装](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html) | [源码编译](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/compile/linux-compile.html) | ✔️ |  |
 | AI 加速芯片 | 达芬奇 | 华为 | 昇腾 910 系列 | [安装](./npu/install_cn.html#wheel) | [源码编译](./npu/install_cn.html) | | ✔️ |
-| AI 加速芯片 |  | 海光 | 海光 DCU | [安装](./rocm_docs/paddle_install_cn.html#wheel) | [源码编译](./rocm_docs/paddle_install_cn.html#anzhuangfangshier-tongguoyuanmabianyianzhuang) | | [支持模型](./rocm_docs/paddle_rocm_cn.html) |
+| AI 加速芯片 | GPGPU | 海光 | 海光 Z100 系列 | [安装](./dcu/install_cn.html#wheel) | [源码编译](./dcu/install_cn.html) | | [支持模型](./dcu/support_cn.html) |
 | AI 加速芯片 | XPU | 百度 | 昆仑 R200、R300 等 | [安装](./xpu/install_cn.html#wheel) | [源码编译](./xpu/install_cn.html#xpu) |  | [支持模型](./xpu/support_cn.html) |
 | AI 加速芯片 | IPU | Graphcore | GC200 | | | | ✔️ |
 | AI 加速芯片 | MLU | 寒武纪 | MLU370 系列 | [安装](./mlu/install_cn.html#wheel) | [源码编译](./mlu/install_cn.html) |  | [支持模型](./mlu/support_cn.html) |
@@ -25,11 +25,11 @@
 | 服务端 CPU | x86 | Intel | 常见 CPU 型号如 Xeon、Core 全系列以及 NUC | [预编译库](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html) | [源码编译](https://paddleinference.paddlepaddle.org.cn/user_guides/source_compile.html) | ✔️ |   |
 | 服务端 GPU |  | NVIDIA | Ada Lovelace、Hopper、 Ampere、Turing、 Volta 架构  | [预编译库](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html) | [源码编译](https://paddleinference.paddlepaddle.org.cn/user_guides/source_compile.html) | ✔️ |   |
 | 移动端 GPU |  | NVIDIA | Jetson 系列 | [预编译库](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html) | [源码编译](https://paddleinference.paddlepaddle.org.cn/user_guides/source_compile.html) | ✔️ |   |
-| AI 加速芯片 | 达芬奇 | 华为 | 昇腾 910 系列 | [预编译库](./npu/install_cn.html#wheel) | [源码编译](./npu/install_cn.html) |  | ✔️ |
-| AI 加速芯片 | MLU | 寒武纪 | MLU370 系列 | [预编译库](./mlu/install_cn.html#wheel) | [源码编译](./mlu/install_cn.html) |  | ✔️ |
+| AI 加速芯片 | 达芬奇 | 华为 | 昇腾 910 系列 | | [源码编译](./npu/install_cn.html) |  | ✔️ |
+| AI 加速芯片 | MLU | 寒武纪 | MLU370 系列 | | [源码编译](./mlu/install_cn.html) |  | ✔️ |
 | AI 加速芯片 | MUSA | 摩尔线程 | MTT S 系列 GPU |  |  |  |  |
-| AI 加速芯片 |  | 海光 | 海光 DCU | [预编译库](./rocm_docs/paddle_install_cn.html) | [源码编译](./rocm_docs/paddle_install_cn.html) | | [支持模型](./rocm_docs/paddle_rocm_cn.html) |
-| AI 加速芯片 | XPU | 百度 | 昆仑 R200、R300 等 | [预编译库](https://www.paddlepaddle.org.cn/inference/master/guides/hardware_support/xpu_kunlun_cn.html) | [源码编译](https://www.paddlepaddle.org.cn/inference/master/guides/hardware_support/xpu_kunlun_cn.html) |  | [支持模型](./xpu/support_cn.html) |
+| AI 加速芯片 | GPGPU | 海光 | 海光 Z100 系列 | | [源码编译](https://www.paddlepaddle.org.cn/inference/master/guides/hardware_support/dcu_hygon_cn.html) | | [支持模型](./dcu/support_cn.html) |
+| AI 加速芯片 | XPU | 百度 | 昆仑 R200、R300 等 | | [源码编译](https://www.paddlepaddle.org.cn/inference/master/guides/hardware_support/xpu_kunlun_cn.html) |  | [支持模型](./xpu/support_cn.html) |
 | 服务端 CPU | ARM | 飞腾 | FT-2000+/64、S2500 |  |[源码编译](../../install/compile/arm-compile.html#anchor-1) |  |  |
 | 服务端 CPU | ARM | 华为 | 鲲鹏 920 2426SK |  |[源码编译](../../install/compile/arm-compile.html) |  |   |
 | 服务端 CPU | MIPS | 龙芯 | 龙芯 3A4000、3A5000、3C5000L |  |[源码编译](../../install/compile/mips-compile.html#anchor-0) |  |  |
 
@@ -8,15 +8,15 @@
 
 - `飞桨硬件支持 <./hardware_info_cn.html>`_ : 说明飞桨产品支持的硬件。
 - `昆仑 XPU 芯片运行飞桨 <./xpu/index_cn.html>`_ : 介绍如何在昆仑 XPU 芯片环境上安装和使用飞桨。
-- `海光 DCU 芯片运行飞桨 <./rocm_docs/index_cn.html>`_ : 介绍如何在海光 DCU 芯片环境上安装和使用飞桨。
-- `昇腾 NPU 芯片运行飞桨 <./npu/index_cn.html>`_ : 介绍如何在昇腾 NPU 环境上安装和使用飞桨。
-- `寒武纪 MLU 芯片运行飞桨 <./mlu/index_cn.html>`_ : 介绍如何在寒武纪 MLU 环境上安装和使用飞桨。
+- `海光 DCU 芯片运行飞桨 <./dcu/index_cn.html>`_ : 介绍如何在海光 DCU 芯片环境上安装和使用飞桨。
+- `昇腾 NPU 芯片运行飞桨 <./npu/index_cn.html>`_ : 介绍如何在昇腾 NPU 芯片环境上安装和使用飞桨。
+- `寒武纪 MLU 芯片运行飞桨 <./mlu/index_cn.html>`_ : 介绍如何在寒武纪 MLU 芯片环境上安装和使用飞桨。
 
 ..  toctree::
     :hidden:
 
     hardware_info_cn.md
     xpu/index_cn.rst
-    rocm_docs/index_cn.rst
+    dcu/index_cn.rst
     npu/index_cn.rst
     mlu/index_cn.rst