Skip to content

Commit 33e074d

Browse files
authored
Merge pull request #6742 from typhoonzero/fix_document_errors
fix some doc errors
2 parents de85470 + 4952597 commit 33e074d

File tree

4 files changed

+21
-22
lines changed

4 files changed

+21
-22
lines changed

doc/howto/usage/cluster/cluster_train_cn.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# PaddlePaddle分布式训练
1+
# 分布式训练
22

33

44
## 概述
@@ -181,8 +181,8 @@ PaddlePaddle可以使用多种分布式计算平台构建分布式计算任务
181181

182182
## 在不同集群中运行
183183

184-
- [fabric](fabric_cn.md)
185-
- [openmpi](openmpi_cn.md)
186-
- [kubernetes](k8s_cn.md)
187-
- [kubernetes distributed](k8s_distributed_cn.md)
188-
- [kubernetes on AWS](k8s_aws_cn.md)
184+
- [fabric集群](fabric_cn.md)
185+
- [openmpi集群](openmpi_cn.md)
186+
- [kubernetes单机](k8s_cn.md)
187+
- [kubernetes distributed分布式](k8s_distributed_cn.md)
188+
- [AWS上运行kubernetes集群训练](k8s_aws_cn.md)

doc/howto/usage/cluster/cluster_train_en.md

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# PaddlePaddle Distributed Training
1+
# Distributed Training
22

33
## Introduction
44

@@ -188,5 +188,4 @@ These cluster platforms provide API or environment variables for training proces
188188
- [fabric](fabric_en.md)
189189
- [openmpi](openmpi_en.md)
190190
- [kubernetes](k8s_en.md)
191-
- kubernetes distributed
192191
- [kubernetes on AWS](k8s_aws_en.md)

doc/howto/usage/cluster/k8s_cn.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# Kubernetes单机训练
22

3-
在这篇文档里,我们介绍如何在 Kubernetes 集群上启动一个单机使用CPU的Paddle训练作业。在下一篇中,我们将介绍如何启动分布式训练作业。
3+
在这篇文档里,我们介绍如何在 Kubernetes 集群上启动一个单机使用CPU的PaddlePaddle训练作业。在下一篇中,我们将介绍如何启动分布式训练作业。
44

55
## 制作Docker镜像
66

7-
在一个功能齐全的Kubernetes机群里,通常我们会安装Ceph等分布式文件系统来存储训练数据。这样的话,一个分布式Paddle训练任务中的每个进程都可以从Ceph读取数据。在这个例子里,我们只演示一个单机作业,所以可以简化对环境的要求,把训练数据直接放在
8-
Paddle的Docker image里。为此,我们需要制作一个包含训练数据的Paddle镜像
7+
在一个功能齐全的Kubernetes机群里,通常我们会安装Ceph等分布式文件系统来存储训练数据。这样的话,一个分布式PaddlePaddle训练任务中的每个进程都可以从Ceph读取数据。在这个例子里,我们只演示一个单机作业,所以可以简化对环境的要求,把训练数据直接放在
8+
PaddlePaddle的Docker image里。为此,我们需要制作一个包含训练数据的PaddlePaddle镜像
99

10-
Paddle 的 [Quick Start Tutorial](http://www.paddlepaddle.org/doc/demo/quick_start/index_en.html)
10+
Paddle 的 [Quick Start Tutorial](http://www.paddlepaddle.org/docs/develop/documentation/zh/getstarted/index_cn.html)
1111
里介绍了用Paddle源码中的脚本下载训练数据的过程。
12-
`paddledev/paddle:cpu-demo-latest` 镜像里有 Paddle 源码与demo,( 请注意,默认的
13-
Paddle镜像 `paddledev/paddle:cpu-latest` 是不包括源码的, Paddle的各版本镜像可以参考 [Docker installation guide](http://www.paddlepaddle.org/doc/build/docker_install.html) ),所以我们使用这个镜像来下载训练数据到Docker container中,然后把这个包含了训练数据的container保存为一个新的镜像。
12+
`paddledev/paddle:cpu-demo-latest` 镜像里有 PaddlePaddle 源码与demo,( 请注意,默认的
13+
PaddlePaddle镜像 `paddledev/paddle:cpu-latest` 是不包括源码的, PaddlePaddle的各版本镜像可以参考 [Docker installation guide](http://www.paddlepaddle.org/doc/build/docker_install.html) ),所以我们使用这个镜像来下载训练数据到Docker container中,然后把这个包含了训练数据的container保存为一个新的镜像。
1414

1515
### 运行容器
1616

@@ -103,7 +103,7 @@ spec:
103103
restartPolicy: Never
104104
```
105105

106-
### 创建Paddle Job
106+
### 创建PaddlePaddle Job
107107

108108
使用上文创建的yaml文件创建Kubernetes Job,命令为:
109109

doc/howto/usage/cluster/k8s_en.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# Paddle On Kubernetes
1+
# PaddlePaddle On Kubernetes
22

3-
>In this article, we will introduce how to run Paddle training job on single CPU machine using Kubernetes. In next article, we will introduce how to run Paddle training job on distributed cluster.
3+
In this article, we will introduce how to run PaddlePaddle training job on single CPU machine using Kubernetes. In next article, we will introduce how to run PaddlePaddle training job on distributed cluster.
44

55
## Build Docker Image
66

7-
In distributed Kubernetes cluster, we will use Ceph or other shared storage system for storing training related data so that all processes in Paddle training can retrieve data from Ceph. In this example, we will only demo training job on single machine. In order to simplify the requirement of the environment, we will directly put training data into Paddle's Docker Image, so we need to create a Paddle Docker image that already includes the training data.
7+
In distributed Kubernetes cluster, we will use Ceph or other shared storage system for storing training data so that all processes in the training job can retrieve data from Ceph. In this example, we will only demo training job on single machine. In order to simplify the requirement of the environment, we will directly put training data into PaddlePaddle's Docker Image, so we need to create a PaddlePaddle Docker image that already includes the training data.
88

9-
Paddle's [Quick Start Tutorial](http://www.paddlepaddle.org/doc/demo/quick_start/index_en.html) introduces how to download and train data by using script from Paddle's source code.
10-
And `paddledev/paddle:cpu-demo-latest` image has the Paddle source code and demo. (Caution: Default Paddle image `paddledev/paddle:cpu-latest` doesn't include the source code, Paddle's different versions of image can be referred here: [Docker installation guide](http://www.paddlepaddle.org/doc/build/docker_install.html)), so we run this container and download the training data, and then commit the whole container to be a new Docker image.
9+
PaddlePaddle's [Quick Start Tutorial](http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/index_en.html) introduces how to download and train data by using script from PaddlePaddle's source code.
10+
And `paddledev/paddle:cpu-demo-latest` image has the PaddlePaddle source code and demo. (Caution: Default PaddlePaddle image `paddledev/paddle:cpu-latest` doesn't include the source code, PaddlePaddle's different versions of image can be referred here: [Docker installation guide](http://www.paddlepaddle.org/doc/build/docker_install.html)), so we run this container and download the training data, and then commit the whole container to be a new Docker image.
1111

1212
### Run Docker Container
1313

@@ -67,7 +67,7 @@ $ docker commit quick_start_data mypaddle/paddle:quickstart
6767

6868
## Use Kubernetes For Training
6969

70-
>We will use Kubernetes job for training process, following steps shows how to do the training with Kubernetes.
70+
We will use Kubernetes job for training process, following steps shows how to do the training with Kubernetes.
7171

7272
### Create Yaml Files
7373

@@ -99,7 +99,7 @@ spec:
9999
restartPolicy: Never
100100
```
101101

102-
### Start Paddle Job
102+
### Start PaddlePaddle Job
103103

104104
Using the above yaml file to start the Kubernetes job.
105105

0 commit comments

Comments
 (0)