Skip to content

PaddleRec Milestone #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 571 commits into from
May 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
571 commits
Select commit Hold shift + click to select a range
4d0624b
debug
123malin May 8, 2020
ee5d44d
Merge branch 'infer_dssm_w2v' into 'develop'
123malin May 8, 2020
02e66e8
add rank readme
yaoxuefeng6 May 9, 2020
93b9050
Merge branch 'xuefeng' into 'develop'
yaoxuefeng6 May 9, 2020
fab38a4
modify rank readme
yaoxuefeng6 May 9, 2020
3cb4e47
Merge branch 'xuefeng' into 'develop'
yaoxuefeng6 May 9, 2020
a862f89
add gnn
123malin May 9, 2020
6dcb9ca
Merge branch 'gnn' into 'develop'
123malin May 9, 2020
dc2894c
modify rank readme
yaoxuefeng6 May 9, 2020
7a2666e
Merge branch 'xuefeng' into 'develop'
yaoxuefeng6 May 9, 2020
fbab6e1
modify rank readme
yaoxuefeng6 May 9, 2020
922dbcf
Merge branch 'xuefeng' into 'develop'
yaoxuefeng6 May 9, 2020
fd7df95
add simnet
123malin May 9, 2020
3b3637d
update doc
MrChengmo May 9, 2020
abde15c
Merge branch 'chengmo_dev' of ssh://gitlab.baidu.com:8022/chengmo/pad…
MrChengmo May 9, 2020
e957d7b
update readme
MrChengmo May 9, 2020
c30368e
debug
123malin May 9, 2020
decaa00
Merge branch 'simnet' into 'develop'
123malin May 9, 2020
7211a0b
bug fix for train.threads, cluster infer, add w2v prepare_data.sh
123malin May 9, 2020
308fe8c
add tdm infer
MrChengmo May 9, 2020
c554105
Merge branch 'bug_fix' into 'develop'
123malin May 9, 2020
d5b4ffd
Merge branch 'develop' into 'tdm_infer'
MrChengmo May 9, 2020
3c1f6d2
Merge branch 'tdm_infer' into 'develop'
MrChengmo May 9, 2020
05b60cc
add text_classification
xjqbest May 9, 2020
4cdbb3e
add multitask ssr gru4rec
frankwhzhang May 10, 2020
0f78b25
Merge branch 'develop' into 'develop'
frankwhzhang May 10, 2020
74bb8cf
fix
xjqbest May 11, 2020
1125b79
Merge branch 'develop' into 'develop'
xjqbest May 11, 2020
bce83d0
mv dssm to match/; mv tagspace;text_classification to contentundersta…
123malin May 11, 2020
e9011c0
add mpi cluster
seiriosPlus May 12, 2020
f18eea2
add mpi cluster
seiriosPlus May 12, 2020
531bf66
Merge branch 'my_develop' into 'develop'
123malin May 12, 2020
e1f9013
add mpi cluster
seiriosPlus May 12, 2020
b0873dd
add mpi cluster
seiriosPlus May 12, 2020
12f161d
add mpi cluster
seiriosPlus May 12, 2020
16966dc
add qsub submit
seiriosPlus May 12, 2020
7c10b48
add qsub submit
seiriosPlus May 12, 2020
77a2da6
add qsub submit
seiriosPlus May 12, 2020
e6f590c
add recall/readme
123malin May 12, 2020
b2a8248
update readme
123malin May 12, 2020
ad9349d
update readme
123malin May 12, 2020
8a96d14
Merge branch 'my_develop' into 'develop'
123malin May 12, 2020
9438cb3
bug fix
seiriosPlus May 13, 2020
1c8486f
Merge branch 'rec_mpi' into rec_develop
seiriosPlus May 13, 2020
06a44f6
add cluster run, rename fleet-rec to paddle-rec
seiriosPlus May 13, 2020
ee59aae
update doc
MrChengmo May 13, 2020
cd7cb08
fix - - -
MrChengmo May 13, 2020
3aec9f2
add cluster run, rename fleet-rec to paddle-rec
seiriosPlus May 13, 2020
fcbb1cc
for merge
MrChengmo May 13, 2020
89b366a
update doc
MrChengmo May 13, 2020
b576b45
update doc
MrChengmo May 13, 2020
07d791e
update log
MrChengmo May 13, 2020
1fb6396
Merge branch 'doc_update' into 'develop'
MrChengmo May 13, 2020
7d585ea
add cluster run
seiriosPlus May 13, 2020
21ddc26
add cluster run
seiriosPlus May 13, 2020
58a74dc
add reader debug
MrChengmo May 13, 2020
013c12f
fix bug
MrChengmo May 13, 2020
797b515
add paddle cloud run
seiriosPlus May 13, 2020
33fc175
add reader debug mode
MrChengmo May 13, 2020
5a0982f
add reader doc
MrChengmo May 13, 2020
6e3933d
update doc
MrChengmo May 13, 2020
a940d88
fix
MrChengmo May 13, 2020
b357847
replace log
MrChengmo May 13, 2020
b52f68d
fix
MrChengmo May 13, 2020
685a856
update
MrChengmo May 13, 2020
075eb86
Merge branch 'doc_update' into 'develop'
MrChengmo May 13, 2020
41c7764
bug fix
seiriosPlus May 14, 2020
5b3e120
Merge branch 'develop' of ssh://gitlab.baidu.com:8022/tangwei12/paddl…
seiriosPlus May 14, 2020
f265d5e
add paddle cloud run
seiriosPlus May 14, 2020
1bdfd06
add paddle cloud run
seiriosPlus May 14, 2020
032f657
add paddle cloud run
seiriosPlus May 14, 2020
24d5c80
add paddle cloud run
seiriosPlus May 14, 2020
07dbb36
fix multi-task bug
frankwhzhang May 14, 2020
67869ed
bug fix
seiriosPlus May 14, 2020
fb7fbb8
Merge branch 'develop' into 'develop'
frankwhzhang May 14, 2020
8dae87e
add paddle cloud run
seiriosPlus May 14, 2020
00d1d6d
add paddle cloud run
seiriosPlus May 14, 2020
5adee37
add paddle cloud run
seiriosPlus May 14, 2020
9b96156
fix
MrChengmo May 14, 2020
2a5b531
add gpu config
frankwhzhang May 14, 2020
75b8b0e
Merge branch 'develop' into 'develop'
frankwhzhang May 14, 2020
70d1dc5
add paddle cloud run
seiriosPlus May 14, 2020
b51cbb3
add paddle cloud run
seiriosPlus May 14, 2020
42bdd47
Merge branch 'develop' of ssh://gitlab.baidu.com:8022/tangwei12/paddl…
seiriosPlus May 14, 2020
70f1778
add multitask readme
frankwhzhang May 14, 2020
c2b6fc4
Merge branch 'develop' into 'develop'
frankwhzhang May 14, 2020
7a7843c
add evaluate only choice
MrChengmo May 14, 2020
e2d2db2
add paddle cloud run
seiriosPlus May 14, 2020
120b2e4
Merge branch 'develop' of ssh://gitlab.baidu.com:8022/tangwei12/paddl…
seiriosPlus May 14, 2020
e1911c9
upadet
MrChengmo May 14, 2020
91bc9c9
fix
MrChengmo May 14, 2020
4f59a38
fix
MrChengmo May 14, 2020
1fb9c55
fix
MrChengmo May 14, 2020
47672b3
fix
MrChengmo May 14, 2020
f1ef14d
fix
MrChengmo May 14, 2020
0a0fb30
fix
MrChengmo May 14, 2020
307afea
fix
MrChengmo May 14, 2020
e97bc2c
fix
MrChengmo May 14, 2020
3a3a235
Merge branch 'doc_update' into 'develop'
MrChengmo May 14, 2020
67997ab
fix
MrChengmo May 14, 2020
240a3e6
Merge branch 'doc_update' into 'develop'
MrChengmo May 14, 2020
6fa3457
add gru4rec infer
frankwhzhang May 14, 2020
f5626ba
Merge branch 'develop' into 'develop'
frankwhzhang May 14, 2020
41c8007
update ctr dnn benchmark
seiriosPlus May 14, 2020
6574436
Merge branch 'develop' of ssh://gitlab.baidu.com:8022/tangwei12/paddl…
seiriosPlus May 14, 2020
7d00be9
update ctr dnn benchmark
seiriosPlus May 14, 2020
4590256
update ctr dnn benchmark
seiriosPlus May 14, 2020
6275765
update
MrChengmo May 14, 2020
7a3a1dd
fix
MrChengmo May 14, 2020
4f2fe5a
Merge branch 'ai_studio_beta' into 'develop'
MrChengmo May 14, 2020
1e0195c
fix
MrChengmo May 14, 2020
65fe4a7
Merge branch 'ai_studio_beta' into 'develop'
MrChengmo May 14, 2020
f27e824
fix readme
frankwhzhang May 14, 2020
26a8fe3
Merge branch 'develop' into 'develop'
frankwhzhang May 14, 2020
2047af5
add readme
xjqbest May 14, 2020
0ab7d17
Update readme.md
xjqbest May 14, 2020
946d2a5
Merge branch 'develop' into 'develop'
xjqbest May 14, 2020
32028a9
Update readme.md
xjqbest May 15, 2020
183ff0a
Merge branch 'develop' into 'develop'
xjqbest May 15, 2020
07456ac
refine readme for cu
May 15, 2020
ecdec09
refine readme for cu
May 15, 2020
417be99
refine readme for cu
May 15, 2020
c528127
refine readme for cu
May 15, 2020
524a796
refine readme for cu
May 15, 2020
3b59e3c
add ssr infer
frankwhzhang May 15, 2020
c051fb3
Merge branch 'develop' into 'develop'
frankwhzhang May 15, 2020
abb7933
refine readme for rank and cu
May 15, 2020
388df4f
refine readme for rank and cu
May 15, 2020
3c132e9
refine readme for rank and cu
May 15, 2020
5e300b8
Merge branch 'develop' into 'develop'
fuyinno4 May 15, 2020
f1a9503
add mmoe infer
frankwhzhang May 15, 2020
52d3c0a
add share-bottom infer
frankwhzhang May 15, 2020
ad5022d
Merge branch 'develop' into 'develop'
frankwhzhang May 15, 2020
729d5e3
add esmm infer
frankwhzhang May 15, 2020
c241990
Merge branch 'develop' into 'develop'
frankwhzhang May 15, 2020
aa858fe
finish design
MrChengmo May 15, 2020
542838d
fix multitask readme
frankwhzhang May 15, 2020
4c8c32a
Merge branch 'develop' into 'develop'
frankwhzhang May 15, 2020
6f1df8f
fix readme
frankwhzhang May 15, 2020
8ca357d
Merge branch 'develop' into 'develop'
frankwhzhang May 15, 2020
726c61c
fix
xjqbest May 15, 2020
3e0c301
Merge branch 'develop' into 'develop'
xjqbest May 15, 2020
76ee787
update reader
MrChengmo May 15, 2020
8c0b1d1
update recall/match readme
123malin May 15, 2020
c624dd5
update recall readme
123malin May 15, 2020
531bf6f
fix reader
MrChengmo May 15, 2020
7aef156
Merge branch 'doc' into 'develop'
123malin May 15, 2020
1b2a89f
Merge branch 'ai_studio_beta' into 'develop'
MrChengmo May 15, 2020
e8b832c
update readne
MrChengmo May 15, 2020
d1c7476
fix
MrChengmo May 15, 2020
69dae72
fix
MrChengmo May 15, 2020
50eeed6
fix
MrChengmo May 15, 2020
39e76f3
Merge branch 'ai_studio_beta' into 'develop'
MrChengmo May 15, 2020
98a47ba
add rec-overview
MrChengmo May 16, 2020
dbf4b13
Merge branch 'ai_studio_beta' into 'develop'
MrChengmo May 16, 2020
13e62b7
fix
xjqbest May 18, 2020
d7171ec
Merge branch 'develop' into 'develop'
xjqbest May 18, 2020
0e9c4fd
refine readme for cu
fuyinno4 May 15, 2020
077e682
release 0.1
seiriosPlus May 14, 2020
4cdab80
remove cluster example
seiriosPlus May 14, 2020
63dd4d0
refine readme for cu
fuyinno4 May 15, 2020
4f0c5de
refine readme for cu
fuyinno4 May 15, 2020
b9be302
refine readme for cu
fuyinno4 May 15, 2020
07f946a
refine readme for cu
fuyinno4 May 15, 2020
3865e85
add ssr infer
frankwhzhang May 15, 2020
c167bf3
refine readme for rank and cu
fuyinno4 May 15, 2020
dbf19ca
refine readme for rank and cu
fuyinno4 May 15, 2020
cb01de6
refine readme for rank and cu
fuyinno4 May 15, 2020
ce6b34a
add mmoe infer
frankwhzhang May 15, 2020
5d701fe
add share-bottom infer
frankwhzhang May 15, 2020
3214549
add esmm infer
frankwhzhang May 15, 2020
61a7361
finish design
MrChengmo May 15, 2020
0cab80c
add license and bug fix
seiriosPlus May 15, 2020
e48639c
add license and bug fix
seiriosPlus May 15, 2020
bb21535
fix multitask readme
frankwhzhang May 15, 2020
c17f9ae
fix readme
frankwhzhang May 15, 2020
ded2280
rebase user
seiriosPlus May 18, 2020
3874172
rebase user
seiriosPlus May 18, 2020
ece6c4b
update recall/match readme
123malin May 15, 2020
d610967
update recall readme
123malin May 15, 2020
0ae5bca
fix reader
MrChengmo May 15, 2020
79252c7
update readne
MrChengmo May 15, 2020
a929a99
fix
MrChengmo May 15, 2020
9ebd08a
fix
MrChengmo May 15, 2020
975245e
fix
MrChengmo May 15, 2020
7dfe461
add rec-overview
MrChengmo May 16, 2020
d12cf52
merge readme
seiriosPlus May 18, 2020
1fdcf2e
merge develop
seiriosPlus May 18, 2020
fa509de
remove readme
seiriosPlus May 18, 2020
5461670
Merge branch 'master' of https://github.com/PaddlePaddle/PaddleRec in…
seiriosPlus May 18, 2020
81992dc
add readme
seiriosPlus May 18, 2020
a66a717
add readme
seiriosPlus May 18, 2020
3cae956
remove unused flag -d -e
seiriosPlus May 18, 2020
d536f99
remove unused flag -d -e
seiriosPlus May 18, 2020
6378742
fix readme
frankwhzhang May 18, 2020
7b457e4
Merge branch 'develop' into 'develop'
frankwhzhang May 18, 2020
2f710e5
tmp modify rank readme
yaoxuefeng6 May 18, 2020
1431b66
Merge branch 'xuefeng' into 'develop'
yaoxuefeng6 May 18, 2020
a14ff89
fix readme
frankwhzhang May 18, 2020
77c3e44
Merge branch 'develop' into 'develop'
frankwhzhang May 18, 2020
4699640
fix readme
frankwhzhang May 18, 2020
7e1f327
fix xdeepfm metric
yaoxuefeng6 May 18, 2020
1b1cfee
Merge branch 'xuefeng' into 'develop'
yaoxuefeng6 May 18, 2020
1c41e7b
fix ssr readme
frankwhzhang May 18, 2020
17420a0
Merge branch 'develop' into 'develop'
frankwhzhang May 18, 2020
e9853a9
update recall/match readme
123malin May 18, 2020
7ab5d82
Merge branch 'my_develop' into 'develop'
123malin May 18, 2020
741fbbd
windows path adapt
seiriosPlus May 18, 2020
89799b5
Merge branch 'develop' of ssh://gitlab.baidu.com:8022/tangwei12/paddl…
seiriosPlus May 18, 2020
72e086b
windows path adapt
seiriosPlus May 18, 2020
3c915be
windows path adapt
seiriosPlus May 18, 2020
91dad67
add slot reader
xjqbest May 18, 2020
e57ed51
add online trainning trainer
seiriosPlus May 18, 2020
c67be2a
update doc
MrChengmo May 18, 2020
6361f96
update doc
MrChengmo May 18, 2020
473b742
update doc
MrChengmo May 18, 2020
f2ed7fd
delete readme
MrChengmo May 18, 2020
c685a49
Merge branch 'doc_v1' into 'develop'
MrChengmo May 18, 2020
0f88900
update doc
MrChengmo May 19, 2020
7cd6838
fix
MrChengmo May 19, 2020
8d23379
fix
MrChengmo May 19, 2020
8f5d002
fix
MrChengmo May 19, 2020
bef91cb
Merge branch 'doc_v2' into 'develop'
MrChengmo May 19, 2020
8d82a06
remove unused import
seiriosPlus May 19, 2020
ba023df
fix import order
seiriosPlus May 19, 2020
801dfd3
rename get_cost_op to avg_cost
seiriosPlus May 19, 2020
7a3ec4e
for mat
seiriosPlus May 19, 2020
b11b882
rename Layer
seiriosPlus May 19, 2020
522d4d4
slot reader for rank
xjqbest May 19, 2020
c3b2851
rm slot data
xjqbest May 19, 2020
5892301
fix
xjqbest May 19, 2020
25fbe51
fix
xjqbest May 19, 2020
c6fc563
add custom_dataset_reader.md
xjqbest May 19, 2020
3e50521
Merge remote-tracking branch 'upstream/develop' into develop
xjqbest May 19, 2020
a01831d
add ncf youtube
frankwhzhang May 19, 2020
c782ddb
Merge branch 'develop' into 'develop'
frankwhzhang May 19, 2020
b111a04
Merge branch 'develop' into 'develop'
frankwhzhang May 19, 2020
01d8b29
add readme
frankwhzhang May 19, 2020
76471fc
Merge branch 'develop' of http://gitlab.baidu.com/tangwei12/paddlerec…
frankwhzhang May 19, 2020
d770eec
Merge branch 'develop' into 'develop'
frankwhzhang May 19, 2020
af164aa
fix
xjqbest May 19, 2020
491c036
Merge remote-tracking branch 'upstream/develop' into develop
xjqbest May 19, 2020
43701d7
fix
xjqbest May 19, 2020
2d22c88
fix
xjqbest May 19, 2020
3882868
Merge branch 'develop' into 'develop'
xjqbest May 19, 2020
3b3ae94
update
MrChengmo May 19, 2020
17f07fe
Merge branch 'doc_v3' into 'develop'
MrChengmo May 19, 2020
b95b2fe
fix
xjqbest May 19, 2020
641a7ef
Merge branch 'develop' into 'develop'
xjqbest May 19, 2020
caf0501
fix Normalization
seiriosPlus May 19, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
*.o
output
.idea/
paddlerec.egg-info/
*~
*.pyc
221 changes: 219 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,219 @@
# PaddleRec
推荐算法,大规模并行训练支持
<p align="center">
<img align="center" src="doc/imgs/logo.png">
<p>

<p align="center">
<br>
<img alt="Release" src="https://img.shields.io/badge/Release-0.1.0-yellowgreen">
<img alt="License" src="https://img.shields.io/github/license/PaddlePaddle/Serving">
<img alt="Slack" src="https://img.shields.io/badge/Join-Slack-green">
<br>
<p>


<h2 align="center">什么是PaddleRec</h2>

<p align="center">
<img align="center" src="doc/imgs/structure.png">
<p>

- 源于飞桨生态的搜索推荐模型**一站式开箱即用工具**
- 适合初学者,开发者,研究者从调研,训练到预测部署的全流程解决方案
- 包含语义理解、召回、粗排、精排、多任务学习、融合等多个任务的推荐搜索算法库
- 配置**yaml**自定义选项,即可快速上手使用单机训练、大规模分布式训练、离线预测、在线部署


<h2 align="center">PadlleRec概览</h2>

<p align="center">
<img align="center" src="doc/imgs/overview.png">
<p>


<h2 align="center">推荐系统-流程概览</h2>

<p align="center">
<img align="center" src="doc/imgs/rec-overview.png">
<p>

<h2 align="center">便捷安装</h2>

### 环境要求
* Python 2.7/ 3.5 / 3.6 / 3.7
* PaddlePaddle >= 1.7.2
* 操作系统: Windows/Mac/Linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否支持mac和windows?

Copy link
Collaborator Author

@seiriosPlus seiriosPlus May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

支持mac和windows, windows目前只能提供单机训练,这个在文档中会说明


> Windows下目前仅提供单机训练,建议使用Linux

### 安装命令

- 安装方法一<PIP源直接安装>:
```bash
python -m pip install paddle-rec
```

- 安装方法二

源码编译安装
1. 安装飞桨 **注:需要用户安装版本 >1.7.2 的飞桨**

```shell
python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```

2. 源码安装PaddleRec

```
git clone https://github.com/PaddlePaddle/PaddleRec/
cd PaddleRec
python setup.py install
```


<h2 align="center">快速启动</h2>

### 启动内置模型的默认配置

目前框架内置了多个模型,一行命令即可使用内置模型开始单机训练和本地模拟分布式训练。
> 本地模拟分布式(`local_cluster`)为`1个server + 1个trainer`的参数服务器模式


我们以排序模型中的`dnn`模型为例介绍PaddleRec的简单使用。训练数据来源为[Criteo数据集](https://www.kaggle.com/c/criteo-display-ad-challenge/),我们从中截取了100条方便您快速上手体验完整的PaddleRec流程。

```bash
# 使用CPU进行单机训练
python -m paddlerec.run -m paddlerec.models.rank.dnn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用的训练数据是否需要说明一下?

```

### 启动内置模型的自定配置

若您复用内置模型,对**yaml**配置文件进行了修改,如更改超参,重新配置数据后,可以直接使用paddlerec运行该yaml文件。

我们以dnn模型为例,在paddlerec代码目录下:
```bash
cd paddlerec
```

修改dnn模型的[超参配置](./models/rank/dnn/config.yaml),例如将迭代训练轮数从10轮修改为5轮:
```yaml
train:
# epochs: 10
epochs: 5
```

在Linux环境下,可以使用`vim`等文本编辑工具修改yaml文件:

```bash
vim ./models/rank/dnn/config.yaml
# 键入 i, 进入编辑模式
# 修改yaml文件配置
# 完成修改后,点击esc,退出编辑模式
# 键入 :wq 保存文件并退出
```

完成dnn模型`models/rank/dnn/config.yaml`的配置修改后,运行`dnn`模型:
```bash
# 使用自定配置进行训练
python -m paddlerec.run -m ./models/rank/dnn/config.yaml
```

### 分布式训练

分布式训练需要配置`config.yaml`,加入或修改`engine`选项为`cluster`或`local_cluster`,以进行分布式训练,或本地模拟分布式训练。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_cluster如果就是上文提到的1*1模拟训练,是否需要统一一下,统一采用local cluster来描述


#### 本地模拟分布式训练

我们以dnn模型为例,在paddlerec代码目录下,修改dnn模型的`config.yaml`文件:

```yaml
train:
#engine: single
engine: local_cluster
```
然后启动paddlerec训练:

```bash
# 进行本地模拟分布式训练
python -m paddlerec.run -m ./models/rank/dnn/config.yaml
```

#### 集群分布式训练

我们以dnn模型为例,在paddlerec代码目录下,首先修改dnn模型`config.yaml`文件:

```yaml
train:
#engine: single
engine: cluster
```
再添加分布式启动配置文件`backend.yaml`,具体配置规则在[分布式训练](doc/distributed_train.md)教程中介绍。最后启动paddlerec训练:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果这个yaml的用法是我们主推的,最好在readme里面体现


```bash
# 配置好 mpi/k8s/paddlecloud集群环境后
python -m paddlerec.run -m ./models/rank/dnn/config.yaml -b backend.yaml
```


<h2 align="center">支持模型列表</h2>


| 方向 | 模型 | 单机CPU训练 | 单机GPU训练 | 分布式CPU训练 |
| :------: | :-----------------------------------------------------------------------: | :---------: | :---------: | :-----------: |
| 内容理解 | [Text-Classifcation](models/contentunderstanding/classification/model.py) | ✓ | x | ✓ |
| 内容理解 | [TagSpace](models/contentunderstanding/tagspace/model.py) | ✓ | x | ✓ |
| 召回 | [DSSM](models/match/dssm/model.py) | ✓ | x | ✓ |
| 召回 | [MultiView-Simnet](models/match/multiview-simnet/model.py) | ✓ | x | ✓ |
| 召回 | [TDM](models/treebased/tdm/model.py) | ✓ | x | ✓ |
| 召回 | [Word2Vec](models/recall/word2vec/model.py) | ✓ | x | ✓ |
| 召回 | [SSR](models/recall/ssr/model.py) | ✓ | ✓ | ✓ |
| 召回 | [Gru4Rec](models/recall/gru4rec/model.py) | ✓ | ✓ | ✓ |
| 召回 | [Youtube_dnn](models/recall/youtube_dnn/model.py) | ✓ | ✓ | ✓ |
| 召回 | [NCF](models/recall/ncf/model.py) | ✓ | ✓ | ✓ |
| 排序 | [Dnn](models/rank/dnn/model.py) | ✓ | x | ✓ |
| 排序 | [DeepFM](models/rank/deepfm/model.py) | ✓ | x | ✓ |
| 排序 | [xDeepFM](models/rank/xdeepfm/model.py) | ✓ | x | ✓ |
| 排序 | [DIN](models/rank/din/model.py) | ✓ | x | ✓ |
| 排序 | [Wide&Deep](models/rank/wide_deep/model.py) | ✓ | x | ✓ |
| 多任务 | [ESMM](models/multitask/esmm/model.py) | ✓ | ✓ | ✓ |
| 多任务 | [MMOE](models/multitask/mmoe/model.py) | ✓ | ✓ | ✓ |
| 多任务 | [ShareBottom](models/multitask/share-bottom/model.py) | ✓ | ✓ | ✓ |




<h2 align="center">文档</h2>

### 背景介绍
* [推荐系统介绍](doc/rec_background.md)
* [分布式深度学习介绍](doc/ps_background.md)

### 新手教程
* [环境要求](#环境要求)
* [安装命令](#安装命令)
* [快速开始](#启动内置模型的默认配置)

### 进阶教程
* [自定义数据集及Reader](doc/custom_dataset_reader.md)
* [分布式训练](doc/distributed_train.md)

### 开发者教程
* [PaddleRec设计文档](doc/design.md)

### 关于PaddleRec性能
* [Benchmark](doc/benchmark.md)

### FAQ
* [常见问题FAQ](doc/faq.md)


<h2 align="center">社区</h2>

### 反馈
如有意见、建议及使用中的BUG,欢迎在`GitHub Issue`提交

### 版本历史
- 2020.5.14 - PaddleRec v0.1

### 许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。

13 changes: 13 additions & 0 deletions __init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Empty file added core/__init__.py
Empty file.
Empty file added core/engine/__init__.py
Empty file.
Empty file added core/engine/cluster/__init__.py
Empty file.
Empty file.
95 changes: 95 additions & 0 deletions core/engine/cluster/cloud/cluster.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license

# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


###################################################
# Usage: submit.sh
# Description: run mpi submit client implement
###################################################

# ---------------------------------------------------------------------------- #
# variable define #
# ---------------------------------------------------------------------------- #

#-----------------------------------------------------------------------------------------------------------------
#fun : package
#param : N/A
#return : 0 -- success; not 0 -- failure
#-----------------------------------------------------------------------------------------------------------------
function package_hook() {
g_run_stage="package"
package
}

#-----------------------------------------------------------------------------------------------------------------
#fun : before hook submit to cluster
#param : N/A
#return : 0 -- success; not 0 -- failure
#-----------------------------------------------------------------------------------------------------------------
function _before_submit() {
echo "before_submit"
before_submit_hook
}

#-----------------------------------------------------------------------------------------------------------------
#fun : after hook submit to cluster
#param : N/A
#return : 0 -- success; not 0 -- failure
#-----------------------------------------------------------------------------------------------------------------
function _after_submit() {
echo "after_submit"
after_submit_hook
}

#-----------------------------------------------------------------------------------------------------------------
#fun : submit to cluster
#param : N/A
#return : 0 -- success; not 0 -- failure
#-----------------------------------------------------------------------------------------------------------------
function _submit() {
g_run_stage="submit"

cd ${engine_temp_path}

paddlecloud job --ak ${engine_submit_ak} --sk ${engine_submit_sk} train --cluster-name ${engine_submit_cluster} \
--job-version ${engine_submit_version} \
--mpi-priority ${engine_submit_priority} \
--mpi-wall-time 300:59:00 \
--mpi-nodes ${engine_submit_nodes} --is-standalone 0 \
--mpi-memory 110Gi \
--job-name ${engine_submit_jobname} \
--start-cmd "${g_run_cmd}" \
--group-name ${engine_submit_group} \
--job-conf ${engine_submit_config} \
--files ${g_submitfiles} \
--json

cd -
}

function submit_hook() {
_before_submit
_submit
_after_submit
}

function main() {
source ${engine_submit_scrpit}

package_hook
submit_hook
}

main
60 changes: 60 additions & 0 deletions core/engine/cluster/cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from __future__ import print_function
from __future__ import unicode_literals

import copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import请用字母序

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import os
import subprocess

from paddlerec.core.engine.engine import Engine
from paddlerec.core.factory import TrainerFactory
from paddlerec.core.utils import envs


class ClusterEngine(Engine):
def __init_impl__(self):
abs_dir = os.path.dirname(os.path.abspath(__file__))
backend = envs.get_runtime_environ("engine_backend")
if backend == "PaddleCloud":
self.submit_script = os.path.join(abs_dir, "cloud/cluster.sh")
else:
raise ValueError("{} can not be supported now".format(backend))

def start_worker_procs(self):
trainer = TrainerFactory.create(self.trainer)
trainer.run()

def start_master_procs(self):
default_env = os.environ.copy()
current_env = copy.copy(default_env)
current_env.pop("http_proxy", None)
current_env.pop("https_proxy", None)

cmd = ("bash {}".format(self.submit_script)).split(" ")
proc = subprocess.Popen(cmd, env=current_env, cwd=os.getcwd())
proc.wait()

def run(self):
role = envs.get_runtime_environ("engine_role")

if role == "MASTER":
self.start_master_procs()

elif role == "WORKER":
self.start_worker_procs()

else:
raise ValueError("role {} error, must in MASTER/WORKER".format(role))
Loading