Skip to content

Commit f9fa2b9

Browse files
committed
fix link wrong
1 parent ab0d859 commit f9fa2b9

File tree

5 files changed

+38
-22
lines changed

5 files changed

+38
-22
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858
| 排序 | [NFM](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/nfm/) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [SIGIR 2017][Neural Factorization Machines for Sparse Predictive Analytics](https://dl.acm.org/doi/pdf/10.1145/3077136.3080777) |
5959
| 排序 | [AFM](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/afm/) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [IJCAI 2017][Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks](https://arxiv.org/pdf/1708.04617.pdf) |
6060
| 排序 | [DeepFM](models/rank/deepfm/) || x || x | 2.0 | [IJCAI 2017][DeepFM: A Factorization-Machine based Neural Network for CTR Prediction](https://arxiv.org/pdf/1703.04247.pdf) |
61-
| 排序 | [xDeepFM](models/rank/xdeepfm/) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [KDD 2018][xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems](https://dl.acm.org/doi/pdf/10.1145/3219819.3220023) |
61+
| 排序 | [xDeepFM](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/xdeepfm) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [KDD 2018][xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems](https://dl.acm.org/doi/pdf/10.1145/3219819.3220023) |
6262
| 排序 | [DIN](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/din/) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [KDD 2018][Deep Interest Network for Click-Through Rate Prediction](https://dl.acm.org/doi/pdf/10.1145/3219819.3219823) |
6363
| 排序 | [DIEN](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/dien/) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [AAAI 2019][Deep Interest Evolution Network for Click-Through Rate Prediction](https://www.aaai.org/ojs/index.php/AAAI/article/view/4545/4423) |
6464
| 排序 | [BST](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/BST/) || x || x | [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [DLP_KDD 2019][Behavior Sequence Transformer for E-commerce Recommendation in Alibaba](https://arxiv.org/pdf/1905.06874v1.pdf) |
@@ -67,7 +67,7 @@
6767
| 排序 | [FGCNN](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/fgcnn/) ||||| [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [WWW 2019][Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction](https://arxiv.org/pdf/1904.04447.pdf) |
6868
| 排序 | [Fibinet](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/fibinet/) ||||| [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [RecSys19][FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction]( https://arxiv.org/pdf/1905.09433.pdf) |
6969
| 排序 | [Flen](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/rank/flen/) ||||| [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [2019][FLEN: Leveraging Field for Scalable CTR Prediction]( https://arxiv.org/pdf/1911.04690.pdf) |
70-
| 多任务 | [PLE](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/multitask/ple/) ||||| [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236) |
70+
| 多任务 | PLE ||||| 1.8.5 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236) |
7171
| 多任务 | [ESMM](models/multitask/esmm/) ||||| 2.0 | [SIGIR 2018][Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate](https://arxiv.org/abs/1804.07931) |
7272
| 多任务 | [MMOE](models/multitask/mmoe/) ||||| 2.0 | [KDD 2018][Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts](https://dl.acm.org/doi/abs/10.1145/3219819.3220007) |
7373
| 多任务 | [ShareBottom](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5/models/multitask/share-bottom/) ||||| [1.8.5](https://github.com/PaddlePaddle/PaddleRec/tree/release/1.8.5) | [1998][Multitask learning](http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-203.pdf) |

datasets/readme.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,11 @@ sh data_process.sh
1717
| 数据集名称 | 简介 | Reference |
1818
| :----------------------------------------------: | :------------------------------------------------------------------------------------------: | :-------------------------------: |
1919
|[ag_news](https://paddle-tagspace.bj.bcebos.com/data.tar)|496835 条来自AG新闻语料库 4 大类别超过 2000 个新闻源的新闻文章,数据集仅仅援用了标题和描述字段。每个类别分别拥有 30,000 个训练样本及 1900 个测试样本。| [ComeToMyHead](http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html)|
20-
|[Ali-CCP:Alibaba Click and Conversion Prediction]( https://tianchi.aliyun.com/datalab/dataSet.html?dataId=408 )|从淘宝推荐系统的真实流量日志中收集的数据集。|[SIGIR(2018)]( https://tianchi.aliyun.com/datalab/dataSet.html?dataId=408)|
20+
|[Ali-CCP:Alibaba Click and Conversion Prediction](https://tianchi.aliyun.com/datalab/dataSet.html?dataId=408)|从淘宝推荐系统的真实流量日志中收集的数据集。|[SIGIR(2018)]( https://tianchi.aliyun.com/datalab/dataSet.html?dataId=408)|
2121
|[BQ](https://paddlerec.bj.bcebos.com/dssm%2Fbq.tar.gz)|BQ是一个智能客服中文问句匹配数据集,该数据集是自动问答系统语料,共有120,000对句子对,并标注了句子对相似度值。数据中存在错别字、语法不规范等问题,但更加贴近工业场景|--|
2222
|[Census-income Data](https://archive.ics.uci.edu/ml/machine-learning-databases/census-income-mld/census.tar.gz )|此数据集包含从1994年和1995年美国人口普查局进行的当前人口调查中提取的加权人口普查数据。数据包含人口统计和就业相关变量。|[Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid](http://robotics.stanford.edu/~ronnyk/nbtree.pdf)|
2323
|[Criteo](https://fleet.bj.bcebos.com/ctr_data.tar.gz)|该数据集包括两部分:训练集和测试集。训练集包含一段时间内Criteo的部分流量,测试集则对应训练数据后一天的广告点击流量。|[kaggle](https://www.kaggle.com/c/criteo-display-ad-challenge/)|
2424
|[letor07](https://paddlerec.bj.bcebos.com/match_pyramid/match_pyramid_data.tar.gz)|LETOR是一套用于学习排名研究的基准数据集,其中包含标准特征、相关性判断、数据划分、评估工具和若干基线|[LETOR: Learning to Rank for Information Retrieval](https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fum%2Fbeijing%2Fprojects%2Fletor%2F)|
2525
|[senti_clas](https://baidu-nlp.bj.bcebos.com/sentiment_classification-dataset-1.0.0.tar.gz)|情感倾向分析(Sentiment Classification,简称Senta)针对带有主观描述的中文文本,可自动判断该文本的情感极性类别并给出相应的置信度。情感类型分为积极、消极。情感倾向分析能够帮助企业理解用户消费习惯、分析热点话题和危机舆情监控,为企业提供有利的决策支持|--|
2626
|[one_billion](http://www.statmt.org/lm-benchmark/)|拥有十亿个单词基准,为语言建模实验提供标准的训练和测试|[One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling](https://arxiv.org/abs/1312.3005)|
27-
|[MIND](https://paddlerec.bj.bcebos.com/datasets/MIND/bigdata.zip)|MIND即MIcrosoft News Dataset的简写,MIND里的数据来自Microsoft News用户的行为日志。
28-
MIND的数据集里包含了1,000,000的用户以及这些用户与160,000的文章的交互行为。|[Microsoft(2020)](https://msnews.github.io)|
27+
|[MIND](https://paddlerec.bj.bcebos.com/datasets/MIND/bigdata.zip)|MIND即MIcrosoft News Dataset的简写,MIND里的数据来自Microsoft News用户的行为日志。MIND的数据集里包含了1,000,000的用户以及这些用户与160,000的文章的交互行为。|[Microsoft(2020)](https://msnews.github.io)|

models/rank/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# 排序模型库
22

33
## 简介
4-
我们提供了常见的排序任务中使用的模型算法的PaddleRec实现, 包括动态图和静态图的单机训练&预测效果指标。实现的排序模型包括 [logistic regression](logistic_regression)[多层神经网络](dnn)[FM](fm)[gatednn](gatednn)[DeepFM](deepfm)[Wide&Deep](wide_deep)[naml](naml)
4+
我们提供了常见的排序任务中使用的模型算法的PaddleRec实现, 包括动态图和静态图的单机训练&预测效果指标。实现的排序模型包括 [logistic regression](logistic_regression)[多层神经网络](dnn)[FM](fm)[gateDnn](gateDnn)[DeepFM](deepfm)[Wide&Deep](wide_deep)[naml](naml)
55

66
模型算法库在持续添加中,欢迎关注。
77

models/recall/readme.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@
2727
<img align="center" src="../../doc/imgs/word2vec.png">
2828
<p>
2929

30+
## 使用教程
31+
3032
### 快速开始
3133
```bash
3234
# 进入模型目录
@@ -47,3 +49,7 @@ python -u ../../../tools/static_infer.py -m config.yaml
4749
| 数据集 | 模型 | acc |
4850
| :------------------: | :--------------------: | :---------: |
4951
| 1 Billion Word Language Model Benchmark | Word2Vec | 0.579 |
52+
53+
### 效果复现
54+
您需要进入PaddleRec/datasets目录下的对应数据集中运行脚本获取全量数据集,然后在模型目录下使用全量数据的参数运行。
55+
每个模型下的readme中都有详细的效果复现的教程,您可以进入模型的目录中详细查看。

models/rerank/readme.md

Lines changed: 27 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,50 @@
11
# 重排序模型库
22

33
## 简介
4-
我们提供了常见的重排序使用的模型算法的PaddleRec实现, 单机训练&预测效果指标以及分布式训练&预测性能指标等。目前实现的模型是 [Listwise](listwise)
4+
我们提供了常见的重排序使用的模型算法的PaddleRec实现, 单机训练&预测效果指标以及分布式训练&预测性能指标等。目前实现的模型是。
55

66
模型算法库在持续添加中,欢迎关注。
77

88
## 目录
99
* [整体介绍](#整体介绍)
10-
* [重排序模型列表](#重排序模型列表)
10+
* [模型列表](#模型列表)
1111
* [使用教程](#使用教程)
12+
* [快速开始](#快速开始)
13+
* [模型效果](#模型效果)
14+
* [效果复现](#效果复现)
1215

1316
## 整体介绍
14-
### 融合模型列表
17+
### 模型列表
1518

1619
| 模型 | 简介 | 论文 |
1720
| :------------------: | :--------------------: | :---------: |
18-
| Listwise | Listwise | [2019][Sequential Evaluation and Generation Framework for Combinatorial Recommender System](https://arxiv.org/pdf/1902.00245.pdf) |
1921

2022
下面是每个模型的简介(注:图片引用自链接中的论文)
2123

2224

23-
[Listwise](https://arxiv.org/pdf/1902.00245.pdf):
24-
<p align="center">
25-
<img align="center" src="../../doc/imgs/listwise.png">
26-
<p>
25+
## 使用教程
2726

27+
### 快速开始
28+
```bash
29+
# 进入模型目录
30+
cd models/rerank/xxx # xxx为任意的rerank下的模型目录
31+
# 动态图训练
32+
python -u ../../../tools/trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml
33+
# 动态图预测
34+
python -u ../../../tools/infer.py -m config.yaml
2835

29-
## 使用教程(快速开始)
30-
```shell
31-
git clone https://github.com/PaddlePaddle/PaddleRec.git paddle-rec
32-
cd paddle-rec
36+
# 静态图训练
37+
python -u ../../../tools/static_trainer.py -m config.yaml # 全量数据运行config_bigdata.yaml
38+
# 静态图预测
39+
python -u ../../../tools/static_infer.py -m config.yaml
40+
```
3341

34-
python -m paddlerec.run -m models/rerank/listwise/config.yaml # listwise
35-
```
42+
### 模型效果
43+
44+
| 数据集 | 模型 | acc |
45+
| :------------------: | :--------------------: | :---------: |
3646

37-
## 使用教程(复现论文)
3847

39-
listwise原论文没有给出训练数据,我们使用了随机的数据,可参考快速开始
48+
### 效果复现
49+
您需要进入PaddleRec/datasets目录下的对应数据集中运行脚本获取全量数据集,然后在模型目录下使用全量数据的参数运行。
50+
每个模型下的readme中都有详细的效果复现的教程,您可以进入模型的目录中详细查看。

0 commit comments

Comments
 (0)