PaddlePaddle · jinghu23 · May 13, 2021 · May 12, 2021 · May 12, 2021 · May 13, 2021
diff --git a/apps/drug_drug_synergy/README.md b/apps/drug_drug_synergy/README.md
@@ -1,58 +1,7 @@
-# DDs(Drug Drug synergy)
+# drug_drug_synergy
 
 [中文版本](./README_cn.md) [English Version](./README.md)
 
-* [Background](#background)
-* [Datasets](#datasets)
-    * [ddi](#ddi)
-    * [dti](#dti)
-    * [ppi](#ppi)
-    * [features](#features)
-* [Instructions](#instructions)
-    * [Training and Evaluation](#train-and-evaluation)
-* [Reference](#reference)
+We provide the following pretrained protein methods.
 
-## Background
-
-Drug combinations, also known as combinatorial therapy, are frequently prescribed to treat patients with complex disease. Graph convolutional network(GCN) can be used to predict drug-drug synergy by intergrating multiple biological networks.
-
-## Datasets
-Drug-drug synergy information and drug physi-chemical features can be put under `data` folder. Then let us create `ddi`, `dti` and `ppi` folder under `data` folder.
-### ddi
-
-```sh
-cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
-```
-
-### dti
-
-### ppi 
-
-
-## Instructions
-For illustration, we provide a python script `train.py`.
-Its usage is:
-```
-CUDA_VISIBLE_DEVICES=0 python3 train.py 
-                         --ddi ./data/ddi/DDs.csv
-                         --dti ./data/dti/drug_protein_links.tsv
-                         --ppi ./data/ppi/protein_protein_links.txt
-                         --d_feat ./data/all_drugs_name.fet
-                         --epochs 10
-                         --num_graph 10
-                         --sub_neighbours 10 10
-                         --cuda   
-```
-Notice that if you only have CPU machine, just remove `--cuda`. 
-
-## Reference
-**RGCN**
-> @article{jiang2020deep,
-  title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
-  author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
-  journal={Computational and structural biotechnology journal},
-  volume={18},
-  pages={427--438},
-  year={2020},
-  publisher={Elsevier}
-}
+* [RGCN](./RGCN/README.md)
diff --git a/apps/drug_drug_synergy/README_cn.md b/apps/drug_drug_synergy/README_cn.md
@@ -1,51 +1,7 @@
-# DDs(Drug Drug synergy)
+# 药物协同预测模型
 
 [中文版本](./README_cn.md) [English Version](./README.md)
 
-* [背景介绍](#背景介绍)
-* [数据集](#数据集)
-    * [ddi](#ddi)
-    * [dti](#dti)
-    * [ppi](#ppi)
-    * [特征集](#特征集)
-* [使用说明](#使用说明)
-    * [训练与评估](#训练与评估)
-* [引用](#引用)
+我们提供以下双摇联用协调性预测模型。
 
-## 背景
-药物联用，也叫做协同治疗，通常在应对复杂疾病时使用。而图神经网络能够结合多种生物学网络从而来预测药物的协同作用。
-## 数据集
-药物协同的分值文件和药物的理化特征信息文件在 `data` 文件夹下. 首先在`data` 文件夹下创建`ddi`, `dti`和`ppi`文件夹。
-### ddi
-```sh
-cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
-```
-### dti
-### ppi
-
-## 使用说明
-为了方便展示，我们构建了一个脚本， `train.py`.
-用法如下:
-```
-CUDA_VISIBLE_DEVICES=0 python3 train.py 
-                         --ddi ./data/ddi/DDs.csv
-                         --dti ./data/dti/drug_protein_links.tsv
-                         --ppi ./data/ppi/protein_protein_links.txt
-                         --d_feat ./data/all_drugs_name.fet
-                         --epochs 10
-                         --num_graph 10
-                         --sub_neighbours 10 10
-                         --cuda   
-```
-请注意，如果训练环境没有GPU，去掉`--cuda`即可。 
-## 引用
-**RGCN**
-> @article{jiang2020deep,
-  title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
-  author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
-  journal={Computational and structural biotechnology journal},
-  volume={18},
-  pages={427--438},
-  year={2020},
-  publisher={Elsevier}
-}
+* [RGCN](./RGCN/README_cn.md)
diff --git a/apps/drug_drug_synergy/RGCN/README.md b/apps/drug_drug_synergy/RGCN/README.md
@@ -0,0 +1,68 @@
+# DDs(Drug Drug synergy)
+
+[中文版本](./README_cn.md) [English Version](./README.md)
+
+* [Background](#background)
+* [Datasets](#datasets)
+    * [ddi](#ddi)
+    * [dti](#dti)
+    * [ppi](#ppi)
+    * [features](#features)
+* [Instructions](#instructions)
+    * [Training and Evaluation](#train-and-evaluation)
+* [Reference](#reference)
+
+## Background
+
+Drug combinations, also known as combinatorial therapy, are frequently prescribed to treat patients with complex disease. Graph convolutional network(GCN) can be used to predict drug-drug synergy by intergrating multiple biological networks.
+
+## Datasets
+Drug-drug synergy information and drug physi-chemical features can be put under `data` folder. 
+### ddi
+
+```sh
+cd data && mkdir DDI && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
+```
+
+### dti
+```sh
+cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/dti.tgz" && tar xzvf dti.tgz
+```
+
+### ppi 
+```sh
+cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/ppi.tgz" && tar xzvf ppi.tgz
+```
+
+### drug features
+```sh
+cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/drug_feat.tgz" && tar xzvf drug_feat.tgz
+```
+
+## Instructions
+For illustration, we provide a python script `train.py`.
+Its usage is:
+```
+CUDA_VISIBLE_DEVICES=0 python3 train.py 
+                         --ddi ./data/DDI/DDs.csv
+                         --dti ./data/DTI/drug_protein_links.tsv
+                         --ppi ./data/PPI/protein_protein_links.txt
+                         --d_feat ./data/all_drugs_name.fet
+                         --epochs 10
+                         --num_graph 10
+                         --sub_neighbours 10 10
+                         --cuda   
+```
+Notice that if you only have CPU machine, just remove `--cuda`. 
+
+## Reference
+**RGCN**
+> @article{jiang2020deep,
+  title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
+  author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
+  journal={Computational and structural biotechnology journal},
+  volume={18},
+  pages={427--438},
+  year={2020},
+  publisher={Elsevier}
+}
diff --git a/apps/drug_drug_synergy/RGCN/README_cn.md b/apps/drug_drug_synergy/RGCN/README_cn.md
@@ -0,0 +1,63 @@
+# DDs(Drug Drug synergy)
+
+[中文版本](./README_cn.md) [English Version](./README.md)
+
+* [背景介绍](#背景介绍)
+* [数据集](#数据集)
+    * [ddi](#ddi)
+    * [dti](#dti)
+    * [ppi](#ppi)
+    * [特征集](#特征集)
+* [使用说明](#使用说明)
+    * [训练与评估](#训练与评估)
+* [引用](#引用)
+
+## 背景
+药物联用，也叫做协同治疗，通常在应对复杂疾病时使用。而图神经网络能够结合多种生物学网络从而来预测药物的协同作用。
+## 数据集
+药物协同的分值文件和药物的理化特征信息文件在 `data` 文件夹下。
+### ddi
+```sh
+cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
+```
+### dti
+```sh
+cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/dti.tgz" && tar xzvf dti.tgz
+```
+
+### ppi 
+```sh
+cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/ppi.tgz" && tar xzvf ppi.tgz
+```
+
+### drug features
+```sh
+cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/drug_feat.tgz" && tar xzvf drug_feat.tgz
+```
+
+## 使用说明
+为了方便展示，我们构建了一个脚本， `train.py`.
+用法如下:
+```
+CUDA_VISIBLE_DEVICES=0 python3 train.py 
+                         --ddi ./data/DDI/DDs.csv
+                         --dti ./data/DTI/drug_protein_links.tsv
+                         --ppi ./data/PPI/protein_protein_links.txt
+                         --d_feat ./data/all_drugs_name.fet
+                         --epochs 10
+                         --num_graph 10
+                         --sub_neighbours 10 10
+                         --cuda   
+```
+请注意，如果训练环境没有GPU，去掉`--cuda`即可。 
+## 引用
+**RGCN**
+> @article{jiang2020deep,
+  title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
+  author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
+  journal={Computational and structural biotechnology journal},
+  volume={18},
+  pages={427--438},
+  year={2020},
+  publisher={Elsevier}
+}
diff --git a/apps/drug_drug_synergy/R_model.py → apps/drug_drug_synergy/RGCN/R_model.py b/apps/drug_drug_synergy/R_model.py → apps/drug_drug_synergy/RGCN/R_model.py
@@ -32,7 +32,7 @@
 import pandas as pd
 import numpy as np
 
-def Decagon_norm(graph, feature, edges):
+def decagon_norm(graph, feature, edges):
     """
     Relation Graph Neural Network degree normalization method
     """
@@ -96,6 +96,7 @@ def __init__(self, in_dim, out_dim, etypes, num_bases=0, act='relu', norm=True):
             )
         self.act = act
         self.norm = norm
+
     def forward(self, graph, feat):
         """Forward
         Args:
@@ -221,4 +222,4 @@ def negative_Sampling(label):
     val_neg_idx = np.random.choice(len(neg_pos[0]), num_neg)
     valid[(neg_pos[0][val_neg_idx], neg_pos[1][val_neg_idx])] = -1
 
-    return paddle.to_tensor(valid.astype('float32'))
+    return valid.astype('float32')
diff --git a/apps/drug_drug_synergy/graphsage_sampling.py → ...g_drug_synergy/RGCN/graphsage_sampling.py b/apps/drug_drug_synergy/graphsage_sampling.py → ...g_drug_synergy/RGCN/graphsage_sampling.py
@@ -53,6 +53,7 @@ def graphsage_sampling(hg, start_nodes, num_neighbours=10, etype='dti'):
 
     return qualified_neighs, qualified_eids
 
+
 def subgraph_gen(hg, label_idx, neighbours=[10, 10]):
     """
     Subgraph sampling by graphsage_sampling

diff --git a/apps/drug_drug_synergy/train.py → apps/drug_drug_synergy/RGCN/train.py b/apps/drug_drug_synergy/train.py → apps/drug_drug_synergy/RGCN/train.py
@@ -41,8 +41,7 @@
 from pahelix.featurizers import het_gnn_featurizer
 
 
-
-def Train(num_subgraph, graph, label_idx, epochs, sub_neighbours=[10, 10], init=True):
+def train(num_subgraph, graph, label_idx, epochs, sub_neighbours=[10, 10], init=True):
     """
     Model training for one epoch and return training loss and validation loss.
     """
@@ -105,6 +104,7 @@ def eval(model, graph, label, sub_neighbours, criterion):
     loss = criterion(pred, label)
     return pred, loss
 
+
 def train_val_plot(training_loss, val_loss, figure_name='loss_figure.pdf'):
     """
     Plot the training loss figure.
@@ -115,6 +115,7 @@ def train_val_plot(training_loss, val_loss, figure_name='loss_figure.pdf'):
     axx.legend(['training loss', 'val loss']) 
     fig.savefig(figure_name)
 
+
 def main(ddi, dti, ppi, d_feat, epochs=10, num_subgraph=20, sub_neighbours=[10, 10], cuda=False):
     """
     Args:
@@ -141,7 +142,7 @@ def main(ddi, dti, ppi, d_feat, epochs=10, num_subgraph=20, sub_neighbours=[10,
     value = drug_feat.collate_fn(ddi, dti, ppi, d_feat)
     hg, nodes_dict, label, label_idx = value['rt'] 
 
-    trained_model = Train(num_subgraph, hg, label_idx, epochs, [25, 25])
+    trained_model = train(num_subgraph, hg, label_idx, epochs, args.sub_neighbours)
 
     return trained_model
 

diff --git a/pahelix/datasets/ddi_dataset.py b/pahelix/datasets/ddi_dataset.py
@@ -28,6 +28,8 @@
 def get_default_ddi_task_names():
     """Get that default ddi task names and return class label"""
     return ['drug_a_name', 'drug_b_name', 'cell_line', 'synergy']
+
+
 def load_ddi_dataset(data_path, task_names=None, cellline=None, featurizer=None):
     """Load ddi dataset,process the input information and the featurizer.
     Description:

diff --git a/pahelix/datasets/dti_dataset.py b/pahelix/datasets/dti_dataset.py
@@ -29,6 +29,7 @@ def get_default_dti_task_names():
     """Get that default dti task names"""
     return ['chemical', 'protein']
 
+
 def load_dti_dataset(data_path, task_names=None, featurizer=None):
     """Load dti dataset,process the input information and the featurizer.
     Description:

diff --git a/pahelix/datasets/ppi_dataset.py b/pahelix/datasets/ppi_dataset.py
@@ -27,7 +27,9 @@
 __all__ = ['get_default_ppi_task_names', 'load_ppi_dataset']
 def get_default_ppi_task_names():
     """Get that default ppi task names"""
-    return ['protein1', 'protein2']
+    return ['protein1', 'protein2'] 
+
+
 def load_ppi_dataset(data_path, task_names=None, featurizer=None):
     """Load ppi dataset,process the input information and the featurizer.
     Description:
-Original file line number
+Diff line change
@@ Expand Up @@
         return qualified_neighs, qualified_eids
     def subgraph_gen(hg, label_idx, neighbours=[10, 10]):
         """
         Subgraph sampling by graphsage_sampling
@@ Expand Down @@