diff --git a/apps/drug_drug_synergy/README.md b/apps/drug_drug_synergy/README.md index c773eb77..7a967350 100644 --- a/apps/drug_drug_synergy/README.md +++ b/apps/drug_drug_synergy/README.md @@ -1,58 +1,7 @@ -# DDs(Drug Drug synergy) +# drug_drug_synergy [中文版本](./README_cn.md) [English Version](./README.md) -* [Background](#background) -* [Datasets](#datasets) - * [ddi](#ddi) - * [dti](#dti) - * [ppi](#ppi) - * [features](#features) -* [Instructions](#instructions) - * [Training and Evaluation](#train-and-evaluation) -* [Reference](#reference) +We provide the following pretrained protein methods. -## Background - -Drug combinations, also known as combinatorial therapy, are frequently prescribed to treat patients with complex disease. Graph convolutional network(GCN) can be used to predict drug-drug synergy by intergrating multiple biological networks. - -## Datasets -Drug-drug synergy information and drug physi-chemical features can be put under `data` folder. Then let us create `ddi`, `dti` and `ppi` folder under `data` folder. -### ddi - -```sh -cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv" -``` - -### dti - -### ppi - - -## Instructions -For illustration, we provide a python script `train.py`. -Its usage is: -``` -CUDA_VISIBLE_DEVICES=0 python3 train.py - --ddi ./data/ddi/DDs.csv - --dti ./data/dti/drug_protein_links.tsv - --ppi ./data/ppi/protein_protein_links.txt - --d_feat ./data/all_drugs_name.fet - --epochs 10 - --num_graph 10 - --sub_neighbours 10 10 - --cuda -``` -Notice that if you only have CPU machine, just remove `--cuda`. - -## Reference -**RGCN** -> @article{jiang2020deep, - title={Deep graph embedding for prioritizing synergistic anticancer drug combinations}, - author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao}, - journal={Computational and structural biotechnology journal}, - volume={18}, - pages={427--438}, - year={2020}, - publisher={Elsevier} -} \ No newline at end of file +* [RGCN](./RGCN/README.md) \ No newline at end of file diff --git a/apps/drug_drug_synergy/README_cn.md b/apps/drug_drug_synergy/README_cn.md index e0c526de..a3d371ca 100644 --- a/apps/drug_drug_synergy/README_cn.md +++ b/apps/drug_drug_synergy/README_cn.md @@ -1,51 +1,7 @@ -# DDs(Drug Drug synergy) +# 药物协同预测模型 [中文版本](./README_cn.md) [English Version](./README.md) -* [背景介绍](#背景介绍) -* [数据集](#数据集) - * [ddi](#ddi) - * [dti](#dti) - * [ppi](#ppi) - * [特征集](#特征集) -* [使用说明](#使用说明) - * [训练与评估](#训练与评估) -* [引用](#引用) +我们提供以下双摇联用协调性预测模型。 -## 背景 -药物联用,也叫做协同治疗,通常在应对复杂疾病时使用。而图神经网络能够结合多种生物学网络从而来预测药物的协同作用。 -## 数据集 -药物协同的分值文件和药物的理化特征信息文件在 `data` 文件夹下. 首先在`data` 文件夹下创建`ddi`, `dti`和`ppi`文件夹。 -### ddi -```sh -cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv" -``` -### dti -### ppi - -## 使用说明 -为了方便展示,我们构建了一个脚本, `train.py`. -用法如下: -``` -CUDA_VISIBLE_DEVICES=0 python3 train.py - --ddi ./data/ddi/DDs.csv - --dti ./data/dti/drug_protein_links.tsv - --ppi ./data/ppi/protein_protein_links.txt - --d_feat ./data/all_drugs_name.fet - --epochs 10 - --num_graph 10 - --sub_neighbours 10 10 - --cuda -``` -请注意,如果训练环境没有GPU,去掉`--cuda`即可。 -## 引用 -**RGCN** -> @article{jiang2020deep, - title={Deep graph embedding for prioritizing synergistic anticancer drug combinations}, - author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao}, - journal={Computational and structural biotechnology journal}, - volume={18}, - pages={427--438}, - year={2020}, - publisher={Elsevier} -} \ No newline at end of file +* [RGCN](./RGCN/README_cn.md) \ No newline at end of file diff --git a/apps/drug_drug_synergy/RGCN/README.md b/apps/drug_drug_synergy/RGCN/README.md new file mode 100644 index 00000000..6967a664 --- /dev/null +++ b/apps/drug_drug_synergy/RGCN/README.md @@ -0,0 +1,68 @@ +# DDs(Drug Drug synergy) + +[中文版本](./README_cn.md) [English Version](./README.md) + +* [Background](#background) +* [Datasets](#datasets) + * [ddi](#ddi) + * [dti](#dti) + * [ppi](#ppi) + * [features](#features) +* [Instructions](#instructions) + * [Training and Evaluation](#train-and-evaluation) +* [Reference](#reference) + +## Background + +Drug combinations, also known as combinatorial therapy, are frequently prescribed to treat patients with complex disease. Graph convolutional network(GCN) can be used to predict drug-drug synergy by intergrating multiple biological networks. + +## Datasets +Drug-drug synergy information and drug physi-chemical features can be put under `data` folder. +### ddi + +```sh +cd data && mkdir DDI && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv" +``` + +### dti +```sh +cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/dti.tgz" && tar xzvf dti.tgz +``` + +### ppi +```sh +cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/ppi.tgz" && tar xzvf ppi.tgz +``` + +### drug features +```sh +cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/drug_feat.tgz" && tar xzvf drug_feat.tgz +``` + +## Instructions +For illustration, we provide a python script `train.py`. +Its usage is: +``` +CUDA_VISIBLE_DEVICES=0 python3 train.py + --ddi ./data/DDI/DDs.csv + --dti ./data/DTI/drug_protein_links.tsv + --ppi ./data/PPI/protein_protein_links.txt + --d_feat ./data/all_drugs_name.fet + --epochs 10 + --num_graph 10 + --sub_neighbours 10 10 + --cuda +``` +Notice that if you only have CPU machine, just remove `--cuda`. + +## Reference +**RGCN** +> @article{jiang2020deep, + title={Deep graph embedding for prioritizing synergistic anticancer drug combinations}, + author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao}, + journal={Computational and structural biotechnology journal}, + volume={18}, + pages={427--438}, + year={2020}, + publisher={Elsevier} +} \ No newline at end of file diff --git a/apps/drug_drug_synergy/RGCN/README_cn.md b/apps/drug_drug_synergy/RGCN/README_cn.md new file mode 100644 index 00000000..a40ff673 --- /dev/null +++ b/apps/drug_drug_synergy/RGCN/README_cn.md @@ -0,0 +1,63 @@ +# DDs(Drug Drug synergy) + +[中文版本](./README_cn.md) [English Version](./README.md) + +* [背景介绍](#背景介绍) +* [数据集](#数据集) + * [ddi](#ddi) + * [dti](#dti) + * [ppi](#ppi) + * [特征集](#特征集) +* [使用说明](#使用说明) + * [训练与评估](#训练与评估) +* [引用](#引用) + +## 背景 +药物联用,也叫做协同治疗,通常在应对复杂疾病时使用。而图神经网络能够结合多种生物学网络从而来预测药物的协同作用。 +## 数据集 +药物协同的分值文件和药物的理化特征信息文件在 `data` 文件夹下。 +### ddi +```sh +cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv" +``` +### dti +```sh +cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/dti.tgz" && tar xzvf dti.tgz +``` + +### ppi +```sh +cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/ppi.tgz" && tar xzvf ppi.tgz +``` + +### drug features +```sh +cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/drug_feat.tgz" && tar xzvf drug_feat.tgz +``` + +## 使用说明 +为了方便展示,我们构建了一个脚本, `train.py`. +用法如下: +``` +CUDA_VISIBLE_DEVICES=0 python3 train.py + --ddi ./data/DDI/DDs.csv + --dti ./data/DTI/drug_protein_links.tsv + --ppi ./data/PPI/protein_protein_links.txt + --d_feat ./data/all_drugs_name.fet + --epochs 10 + --num_graph 10 + --sub_neighbours 10 10 + --cuda +``` +请注意,如果训练环境没有GPU,去掉`--cuda`即可。 +## 引用 +**RGCN** +> @article{jiang2020deep, + title={Deep graph embedding for prioritizing synergistic anticancer drug combinations}, + author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao}, + journal={Computational and structural biotechnology journal}, + volume={18}, + pages={427--438}, + year={2020}, + publisher={Elsevier} +} \ No newline at end of file diff --git a/apps/drug_drug_synergy/R_model.py b/apps/drug_drug_synergy/RGCN/R_model.py similarity index 98% rename from apps/drug_drug_synergy/R_model.py rename to apps/drug_drug_synergy/RGCN/R_model.py index e0df48ee..a32bd21c 100644 --- a/apps/drug_drug_synergy/R_model.py +++ b/apps/drug_drug_synergy/RGCN/R_model.py @@ -32,7 +32,7 @@ import pandas as pd import numpy as np -def Decagon_norm(graph, feature, edges): +def decagon_norm(graph, feature, edges): """ Relation Graph Neural Network degree normalization method """ @@ -96,6 +96,7 @@ def __init__(self, in_dim, out_dim, etypes, num_bases=0, act='relu', norm=True): ) self.act = act self.norm = norm + def forward(self, graph, feat): """Forward Args: @@ -221,4 +222,4 @@ def negative_Sampling(label): val_neg_idx = np.random.choice(len(neg_pos[0]), num_neg) valid[(neg_pos[0][val_neg_idx], neg_pos[1][val_neg_idx])] = -1 - return paddle.to_tensor(valid.astype('float32')) \ No newline at end of file + return valid.astype('float32') \ No newline at end of file diff --git a/apps/drug_drug_synergy/graphsage_sampling.py b/apps/drug_drug_synergy/RGCN/graphsage_sampling.py similarity index 99% rename from apps/drug_drug_synergy/graphsage_sampling.py rename to apps/drug_drug_synergy/RGCN/graphsage_sampling.py index 16b47082..2dee62e9 100644 --- a/apps/drug_drug_synergy/graphsage_sampling.py +++ b/apps/drug_drug_synergy/RGCN/graphsage_sampling.py @@ -53,6 +53,7 @@ def graphsage_sampling(hg, start_nodes, num_neighbours=10, etype='dti'): return qualified_neighs, qualified_eids + def subgraph_gen(hg, label_idx, neighbours=[10, 10]): """ Subgraph sampling by graphsage_sampling diff --git a/apps/drug_drug_synergy/train.py b/apps/drug_drug_synergy/RGCN/train.py similarity index 97% rename from apps/drug_drug_synergy/train.py rename to apps/drug_drug_synergy/RGCN/train.py index f1c41d0b..e02b6228 100644 --- a/apps/drug_drug_synergy/train.py +++ b/apps/drug_drug_synergy/RGCN/train.py @@ -41,8 +41,7 @@ from pahelix.featurizers import het_gnn_featurizer - -def Train(num_subgraph, graph, label_idx, epochs, sub_neighbours=[10, 10], init=True): +def train(num_subgraph, graph, label_idx, epochs, sub_neighbours=[10, 10], init=True): """ Model training for one epoch and return training loss and validation loss. """ @@ -105,6 +104,7 @@ def eval(model, graph, label, sub_neighbours, criterion): loss = criterion(pred, label) return pred, loss + def train_val_plot(training_loss, val_loss, figure_name='loss_figure.pdf'): """ Plot the training loss figure. @@ -115,6 +115,7 @@ def train_val_plot(training_loss, val_loss, figure_name='loss_figure.pdf'): axx.legend(['training loss', 'val loss']) fig.savefig(figure_name) + def main(ddi, dti, ppi, d_feat, epochs=10, num_subgraph=20, sub_neighbours=[10, 10], cuda=False): """ Args: @@ -141,7 +142,7 @@ def main(ddi, dti, ppi, d_feat, epochs=10, num_subgraph=20, sub_neighbours=[10, value = drug_feat.collate_fn(ddi, dti, ppi, d_feat) hg, nodes_dict, label, label_idx = value['rt'] - trained_model = Train(num_subgraph, hg, label_idx, epochs, [25, 25]) + trained_model = train(num_subgraph, hg, label_idx, epochs, args.sub_neighbours) return trained_model diff --git a/pahelix/datasets/ddi_dataset.py b/pahelix/datasets/ddi_dataset.py index 46df6a2d..050ff559 100644 --- a/pahelix/datasets/ddi_dataset.py +++ b/pahelix/datasets/ddi_dataset.py @@ -28,6 +28,8 @@ def get_default_ddi_task_names(): """Get that default ddi task names and return class label""" return ['drug_a_name', 'drug_b_name', 'cell_line', 'synergy'] + + def load_ddi_dataset(data_path, task_names=None, cellline=None, featurizer=None): """Load ddi dataset,process the input information and the featurizer. Description: diff --git a/pahelix/datasets/dti_dataset.py b/pahelix/datasets/dti_dataset.py index 1f7c5a83..5a8982cc 100644 --- a/pahelix/datasets/dti_dataset.py +++ b/pahelix/datasets/dti_dataset.py @@ -29,6 +29,7 @@ def get_default_dti_task_names(): """Get that default dti task names""" return ['chemical', 'protein'] + def load_dti_dataset(data_path, task_names=None, featurizer=None): """Load dti dataset,process the input information and the featurizer. Description: diff --git a/pahelix/datasets/ppi_dataset.py b/pahelix/datasets/ppi_dataset.py index eb3544a4..de3c1b25 100644 --- a/pahelix/datasets/ppi_dataset.py +++ b/pahelix/datasets/ppi_dataset.py @@ -27,7 +27,9 @@ __all__ = ['get_default_ppi_task_names', 'load_ppi_dataset'] def get_default_ppi_task_names(): """Get that default ppi task names""" - return ['protein1', 'protein2'] + return ['protein1', 'protein2'] + + def load_ppi_dataset(data_path, task_names=None, featurizer=None): """Load ppi dataset,process the input information and the featurizer. Description: