Skip to content

DDs #90

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 13, 2021
Merged

DDs #90

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 3 additions & 54 deletions apps/drug_drug_synergy/README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,7 @@
# DDs(Drug Drug synergy)
# drug_drug_synergy

[中文版本](./README_cn.md) [English Version](./README.md)

* [Background](#background)
* [Datasets](#datasets)
* [ddi](#ddi)
* [dti](#dti)
* [ppi](#ppi)
* [features](#features)
* [Instructions](#instructions)
* [Training and Evaluation](#train-and-evaluation)
* [Reference](#reference)
We provide the following pretrained protein methods.

## Background

Drug combinations, also known as combinatorial therapy, are frequently prescribed to treat patients with complex disease. Graph convolutional network(GCN) can be used to predict drug-drug synergy by intergrating multiple biological networks.

## Datasets
Drug-drug synergy information and drug physi-chemical features can be put under `data` folder. Then let us create `ddi`, `dti` and `ppi` folder under `data` folder.
### ddi

```sh
cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
```

### dti

### ppi


## Instructions
For illustration, we provide a python script `train.py`.
Its usage is:
```
CUDA_VISIBLE_DEVICES=0 python3 train.py
--ddi ./data/ddi/DDs.csv
--dti ./data/dti/drug_protein_links.tsv
--ppi ./data/ppi/protein_protein_links.txt
--d_feat ./data/all_drugs_name.fet
--epochs 10
--num_graph 10
--sub_neighbours 10 10
--cuda
```
Notice that if you only have CPU machine, just remove `--cuda`.

## Reference
**RGCN**
> @article{jiang2020deep,
title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
journal={Computational and structural biotechnology journal},
volume={18},
pages={427--438},
year={2020},
publisher={Elsevier}
}
* [RGCN](./RGCN/README.md)
50 changes: 3 additions & 47 deletions apps/drug_drug_synergy/README_cn.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,7 @@
# DDs(Drug Drug synergy)
# 药物协同预测模型

[中文版本](./README_cn.md) [English Version](./README.md)

* [背景介绍](#背景介绍)
* [数据集](#数据集)
* [ddi](#ddi)
* [dti](#dti)
* [ppi](#ppi)
* [特征集](#特征集)
* [使用说明](#使用说明)
* [训练与评估](#训练与评估)
* [引用](#引用)
我们提供以下双摇联用协调性预测模型。

## 背景
药物联用,也叫做协同治疗,通常在应对复杂疾病时使用。而图神经网络能够结合多种生物学网络从而来预测药物的协同作用。
## 数据集
药物协同的分值文件和药物的理化特征信息文件在 `data` 文件夹下. 首先在`data` 文件夹下创建`ddi`, `dti`和`ppi`文件夹。
### ddi
```sh
cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
```
### dti
### ppi

## 使用说明
为了方便展示,我们构建了一个脚本, `train.py`.
用法如下:
```
CUDA_VISIBLE_DEVICES=0 python3 train.py
--ddi ./data/ddi/DDs.csv
--dti ./data/dti/drug_protein_links.tsv
--ppi ./data/ppi/protein_protein_links.txt
--d_feat ./data/all_drugs_name.fet
--epochs 10
--num_graph 10
--sub_neighbours 10 10
--cuda
```
请注意,如果训练环境没有GPU,去掉`--cuda`即可。
## 引用
**RGCN**
> @article{jiang2020deep,
title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
journal={Computational and structural biotechnology journal},
volume={18},
pages={427--438},
year={2020},
publisher={Elsevier}
}
* [RGCN](./RGCN/README_cn.md)
68 changes: 68 additions & 0 deletions apps/drug_drug_synergy/RGCN/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# DDs(Drug Drug synergy)

[中文版本](./README_cn.md) [English Version](./README.md)

* [Background](#background)
* [Datasets](#datasets)
* [ddi](#ddi)
* [dti](#dti)
* [ppi](#ppi)
* [features](#features)
* [Instructions](#instructions)
* [Training and Evaluation](#train-and-evaluation)
* [Reference](#reference)

## Background

Drug combinations, also known as combinatorial therapy, are frequently prescribed to treat patients with complex disease. Graph convolutional network(GCN) can be used to predict drug-drug synergy by intergrating multiple biological networks.

## Datasets
Drug-drug synergy information and drug physi-chemical features can be put under `data` folder.
### ddi

```sh
cd data && mkdir DDI && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
```

### dti
```sh
cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/dti.tgz" && tar xzvf dti.tgz
```

### ppi
```sh
cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/ppi.tgz" && tar xzvf ppi.tgz
```

### drug features
```sh
cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/drug_feat.tgz" && tar xzvf drug_feat.tgz
```

## Instructions
For illustration, we provide a python script `train.py`.
Its usage is:
```
CUDA_VISIBLE_DEVICES=0 python3 train.py
--ddi ./data/DDI/DDs.csv
--dti ./data/DTI/drug_protein_links.tsv
--ppi ./data/PPI/protein_protein_links.txt
--d_feat ./data/all_drugs_name.fet
--epochs 10
--num_graph 10
--sub_neighbours 10 10
--cuda
```
Notice that if you only have CPU machine, just remove `--cuda`.

## Reference
**RGCN**
> @article{jiang2020deep,
title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
journal={Computational and structural biotechnology journal},
volume={18},
pages={427--438},
year={2020},
publisher={Elsevier}
}
63 changes: 63 additions & 0 deletions apps/drug_drug_synergy/RGCN/README_cn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# DDs(Drug Drug synergy)

[中文版本](./README_cn.md) [English Version](./README.md)

* [背景介绍](#背景介绍)
* [数据集](#数据集)
* [ddi](#ddi)
* [dti](#dti)
* [ppi](#ppi)
* [特征集](#特征集)
* [使用说明](#使用说明)
* [训练与评估](#训练与评估)
* [引用](#引用)

## 背景
药物联用,也叫做协同治疗,通常在应对复杂疾病时使用。而图神经网络能够结合多种生物学网络从而来预测药物的协同作用。
## 数据集
药物协同的分值文件和药物的理化特征信息文件在 `data` 文件夹下。
### ddi
```sh
cd data/ddi && wget "http://www.bioinf.jku.at/software/DeepSynergy/labels.csv"
```
### dti
```sh
cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/dti.tgz" && tar xzvf dti.tgz
```

### ppi
```sh
cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/ppi.tgz" && tar xzvf ppi.tgz
```

### drug features
```sh
cd data && wget "https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/drug_synergy_datasets/drug_feat.tgz" && tar xzvf drug_feat.tgz
```

## 使用说明
为了方便展示,我们构建了一个脚本, `train.py`.
用法如下:
```
CUDA_VISIBLE_DEVICES=0 python3 train.py
--ddi ./data/DDI/DDs.csv
--dti ./data/DTI/drug_protein_links.tsv
--ppi ./data/PPI/protein_protein_links.txt
--d_feat ./data/all_drugs_name.fet
--epochs 10
--num_graph 10
--sub_neighbours 10 10
--cuda
```
请注意,如果训练环境没有GPU,去掉`--cuda`即可。
## 引用
**RGCN**
> @article{jiang2020deep,
title={Deep graph embedding for prioritizing synergistic anticancer drug combinations},
author={Jiang, Peiran and Huang, Shujun and Fu, Zhenyuan and Sun, Zexuan and Lakowski, Ted M and Hu, Pingzhao},
journal={Computational and structural biotechnology journal},
volume={18},
pages={427--438},
year={2020},
publisher={Elsevier}
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
import pandas as pd
import numpy as np

def Decagon_norm(graph, feature, edges):
def decagon_norm(graph, feature, edges):
"""
Relation Graph Neural Network degree normalization method
"""
Expand Down Expand Up @@ -96,6 +96,7 @@ def __init__(self, in_dim, out_dim, etypes, num_bases=0, act='relu', norm=True):
)
self.act = act
self.norm = norm

def forward(self, graph, feat):
"""Forward
Args:
Expand Down Expand Up @@ -221,4 +222,4 @@ def negative_Sampling(label):
val_neg_idx = np.random.choice(len(neg_pos[0]), num_neg)
valid[(neg_pos[0][val_neg_idx], neg_pos[1][val_neg_idx])] = -1

return paddle.to_tensor(valid.astype('float32'))
return valid.astype('float32')
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ def graphsage_sampling(hg, start_nodes, num_neighbours=10, etype='dti'):

return qualified_neighs, qualified_eids


def subgraph_gen(hg, label_idx, neighbours=[10, 10]):
"""
Subgraph sampling by graphsage_sampling
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@
from pahelix.featurizers import het_gnn_featurizer



def Train(num_subgraph, graph, label_idx, epochs, sub_neighbours=[10, 10], init=True):
def train(num_subgraph, graph, label_idx, epochs, sub_neighbours=[10, 10], init=True):
"""
Model training for one epoch and return training loss and validation loss.
"""
Expand Down Expand Up @@ -105,6 +104,7 @@ def eval(model, graph, label, sub_neighbours, criterion):
loss = criterion(pred, label)
return pred, loss


def train_val_plot(training_loss, val_loss, figure_name='loss_figure.pdf'):
"""
Plot the training loss figure.
Expand All @@ -115,6 +115,7 @@ def train_val_plot(training_loss, val_loss, figure_name='loss_figure.pdf'):
axx.legend(['training loss', 'val loss'])
fig.savefig(figure_name)


def main(ddi, dti, ppi, d_feat, epochs=10, num_subgraph=20, sub_neighbours=[10, 10], cuda=False):
"""
Args:
Expand All @@ -141,7 +142,7 @@ def main(ddi, dti, ppi, d_feat, epochs=10, num_subgraph=20, sub_neighbours=[10,
value = drug_feat.collate_fn(ddi, dti, ppi, d_feat)
hg, nodes_dict, label, label_idx = value['rt']

trained_model = Train(num_subgraph, hg, label_idx, epochs, [25, 25])
trained_model = train(num_subgraph, hg, label_idx, epochs, args.sub_neighbours)

return trained_model

Expand Down
2 changes: 2 additions & 0 deletions pahelix/datasets/ddi_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
def get_default_ddi_task_names():
"""Get that default ddi task names and return class label"""
return ['drug_a_name', 'drug_b_name', 'cell_line', 'synergy']


def load_ddi_dataset(data_path, task_names=None, cellline=None, featurizer=None):
"""Load ddi dataset,process the input information and the featurizer.
Description:
Expand Down
1 change: 1 addition & 0 deletions pahelix/datasets/dti_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ def get_default_dti_task_names():
"""Get that default dti task names"""
return ['chemical', 'protein']


def load_dti_dataset(data_path, task_names=None, featurizer=None):
"""Load dti dataset,process the input information and the featurizer.
Description:
Expand Down
4 changes: 3 additions & 1 deletion pahelix/datasets/ppi_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@
__all__ = ['get_default_ppi_task_names', 'load_ppi_dataset']
def get_default_ppi_task_names():
"""Get that default ppi task names"""
return ['protein1', 'protein2']
return ['protein1', 'protein2']


def load_ppi_dataset(data_path, task_names=None, featurizer=None):
"""Load ppi dataset,process the input information and the featurizer.
Description:
Expand Down