Skip to content

Commit 030188e

Browse files
authored
Merge pull request #128 from PaddlePaddle/update_introduction
Update introduction
2 parents dd1859e + 70ec47e commit 030188e

File tree

8 files changed

+149
-138
lines changed

8 files changed

+149
-138
lines changed

README.md

Lines changed: 41 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -9,51 +9,60 @@ English | [简体中文](README_cn.md)
99
![python version](https://img.shields.io/badge/python-3.6+-orange.svg)
1010
![support os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-yellow.svg)
1111

12-
PaddleHelix is a machine-learning-based bio-computing framework aiming at facilitating the development of the following areas:
13-
> * Vaccine design
14-
> * Drug discovery
15-
> * Precision medicine
12+
## Latest News
13+
`2021.06.17` PaddleHelix team won the 2nd place in the [OGB-LCS KDD Cup 2021 PCQM4M-LSC track](https://ogb.stanford.edu/kddcup2021/results/), predicting DFT-calculated HOMO-LUMO energy gap of molecules. Please refer to [the solution](./competition/kddcup2021-PCQM4M-LSC) for more details.
1614

17-
## Features
18-
* High Efficiency: We provide LinearRNA, a highly efficient toolkit for RNA structure prediction and analysis. LinearFold & LinearPartition achieve O(n) complexity in RNA-folding prediction, which is hundreds of times faster than traditional folding techniques.
19-
<p align="center">
20-
<img src="./.github/LinearRNA.jpg" align="middle" />
21-
</p>
22-
23-
* Large-scale Representation Learning: Self-supervised learning for molecule representations offers prospects of a breakthrough in tasks with limited annotation, including drug profiling, drug-target interaction, protein-protein interaction, RNA-RNA interaction, protein folding, RNA folding, and molecule design. PaddleHelix implements various representation learning algorithms and state-of-the-art large-scale pre-trained models to help developers start from "the shoulders of giants" quickly.
24-
<p align="center">
25-
<img src="./.github/paddlehelix_features.jpg" align="middle" />
26-
</p>
15+
`2021.05.20` PaddleHelix v1.0 released. 1) Update from static framework to dynamic framework; 2) Add new applications: molecular generation and drug-drug synergy.
2716

28-
* Rich examples and applications: PaddleHelix provides frequently used components such as networks, datasets, and pre-trained models. Users can easily use those components to build up their models and systems. PaddleHelix also provides multiple applications, such as compound property prediction, drug-target interaction, and so on.
17+
`2021.03.15` PaddleHelix team rank 1st in the ogbg-molhiv and ogbg-molpcba of [OGB](https://ogb.stanford.edu/docs/leader_graphprop/), predicting the molecular properties.
2918

30-
----
19+
---
3120

32-
## Installation
21+
## Introduction
22+
PaddleHelix is a bio-computing tools, taking advantage of machine learning approach, especially deep neural networks, for facilitating the development of the following areas:
23+
* **Drug Discovery**. Provide 1) Large-scale pre-training models: compounds and proteins; 2) Various applications: molecular property prediction, drug-target affinity prediction, and molecular generation.
24+
* **Vaccine Design**. Provide RNA design algorithms, including LinearFold and LinearPartition.
25+
* **Precision Medicine**. Provide application of drug-drug synergy.
3326

34-
The installation prerequisites and guide can be found [here](./installation_guide.md).
27+
<p align="center">
28+
<img src=".github/PaddleHelix_Structure.png" align="middle" heigh="80%" width="80%" />
29+
</p>
3530

36-
----
31+
## Resources
32+
### Application Platform
33+
[PaddleHelix platform](https://paddlehelix.baidu.com/) provides the AI + biochemistry abilities for the scenarios of drug discovery, vaccine design and precision medicine.
3734

38-
## Documentation
35+
### Installation Guide
36+
PaddleHelix is a bio-computing repository based on [PaddlePaddle](https://github.com/paddlepaddle/paddle), a high-performance Parallelized Deep Learning Platform. The installation prerequisites and guide can be found [here](./installation_guide.md).
3937

4038
### Tutorials
41-
* We provide abundant [tutorials](./tutorials) to help you navigate the repository and start quickly.
42-
* PaddleHelix is based on [PaddlePaddle](https://github.com/paddlepaddle/paddle), a high-performance Parallelized Deep Learning Platform.
39+
We provide abundant [tutorials](./tutorials) to help you navigate the repository and start quickly.
40+
* **Drug Discovery**
41+
- [Compound Representation Learning and Property Prediction](./tutorials/compound_property_prediction_tutorial.ipynb)
42+
- [Protein Representation Learning and Property Prediction](./tutorials/protein_pretrain_and_property_prediction_tutorial.ipynb)
43+
- Predicting Drug-Target Interaction: [GraphDTA](./tutorials/drug_target_interaction_graphdta_tutorial.ipynb), [MolTrans](./tutorials/drug_target_interaction_moltrans_tutorial.ipynb)
44+
- [Molecular Generation](./tutorials/molecular_generation_tutorial.ipynb)
45+
* **Vaccine Design**
46+
- [Predicting RNA Secondary Structure](./tutorials/linearrna_tutorial.ipynb)
4347

4448
### Examples
45-
* [Representation Learning - Compounds](./apps/pretrained_compound)
46-
* [Representation Learning - Proteins](./apps/pretrained_protein)
47-
* [Drug-Target Interaction](./apps/drug_target_interaction)
48-
* [Molecular Generation](./apps/molecular_generation)
49-
* [Drug Drug Synergy](./apps/drug_drug_synergy)
50-
* [LinearRNA](./c/pahelix/toolkit/linear_rna)
51-
52-
### The API reference
53-
* Detailed API reference of PaddleHelix can be found [here](https://paddlehelix.readthedocs.io/en/dev/).
49+
We also provide [examples](./apps) that implement various algorithms and show the methods running the algorithms:
50+
* **Pretraining**
51+
- [Representation Learning - Compounds](./apps/pretrained_compound)
52+
- [Representation Learning - Proteins](./apps/pretrained_protein)
53+
* **Drug discovery and Precision Medicine**
54+
- [Drug-Target Interaction](./apps/drug_target_interaction)
55+
- [Molecular Generation](./apps/molecular_generation)
56+
- [Drug Drug Synergy](./apps/drug_drug_synergy)
57+
* **Vaccine Design**
58+
- [LinearRNA](./c/pahelix/toolkit/linear_rna)
59+
60+
### Competition Solutions
61+
PaddleHelix team participated in multiple competitions related to bio-computing. The solutions can be found [here](./competition).
5462

5563
### Guide for developers
56-
* If you need help in modifying the source code of PaddleHelix, please see our [Guide for developers](./developer_guide.md).
64+
* To develope new functions based on the source code of PaddleHelix, please refer to [guide for developers](./developer_guide.md).
65+
* For more details of the APIs, please refer to the [documents](https://paddlehelix.readthedocs.io/en/dev/).
5766

5867
### Welcome to join us
5968
We are looking for machine learning researchers / engineers or bioinformatics / computational chemistry researchers interested in AI-driven drug design.

README_cn.md

Lines changed: 43 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -9,49 +9,62 @@
99
![python version](https://img.shields.io/badge/python-3.6+-orange.svg)
1010
![support os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-yellow.svg)
1111

12-
螺旋桨(PaddleHelix)是一个基于机器学习的生物计算工具集,致力于加速如下领域的进展:
13-
> * 疫苗设计
14-
> * 新药发现
15-
> * 精准医疗
12+
## 最新消息
13+
`2021.06.17` 螺旋桨团队在[OGB-LCS KDD Cup 2021 PCQM4M-LSC track](https://ogb.stanford.edu/kddcup2021/results/)比赛中赢得了亚军。该项比赛预测使用DFT计算的分子HOMO-LUMO的能量差。请参阅[解决方案](./competition/kddcup2021-PCQM4M-LSC)获得更多的细节。.
1614

17-
## 特色
15+
`2021.05.20` 螺旋桨v1.0正式版发布。 1)将模型全面从静态图升级到动态图; 2) 添加更多应用: 分子生成和药物联用.
1816

19-
* **高性能**:提供了LinearRNA系列高性能算法助力 RNA 结构预测和分析。例如,LinearFold 和 LinearPartition 能够迅速准确定位能量较低 RNA 二级结构,性能相比传统方法提升数百甚至上千倍。
20-
<p align="center">
21-
<img src="./.github/LinearRNA.jpg" align="middle" />
22-
</p>
17+
`2021.03.15` 螺旋桨团队在权威图榜单[OGB](https://ogb.stanford.edu/docs/leader_graphprop/)的ogbg-molhiv和ogbg-molpcba任务上取得第一名。这两项任务均是预测小分子的属性。
18+
19+
---
20+
21+
## 简介
22+
螺旋桨(PaddleHelix)是一个生物计算工具集,是用机器学习的方法,特别是深度神经网络,致力于促进以下领域的发展:
23+
24+
* **新药发现**。提供1)大规模预训练模型:化合物和蛋白质; 2)多种应用:分子属性预测,药物靶点亲和力预测,和分子生成。
25+
* **疫苗设计**。提供RNA设计算法,包括LinearFold和LinearPartition。
26+
* **精准医疗**。提供药物联用的应用。
2327

24-
* 由大规模**表示预训练**支撑的生物计算工具:随着自监督学习用于分子表示训练的进展,为样本量非常稀少的很多生物计算任务带来了全新的突破,这些任务包括分子性质预测,药物-靶点相互作用,蛋白质-蛋白质相互作用,RNA-RNA 相互作用,蛋白质折叠,RNA 折叠等等领域。螺旋桨广泛提供了业界最领先的表示学习方法和模型,使得开发者可以基于大规模模型快速切入需求的任务,站在巨人的肩膀上。
2528
<p align="center">
26-
<img src="./.github/paddlehelix_features.jpg" align="middle" />
29+
<img src=".github/PaddleHelix_Structure.png" align="middle" heigh="80%" width="80%" />
2730
</p>
2831

29-
* 丰富的样例和应用:螺旋桨提供了生物计算中常用的模块,如模型结构,数据集,和预训练模型。用户可以用非常简单的接口模块,快速组建自己的网络和系统。罗湘江还提供多种应用,例如化合物属性预测,药物靶点亲和力预测等等。
30-
----
31-
32-
## 安装
33-
详细的安装指引和环境配置请查阅[这里](./installation_guide_cn.md)
32+
---
33+
## 项目资源
34+
### 计算平台
35+
[PaddleHelix平台](https://paddlehelix.baidu.com/)提供AI+生物计算能力,满足新药研发、疫苗设计、精准医疗场景的AI需求。
3436

35-
----
36-
## 文档
37+
### 安装指南
38+
螺旋桨是一个基于高性能机器学习工具[PaddlePaddle飞桨](https://github.com/paddlepaddle/paddle)的生物计算开源工具库。详细的安装和环境配置指引请查阅[这里](./installation_guide_cn.md)
3739

38-
### 教学
39-
* 我们提供了大量的[教学实例](./tutorials)以方便开发者快速了解和使用该框架
40-
* PaddleHelix基于[飞桨(PaddlePaddle)](https://github.com/paddlepaddle/paddle)开源深度学习框架实现,该框架在性能表现上尤其出色。
40+
### 教学示例
41+
我们提供了大量的[教学示例](./tutorials)以方便开发者快速了解和使用该框架:
42+
* **Drug Discovery**
43+
- [化合物表示和属性预测](./tutorials/compound_property_prediction_tutorial_cn.ipynb)
44+
- [蛋白质表示和属性预测](./tutorials/protein_pretrain_and_property_prediction_tutorial_cn.ipynb)
45+
- Predicting Drug-Target Interaction: [GraphDTA](./tutorials/drug_target_interaction_graphdta_tutorial_cn.ipynb), [MolTrans](./tutorials/drug_target_interaction_moltrans_tutorial_cn.ipynb)
46+
- [分子生成](./tutorials/molecular_generation_tutorial_cn.ipynb)
47+
* **Vaccine Design**
48+
- [RNA结构预测](./tutorials/linearrna_tutorial_cn.ipynb)
4149

4250
### 使用示例
43-
* [表示学习 - 化合物](./apps/pretrained_compound/README_cn.md)
44-
* [表示学习 - 蛋白质](./apps/pretrained_protein/README_cn.md)
45-
* [药物-分子作用预测](./apps/drug_target_interaction/README_cn.md)
46-
* [分子生成](./apps/molecular_generation/README_cn.md)
47-
* [药物联用](./apps/drug_drug_synergy/README_cn.md)
48-
* [LinearRNA](./c/pahelix/toolkit/linear_rna/README_cn.md)
51+
我们也提供了多个算法的[代码和使用示例](./apps):
52+
* **预训练**
53+
- [表示学习 - 化合物](./apps/pretrained_compound)
54+
- [表示学习 - 蛋白质](./apps/pretrained_protein)
55+
* **新药发现和精准医疗**
56+
- [药物-分子作用预测](./apps/drug_target_interaction)
57+
- [分子生成](./apps/molecular_generation)
58+
- [药物联用](./apps/drug_drug_synergy)
59+
* **疫苗设计**
60+
- [LinearRNA](./c/pahelix/toolkit/linear_rna)
4961

50-
### API 文档
51-
* 如果你对PaddleHelix的详细接口感兴趣,请查阅[API 文档](https://paddlehelix.readthedocs.io/en/dev/)
62+
### 比赛解决方案
63+
螺旋桨团队参加了多项生物计算相关的赛事,相关解决方案可以参阅[这里](./competition).
5264

5365
### 开发者指南
54-
* 如果你需要修改PaddleHelix的源代码,请查阅我们提供的[开发者指南](./developer_guide_cn.md)
66+
* 如果你需要基于螺旋桨的源代码进行新功能的开发,请查阅我们提供的[开发者指南](./developer_guide_cn.md)
67+
* 如果你想知道螺旋桨各种接口的详情,请查阅[API文档](https://paddlehelix.readthedocs.io/en/dev/)
5568

5669
### 欢迎加入我们
5770
我们正在招聘对人工智能驱动的药物设计感兴趣的机器学习研究人员/工程师或生物信息/计算化学相关研究人员。

apps/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
English | [简体中文](README_cn.md)
2+
3+
We provide examples that implement various algorithms and show the methods running the algorithms:
4+
* **Pretraining**
5+
- [Representation Learning - Compounds](./pretrained_compound)
6+
- [Representation Learning - Proteins](./pretrained_protein)
7+
* **Drug discovery and Precision Medicine**
8+
- [Drug-Target Interaction](./drug_target_interaction)
9+
- [Molecular Generation](./molecular_generation)
10+
- [Drug Drug Synergy](./drug_drug_synergy)
11+
* **Vaccine Design**
12+
- [LinearRNA](../c/pahelix/toolkit/linear_rna)

apps/README_cn.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
[English](README.md) | 简体中文
2+
3+
我们提供了多个算法的代码和使用示例:
4+
* **预训练**
5+
- [表示学习 - 化合物](./pretrained_compound)
6+
- [表示学习 - 蛋白质](./pretrained_protein)
7+
* **新药发现和精准医疗**
8+
- [药物-分子作用预测](./drug_target_interaction)
9+
- [分子生成](./molecular_generation)
10+
- [药物联用](./drug_drug_synergy)
11+
* **疫苗设计**
12+
- [LinearRNA](../c/pahelix/toolkit/linear_rna)

competition/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
PaddleHelix team participated in multiple competitions related to bio-computing.
2+
3+
- `2021.06.17` We won the 2nd place in the [OGB-LCS KDD Cup 2021 PCQM4M-LSC track](https://ogb.stanford.edu/kddcup2021/results/), predicting DFT-calculated HOMO-LUMO energy gap of molecules. Please refer to [the solution](./kddcup2021-PCQM4M-LSC) for more details.
4+
5+
- `2021.03.15` We 1st in the ogbg-molhiv and ogbg-molpcba of [OGB](https://ogb.stanford.edu/docs/leader_graphprop/), predicting the molecular properties. Please refer to [the solution](./ogbg_molhiv) for more details.

docs/readme.rst

Lines changed: 0 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -17,27 +17,6 @@ Welcome to PaddleHelix Helper
1717
:alt: Documentation Status
1818

1919

20-
PaddleHelix is a machine-learning-based bio-computing framework aiming at facilitating the development of the following areas:
21-
* Vaccine design
22-
* Drug discovery
23-
* Precision medicine
24-
25-
Features
26-
========
27-
28-
- High Efficiency: We provide LinearRNA, a highly efficient toolkit for RNA structure prediction and analysis. LinearFold & LinearPartition achieve O(n) complexity in RNA-folding prediction, which is hundreds of times faster than traditional folding techniques.
29-
30-
.. image:: ../.github/LinearRNA.jpg
31-
:align: center
32-
33-
- Large-scale Representation Learning: Self-supervised learning for molecule representations offers prospects of a breakthrough in tasks with limited annotation, including drug profiling, drug-target interaction, protein-protein interaction, RNA-RNA interaction, protein folding, RNA folding, and molecule design. PaddleHelix implements various representation learning algorithms and state-of-the-art large-scale pre-trained models to help developers start from "the shoulders of giants" quickly.
34-
35-
.. image:: ../.github/paddlehelix_features.jpg
36-
:align: center
37-
38-
- Rich examples and applications: PaddleHelix provides frequently used components such as networks, datasets, and pre-trained models. Users can easily use those components to build up their models and systems. PaddleHelix also provides multiple applications, such as compound property prediction, drug-target interaction, and so on.
39-
40-
4120
Installation
4221
============
4322

0 commit comments

Comments
 (0)