You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2
Original file line number
Diff line number
Diff line change
@@ -12,6 +12,8 @@ English | [简体中文](README_cn.md)
12
12
13
13
14
14
## Latest News
15
+
`2022.08.11` PaddleHelix released the codes of HelixGEM-2, a novel Molecular Property Prediction Network that models full-range many-body interactions. And it ranked 1st in the OGB [PCQM4Mv2](https://ogb.stanford.edu/docs/lsc/leaderboards/) leaderboard. Please refer to [paper](https://arxiv.org/abs/2208.05863) and [codes](./apps/pretrained_compound/ChemRL/GEM-2) for more details.
16
+
15
17
`2022.07.29` PaddleHelix released the codes of HelixFold-Single, an **MSA-free** protein structure prediction pipeline relying on only the primary sequences, which can **predict the protein structures within seconds**. Please refer to [paper](https://arxiv.org/abs/2207.13921) and [codes](./apps/protein_folding/helixfold-single) for more details. Welcome to [PaddleHelix website](https://paddlehelix.baidu.com/app/drug/protein-single/forecast
16
18
) to try out the structure prediction online service.
# GEM-2: Next Generation Molecular Property Prediction Network with Many-body and Full-range Interaction Modeling
2
-
Molecular property prediction is a fundamental task in the drug and material industries. Physically, the properties of a molecule are determined by its own electronic structure, which can be exactly described by the Schrödinger equation. However, solving the Schrödinger equation for most molecules is extremely challenging due to long-range interactions in the behavior of a quantum many-body system. While deep learning methods have proven to be effective in molecular property prediction, we design a novel method, namely GEM-2, which comprehensively considers both the long-range and many-body interactions in molecules. GEM-2 consists of two interacted tracks: an atom-level track modeling both the local and global correlation between any two atoms, and a pair-level track modeling the correlation between all atom pairs, which embed information between any 3 or 4 atoms. Extensive experiments demonstrated the superiority of GEM-2 over multiple baseline methods in quantum chemistry and drug discovery tasks.
3
-
1
+
# GEM-2: Next Generation Molecular Property Prediction Network by Modeling Full-range Many-body Interactions
2
+
GEM-2 is a molecular modeling framework which comprehensively considers full-range many-body interactions in molecules. Multiple tracks are utilized to model the full-range interactions between the many-bodies with different orders, and a novel axial attention mechanism is designed to approximate the full-range interaction modeling with much lower computational cost.
4
3
A preprint version of our work can be found [here](https://arxiv.org/abs/2208.05863).
The overall framework of GEM-2. First, a molecule is described by the representations of many-bodies of multiple orders. Then, Optimus blocks are designed to update the representations. Each Optimus block contains $M$ tracks, and the $m$-th track contains a stack of many-body axial attentions to model the full-range interactions between the $m$-bodies. The many-body axial attentions and the Low2High module also play the roles of exchanging messages across the tracks. Finally, the molecular property prediction is made by pooling over the $1$-body representations.
11
+
## Result
12
+
### PCQM4Mv2
13
+
PCQM4Mv2 is a large-scale quantum chemistry dataset containing the DFT-calculated HOMO-LUMO
14
+
energy gaps. The OGB leaderboard for PCQM4Mv2 can be found [here](https://ogb.stanford.edu/docs/lsc/leaderboards/#pcqm4mv2).
LIT-PCBA is a virtual screening dataset containing protein targets with their corresponding active and inactive compounds selected from high-confidence PubChem Bioassay data
You can also download the processed PCQM4Mv2 dataset with rdkit generated 3d information from [here](https://baidu-nlp.bj.bcebos.com/PaddleHelix/datasets/compound_datasets/pcqm4mv2_gem2.tgz).
Copy file name to clipboardExpand all lines: apps/protein_folding/helixfold-single/README.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -16,8 +16,7 @@ To reproduce the results reported in our paper, specific environment settings ar
16
16
17
17
## Installation
18
18
Except those listed in the `requirements.txt`, PaddlePaddle `dev` package is required to run HelixFold.
19
-
Visit [here](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html) to
20
-
install PaddlePaddle `dev`. Also, we provide a package here if your machine environment is Nvidia A100 with
19
+
Visit [here](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html) to install PaddlePaddle `dev`. Also, we provide a package here if your machine environment is Nvidia A100 with
0 commit comments