Welcome to the chemical reaction compound atom-to-atom mapping research project !!!
A chemical reaction can be defined as the transformation of a set of chemical compounds into another. Accompanied by a change in energy, the atoms of the reactant chemical compounds are rearranged to form the product chemical compounds, with or without the assistance of spectator compounds. Correctly mapping this rearrangement of chemical compound atoms is paramount for capturing the essence of the chemical reaction. This task, commonly referred to as atom-to-atom mapping or atom mapping, has proven challenging as it is a generalization of the well-known subgraph isomorphism problem. Consequently, the primary objective of the Atom-to-atom Mapping research project is to systematically curate and facilitate access to relevant chemical reaction compound atom-to-atom mapping resources.
An environment can be created using the git and conda commands as follows:
git clone https://github.com/neo-chem-synth-wave/atom-to-atom-mapping.git
cd atom-to-atom-mapping
conda env create -f environment.yaml
conda activate atom-to-atom-mapping-env
The atom_to_atom_mapping package can be installed using the pip command as follows:
pip install .
According to GitHub Issue 4 and GitHub Issue 5 on the LocalMapper repository, potential conflicts between the PyTorch, CUDA, and DGL libraries may arise. To resolve the conflicts, the appropriate version of the DGL library can be re-installed as follows:
# Re-install the DGL library for the PyTorch and CUDA library versions 2.4 and 12.1, respectively.
pip uninstall dgl
pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/cu121/repo.html
The purpose of the scripts directory is to illustrate how to map chemical reaction compounds using the following approaches:
The map_reaction_smiles_strings script can be utilized as follows:
# Map a chemical reaction SMILES string.
python scripts/map_reaction_smiles_strings.py \
--atom_to_atom_mapping_approach "indigo" \
--reaction_smiles "OCN1C(=O)Cc2ccccc12.c1nc2ccccc2[nH]1>>O=C1Cc2ccccc2N1Cn1cnc2ccccc12"
# Map the chemical reaction SMILES strings from a .csv file.
python scripts/map_reaction_smiles_strings.py \
--atom_to_atom_mapping_approach "rxnmapper" \
--input_csv_file_path "/path/to/the/input/file.csv" \
--reaction_smiles_column_name "name_of_the_reaction_smiles_column" \
--output_csv_file_path "/path/to/the/output/file.csv"
The contents of this repository are published under the MIT license. Please refer to the individual references for more details regarding the license information of external resources utilized within the repository.
If you are interested in contributing to this research project by reporting bugs, suggesting improvements, or submitting feedback, feel free to do so using GitHub Issues.
Marvin was used for drawing, displaying and characterizing chemical structures, substructures and reactions. [5]
[1] EPAM Indigo: https://lifescience.opensource.epam.com/indigo/index.html. Accessed on: 2025/05/04.
[2] Schwaller, P., Hoover, B., Reymond, J., Strobelt, H., and Laino, T. Extraction of Organic Chemistry Grammar from Unsupervised Learning of Chemical Reactions. Sci. Adv., 7, eabe4166, 2021.
[3] Nugmanov, R., Dyubankova, N., Gedich, A., and Wegner, J.K. Bidirectional Graphormer for Reactivity Understanding: Neural Network Trained to Reaction Atom-to-atom Mapping Task. J. Chem. Inf. Model., 2022, 62, 14, 3307–3315.
[4] Chen, S., An, S., Babazade, R., and Jung, Y. Precise Atom-to-atom Mapping for Organic Reactions via Human-in-the-loop Machine Learning. Nat. Commun., 15, 2250, 2024.
[5] Marvin 24.3.1, 2024, ChemAxon: https://chemaxon.com. Accessed on: 2025/05/04.