A curated list of awesome papers, tools, authors, books, blogs and other resources related to AI4Chemistry or AI4Materials.
Inspired by awesome-python and awesome-python-chemistry.
- Cross-disciplinary perspectives on the potential for artificial intelligence across chemistry [2025]
- A Perspective on Foundation Models in Chemistry [2025]
- A Review of Large Language Models and Autonomous Agents in Chemistry [2024]
- Applications of Transformers in Computational Chemistry: Recent Progress and Prospects [2024] - Transformer
- Materials science in the era of large language models: a perspective [2024]
- Generative Models as an Emerging Paradigm in the Chemical Sciences [2023]
- A Survey on ensemble learning under the era of deep learning [2022]
- Recent advances and applications of deep learning methods in materials science [2022]
- Machine Learning for Molecular Simulation [2020]
- Data Reduction Techniques for Simulation, Visualization and Data Analysis [2018]
- Machine learning for molecular and materials science [2018] - Nature
- Materials discovery and design using machine learning [2017]
- Element similarity in high-dimensional materials representations[2023]
- A review of molecular representation in the ageof machine learning [2022] - Review
- Geometrically Equivariant Graph Neural Networks: A Survey [2022] - Geometrical learning, Survey
- Benchmarking graph neural networks for materials chemistry [2021] - Benchmarking
- A Comprehensive Survey on Graph Neural Networks [2021] - Survey
- E(n) Equivariant Graph Neural Networks [2021] - E3NN, pioneering work
- Atomistic Line Graph Neural Network for improved materials property predictions [2021]
- Directional Message Passing for Molecular Graphs [2020]
- Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals [2019] - Review
- Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds [2018] - Tensor field network, pioneering work
- Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties [2018] - Pinoneering work in solid state materials
- Neural Message Passing for Quantum Chemistry [2017] - MPNN, pioneering work
- A Diffusion-Based Pre-training Framework for Crystal Property Prediction [2024]
- Boost Your Crystal Model with Denoising Pre-training [2024]
- Self-supervised learning for crystal property prediction via denoising [2024]
- May the force be with you: unified force-centric pre-training for 3D molecular conformations [2023]
- Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs [2023]
- Denoise Pretraining on Nonequilibrium Molecules for Accurate and Transferable Neural Potentials [2023]
- Pre-training via Denoising for Molecular Property Prediction [2022]
- Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching [2022]
- The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules [2020]
- Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information [2021] - Dataset
- ILThermo - Dataset
- Vaporization enthalpy prediction of ionic liquids based on back-propagation artificial neural network [2025] - 3150 data points, Vaporization enthalpy prediction
- Large-Scale Screening for High Conductivity Ionic Liquids via Machine Learning Algorithm Utilizing Graph Neural Network-Based Features [2024] - 5700 data points, conductivity, GNN+XGBoost
- Rapid and Accurate Prediction of the Melting Point for Imidazolium-Based Ionic Liquids by Artificial Neural Network [2024] - 280 data points, melting point
- Machine learning coupled with group contribution for predicting the electrical conductivity of ionic liquids with experimental accuracy [2024] - 7598, electrical conductivity
- Generalizing property prediction of ionic liquids from limited labeled data: a one-stop framework empowered by transfer learning [2023] - Transformer-CNN, diverse ILs properties
- ILTox: A Curated Toxicity Database for Machine Learning and Design of Environmentally Friendly Ionic Liquids [2023] - 6700 data points, Toxicity
- Benchmarking machine learning methods for modeling physical properties of ionic liquids [2022]
- A review of group contribution models to calculate thermodynamic properties of ionic liquids for process systems engineering [2022] - Review
- Predictive molecular thermodynamic models for ionic liquids [2022] - Review
- Comparison of molecular and structural features towards prediction of ionic liquid ionic conductivity for electrochemical applications [2022] - Benchmark
- Predicting CO2 Absorption in Ionic Liquids with Molecular Descriptors and Explainable Graph Neural Networks [2022] - CO2 Absorption in Ionic Liquids, GNN
- A review on machine learning algorithms for the ionic liquid chemical space [2021] - Review
- Application of Artificial Intelligence-based predictive methods in Ionic liquid studies: A review [2021] - Review
- Beware of proper validation of models for ionic Liquids [2021] - Melting point temperature, Transformer-CNN
- Neural recommender system for the activity coefficient prediction and UNIFAC model extension of ionic liquid-solute systems [2021] - 41553 data points, activity coefficient prediction, GNN
- Viscosity of Ionic Liquids: An Extensive Database and a New Group Contribution Model Based on a Feed-Forward Artificial Neural Network [2014] - Group contribution method
- Molecular contrastive learning of representations via graph neural networks [2022] - MolCLR (Molecular Contrastive Learning of Representations via Graph Neural Networks)
- Mol-BERT: An Effective Molecular Representation with BERT for Molecular Property Prediction [2021] - BERT
- SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction [2019] - BERT
- Ab-initio variational wave functions for the time-dependent many-electron Schrödinger equation [2024] - Nature Communication
- Accurate computation of quantum excited states with neural networks [2024] - Science, Exicted State
- A deep equivariant neural network approach for efficient hybrid density functional calculations [2024]
- Ab initio quantum chemistry with neural-network wavefunctions [2023] - Nature Review Chemistry
- Toward Orbital-Free Density Functional Theory with Small Data Sets and Deep Learning [2022] - JCTC, orbitral-free DFT
- Pushing the frontiers of density functionals by solving the fractional electron problem [2021] - Science, DFT, DM21
- Deep-neural-network solution of the electronic Schrödinger equation [2020] - Nature Chemistry
- Solving the quantum many-body problem with artificial neural networks [2017] - Science, The first application of NN to represent many-body wavefunctions
- Backflow Transformations via Neural Networks for Quantum Many-Body Wave Functions [2019] - PRL
- Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces [2007] - Pioneering work
- Fast and flexible long-range models for atomistic machine learning [2025]
- Learning local equivariant representations for large-scale atomistic dynamics [2023] - Nature Communication
- E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials [2022] - Nature Communication
- A universal graph deep learning interatomic potential for the periodic table [2022]
- Physics-Inspired Structural Representations for Molecules and Materials [2021] - Review
- Gaussian Process Regression for Materials and Molecules [2021] - Review
- Machine Learning Force Fields [2021] - Review
- Machine Learning Force Fields: Recent Advances and Remaining Challenges [2021] - perspective
- Has generative artificial intelligence solved inverse materials design? [2024]
- Machine learning-aided generative molecular design [2024]
- MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials [2024]
- Scientific discovery in the age of artificial intelligence [2023] - Nature
- An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning [2023]
- Art and the science of generative AI [2023] - Science
- Discovering and understanding materials through computation [2021]
- Structure prediction drives materials discovery [2019]
- Inverse design in search of materials with target functionalities [2018]
- Generative Adversarial Networks for Crystal Structure Prediction [2020]
- MolGAN: An implicit generative model for small molecular graphs [2018]
- Reinforced Adversarial Neural Computer for de Novo Molecular Design [2018]
- Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models [2017] - Pioneering work of using generative work in chemistry
- Auto-Encoding Variational Bayes [2013] - Pioneering work
- WyCryst: Wyckoff inorganic crystal generator framework [2024]
- Deep generative design of porous organic cages via a variational autoencoder [2023]
- An invertible crystallographic representation for general inverse design of inorganic crystals with targeted properties [2022]
- The Usual Suspects? Reassessing Blame for VAE Posterior Collapse [2019]
- Inverse Design of Solid-State Materials via a Continuous Representation [2019]
- Constrained Graph Variational Autoencoders for Molecule Design [2018]
- Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules [2018] - Pioneering work in molecule generation
- Unified Generative Modeling of 3D Molecules via Bayesian Flow Networks [2024]
- FlowLLM: Flow Matching for Material Generation with Large Language Models as Base Distributions [2024]
- Normalizing Flows for Probabilistic Modeling and Inference [2019] - Pioneering work
- A generative model for inorganic materials design [2025] - Nature
- Con-CDVAE: A method for the conditional generation of crystal structures [2024]
- Deep learning generative model for crystal structure prediction [2024]
- Space Group Constrained Crystal Generation [2024]
- Equivariant 3D-conditional diffusion model for molecular linker design [2024]
- Guided diffusion for inverse molecular design [2023]
- Scalable Diffusion for Materials Generation [2023]
- Crystal Diffusion Variational Autoencoder for Periodic Material Generation [2021] - Pioneering work in crystal materials generation
- Denoising Diffusion Probabilistic Models [2020] - Pioneering work
- MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials [2024]
- AtomGPT: Atomistic Generative Pretrained Transformer for Forward and Inverse Materials Design [2024]
- Language models can generate molecules, materials, and protein binding sites directly in three dimensions as XYZ, CIF, and PDB files [2024]
- Crystal structure generation with autoregressive large language modeling [2024]
- GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation [2020]
- It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design [2024]
- Molecular Generative Model via Retrosynthetically Prepared Chemical Building Block Assembly [2023]
- Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking [2023]
- From Black Boxes to Actionable Insights: A Perspective on Explainable Artificial Intelligence for Scientific Discovery [2023]
- Computer-aided multi-objective optimization in small molecule discovery [2023]
- Interpretable machine learning for knowledge generation in heterogeneous catalysis [2022]
- Explainable graph neural networks for organic cages [2022]
- Materials Precursor Score: Modeling Chemists’ Intuition for the Synthetic Accessibility of Porous Organic Cage Precursors [2021]
- Drug discovery with explainable artificial intelligence [2020]
- Leveraging Prompt Engineering in Large Language Models for Accelerating Chemical Research [2025]
- Fine-Tuned Language Models Generate Stable Inorganic Materials as Text [2024]
- Exploration of crystal chemical space using text-guided generative artificial intelligence [2024]
- Closing the Execution Gap in Generative AI for Chemicals and Materials: Freeways or Safeguards [2024] - Blog
- Generative Hierarchical Materials Search [2024]
- Generative Models as an Emerging Paradigm in the Chemical Sciences [2023] - Review
- Dismai-Bench: benchmarking and designing generative models using disordered materials and interfaces [2024] - Benchmarking
- matbench-genmetrics: A Python library for benchmarking crystal structure generative models using time-based splits of Materials Project structures [2024] - Benchmarking
- Fine-Tuned Language Models Generate Stable Inorganic Materials as Text [2024]
- Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models [2020] - Benchmarking
- System-Aware Neural ODE Processes for Few-Shot Bayesian Optimization [2024]
- A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences [2024]
- Race to the bottom: Bayesian optimisation for chemical problems [2024]
- Bayesian reaction optimization as a tool for chemical synthesis [2021] - Nature
- Practical Bayesian Optimization of Machine Learning Algorithms [2012]
- A framework for evaluating the chemical knowledge and reasoning abilities of large language models against the expertise of chemists [2025] - Nature Chemistry
- A review of large language models and autonomous agents in chemistry [2025] - Review
- Language models for materials discovery and sustainability: Progress, challenges, and opportunities [2025] - Review
- 34 Examples of LLM Applications in Materials Science and Chemistry: Towards Automation, Assistants, Agents, and Accelerated Scientific Discovery [2025]
- AlchemBERT: Exploring Lightweight Language Models for Materials Informatics [2025]
- Foundation models for materials discovery – current state and future directions [2025]
- LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction [2024] - Benchmark
- MaScQA: investigating materials science knowledge of large language models [2024] - Question Answering
- Large Language Models as Molecular Design Engines [2024]
- Leveraging large language models for predictive chemistry [2024]
- Augmenting large language models with chemistry tools [2024]
- Foundational Large Language Models for Materials Research [2024]
- Scientific Large Language Models: A Survey on Biological & Chemical Domains [2024]
- A survey on multimodal large language models [2024]
- Chemical language modeling with structured state space sequence models [2024]
- Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules [2024] - GPT3
- Foundational Large Language Models for Materials Research [2024]
- Beyond designer's knowledge: Generating materials design hypotheses via large language models [2024]
- Structured information extraction from scientific text with large language models [2024]
- Regression with Large Language Models for Materials and Molecular Property Prediction [2024]
- Large Language Models for Material Property Predictions: elastic constant tensor prediction and materials design [2024]
- BartSmiles: Generative Masked Language Models for Molecular Representations [2024]
- Can large language models understand molecules? [2024] -Finetuning
- Multimodal language and graph learning of adsorption configuration in catalysis [2024] - Finetuning
- Catalyst Energy Prediction with CatBERTa: Unveiling Feature Exploration Strategies through Large Language Models [2023] - Finetuning
- LLM-Prop: Predicting Physical And Electronic Properties Of Crystalline Solids From Their Text Descriptions [2023]
- Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT [2023]
- Catalyst Energy Prediction with CatBERTa: Unveiling Feature Exploration Strategies through Large Language Models [2023] - CatBERTa
- What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks [2023] - Benchmark
- ChemBERTa-2: Towards Chemical Foundation Models [2022]
- Chemformer: a pre-trained transformer for computational chemistry [2022]
- Large-scale chemical language representations capture molecular structure and properties [2022]
- ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction [2020]
- SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction [2019]
- Self-Driving Laboratories for Chemistry and Materials Science [2024] - Review
- Foundation Model for Material Science [2024]
- Multi-modal molecule structure–text model for text-based retrieval and editing [2023]
- Evaluating Self-Supervised Learning for Molecular Graph Embeddings [2023]- GNN embedding
- Benchmarking Large Language Models for Math Reasoning Tasks [2024] - Benchmark
- Machine Learning Methods for Small Data Challenges in Molecular Science [2023] - Small dataset
- Science-Driven Atomistic Machine Learning [2023]
- AI4Mat-ICLR 2025 [2025]
- AI4Mat-2023 workshop at NeurIPS 2023 [2023]
- RDKit - free
- Open Babel - free
- CODESSA - commercial
- DRAGON - commercial