Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Computational Biology
  3. TargetDiff

TargetDiff

Python

Official implementation of a 3D equivariant diffusion model for generating drug-like molecules that bind to specific protein targets and predicting their binding affinity.

GitHubGitHub
338 stars53 forks0 contributors

What is TargetDiff?

TargetDiff is an open-source machine learning framework for structure-based drug discovery. It generates novel 3D molecular structures that are likely to bind to a specific protein target using an equivariant diffusion model and predicts their binding affinity. It addresses the challenge of designing drug-like molecules with desired binding properties from scratch.

Target Audience

Computational chemists, bioinformaticians, and machine learning researchers working on AI-driven drug discovery, molecular generation, and protein-ligand interaction prediction.

Value Proposition

It provides a unified, geometry-aware pipeline for both target-conditioned molecule generation and affinity prediction, leveraging SE(3)-equivariant networks for accurate 3D modeling and integrating with established docking tools for validation.

Overview

The official implementation of 3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction (ICLR 2023)

Use Cases

Best For

  • Generating novel drug candidates for a specific protein target
  • Predicting binding affinity of protein-ligand complexes
  • Academic research in AI for drug discovery
  • Benchmarking molecular generation models
  • Exploring structure-activity relationships in silico
  • Teaching concepts of equivariant neural networks in chemistry

Not Ideal For

  • Projects requiring quick, out-of-the-box deployment without extensive environment setup and dependency management
  • Teams without access to GPU resources for training or sampling with the diffusion models
  • Applications focused solely on 2D molecular properties or ligand-based design without protein structure data
  • Production environments needing robust, commercial-grade support and frequent model updates

Pros & Cons

Pros

Target-Conditioned Generation

Generates 3D molecular structures specifically for protein binding pockets using data from CrossDocked2020, with scripts for pocket extraction and sampling from PDB files, enabling precise drug design.

Equivariant Neural Networks

Utilizes SE(3)-equivariant diffusion models to respect 3D symmetries, improving geometric accuracy as validated in the ICLR 2023 paper and benchmark comparisons against baselines like Pocket2Mol.

Integrated Affinity Prediction

Predicts binding affinity via supervised learning on PDBBind data, with inference scripts for real complexes, achieving an RMSE of 1.316 on test sets and leveraging generative features for enhanced performance.

Docking Validation

Integrates with AutoDock Vina for in silico evaluation, providing metafiles for benchmarking and reproducibility, as detailed in the evaluation section with multiple docking modes.

Cons

Complex Setup and Dependencies

Requires specific versions of PyTorch, CUDA, and other packages via Conda and Pip, with additional tools like AutoDockTools_py3, making installation error-prone and time-consuming.

Incomplete or Outdated Models

READMe admits that supervised learning checkpoints for PDBBind v2020 are lost, offering only v2016 models, which limits accuracy with newer data and reflects maintenance gaps.

High Computational Demands

Relies heavily on GPU for training and sampling, and docking evaluation is time-consuming, as noted in the evaluation scripts, restricting use in resource-constrained settings.

Cumbersome Data Handling

Training requires downloading large datasets from Google Drive and running multiple preprocessing scripts, which can be daunting and prone to failure without careful manual intervention.

Frequently Asked Questions

Quick Stats

Stars338
Forks53
Contributors0
Open Issues13
Last commit2 years ago
CreatedSince 2023

Tags

#diffusion-models#drug-discovery#ai-for-science#structural-biology#computational-chemistry#machine-learning#drug-design#pytorch

Built With

P
PyTorch Geometric
R
RDKit
C
CUDA
P
Python
P
PyTorch

Included in

Computational Biology122
Auto-fetched 13 hours ago

Related Projects

DiffDockDiffDock

Implementation of DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking

Stars1,511
Forks349
Last commit1 year ago
JTVAEJTVAE

Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)

Stars556
Forks197
Last commit3 years ago
Molecular TransformerMolecular Transformer

Molecular Transformer is a neural machine translation model adapted for chemistry that predicts chemical reaction outcomes and retrosynthetic pathways. It translates between molecular representations (SMILES strings) to forecast how molecules react or how target molecules can be synthesized, accelerating discovery in organic chemistry and drug development. ## Key Features - **Retrosynthesis Prediction** — Predicts reactant molecules needed to synthesize a target product molecule. - **Uncertainty Calibration** — Provides confidence estimates for predictions, helping chemists assess reliability. - **SMILES Tokenization** — Uses custom tokenization of SMILES strings to treat molecules as sequences for transformer models. - **Data Augmentation** — Doubles training data by generating random equivalent SMILES representations via RDKit. - **Pre-trained Models** — Includes models trained on public datasets (USPTO_MIT, USPTO_STEREO) with mixed or separated reactant/reagent formats. ## Philosophy Molecular Transformer aims to make AI-assisted chemical reaction prediction accessible to organic chemists, with the goal of integrating these models into daily laboratory workflows to accelerate molecular discovery.

Stars424
Forks82
Last commit4 years ago
REINVENTREINVENT

REINVENT is a reinforcement learning framework specifically designed for de novo drug design, enabling the generation of novel molecular structures with optimized properties. It addresses the challenge of discovering new chemical entities by combining generative models with property prediction to explore chemical space efficiently. ## Key Features - **Reinforcement Learning Pipeline** — Uses RL to optimize molecular structures toward desired chemical properties and biological activities - **De Novo Molecular Generation** — Creates entirely new molecular entities rather than modifying existing compounds - **Property Optimization** — Incorporates scoring functions to guide generation toward molecules with specific target properties - **Template-Based Execution** — Provides configurable JSON templates for different running modes and experiments - **TensorBoard Integration** — Enables real-time monitoring and visualization of training logs and progress ## Philosophy REINVENT applies reinforcement learning principles to drug discovery, treating molecular generation as an optimization problem where the agent learns to propose molecules that maximize desired chemical and biological properties.

Stars373
Forks113
Last commit11 months ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub