Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Computational Biology
  3. DiffDock

DiffDock

MITPythonv1.1.3

A state-of-the-art diffusion model for predicting how small molecules (ligands) bind to proteins.

Visit WebsiteGitHubGitHub
1.5k stars352 forks0 contributors

What is DiffDock?

DiffDock is an open-source implementation of a state-of-the-art diffusion model for molecular docking. It predicts the 3D binding pose of a small molecule (ligand) within a protein's active site, which is a critical step in computational drug discovery and structural biology. The method outputs both the predicted structure and a confidence score to help researchers assess the prediction's reliability.

Target Audience

Computational chemists, structural biologists, and drug discovery researchers who need to predict or analyze how potential drug molecules interact with protein targets. It is also relevant for machine learning practitioners interested in AI applications for science.

Value Proposition

Developers choose DiffDock for its high accuracy, which is state-of-the-art in molecular docking benchmarks, and its unique diffusion-based approach that provides a confidence estimate alongside each prediction. The project is actively maintained, offers multiple easy-to-use interfaces, and is built on a transparent, open-source codebase.

Overview

Implementation of DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking

Use Cases

Best For

  • Predicting binding poses for novel small molecule candidates in early-stage drug discovery.
  • Benchmarking new docking algorithms against a state-of-the-art diffusion model.
  • Teaching concepts of molecular docking and AI in structural biology.
  • Rapidly screening ligand poses when experimental structures are unavailable.
  • Integrating a docking component into a larger computational drug design pipeline.
  • Research into protein-ligand interactions where confidence estimation is important.

Not Ideal For

  • Projects requiring quantitative binding affinity predictions, as DiffDock only outputs structural poses and confidence scores, not direct affinity measures.
  • Docking protein-protein or protein-nucleic acid complexes, since the model is specifically trained and tested only for small molecule ligands.
  • Environments without GPU acceleration, because inference runs significantly slower on CPU, making large-scale screenings impractical.
  • Users seeking a zero-configuration cloud API, as local setup requires conda or Docker environment management.

Pros & Cons

Pros

State-of-the-Art Accuracy

DiffDock-L achieves high performance on benchmarks like PDBBind and DockGen, as evidenced by the updated paper and provided evaluation scripts for replication.

Integrated Confidence Scoring

Outputs a confidence score for each predicted pose, with guidelines in the FAQ to help assess reliability, aiding in decision-making without external tools.

Flexible Input Handling

Accepts proteins as PDB files or sequences (folded with ESMFold) and ligands as SMILES or various file formats, supporting diverse data sources from experiments or databases.

Multiple Deployment Options

Offers a web interface via Hugging Face Spaces, local CLI, Docker container, and a graphical UI, making it accessible for different user preferences and setups.

Cons

No Affinity Prediction

Explicitly does not predict binding affinity; users must integrate with other tools like GNINA or free energy calculations, adding complexity to workflows.

Small Molecule Limitation

Designed only for ligand-protein docking, not suitable for larger biomolecules like proteins or nucleic acids, requiring alternative methods for such interactions.

GPU Dependency for Performance

While CPU is supported, inference is significantly slower without a GPU, as noted in the README, which can limit throughput for large batches.

Frequently Asked Questions

Quick Stats

Stars1,524
Forks352
Contributors0
Open Issues129
Last commit1 year ago
CreatedSince 2022

Tags

#molecular-docking#diffusion-models#computational-biology#drug-discovery#binding#ai-for-science#docking#machine-learning#structural-bioinformatics

Built With

R
RDKit
P
Python
D
Docker
P
PyTorch

Links & Resources

Website

Included in

Computational Biology122
Auto-fetched 18 hours ago

Related Projects

JTVAEJTVAE

Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)

Stars560
Forks196
Last commit3 years ago
Molecular TransformerMolecular Transformer

Molecular Transformer is a neural machine translation model adapted for chemistry that predicts chemical reaction outcomes and retrosynthetic pathways. It translates between molecular representations (SMILES strings) to forecast how molecules react or how target molecules can be synthesized, accelerating discovery in organic chemistry and drug development. ## Key Features - **Retrosynthesis Prediction** — Predicts reactant molecules needed to synthesize a target product molecule. - **Uncertainty Calibration** — Provides confidence estimates for predictions, helping chemists assess reliability. - **SMILES Tokenization** — Uses custom tokenization of SMILES strings to treat molecules as sequences for transformer models. - **Data Augmentation** — Doubles training data by generating random equivalent SMILES representations via RDKit. - **Pre-trained Models** — Includes models trained on public datasets (USPTO_MIT, USPTO_STEREO) with mixed or separated reactant/reagent formats. ## Philosophy Molecular Transformer aims to make AI-assisted chemical reaction prediction accessible to organic chemists, with the goal of integrating these models into daily laboratory workflows to accelerate molecular discovery.

Stars425
Forks82
Last commit4 years ago
REINVENTREINVENT

REINVENT is a reinforcement learning framework specifically designed for de novo drug design, enabling the generation of novel molecular structures with optimized properties. It addresses the challenge of discovering new chemical entities by combining generative models with property prediction to explore chemical space efficiently. ## Key Features - **Reinforcement Learning Pipeline** — Uses RL to optimize molecular structures toward desired chemical properties and biological activities - **De Novo Molecular Generation** — Creates entirely new molecular entities rather than modifying existing compounds - **Property Optimization** — Incorporates scoring functions to guide generation toward molecules with specific target properties - **Template-Based Execution** — Provides configurable JSON templates for different running modes and experiments - **TensorBoard Integration** — Enables real-time monitoring and visualization of training logs and progress ## Philosophy REINVENT applies reinforcement learning principles to drug discovery, treating molecular generation as an optimization problem where the agent learns to propose molecules that maximize desired chemical and biological properties.

Stars374
Forks113
Last commit1 year ago
TargetDiffTargetDiff

The official implementation of 3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction (ICLR 2023)

Stars341
Forks53
Last commit2 years ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub