A state-of-the-art diffusion model for predicting how small molecules (ligands) bind to proteins.
DiffDock is an open-source implementation of a state-of-the-art diffusion model for molecular docking. It predicts the 3D binding pose of a small molecule (ligand) within a protein's active site, which is a critical step in computational drug discovery and structural biology. The method outputs both the predicted structure and a confidence score to help researchers assess the prediction's reliability.
Computational chemists, structural biologists, and drug discovery researchers who need to predict or analyze how potential drug molecules interact with protein targets. It is also relevant for machine learning practitioners interested in AI applications for science.
Developers choose DiffDock for its high accuracy, which is state-of-the-art in molecular docking benchmarks, and its unique diffusion-based approach that provides a confidence estimate alongside each prediction. The project is actively maintained, offers multiple easy-to-use interfaces, and is built on a transparent, open-source codebase.
Implementation of DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
DiffDock-L achieves high performance on benchmarks like PDBBind and DockGen, as evidenced by the updated paper and provided evaluation scripts for replication.
Outputs a confidence score for each predicted pose, with guidelines in the FAQ to help assess reliability, aiding in decision-making without external tools.
Accepts proteins as PDB files or sequences (folded with ESMFold) and ligands as SMILES or various file formats, supporting diverse data sources from experiments or databases.
Offers a web interface via Hugging Face Spaces, local CLI, Docker container, and a graphical UI, making it accessible for different user preferences and setups.
Explicitly does not predict binding affinity; users must integrate with other tools like GNINA or free energy calculations, adding complexity to workflows.
Designed only for ligand-protein docking, not suitable for larger biomolecules like proteins or nucleic acids, requiring alternative methods for such interactions.
While CPU is supported, inference is significantly slower without a GPU, as noted in the README, which can limit throughput for large batches.