A collection of cheminformatics and machine-learning software for molecular informatics, written in C++ with Python wrappers.
RDKit is an open-source cheminformatics and machine-learning software library written primarily in C++ with Python bindings. It provides tools for molecular informatics, including 2D/3D molecular operations, chemical descriptor calculation, fingerprint generation, and database integration. The project solves the need for a performant, extensible toolkit that supports both research and production applications in chemical data analysis.
Cheminformatics researchers, computational chemists, bioinformatics developers, and data scientists working with molecular data who need robust libraries for chemical structure manipulation and machine-learning feature generation.
Developers choose RDKit for its high-performance C++ core, comprehensive feature set for cheminformatics, business-friendly BSD license, and extensive language bindings that enable integration into diverse scientific workflows. Its active community and regular user group meetings provide strong support and continuous development.
The official sources for the RDKit library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Efficient C++ implementations for molecular data structures ensure fast operations, crucial for large-scale cheminformatics tasks as highlighted in the key features.
With Python, Java, C#, JavaScript, and CFFI interfaces, RDKit integrates into diverse tech stacks, making it accessible for different development environments.
Includes 2D/3D operations, descriptor and fingerprint generation, and PostgreSQL cartridge for database searches, covering a wide range of research and production needs.
Yearly user group meetings and a Contrib folder for community extensions provide ongoing development and resources, enhancing the toolkit's utility.
While Python installation is easy via conda, setting up for Java, C#, or JavaScript requires additional dependencies and compilation, which can be cumbersome.
The JavaScript wrapper is generated with emscripten and may not have full feature parity or optimal performance for real-time web applications.
The comprehensive nature and reliance on cheminformatics concepts can be overwhelming for developers without a chemistry background, despite documentation.