A conversational AI framework for editing small molecules, peptides, and proteins using retrieval-augmented generation and domain feedback.
ChatDrug is a research framework that uses large language models (LLMs) for conversational editing of drug molecules, including small molecules, peptides, and proteins. It combines retrieval-augmented generation with domain-specific feedback to iteratively refine molecular structures based on natural language instructions and biochemical property evaluations.
Computational chemists, bioinformaticians, and AI researchers working on drug discovery and protein engineering who need interactive tools for molecular design and optimization.
It uniquely integrates conversational AI with domain-aware feedback mechanisms, enabling more intuitive and guided drug editing compared to traditional computational methods or standalone LLMs without biochemical grounding.
LLM for Drug Editing, ICLR 2024
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses a knowledge base to augment LLM responses, improving the relevance of molecular edits, as implemented in the retrieval-augmented generation module that pulls from curated datasets.
Incorporates domain feedback like binding affinity and solubility evaluations, directing edits toward desired therapeutic profiles, evidenced by integration with MHCFlurry for peptides and ProteinDT for proteins.
Handles small molecules, peptides, and proteins in a single framework, allowing versatile drug editing tasks without switching tools, as shown in the supported task types and evaluation metrics.
Offers a fast mode for protein editing to reduce computational overhead, addressing performance bottlenecks with the --fast_protein flag that accelerates retrieval and evaluation steps.
Requires extensive environment configuration with conda, multiple pip installs, and manual downloads of datasets and models from Hugging Face, making deployment non-trivial for new users.
Relies on OpenAI API for conversational LLMs, introducing costs and potential downtime risks, with API key setup mandatory in the utility file as per usage instructions.
Evaluation is tied to specific tools like RDKit and MHCFlurry, which may not cover all biochemical properties needed for comprehensive drug design, restricting flexibility in assessment.