A transformer-based model for predicting drug-target interactions using substructural pattern mining and augmented transformer encoders.
MolTrans is a molecular interaction transformer model that predicts drug-target interactions using deep learning. It addresses key challenges in computational drug discovery by incorporating substructural pattern mining and leveraging unlabeled biomedical data to improve prediction accuracy and interpretability.
Bioinformatics researchers, computational chemists, and drug discovery scientists working on in-silico drug development who need accurate and interpretable drug-target interaction predictions.
MolTrans offers more accurate and interpretable predictions than existing methods by explicitly modeling substructural interactions and effectively utilizing both labeled and unlabeled molecular data through augmented transformer encoders.
MolTrans: Molecular Interaction Transformer for Drug Target Interaction Prediction (Bioinformatics)
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
MolTrans uses knowledge-inspired substructural pattern mining to identify relevant molecular parts, making predictions more explainable than black-box models, as highlighted in its key features.
It incorporates augmented transformer encoders to capture semantic relations from massive unlabeled biomedical data, addressing a common limitation in existing methods for better accuracy.
The repository includes processed datasets like BindingDB, DAVIS, and BIOSNAP with various experimental configurations, saving significant data preparation time for researchers.
Provides an example jupyter notebook and train.py script for running experiments, allowing quick testing and benchmarking on supported datasets.
The README states that more codes and tests will be added, indicating the current version might lack comprehensive testing or features, posing risks for stable deployment.
As a deep learning model with transformer encoders, it requires substantial GPU resources and training time, which may not be feasible for users with limited hardware.
Users are directed to external pages for dataset details, and the setup instructions are minimal, potentially hindering customization and troubleshooting for new users.