A curated list of deep learning implementations and resources for biological research, with a focus on genomics.
deeplearning-biology is a curated, community-maintained list of deep learning implementations, tools, and resources specifically for biological research. It aggregates code repositories, model zoos, and key papers to help researchers and developers apply state-of-the-art ML techniques to problems in genomics, protein biology, drug discovery, and more. The project solves the problem of fragmented information by providing a centralized, categorized reference.
Bioinformaticians, computational biologists, and machine learning researchers or engineers who want to apply or understand deep learning in biological contexts. It's especially useful for those entering the field or looking for practical implementations beyond theoretical papers.
Unlike generic ML resource lists, it is domain-specific, implementation-focused, and community-driven. It saves significant time by filtering for biological relevance and code availability, and it emphasizes real-world tools over purely academic descriptions.
A list of deep learning implementations in biology
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Organized by domains such as genomics, protein biology, and chemoinformatics, with subcategories like variant calling and single-cell applications, making it easy to navigate specific research areas as shown in the detailed table of contents.
Prioritizes projects with available code, such as GitHub repositories for AlphaFold, DeepVariant, and ESM, ensuring users can access practical tools rather than just theoretical papers.
Encourages contributions to expand coverage, especially in underrepresented subfields, which helps keep the list growing and responsive to new developments, as mentioned in the README's call for contributions.
Includes landmark implementations like AlphaFold for protein structure prediction and DNABERT for sequence modeling, providing direct access to cutting-edge research and model zoos like Kipoi.
The README explicitly admits a bias towards genomics, so other areas like metabolomics or systems biology may have less comprehensive or up-to-date entries, limiting utility for those fields.
It functions as a static list without built-in tools for testing, comparing, or evaluating models; users must navigate to external repositories, which can involve complex setup and dependency management.
Updates and accuracy rely on voluntary contributions, leading to potential delays in adding new resources or curating outdated entries, as there is no automated or guaranteed review process.