Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Computational Biology
  3. ProteinGym

ProteinGym

MITHTMLPG_v1.3

A comprehensive benchmark suite for evaluating protein fitness prediction models using deep mutational scanning and clinical variant data.

Visit WebsiteGitHubGitHub
442 stars58 forks0 contributors

What is ProteinGym?

ProteinGym is a large-scale benchmark suite for evaluating protein fitness prediction models. It provides curated datasets from deep mutational scanning experiments and annotated human clinical variants to enable standardized comparisons of computational methods that predict how mutations affect protein function. The project addresses the need for rigorous, reproducible evaluation in protein engineering and variant interpretation.

Target Audience

Computational biologists, bioinformaticians, and machine learning researchers developing or applying models for protein fitness prediction, variant effect analysis, and protein design.

Value Proposition

Developers choose ProteinGym because it offers a comprehensive, community-maintained benchmark with diverse datasets, standardized metrics, and an extensive leaderboard of state-of-the-art baselines, enabling fair model comparisons and accelerating research in protein fitness prediction.

Overview

Official repository for the ProteinGym benchmarks

Use Cases

Best For

  • Benchmarking new protein fitness prediction models against established baselines
  • Evaluating model performance on specific mutation types like substitutions or indels
  • Assessing predictive accuracy across different protein families and functional categories
  • Reproducing published results in protein variant effect prediction
  • Curating standardized datasets for training supervised protein fitness models
  • Comparing MSA-based, single-sequence, and structure-based prediction approaches

Not Ideal For

  • Teams developing proprietary protein design tools that cannot share model scores publicly
  • Researchers with limited computational resources or storage for multi-gigabyte datasets
  • Projects requiring real-time mutation effect prediction in production environments

Pros & Cons

Pros

Extensive Benchmark Coverage

Includes ~2.7M missense variants across 217 DMS assays for substitutions and ~300k mutants for indels, providing a diverse and large-scale dataset for evaluation as detailed in the Overview.

Comprehensive Performance Metrics

Uses Spearman, AUC, MCC, NDCG, and Top-K recall for zero-shot and supervised settings, ensuring thorough model assessment across different regimes, as specified in the Results section.

Public Leaderboard Transparency

Hosts an interactive website with performance rankings and detailed files, enabling easy comparison and reproducibility, highlighted in the Key Features and Results.

Community-Driven Contributions

Openly accepts new assays and baselines through GitHub issues and PRs with clear criteria, fostering collaborative development as described in the 'How to contribute?' section.

Cons

Complex Initial Setup

Requires downloading multiple large files (e.g., 17.8GB for clinical MSAs), configuring paths in scripts, and running command-line tools, which can be daunting for new users, as outlined in the Usage and reproducibility section.

Limited Model Inclusion

Only supports open-source models that can score all mutants in benchmarks, excluding proprietary methods and potentially narrowing the benchmark's scope, as stated in the 'New baselines' criteria.

Incomplete Code Integration

Supervised model training code is housed in a separate repository (ProteinNPT) and not fully integrated into this project, noted in the 'Notes' section under contributions.

Frequently Asked Questions

Quick Stats

Stars442
Forks58
Contributors0
Open Issues19
Last commit3 months ago
CreatedSince 2022

Tags

#variant-effect-prediction#protein#computational-biology#protein-design#bioinformatics#open-data#protein-engineering#machine-learning#benchmark

Built With

P
Python

Links & Resources

Website

Included in

Computational Biology122
Auto-fetched 31 minutes ago

Related Projects

MOSESMOSES

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Stars979
Forks280
Last commit2 years ago
TAPE (Tasks Assessing Protein Embeddings)TAPE (Tasks Assessing Protein Embeddings)

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

Stars740
Forks135
Last commit3 years ago
GuacaMolGuacaMol

Benchmarks for generative chemistry

Stars525
Forks99
Last commit2 years ago
scIB (Single-cell Integration Benchmarks)scIB (Single-cell Integration Benchmarks)

Benchmarking analysis of data integration tools

Stars423
Forks76
Last commit2 months ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub