Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Apache Spark
  3. Cromwell

Cromwell

BSD-3-ClauseScala92

An open-source workflow management system for bioinformatics that scales from one-off use cases to massive production environments.

Visit WebsiteGitHubGitHub
1.1k stars381 forks0 contributors

What is Cromwell?

Cromwell is an open-source workflow management system specifically designed for bioinformatics that executes workflows written in the Workflow Description Language (WDL). It solves the problem of orchestrating complex computational pipelines in scientific research, enabling seamless scaling from small experiments to massive production environments. The system provides a robust engine that handles distributed execution across various cloud platforms and on-premise infrastructure.

Target Audience

Bioinformatics researchers, computational biologists, and data scientists who need to orchestrate and scale complex computational workflows for genomic analysis and other life science applications.

Value Proposition

Developers choose Cromwell because it offers a specialized, scalable solution for bioinformatics workflow management with strong WDL language support, modular backend architecture for cloud flexibility, and the ability to self-host for specialized requirements. It bridges the gap between academic research and production-scale bioinformatics pipelines.

Overview

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments

Use Cases

Best For

  • Orchestrating genomic analysis pipelines in research environments
  • Scaling bioinformatics workflows from development to production
  • Managing complex computational workflows across cloud platforms
  • Self-hosting workflow management systems for specialized bioinformatics needs
  • Executing WDL-based workflows in academic or institutional settings
  • Building reproducible scientific pipelines for life sciences research

Not Ideal For

  • Projects relying on Common Workflow Language (CWL) for workflow definitions
  • Teams needing hands-on, vendor-provided support for on-premise installations
  • General-purpose data engineering outside of life sciences or bioinformatics
  • Real-time or interactive data processing requiring low-latency execution

Pros & Cons

Pros

Specialized WDL Execution

Executes workflows written in WDL, a domain-specific language optimized for bioinformatics, as highlighted in the key features and documentation links.

Seamless Production Scaling

Transitions trivially from small-scale experiments to massive production runs with the same workflow definitions, ensuring consistency and reliability.

Flexible Cloud Integration

Supports modular backends for AWS Batch and GCP Batch, allowing deployment across major cloud providers with pluggable architecture.

Managed and Self-Hosted Deployment

Offers options via Terra for a managed platform or self-hosting via JAR/Docker images, catering to different institutional needs.

Cons

Dropped CWL Support

Cromwell version 80 and above no longer supports CWL, forcing users to migrate workflows or seek alternatives for existing CWL pipelines.

Limited Self-Hosted Support

Self-managed instances only receive bug report support from the core team, with no direct assistance for setup, configuration, or operations.

Community-Maintained Backends

Backends other than AWS and GCP are community-based, potentially leading to inconsistent updates, slower fixes, and reliability concerns.

Bioinformatics-Specific Learning Curve

Requires familiarity with WDL and bioinformatics concepts, making it less accessible for developers outside the life sciences domain.

Frequently Asked Questions

Quick Stats

Stars1,065
Forks381
Contributors0
Open Issues764
Last commit1 day ago
CreatedSince 2015

Tags

#workflow-management#scientific-computing#workflow#aws-batch#executor#cloud-computing#docker#scala#cloud#data-pipelines#bioinformatics#containers#hpc

Built With

D
Docker

Links & Resources

Website

Included in

Apache Spark1.9k
Auto-fetched 5 hours ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub