A multimodal protein language model for generative protein design and engineering by jointly reasoning over sequence, structure, and function.
ESM is a family of protein language models developed by EvolutionaryScale for simulating and designing proteins. It includes ESM3, a multimodal generative model that reasons over protein sequence, structure, and function, and ESM C, a representation learning model for creating protein embeddings. These tools enable researchers to generate novel proteins, predict properties, and analyze biological data at scale.
Computational biologists, bioinformatics researchers, and AI scientists working on protein engineering, drug discovery, and biological sequence analysis.
Developers choose ESM for its state-of-the-art multimodal reasoning capabilities, scalable architecture, and flexible deployment options, including local inference, cloud API access, and commercial licensing via AWS SageMaker.
ESM (Evolutionary Scale Modeling) is a family of protein language models developed by EvolutionaryScale. It includes ESM3, a frontier generative model for biology, and ESM C, a representation learning model for creating protein embeddings. These models enable researchers and developers to simulate protein evolution, design novel proteins, and analyze biological sequences with unprecedented scale and control.
ESM is developed with a mission to understand biology for human benefit through open, safe, and responsible AI research, guided by a framework that emphasizes risk evaluation and stakeholder collaboration.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
ESM3 jointly reasons across sequence, structure, and function tracks, enabling controlled protein design with partial prompts, as shown in the diagram and GFP generation tutorial.
Offers models from 1.4B to 98B parameters trained on billions of proteins, providing options for different computational needs, as listed in the available models table.
Supports local inference via Hugging Face, cloud API through Forge, and commercial licensing on AWS SageMaker, allowing seamless transition from research to production.
ESM C delivers high-performance protein embeddings with reduced memory and faster inference than ESM2, acting as a drop-in replacement for sequence analysis tasks.
Running large models like the 98B ESM3 locally requires substantial GPU memory and power, which may be inaccessible without high-end hardware or cloud credits.
Deploying on AWS SageMaker involves multi-step CloudFormation templates and AWS account management, adding significant setup time and operational overhead.
Access to advanced features via Forge API ties users to EvolutionaryScale's infrastructure, with potential costs, latency, and dependency on external service availability.