Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. JAX
  3. Scenic

Scenic

Apache-2.0Python

A JAX library for rapid prototyping of large-scale attention-based vision models across images, video, audio, and multimodal data.

GitHubGitHub
3.8k stars478 forks0 contributors

What is Scenic?

Scenic is a JAX library focused on research and development of large-scale, attention-based models for computer vision and beyond. It solves the problem of rapid prototyping for vision models by providing shared, optimized libraries for common tasks and hosting complete project implementations. It is extensively used for building models that handle images, video, audio, and multimodal data.

Target Audience

Researchers and engineers in computer vision and multimodal AI who need a flexible, high-performance codebase for developing and experimenting with large-scale attention-based models using JAX.

Value Proposition

Developers choose Scenic for its clean design philosophy favoring simplicity, its comprehensive set of optimized libraries for large-scale training, and its extensive collection of state-of-the-art model implementations and baselines, all built on the efficient JAX/Flax stack.

Overview

Scenic: A Jax Library for Computer Vision Research and Beyond

Use Cases

Best For

  • Rapid prototyping of novel vision transformer architectures
  • Reproducing and benchmarking state-of-the-art computer vision papers
  • Training large-scale models on multi-host, multi-device setups
  • Developing multimodal models that combine vision, video, and audio
  • Experimenting with attention-based models for segmentation and object detection
  • Building upon strong, optimized baselines like ViT, DETR, and CLIP

Not Ideal For

  • Production teams needing stable, versioned APIs with long-term support and enterprise tooling
  • Developers entrenched in PyTorch or TensorFlow ecosystems who prefer their established toolchains and communities
  • Projects requiring out-of-the-box, pre-trained models for inference without custom training loops
  • Applications where minimal code modification is desired, as Scenic's philosophy encourages forking over abstraction

Pros & Cons

Pros

Optimized Multi-Device Training

Provides scalable input pipelines and training loops designed for multi-host setups, efficiently handling data division, caching, and prefetching as outlined in the dataset_lib and train_lib modules.

Rich Baseline Implementations

Hosts fully-fleshed projects for SOTA models like ViT, DETR, and CLIP, offering reproducible baselines for easy experimentation and benchmarking, as detailed in the projects directory.

Multi-Modal Flexibility

Supports model development across images, video, audio, and their combinations, enabling advanced multimodal research with shared libraries, evidenced by projects like PolyViT and AVATAR.

Simplicity-First Design

Emphasizes forking and copy-pasting over unnecessary abstraction, making code straightforward to understand and modify for rapid prototyping, as per the project's stated philosophy.

Cons

Code Duplication Risk

The forking-centric approach can lead to maintenance challenges and duplicated efforts across projects, since functionality is only upstreamed to shared libraries after proving widely useful.

JAX Ecosystem Dependency

Built entirely on JAX and Flax, locking users into this ecosystem, which has a steeper learning curve and less mature tooling compared to PyTorch for some developers.

Research-Focused Limitations

Lacks polish for production deployment, with documentation and features centered on experimental use rather than enterprise integration, such as model serving or detailed deployment guides.

Frequently Asked Questions

Quick Stats

Stars3,793
Forks478
Contributors0
Open Issues161
Last commit3 days ago
CreatedSince 2021

Tags

#model-training#jax#vision-transformer#deep-learning#flax#multimodal-ai#transformers#research#computer-vision

Built With

J
JAX
F
Flax

Included in

JAX2.1k
Auto-fetched 1 day ago

Related Projects

HuggingFace TransformersHuggingFace Transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Stars159,772
Forks32,981
Last commit1 day ago
TraxTrax

Trax — Deep Learning with Clear Code and Speed

Stars8,303
Forks830
Last commit7 months ago
FlaxFlax

Flax is a neural network library for JAX that is designed for flexibility.

Stars7,172
Forks799
Last commit2 days ago
Flax NNXFlax NNX

Flax is a neural network library for JAX that is designed for flexibility.

Stars7,172
Forks799
Last commit2 days ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub