A comprehensive collection of PyTorch image models, layers, utilities, and training scripts for computer vision research and applications.
PyTorch Image Models (timm) is a comprehensive library for computer vision that provides a large collection of state-of-the-art image classification models implemented in PyTorch. It includes pretrained weights, training and validation scripts, and a suite of utilities for model development and experimentation. The library solves the problem of fragmented model implementations by offering a unified, reproducible, and well-documented repository for researchers and engineers.
Computer vision researchers, machine learning engineers, and data scientists who need access to a wide range of pretrained image models, reproducible training pipelines, and flexible tools for model experimentation and deployment.
Developers choose timm for its extensive and curated model zoo, high-performance training scripts, and consistent APIs that simplify working with diverse architectures. Its focus on reproducibility, regular updates with latest models, and comprehensive documentation make it a reliable foundation for vision projects.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Includes hundreds of pretrained models across diverse architectures like Vision Transformers, ConvNeXt, and EfficientNet, all with reproducible ImageNet training results as documented in the Models section.
Provides reference scripts for training, validation, and inference that support multiple GPU modes and advanced augmentations like MixUp and RandAugment, enabling efficient experimentation.
Supports multi-scale feature map extraction via `features_only` and `forward_features` methods, making it easy to adapt models for downstream tasks like detection or segmentation.
Offers a wide range of optimizers, schedulers, and attention modules through common APIs, such as the Muon optimizer and SplitBachNorm regularization, as listed in the Features section.
The README notes compatibility breaks, like in January 2026 for QKV bias fixes, which can disrupt workflows and require careful version management for stable projects.
While useful for feature extraction, timm lacks built-in support for other vision tasks like detection or segmentation, necessitating additional integration with libraries like Detectron2.
The library's breadth and advanced features, such as NaFlexViT for variable resolution, can involve steep configuration and scripting beyond the provided examples, increasing initial setup time.