A PyTorch-based platform for state-of-the-art object detection, segmentation, and visual recognition tasks.
Detectron2 is a PyTorch-based library developed by Facebook AI Research for state-of-the-art computer vision tasks, primarily object detection and image segmentation. It provides a modular platform with advanced algorithms like Cascade R-CNN, PointRend, and panoptic segmentation, enabling both research experimentation and production deployment. The library serves as the successor to Detectron and maskrcnn-benchmark, offering improved training speed and extensive model support.
Computer vision researchers, AI engineers, and developers working on visual recognition projects who need a flexible, production-ready toolkit for object detection and segmentation tasks.
Developers choose Detectron2 for its comprehensive model zoo, modular architecture that supports custom research projects, and seamless export capabilities for deployment. It combines cutting-edge algorithms with practical production features, backed by Facebook AI Research's ongoing development and community support.
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Supports advanced techniques like panoptic segmentation, DensePose, and transformer-based models such as ViTDet, providing state-of-the-art performance for visual recognition tasks.
Designed as a library to build custom projects, with a flexible architecture that facilitates experimentation and extension, as evidenced by the projects/ directory for research builds.
Models can be easily exported to TorchScript or Caffe2 formats, enabling smooth deployment in production environments, as highlighted in the export flexibility features.
Offers a wide range of pre-trained models and benchmarks in the Model Zoo, allowing for quick start and comparison without training from scratch.
Optimized implementations lead to improved training efficiency compared to its predecessors, as noted in the benchmarks linked in the README.
Requires specific PyTorch and CUDA versions with non-trivial setup, often needing careful configuration or Docker, which can be challenging for newcomers.
Training and running models necessitate powerful GPUs with significant memory, making it unsuitable for resource-constrained setups without access to high-end hardware.
Tightly integrated with PyTorch, limiting adoption for teams using other frameworks like TensorFlow and creating vendor lock-in for production pipelines.
The world's simplest facial recognition api for Python and the command line
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.