A cutting-edge framework for training and deploying state-of-the-art YOLO models for object detection, segmentation, classification, and pose estimation.
Ultralytics YOLO is an open-source framework for training, validating, and deploying YOLO-based computer vision models. It provides a unified interface for multiple vision tasks including object detection, segmentation, classification, and pose estimation, solving the problem of needing separate tools for each task. The framework is designed to be fast, accurate, and easy to use, enabling rapid development of vision AI applications.
AI researchers, computer vision engineers, data scientists, and developers building real-time object detection and image analysis systems, from prototyping to production deployment.
Developers choose Ultralytics YOLO for its comprehensive multi-task support within a single framework, state-of-the-art model performance, extensive integration ecosystem, and the flexibility of both CLI and Python APIs that streamline the entire model lifecycle.
Ultralytics YOLO 🚀
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports object detection, instance segmentation, image classification, pose estimation, and oriented bounding boxes in a single codebase, eliminating the need for separate tools for each vision task as stated in the Key Features.
Provides the latest YOLO26 models and legacy versions like YOLOv3, all pre-trained on major datasets like COCO and ImageNet, ensuring high accuracy and performance for various applications.
Exports models to formats like ONNX and TensorRT for optimized inference on different hardware, from cloud GPUs to edge devices, as highlighted in the Flexible Deployment feature.
Seamlessly connects with platforms like Weights & Biases, Comet ML, and Roboflow for enhanced workflows in labeling, training, and visualization, extending functionality beyond core training.
Uses AGPL-3.0, which requires derivative works to be open-source, making it challenging for commercial deployment without purchasing an enterprise license, as noted in the License section.
Focused exclusively on YOLO models, limiting its utility for projects that require or prefer other object detection or vision architectures like Mask R-CNN or DETR.
Optimal inference speeds require GPUs with TensorRT export; CPU performance is significantly slower, as evidenced by the speed metrics in the model tables, which show milliseconds differences between CPU and GPU.