An industry-leading open-source data engine for interactive video and image annotation to power machine learning.
CVAT (Computer Vision Annotation Tool) is an open-source, web-based tool for annotating images and videos to create training data for computer vision models. It solves the problem of efficiently and accurately labeling large datasets, which is a critical bottleneck in machine learning pipelines. It supports both manual and AI-assisted annotation across a wide range of formats and use cases.
Computer vision engineers, ML researchers, data annotation teams, and organizations building custom AI models who need a powerful, scalable, and flexible tool to create and manage labeled datasets.
Developers choose CVAT for its industry-leading feature set, extensive format support, strong open-source community, and flexibility to be used as a free cloud service or a fully controlled self-hosted deployment. Its integration with automatic labeling models significantly reduces manual work.
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports import and export in over 20 industry-standard formats like COCO, YOLO, and Cityscapes, as detailed in the README's format table, ensuring compatibility with most ML pipelines.
Integrates serverless functions with models like Segment Anything and YOLO, speeding up annotation by up to 10x, as mentioned in the key features, reducing manual workload significantly.
Offers a free cloud service at cvat.ai and self-hosted solutions with Docker and Kubernetes, providing scalability for both small teams and large enterprises, as highlighted in the deployment section.
Includes a Python SDK, CLI tool, and REST API, enabling automation and integration into ML workflows, which is emphasized in the SDK and API documentation links.
Connects with platforms like Roboflow, Hugging Face, and FiftyOne for enhanced dataset curation and model analysis, as noted in the partners and integrations sections.
The free online version at cvat.ai restricts users to 10 tasks and 500MB of data, lacks analytics features, and does not allow exporting images—only annotations, as admitted in the cloud service description.
Self-hosting requires Docker or Kubernetes deployment, which can be challenging for teams without DevOps expertise, as indicated in the installation guide and reliance on prebuilt images.
Some automatic labeling functions are limited to CPU or specific frameworks like OpenVINO, not fully leveraging GPU acceleration for all models, as shown in the serverless functions table with CPU/GPU column.
The extensive feature set and multiple interfaces (web, SDK, CLI) can overwhelm new users, making initial adoption slower compared to simpler annotation tools, despite comprehensive documentation.