An open-source solution for continuous validation of machine learning models and data, from research to production.
Deepchecks is an open-source Python library and platform for continuous validation of machine learning models and data. It provides a comprehensive suite of tests to detect issues like data drift, performance degradation, and integrity problems across tabular, NLP, and computer vision models. The framework helps teams ensure model reliability from research through production deployment and monitoring.
Machine learning engineers, data scientists, and MLOps practitioners who need to validate, test, and monitor ML models in production environments. It's particularly valuable for teams building and deploying models at scale who require automated quality assurance.
Deepchecks offers a holistic, open-source validation solution that covers the entire ML lifecycle with specialized checks for different data types. Its unique selling point is combining testing, CI integration, and production monitoring into a single framework with extensive customization and multiple output formats for different stakeholders.
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Includes built-in checks for tabular, NLP, and computer vision data to detect issues like model performance degradation and data drift, as outlined in the Key Features.
Allows creating ordered lists of checks with configurable parameters and conditions, enabling automated pass/fail validation tailored to specific needs.
Provides a dedicated monitoring component with alerts and dynamic visualization for tracking model behavior in production, supporting continuous validation.
Easily integrates into continuous integration pipelines to automatically assess model readiness for deployment, facilitating MLOps practices.
The production monitoring installation via Docker is a work in progress for Windows, restricting its out-of-the-box usability for teams on that platform.
The open-source monitoring deployment supports only one model per instance, which can be inefficient for organizations with multiple models to monitor.
Premium features in the monitoring component are under a commercial license, requiring a paid license for full functionality, which may not suit all budgets.