Showing 17 of 17 projects
A unified open-source metadata platform for data discovery, observability, and governance with column-level lineage and team collaboration.
Generate comprehensive data quality profiles and exploratory data analysis reports for Pandas and Spark DataFrames with a single line of code.
Generate comprehensive data quality profiling and exploratory data analysis reports for Pandas and Spark DataFrames with a single line of code.
A Python library for data quality testing and validation using expressive, extensible Expectations.
An open-source data-centric AI library for automatically detecting and fixing data quality issues in machine learning datasets.
A library built on Apache Spark for defining unit tests to measure data quality in large datasets.
A Python library for automated exploratory data analysis (EDA) with high-density visualizations and target analysis in two lines of code.
A lightweight MongoDB schema analyzer that reveals document structure, field frequencies, and data outliers.
A Python library that automatically extracts schema, statistics, and sensitive entities (PII/NPI) from datasets.
A Python API for Deequ, enabling data quality testing and validation on large datasets using Apache Spark.
An R package that automates exploratory data analysis and data treatment with one-line reports and visualizations.
An engine for ML/data tracking, visualization, explainability, drift detection, and dashboards, integrated with Polyaxon.
A lightweight Python tool for generating rich summary statistics of pandas and Polars dataframes directly in the console.
A high-performance data profiler for discovering and validating complex patterns like functional dependencies, inclusion dependencies, and association rules.
A high-performance data profiler for discovering and validating complex patterns in datasets, enabling data cleaning and quality analysis.
A DataOps-friendly data quality monitoring platform with customizable checks, dashboards, and incident management for multiple data sources.
A fast schema and data analyzer for MongoDB that provides detailed insights into database structure and content.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.