Showing 9 of 9 projects
Generate comprehensive data quality profiles and exploratory data analysis reports for Pandas and Spark DataFrames with a single line of code.
Generate comprehensive data quality profiling and exploratory data analysis reports for Pandas and Spark DataFrames with a single line of code.
A unified open-source metadata platform for data discovery, observability, and governance with column-level lineage and team collaboration.
An open-source data-centric AI library for automatically detecting and fixing data quality issues in machine learning datasets.
A Python library for data quality testing and validation using expressive, extensible Expectations.
A library built on Apache Spark for defining unit tests to measure data quality in large datasets.
A Python library for automated exploratory data analysis (EDA) with high-density visualizations and target analysis in two lines of code.
A lightweight MongoDB schema analyzer that reveals document structure, field frequencies, and data outliers.
A Python library that automatically extracts schema, statistics, and sensitive entities (PII/NPI) from datasets.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.