Showing 13 of 13 projects
A unified open-source metadata platform for data discovery, observability, and governance with column-level lineage and team collaboration.
An open-source metadata platform for data discovery, governance, and observability across your entire data and AI stack.
A Python library for data quality testing and validation using expressive, extensible Expectations.
A Python library for performing data science and machine learning on data without direct access, using remote datasites.
Open-source customer data infrastructure that collects, validates, and enriches behavioral event data for AI and analytics.
An open-source tool that transforms object storage into a Git-like repository for versioned, atomic, and repeatable data lake operations.
A metadata-driven data discovery and catalog platform that helps data teams find, understand, and trust their data resources.
A high-performance, geo-distributed, and federated open data catalog for unified metadata management across diverse data and AI assets.
A RESTful service for storing, retrieving, and managing Avro, JSON Schema, and Protobuf schemas in Apache Kafka ecosystems.
An open-source metadata service for collecting, aggregating, and visualizing data lineage and ecosystem metadata.
A graph database framework for storing and querying large-scale graphs with rich properties and in-database aggregation.
A Python-powered SQL lineage analysis tool that extracts source and target tables from SQL commands without deep parser knowledge.
A Python library that automatically extracts schema, statistics, and sensitive entities (PII/NPI) from datasets.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.