Data Governance

21 projects

Showing 21 of 21 projects

A unified open-source metadata platform for data discovery, observability, and governance with column-level lineage and team collaboration.

#data-collaboration#data-lineage#open-source

An open-source metadata platform for data discovery, governance, and observability across your entire data and AI stack.

#hacktoberfest#data-lineage#ai-integration

Stars12.3k

Forks3.6k

Last commit1 day ago

Great ExpectationsPython

A Python library for data quality testing and validation using expressive, extensible Expectations.

#data-testing#datacleaning#open-source

A Python library for performing data science and machine learning on data without direct access, using remote datasites.

#federated-learning#hacktoberfest#python-library

Open-source customer data infrastructure that collects, validates, and enriches behavioral event data for AI and analytics.

#snowplow-events#data-warehouse-integration#event-tracking

Stars7.0k

Forks1.2k

Last commit22 days ago

lakeFSGo

An open-source tool that transforms object storage into a Git-like repository for versioned, atomic, and repeatable data lake operations.

#multi-cloud#data-versioning#azure-blob-storage

A lightweight, fast, and beautiful database management tool with AI-powered chat interface for PostgreSQL, MySQL, SQLite, MongoDB, Redis, and more.

#data-lineage#database#explorer

A metadata-driven data discovery and catalog platform that helps data teams find, understand, and trust their data resources.

#data-lineage#data-catalog#data-engineering

Stars4.8k

Forks965

Last commit18 days ago

GravitinoJava

A high-performance, geo-distributed, and federated open data catalog for unified metadata management across diverse data and AI assets.

#trino-connector#federated-metadata#skycomputing

Stars3.1k

Forks888

Last commit1 day ago

Confluent Schema registry for KafkaJava

A RESTful service for storing, retrieving, and managing Avro, JSON Schema, and Protobuf schemas in Apache Kafka ecosystems.

#schemas#data-serialization#avro-schema

An open-source metadata service for collecting, aggregating, and visualizing data lineage and ecosystem metadata.

#data-lineage#helm#data-dictionary

Stars2.2k

Forks405

Last commit12 days ago

GafferJava

A graph database framework for storing and querying large-scale graphs with rich properties and in-database aggregation.

#apache-spark#parquet#entity-relation

Stars1.8k

Forks363

Last commit1 year ago

SQLLineagePython

A Python-powered SQL lineage analysis tool that extracts source and target tables from SQL commands without deep parser knowledge.

#ast-analysis#data-lineage#data-engineering

A Python library that automatically extracts schema, statistics, and sensitive entities (PII/NPI) from datasets.

#sensitive-data-detection#data-labels#python-library

Stars1.6k

Forks188

Last commit18 days ago

Schema Registry UIJavaScript

A web UI for managing Avro schemas in Confluent Schema Registry, enabling creation, viewing, searching, evolution, and configuration.

#confluent-platform#kafka#docker

Stars425

Forks112

Last commit2 years ago

InspektorRust

A protocol-aware proxy that enforces fine-grained access policies for databases using Open Policy Agent (OPA).

#openpolicyagent#postgres#acl

Stars285

Forks13

Last commit4 years ago

DQOpsJava

A DataOps-friendly data quality monitoring platform with customizable checks, dashboards, and incident management for multiple data sources.

#data-quality-report#data-observability#data-quality-checks

Stars194

Forks37

Last commit6 months ago

PACEKotlin

An open-source Policy As Code Engine that programmatically creates and applies data policies to platforms like Snowflake, Databricks, and BigQuery.

#data-catalog#policy-as-code#data-policy

Stars40

Forks1

Last commit5 days ago

Consume Power BIPowerShell

PowerShell and Power Automate solutions to consume Power BI's Asynchronous Unified Scanning API for workspace metadata retrieval.

#microsoft-flow#rest-api#asynchronous-processing

Stars19

Forks7

Last commit

Hortonworks Schema RegistryJava

A framework for building metadata repositories, currently featuring a SchemaRegistry implementation.

#kinesis#schemas#flink

Stars18

Forks10

Last commit2 years ago

Atlas BI LibraryTSQL

A unified report library that plugs into existing BI platforms to extract, search, document, and launch reports.

#search#reporting#library

Stars17

Forks6

Last commit1 day ago

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub