Showing 15 of 51 projects
A curated list of awesome ETL frameworks, libraries, and software for data integration and pipeline development.
A high-performance Go driver for ClickHouse offering both native and standard database/sql interfaces.
High-performance datastore optimized for time series and tick data storage and retrieval.
Fast tool for comparing datasets within or across SQL databases to identify differences.
A curated list of awesome streaming frameworks, applications, readings, and resources for stream processing.
A declarative code-first data integration engine that unlocks 600+ APIs and databases, eliminating the need to write and maintain custom API integrations.
A lightweight Python library for creating portable, expressive, and testable data transformation DAGs with built-in lineage and metadata.
A Python library for defining portable, modular, and testable data transformation DAGs with built-in lineage and metadata.
A Python framework and Rust-based distributed processing engine for stateful event and stream processing.
A curated list of awesome Apache Spark packages, libraries, and resources for data engineers and scientists.
A Ruby framework for writing reliable, concise, and maintainable ETL (Extract-Transform-Load) data processing jobs.
An open-source Reverse ETL platform for syncing data from warehouses to business tools like Salesforce, HubSpot, and Slack.
A Python-powered SQL lineage analysis tool that extracts source and target tables from SQL commands without deep parser knowledge.
Base classes for writing Apache Spark tests in Scala and Python, simplifying test setup and teardown.
A Python framework for building real-time data pipelines and event-driven microservices on Apache Kafka using a Streaming DataFrame API.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.