Showing 33 of 33 projects
An open-source data platform that integrates proprietary, licensed, and public financial data sources for analysts, quants, and AI agents.
A system for building agents that perform automated tasks online, like a self-hosted IFTTT or Zapier.
A platform to programmatically author, schedule, and monitor workflows as code.
A platform to programmatically author, schedule, and monitor workflows as code.
An open-source query engine for AI analytics that builds self-reasoning agents across live data sources without ETL.
A distributed event streaming platform for building high-performance data pipelines, streaming analytics, and data integration.
Open-source data integration platform for building ELT pipelines from APIs, databases, and files to data warehouses, lakes, and lakehouses.
A highly scalable and reliable MQTT broker platform for AI, IoT, IIoT, and connected vehicles, supporting multiple protocols and real-time data integration.
A low-latency platform for change data capture (CDC) that streams row-level changes from databases to applications.
A high-performance, resilient stream processor that connects various sources and sinks, performs data transformations, and guarantees at-least-once delivery.
A high-performance, declarative stream processor that connects various sources and sinks with built-in data transformation capabilities.
An open-source ETL (Extract, Transform, Load) tool for data integration and migration.
Open-source data pipelines for cloud asset inventory, CSPM, FinOps, and vulnerability management across AWS, Azure, GCP, and 70+ sources.
Open-source data pipelines to sync cloud infrastructure metadata from AWS, Azure, GCP, and 70+ sources into your data warehouse.
A real-time data integration platform that creates and continually updates consistent views of transactional data using SQL.
An open data lakehouse platform for incremental data processing with upserts, deletes, and time-travel queries.
An easy-to-use, powerful, and reliable system to process and distribute data across cybersecurity, observability, and AI pipelines.
A CLI tool and dataflow engine that lets you query and join data from multiple databases and file formats using SQL.
A distributed data streaming engine with stateful stream processing for building responsive data-intensive applications.
A lean distributed data streaming engine and stream processing framework written in Rust for building responsive data-intensive applications.
A Python library using machine learning for accurate and scalable fuzzy matching, record deduplication, and entity resolution on structured data.
An open-source, privacy-focused customer data platform (CDP) that collects, processes, and routes event data to warehouses and tools.
A curated list of awesome ETL frameworks, libraries, and software for data integration and pipeline development.
A CLI tool to copy data between any databases and platforms with a single command, no code required.
An open-source data IDE for developers to query, script, and visualize data from databases, files, and APIs.
A declarative code-first data integration engine that unlocks 600+ APIs and databases, eliminating the need to write and maintain custom API integrations.
A distributed data integration framework for big data ecosystems, handling ingestion, replication, organization, and lifecycle management for both streaming and batch data.
A distributed data integration framework for big data ecosystems, handling ingestion, replication, organization, and lifecycle management for both streaming and batch data.
Native integration library for using Elasticsearch with Hadoop, Spark, and Hive for real-time search and analytics on big data.
A parallel bulk data loader that transfers data between various storages, databases, NoSQL, and cloud services via plugins.
An open-source Reverse ETL platform for syncing data from warehouses to business tools like Salesforce, HubSpot, and Slack.
A Python library for deep probabilistic modeling and analysis of single-cell and spatial omics data.
An open-source, multi-tenant platform for self-building knowledge graphs and simulation.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.