Showing 14 of 14 projects
MinIO is a high-performance, S3-compatible object storage solution for AI/ML, analytics, and data-intensive workloads.
A high-performance, S3-compatible distributed object storage system built in Rust, optimized for data lakes and AI workloads.
Open-source data integration platform for building ELT pipelines from APIs, databases, and files to data warehouses, lakes, and lakehouses.
A fast distributed SQL query engine for big data analytics, enabling interactive queries across diverse data sources.
Deep Lake is a multimodal data lake and vector store optimized for AI, enabling scalable data management, retrieval, and training for LLM and deep learning applications.
A high-performance table format for huge analytic datasets, enabling multiple engines to safely work with the same tables simultaneously.
An open-source storage framework that enables building a Lakehouse architecture with ACID transactions and scalable metadata handling.
An open-source tool that transforms object storage into a Git-like repository for versioned, atomic, and repeatable data lake operations.
A Python library that simplifies data integration between pandas and AWS services like Athena, S3, Redshift, and more.
A Python library that simplifies data integration between pandas and AWS services like Athena, S3, Redshift, and more.
A high-performance, geo-distributed, and federated open data catalog for unified metadata management across diverse data and AI assets.
A distributed, multi-tenant gateway providing serverless SQL on data warehouses and lakehouses.
A distributed data integration framework for big data ecosystems, handling ingestion, replication, organization, and lifecycle management for both streaming and batch data.
A distributed data integration framework for big data ecosystems, handling ingestion, replication, organization, and lifecycle management for both streaming and batch data.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.