An open-source storage framework that enables building a Lakehouse architecture with ACID transactions and scalable metadata handling.
Delta Lake is an open-source storage framework that enables building a Lakehouse architecture. It provides ACID transactions, scalable metadata handling, and data versioning on top of existing data lakes, solving data reliability issues in big data environments. It unifies batch and streaming data processing with strong consistency guarantees.
Data engineers and platform teams building reliable data lakes and Lakehouse architectures, especially those using Apache Spark, PrestoDB, Flink, Trino, or Hive for large-scale data processing.
Developers choose Delta Lake for its robust ACID guarantees, open format compatibility, and seamless integration with popular compute engines. Its unique selling point is bringing database-like reliability to data lakes without vendor lock-in.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides serializable isolation for concurrent reads and writes, ensuring data integrity in data lakes, as highlighted in the key features.
Serves as a single sink for both processing modes, enabling consistent pipelines without duplication, per the documentation.
Uses Parquet and an open transaction log protocol, ensuring broad ecosystem support and avoiding vendor lock-in.
Maintains a transaction log for data versioning and rollbacks, useful for audits and debugging, as noted in the features.
Some connectors, like Apache Flink, are still in preview, limiting production readiness for certain compute engines, as mentioned in the README.
Relies on underlying storage providing atomic visibility and consistent listing, which may not be supported by all cloud or on-prem systems, per the requirements section.
The protocol may break forward compatibility, meaning older Delta Lake versions might not read newer tables, as admitted in the compatibility notes.