A curated list of essential academic papers for understanding database fundamentals and building modern data systems.
Readings in Databases is a curated collection of essential academic papers that explain the foundational concepts and evolution of database systems. It provides a structured learning path through seminal works in relational databases, distributed systems, columnar storage, and consensus algorithms. The list helps developers and researchers understand the theoretical underpinnings of modern data technologies.
Database engineers, data systems researchers, graduate students in computer science, and software developers who want to build a deep understanding of database internals beyond surface-level API knowledge.
It offers a carefully selected and organized reading list curated by an expert, saving learners time in identifying the most impactful papers. Unlike scattered resources, it provides context and commentary that helps bridge academic research to practical system building.
Readings in Databases
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Curated by Reynold Xin, a database systems expert, ensuring high-impact papers are selected, with brief commentary explaining their practical relevance.
Papers are organized into logical sections like Basics and Columnar Databases, providing a systematic way to study database evolution from theory to practice.
Spans decades from Codd's 1970 paper to modern systems like Spark, showing how foundational principles persist despite technological changes.
Includes links to reading lists from top universities like Berkeley and Stanford, offering pathways for further in-depth study.
Some papers, such as ARIES, are admitted to be 'very hard to read' due to mixing low-level details, requiring prior textbook knowledge for comprehension.
The list is updated only via pull requests and may lag behind newer seminal papers, as noted with trends like cloud computing not being extensively covered post-2010s.
Purely a reading list without built-in tools for discussion, annotations, or code examples, limiting engagement for hands-on learners.