A fast distributed SQL query engine for big data analytics, enabling interactive queries across diverse data sources.
Trino is a distributed SQL query engine designed for big data analytics, enabling fast, interactive queries across diverse data sources like data lakes, relational databases, and NoSQL stores. It solves the problem of data silos by allowing federated querying without moving data, making it ideal for large-scale analytical workloads. Originally known as PrestoSQL, Trino provides a unified SQL interface for scalable data exploration.
Data engineers, data analysts, and platform teams who need to run interactive SQL queries on petabyte-scale datasets across multiple data sources. It's particularly useful for organizations with data lakes or heterogeneous data environments.
Developers choose Trino for its high performance on interactive queries, federated querying capabilities, and ANSI SQL compliance. Its pluggable connector architecture and ability to query data in-place without ETL provide flexibility and reduce data movement overhead compared to traditional data warehouses.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Executes SQL queries in parallel across a cluster of machines, enabling high-speed analytics on petabyte-scale datasets as described in the key features for big data.
Connects to multiple data sources like HDFS, S3, and relational databases simultaneously, allowing queries across heterogeneous systems without moving data, solving data silo issues.
Supports standard SQL syntax, making it accessible to analysts and engineers with SQL knowledge, as highlighted in the key features for ease of use.
Pluggable connector system enables easy integration with new data sources, providing flexibility for evolving data environments as per the key features.
Building from source requires specific Java versions (25.0.1+), Docker, and has platform limitations like needing Rosetta 2 on Apple Silicon, adding to initial setup hurdles as noted in the build requirements.
As a distributed system, running and maintaining Trino clusters demands significant DevOps expertise for scaling, monitoring, and tuning, unlike cloud-based managed services.
Optimized for analytical queries (OLAP), it may not efficiently handle high-volume transactional processing or real-time updates, which can be a drawback for mixed workloads.