Our Presto Credo for the Truly Open Source SQL Query Engine

We believe that data analytics should be democratized—and is why we innovate Presto with state-of-the-art database technology. Trusted governance is important to us—and is why we model our project governance and by laws after the Linux Foundation. TO OUR FELLOW DATA ENGINEERS, SOFTWARE DEVELOPERS, AND DATA PLATFORM ENTHUSIASTS: As the use of data analytics and…

Faster Presto Queries with Parquet Page Index

Introduction Today’s data is growing very fast, which creates challenges for query engines like Presto. Presto is a popular interactive query engine, because of its scalability, high performance, and smooth integration with Hadoop. As the volume of data grows, Presto needs to read larger chunks of data and load them into memory, which causes higher…

Building a high-performance platform on AWS to support real-time gaming services using Presto and Alluxio

Electronic Arts (EA) is a leading company in the gaming industry, providing dozens of games to serve billions of users worldwide each year. Making near real-time decisions for EA’s online services is critical for our business. This blog describes a data platform on AWS based on Presto and Alluxio to support online services with instantaneous…

Running Presto in a Hybrid Cloud Architecture

Migrating SQL workloads from a fully on-premise environment to cloud infrastructure has numerous benefits, including alleviating resource contention and reducing costs by paying for computation resources on an on-demand basis. In the case of Presto running on data stored in HDFS, the separation of compute in the cloud and storage on-premises is apparent since Presto’s…

Improving Presto Latencies with Alluxio Data Caching

The Facebook Presto team has been collaborating with Alluxio on an open source data caching solution for Presto. This is required for multiple Facebook use-cases to improve query latency for queries that scan data from remote sources such as HDFS. We have observed significant improvements in query latencies and IO scans in our experiments. We…