IBM watsonx.data – a modern open data lakehouse architecture, built on Presto!

Today we are happy to share that IBM watsonx.data, a Presto-based Open Data Lakehouse architecture, is now generally available. Back in April we shared that IBM had joined the Presto Foundation through the acquisition of Ahana. To reiterate what we talked about then, we believe that this is an exciting time for the Presto open…

IBM joins the Presto Foundation through acquisition of Ahana

Today we’re thrilled to share that IBM has acquired Ahana, the venture-backed SaaS for Presto startup company, and we want to write more about our belief in Open Source and why IBM and Ahana are joining forces for the benefit of Presto. We believe that this is an exciting time for the Presto project. We’re…

Faster Presto Queries with Parquet Page Index

Introduction Today’s data is growing very fast, which creates challenges for query engines like Presto. Presto is a popular interactive query engine, because of its scalability, high performance, and smooth integration with Hadoop. As the volume of data grows, Presto needs to read larger chunks of data and load them into memory, which causes higher…

Building a high-performance platform on AWS to support real-time gaming services using Presto and Alluxio

Electronic Arts (EA) is a leading company in the gaming industry, providing dozens of games to serve billions of users worldwide each year. Making near real-time decisions for EA’s online services is critical for our business. This blog describes a data platform on AWS based on Presto and Alluxio to support online services with instantaneous…

Running Presto in a Hybrid Cloud Architecture

Migrating SQL workloads from a fully on-premise environment to cloud infrastructure has numerous benefits, including alleviating resource contention and reducing costs by paying for computation resources on an on-demand basis. In the case of Presto running on data stored in HDFS, the separation of compute in the cloud and storage on-premises is apparent since Presto’s…

Improving Presto Latencies with Alluxio Data Caching

The Facebook Presto team has been collaborating with Alluxio on an open source data caching solution for Presto. This is required for multiple Facebook use-cases to improve query latency for queries that scan data from remote sources such as HDFS. We have observed significant improvements in query latencies and IO scans in our experiments. We…