Scaling Cache for Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang

Scaling Cache for Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang

While using the Presto Iceberg connector, the in-heap cache in Presto is likely overloaded. In this talk, Beinan and Chunxu will share the design, implementation, and optimization of the off-heap cache to address the scalability challenges. You will learn how to cache Iceberg data and metadata for the Presto Iceberg connector, followed by future work on improving table scans using Apache Arrow.

Presto Query Analysis for Data Layout Formatting and Query Result Caching – Gurmeet Singh, Uber

Presto Query Analysis for Data Layout Formatting and Query Result Caching – Gurmeet Singh, Uber

In this talk, I will be talking about a microservice that we have built at Uber to be able to analyze Presto queries. The Presto Query Engine does not provide endpoints for query analysis purposes. One has to either execute the query or gather insights from the query explain plan. In this talk, I will talk about 1. The work that we had to do to do the query analysis in a microservice using Presto as a library. 2. Doing predicate analysis on the queries to come up with data formatting recommendations in order to improve query performance. 3. Using the analysis service for query result cache invalidation. The analysis figures out whether the results from a previous run of the query are still valid and can be reused.

A Tour of Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang, Twitter

A Tour of Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang, Twitter

Apache Iceberg is an open table format for huge analytic datasets. The Presto Iceberg connector consolidates the SQL engine and the table format, to empower high-performant data analytics. Here, Beinan and Chunxu would like to discuss and share the architectural design of the Presto Iceberg connector, advanced Iceberg feature support (such as native iceberg connector, row-level deletion, and iceberg v2 support), and the future roadmap.

Presto and Apache Iceberg – Chunxu Tang, Twitter

Presto and Apache Iceberg – Chunxu Tang, Twitter

Apache Iceberg is an open table format for huge analytic datasets. At Twitter, engineers are working on the Presto-Iceberg connector, aiming to bring high-performance data analytics on Iceberg to the Presto ecosystem. Here, Chunxu would like to share what they have learned during the development, hoping to shed light on the future work of interactive queries.