PrestoCon Day 2025 Agenda

Agenda

All times in PDT timezone

9:00 AM – 9:15 AM | Welcome Remarks

▼

Curt Hu

Presto Foundation Governing Board Chair & Sr. Engineering Manager, Uber

Ali LeClerc

Presto Foundation Community Chair & Head of Open Source Strategy, IBM

Welcome to PrestoCon Day! Join us for a day of all things open-source Presto. You’ll hear more from Presto Foundation Chairs Curt and Ali as they share latest updates from the community and what to expect for the day.

9:15 AM – 9:45 AM | TSC Keynote

▼

Tim Meehan

Presto Foundation TSC Chair & Software Engineer, IBM

Tim shares the state of the project and what to look for next. We go over advancements in native processing and improvements in the C++ experience, new plugins and functionality to modify Presto, support for table formats such as Iceberg and Delta, and more.

9:45 AM – 10:15 AM | Interactive Warehouse at Meta

▼

Rohit Jain

Software Engineering Manager, Meta

In this session, we will delve into how Presto is empowering various use cases for both internal and external interactive analytics. We will explore the unique challenges that come with operating at Meta’s scale and discuss our strategies for overcoming them.

10:15 AM – 11:05 AM | Panel: Building for today’s workloads, designing for tomorrow’s AI: Lessons from Meta, Intuit, and IBM on scaling infrastructure for the future

▼

Shradha Ambekar

Sr. Staff Software Engineer, Intuit

Kaushik Ravichandran

Software Engineering Manager, Meta

Anson Kokkat

Principal Product Manager, IBM

Ali LeClerc

Head of Open Source Strategy, IBM

This panel brings together data leaders from Intuit, Meta, and IBM to share how large-scale organizations are architecting modern data platforms for speed, scale, and flexibility. The discussion spans real-world B2C challenges, internal innovations at hyperscale, and perspectives on where the stack is headed, especially in the age of AI.

11:05 AM – 11:35 AM | Powering a Petabyte-Scale Cache: Uber’s Alluxio Implementation

▼

Yangjun Zhang

Software Engineer, Uber

Beinan Wang

Software Engineer, Uber

At Uber, Presto is a critical engine for interactive analytics, processing hundreds of thousands of queries and scanning hundreds of petabytes of data daily. To meet the immense demands for low-latency queries and high reliability, Uber advanced its Alluxio deployment by engineering key architectural enhancements for greater scalability and reliability. This customized Alluxio system forms the backbone of our distributed remote caching layer, managing a cache size scaling from 3 to 4 petabytes. This talk will delve into Uber’s strategies for achieving 99.99% cache reliability with this enhanced system, featuring robust client fallback mechanisms and the use of consistent hashing to maintain efficiency during cluster scaling.

A significant outcome of this implementation is substantial egress bandwidth savings from underlying storage, which is particularly crucial for performance and cost efficiency during peak hours. We will share insights into managing these large-scale cache clusters, highlighting our adaptive cache filter that has been instrumental in achieving over 80% cache hit rates and optimizing resource utilization. Attendees will learn tangible benefits, best practices for leveraging Alluxio with Presto in high-throughput environments, and key takeaways for deploying a similar high-performance caching solution.

11:35 AM – 12:05 PM | Enhancing Presto C++ capabilities using Sidecar

▼

Pramod Satya

Software Engineer, IBM

Pratik Dabre

Software Engineer, IBM

This talk will go over the current status of the Fusionnext project, focusing on what the native sidecar and sidecar plugin support, and how they should be configured in Presto C++ deployment.

12:05 PM – 12:20 PM | Lightning Talk: Setting Up a Cross-Platform Development Environment for Presto C++ Using Dev Containers

▼

Miguel Blanco Godón

Software Developer, Denodo

Paula Santos García-Toriello

Sr. Software Architect, Denodo

For an open-source project to thrive, it’s crucial to simplify the onboarding process for new contributors. This talk will guide you through setting up a development environment for Presto C++ projects using dev containers. By leveraging dev containers, new contributors can quickly start working on these projects, ensuring consistency and enhancing productivity across various operating systems.

12:20 PM – 1:00 PM | Break

1:00 PM – 1:15 PM | Lightning Talk: Prestissimo Extension for AI Training Data Normalization at Meta

▼

Zac Wen

Software Engineer, Meta

Xiaoxuan Meng

Software Engineer, Meta

Wenqi Wu

Software Engineer, Meta

In this talk, we present a recent extension to Prestissimo to support AI training data normalization at Meta. We describe the AI Data Storage system built at Meta to deduplicate user sequence data and enable fast retrieval of aggregated data in different dimensions. We then deep dive into the changes made to Prestissimo to allow user sequence data exploration through Presto SQL, including the introduction of a new Index Join Operator and AI Data Storage Connector. Our extension enables optimized index join query plan generation and end-to-end query execution optimization.

1:15 PM – 1:45 PM | Self-Healing Queries: A Prototype for AI-Assisted Troubleshooting and Auto-Retry in Presto Workloads

▼

Satej Kumar Sahu

Principal Data Engineer, Zalando SE

When Presto queries fail—due to memory limits, skewed joins, or connector issues—engineers often scramble to diagnose and fix them manually. What if Presto could help fix itself? In this talk, I’ll present a prototype system that captures failed queries, analyzes failure patterns using LLMs, and automatically suggests (or retries) mitigated versions—e.g., adding session configs, breaking queries into smaller parts, or rewriting joins. We’ll cover how it works with query logs, the EXPLAIN plan, retry logic, and Presto’s session properties. This approach offers a glimpse into a future where Presto is not just fast, but smart and resilient.

1:45 PM – 2:15 PM | Unfenced UDF Deep Dive

▼

Soumya Duriseti

Software Engineer, IBM

In this session, we will cover how to get started with creating dynamically loaded user defined functions in Presto C++. There will be an introduction into the pros and cons of using this new functionality. Afterwards, there will be an overview of the process, and finally there will be a demo where the process of loading these functions will be shown live.

2:15 PM – 2:45 PM | From Source to Presto: A Developer Playground for Fast Analytics

▼

Rohan Khameshra

Co-founder, Datazip

This talk introduces a lightweight developer playground that demonstrates how to ingest change data from a transactional database (like Postgres or MySQL), register it via an open-source REST catalog (e.g., Polaris or LakeKeeper), and instantly make it queryable in Presto. The demo will walk through the setup, tools, and real-time experience of how quickly one can go from source data to interactive Presto queries using open standards and pluggable components. Ideal for developers and data engineers exploring modern lakehouse and federated query patterns.

2:45 PM – 3:00 PM | Lightning Talk: TPCDS connector in Presto C++

▼

Pramod Satya

Software Engineer, IBM

Pratik Dabre

Software Engineer, IBM

Presto has a TPCDS connector that lets users generate TPCDS tables with different scale factors. Recently we worked on adding a TPCDS connector in Presto C++, building on DuckDB’s TPCDS extension. DuckDB’s TPCDS extension provides C++ files that wrap over the dsdgen data generator, which is implemented in C and provided by the TPC organization. We initially added the TPCDS connector in Presto C++, subsequenty the data generation parts including dsdgen source files were moved to Velox. The TPCDS connector lets us generate TPCDS data on the fly for different scale factors in Presto C++, and write microbenchmarks in Velox for various TPCDS queries. In this talk, we will provide an overview of the implementation, look at the challenges faced in ensuring correctness, and compare performance of the connector in Presto and Presto C++.

3:00 PM – 3:05 PM | Closing Remarks

▼

Ali LeClerc

Head of Open Source Strategy, IBM

PrestoCon 2025 closing remarks.