Keynote Panel: Presto at Scale – Shradha Ambekar, Gurmeet Singh, Neerad Somanchi & Rupa Gangatirkar

Keynote Panel: Presto at Scale – Shradha Ambekar, Gurmeet Singh, Neerad Somanchi & Rupa Gangatirkar

Over the last decade Presto has become one of the most widely adopted open source SQL query engines. In use at companies large and small, Presto’s performance, reliability, and efficiency at scale have become critical to many companies’ data infrastructures. In this panel we’ll hear from three of the largest companies running Presto at scale – Meta, Uber, and Intuit. They’ll share more about their learnings, some of their impressive performance metrics with Presto, and what they envision going forward for Presto at their respective companies.

PrestoDB in HPE Ezmeral Unified Analytics – Milind Bhandarkar, HPE

PrestoDB in HPE Ezmeral Unified Analytics – Milind Bhandarkar, HPE


HPE Ezmeral Unified Analytics is an end-to-end data & AI/ML platform that consists of several popular open-source frameworks for data engineering, data analytics, data science, & ML engineering in a well-integrated packaging. These open-source frameworks include Apache Spark, Apache Airflow, Apache Superset, PrestoDB, MLFlow, Kubeflow, and Feast. This platform is built atop Kubernetes and provides built in security. In this talk we will focus on the role of PrestoDB in Unified Analytics as a fast SQL query engine, and also as a secure data access layer. We will discuss some of our value-additions to PrestoDB, such as a distributed memory-centric columnar caching layer that provides both explicit and transparent caching for dataset fragments, often leading to 3x to 4x query performance. We will conclude by proposing to make caching pluggable in PrestoDB and discussing future directions.

Presto On Spark: Scaling not Failing with Spark – Ariel Weisberg, Meta & Shradha Ambekar, Intuit

Presto On Spark: Scaling not Failing with Spark – Ariel Weisberg, Meta & Shradha Ambekar, Intuit

Presto on Spark is an integration between Presto and Spark that leverages Presto’s compiler/evaluation as a library and Spark’s large scale processing capabilities. It enables a unified SQL experience between interactive and batch use cases. A unified option for batch data processing and ad hoc is very important for creating the experience of queries that scale instead of fail without requiring rewrites between different SQL dialects. In this session, we’ll talk about Presto On Spark architecture, why it matters and its implementation/usage at Intuit.