Keynote Panel: Presto at Scale – Shradha Ambekar, Gurmeet Singh, Neerad Somanchi & Rupa Gangatirkar

Keynote Panel: Presto at Scale – Shradha Ambekar, Gurmeet Singh, Neerad Somanchi & Rupa Gangatirkar

Over the last decade Presto has become one of the most widely adopted open source SQL query engines. In use at companies large and small, Presto’s performance, reliability, and efficiency at scale have become critical to many companies’ data infrastructures. In this panel we’ll hear from three of the largest companies running Presto at scale – Meta, Uber, and Intuit. They’ll share more about their learnings, some of their impressive performance metrics with Presto, and what they envision going forward for Presto at their respective companies.

Presto On Spark: Scaling not Failing with Spark – Ariel Weisberg, Meta & Shradha Ambekar, Intuit

Presto On Spark: Scaling not Failing with Spark – Ariel Weisberg, Meta & Shradha Ambekar, Intuit

Presto on Spark is an integration between Presto and Spark that leverages Presto’s compiler/evaluation as a library and Spark’s large scale processing capabilities. It enables a unified SQL experience between interactive and batch use cases. A unified option for batch data processing and ad hoc is very important for creating the experience of queries that scale instead of fail without requiring rewrites between different SQL dialects. In this session, we’ll talk about Presto On Spark architecture, why it matters and its implementation/usage at Intuit.

Disaggregated Coordinator – Swapnil Tailor, Facebook

Disaggregated Coordinator – Swapnil Tailor, Facebook

In the existing Presto architecture, single coordinator has become a bottleneck in a number of ways for cluster scalability. – With an increasing number of workers, the coordinator has the potential of slow down due to a high number of tasks. – In high QPS use cases, we have found workers can become starved of splits by excessive CPU being spend on task updates in coordinator. – Also with single coordinator, we have an upper limit on the worker pool because of above-mentioned reasons. To overcome with this challenges, we are coming up with a new architecture which supports multiple coordinators in a single cluster.