How Blinkit is Building an Open Data Lakehouse with Presto on AWS – Satyam Krishna & Akshay Agarwal

How Blinkit is Building an Open Data Lakehouse with Presto on AWS – Satyam Krishna & Akshay Agarwal

Blinkit, India’s leading instant delivery service, uses Presto on AWS to help them deliver on their promise of “everything delivered in 10 minutes”. In this session, Satyam and Akshay will discuss why they moved to Presto on S3 from their cloud data warehouse for more flexibility and better price performance. They’ll also share more on their open data lakehouse architecture which includes Presto as their SQL engine for ad hoc reporting, Ahana as SaaS for Presto, Apache Hudi and Iceberg to help manage transactions, and AWS S3 as their data lake.

Disaggregated Coordinator – Swapnil Tailor, Facebook

Disaggregated Coordinator – Swapnil Tailor, Facebook

In the existing Presto architecture, single coordinator has become a bottleneck in a number of ways for cluster scalability. – With an increasing number of workers, the coordinator has the potential of slow down due to a high number of tasks. – In high QPS use cases, we have found workers can become starved of splits by excessive CPU being spend on task updates in coordinator. – Also with single coordinator, we have an upper limit on the worker pool because of above-mentioned reasons. To overcome with this challenges, we are coming up with a new architecture which supports multiple coordinators in a single cluster.