Discovering Data with Presto and Amundsen at Lyft

Discovering Data with Presto and Amundsen at Lyft

Amundsen is an open-source data discovery and metadata platform which is part of LF AI & Data foundation. In this talk, we will deep dive into Amundsen’s architecture and how we integrate Amundsen with Presto to power the data preview and data exploration.

Disaggregated Coordinator – Swapnil Tailor, Facebook

Disaggregated Coordinator – Swapnil Tailor, Facebook

In the existing Presto architecture, single coordinator has become a bottleneck in a number of ways for cluster scalability. – With an increasing number of workers, the coordinator has the potential of slow down due to a high number of tasks. – In high QPS use cases, we have found workers can become starved of splits by excessive CPU being spend on task updates in coordinator. – Also with single coordinator, we have an upper limit on the worker pool because of above-mentioned reasons. To overcome with this challenges, we are coming up with a new architecture which supports multiple coordinators in a single cluster.