Implementing Lakehouse Architecture with Presto at Bolt – Kostiantyn Tsykulenko, Bolt.eu

Implementing Lakehouse Architecture with Presto at Bolt – Kostiantyn Tsykulenko, Bolt.eu

Bolt.eu is the first European mobility super-app. We have over 100M users across Europe and Africa and have to deal with data at a large scale on a daily basis (over 100k queries daily). Previously we were using a traditional data warehouse solution based on Redshift but we’ve faced scalability issues that were hard to overcome and after doing our research we chose Presto as the solution. In just a single year we’ve managed to migrate to the Lakehouse architecture using AWS, Presto, Spark and Delta lake. We would like to talk about our journey, some of the challenges we’ve encountered and how we solved them.

Delta Lake Connector for Presto – Denny Lee, Databricks

Delta Lake Connector for Presto – Denny Lee, Databricks

Delta lake is an open-source project that enables building a lakehouse architecture on top of existing storage systems such as S3, ADLS, GCS, and HDFS. We – the Presto and Delta Lake communities – have come together to make it easier for Presto to leverage the reliability of data lakes by integrating with Delta Lake. In this session, we would like to share the design decisions and internals of the Presto/Delta connector.

Prism: Presto Gateway Service at Uber – Hitarth Trivedi, Uber

Prism: Presto Gateway Service at Uber – Hitarth Trivedi, Uber

Prism is a gateway service for all Presto queries at Uber. It addresses Uber specific needs in four main areas – resource management, query gating, monitoring, and security. It is responsible for proxying over three million weekly queries from 6000+ weekly active users across all of Uber. Presto has variable execution times due to high multi-tenancy at Uber. Prism helps in overcoming those challenges using features like query routing, load balancing, query gating, session parameter checks, failover clusters which helps in maintaining a 99.9% availability and reliability SLA for Presto at Uber. Functionality – Query Execution: 1. Async execution API returns data stream 2. Async execution API returns File Descriptor – Routing – Prism can route queries to different clusters based on client sources. Other functionalities: Load Balancing, Query Gating, Failover, Session Properties, Security