Query Execution Optimization for Broadcast Join using Replicated-Reads Strategy – George Wang, Ahana

Query Execution Optimization for Broadcast Join using Replicated-Reads Strategy – George Wang, Ahana

Today presto supports broadcast join by having a worker to fetch data from a small data source to build a hash table and then sending the entire data over the network to all other workers for hash lookup probed by large data source. This can be optimized by a new query execution strategy as source data from small tables is pulled directly by all workers which is known as replicated reads from dimension tables. This feature comes with a nice caching property given that all worker nodes N are now participating in scanning the data from remote sources. The table scan operation for dimension tables is cacheable per all worker nodes. In addition, there will be better resource utilization because the presto scheduler can now reduce the number plan fragment to execute as the same workers run tasks in parallel within a single stage to reduce data shuffles.

Prism: Presto Gateway Service at Uber – Hitarth Trivedi, Uber

Prism: Presto Gateway Service at Uber – Hitarth Trivedi, Uber

Prism is a gateway service for all Presto queries at Uber. It addresses Uber specific needs in four main areas – resource management, query gating, monitoring, and security. It is responsible for proxying over three million weekly queries from 6000+ weekly active users across all of Uber. Presto has variable execution times due to high multi-tenancy at Uber. Prism helps in overcoming those challenges using features like query routing, load balancing, query gating, session parameter checks, failover clusters which helps in maintaining a 99.9% availability and reliability SLA for Presto at Uber. Functionality – Query Execution: 1. Async execution API returns data stream 2. Async execution API returns File Descriptor – Routing – Prism can route queries to different clusters based on client sources. Other functionalities: Load Balancing, Query Gating, Failover, Session Properties, Security