Getting started with the new Redis HBO for Presto (Aug 30, 2023)

Getting started with the new Redis HBO for Presto (Aug 30, 2023)

Learn more about the new open-source Redis-based Historical Statistics Provider for Presto from Jay Narale, software engineer at Uber who built it. Redis is an open-source in-memory database that integrates with Presto through a dedicated connector. Now with a Redis history-based optimizer, you can enhance the efficiency and speed of query execution for Presto by using historical stats to generate optimized plans for your queries. Jay will cover how the Redis HBO utilizes the in-memory capabilities of Redis to store & analyze historical query execution data, which helps the optimizer make informed decisions about query planning and resource allocation based on the historical patterns of queries, leading to improved execution times and resource utilization.

Presto at Adobe: How Adobe Advertising uses Presto for Adhoc Query, Custom Reporting, and Internal Pipelines

Presto at Adobe: How Adobe Advertising uses Presto for Adhoc Query, Custom Reporting, and Internal Pipelines

Rajmani Arya, Varun Senthilnathan & Manoj Kumar Dhakad, Adobe Advertising: We are from the Product Engineering team in Adobe Advertising (https://business.adobe.com/in/product…. Adobe Advertising is a digital advertisement platform. We take care of accumulating all data, providing platform intelligence, building and maintaining machine leaning capabilities, building and maintaining internal pipelines that form derived data to be used by other teams. The volume of total incoming raw data ranges between 8 to 10 tb/ day spread across 7 regions. The total data in the system currently is about 7pb. This data is largely stored in Hive tables with a central metastore. We use Presto in three ways: 1. Data studio – an internal tool to enable data analysts, sales, marketing and other teams to do adhoc querying. This is also used by data engineers to do adhoc querying for engineering tasks. 2. Custom Reports – We create reports for customers to get performance insights on their campaigns. We have 100s of reports that are run on a daily basis. 3. Internal Pipelines – Presto is used to retrieve data to power 100s of pipelines run daily to generate derived data.