Presto at Adobe: How Adobe Advertising uses Presto for Adhoc Query, Custom Reporting, and Internal Pipelines

Presto at Adobe: How Adobe Advertising uses Presto for Adhoc Query, Custom Reporting, and Internal Pipelines

Rajmani Arya, Varun Senthilnathan & Manoj Kumar Dhakad, Adobe Advertising: We are from the Product Engineering team in Adobe Advertising (https://business.adobe.com/in/product…. Adobe Advertising is a digital advertisement platform. We take care of accumulating all data, providing platform intelligence, building and maintaining machine leaning capabilities, building and maintaining internal pipelines that form derived data to be used by other teams. The volume of total incoming raw data ranges between 8 to 10 tb/ day spread across 7 regions. The total data in the system currently is about 7pb. This data is largely stored in Hive tables with a central metastore. We use Presto in three ways: 1. Data studio – an internal tool to enable data analysts, sales, marketing and other teams to do adhoc querying. This is also used by data engineers to do adhoc querying for engineering tasks. 2. Custom Reports – We create reports for customers to get performance insights on their campaigns. We have 100s of reports that are run on a daily basis. 3. Internal Pipelines – Presto is used to retrieve data to power 100s of pipelines run daily to generate derived data.

Scaling Cache for Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang

Scaling Cache for Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang

While using the Presto Iceberg connector, the in-heap cache in Presto is likely overloaded. In this talk, Beinan and Chunxu will share the design, implementation, and optimization of the off-heap cache to address the scalability challenges. You will learn how to cache Iceberg data and metadata for the Presto Iceberg connector, followed by future work on improving table scans using Apache Arrow.

A Tour of Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang, Twitter

A Tour of Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang, Twitter

Apache Iceberg is an open table format for huge analytic datasets. The Presto Iceberg connector consolidates the SQL engine and the table format, to empower high-performant data analytics. Here, Beinan and Chunxu would like to discuss and share the architectural design of the Presto Iceberg connector, advanced Iceberg feature support (such as native iceberg connector, row-level deletion, and iceberg v2 support), and the future roadmap.

Presto and Apache Iceberg – Chunxu Tang, Twitter

Presto and Apache Iceberg – Chunxu Tang, Twitter

Apache Iceberg is an open table format for huge analytic datasets. At Twitter, engineers are working on the Presto-Iceberg connector, aiming to bring high-performance data analytics on Iceberg to the Presto ecosystem. Here, Chunxu would like to share what they have learned during the development, hoping to shed light on the future work of interactive queries.