Ending DAG Distress: Building Self-Orchestrating Pipelines for Presto – Roy Hasson, Upsolver

Ending DAG Distress: Building Self-Orchestrating Pipelines for Presto – Roy Hasson, Upsolver

Ending DAG Distress: Building Self-Orchestrating Pipelines for Presto – Roy Hasson, Upsolver dbt and Airflow is a popular combination for creating and scheduling batch data modeling and transformation jobs that execute in a data warehouse like Snowflake. Presto users querying the data lake need a similar solution that is simple to use and makes it easy to ingest, model, transform and maintain datasets, without having to write or manage complex DAGs. In this session you will learn how Upsolver built a tool that allows engineers, developers and analysts to write data pipelines using SQL. Pipelines are automatically orchestrated, are data-aware and maintain a consistent data contract between each stage of the pipeline. You will also learn how to introduce the idea of data products into your company to enable more self-service for your Presto users.

Querying streaming data with Presto, Amazon Athena and Upsolver

Querying streaming data with Presto, Amazon Athena and Upsolver

In this session, Yoni will present on querying streaming data with Presto and Amazon Athena including performance, data partitioning and compaction. In addition, we will demo using the Upsolver platform with Amazon Athena. In addition, he will share what they are working on with Prestodb.

Predicting Resource Usages of Future Queries Based on 10M Presto Queries at Twitter

Predicting Resource Usages of Future Queries Based on 10M Presto Queries at Twitter

Here, Chunxu and Beinan would like to share what they have learned in developing a highly-scalable query predictor service through applying machine learning algorithms to ~10 million historical Presto queries to classify queries based on their CPU times and peak memory bytes. At Twitter, this service is helping to improve the performance of Presto clusters and provide expected execution statistics on Business Intelligence dashboards.

A Tour of Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang, Twitter

A Tour of Presto Iceberg Connector – Beinan Wang, Alluxio & Chunxu Tang, Twitter

Apache Iceberg is an open table format for huge analytic datasets. The Presto Iceberg connector consolidates the SQL engine and the table format, to empower high-performant data analytics. Here, Beinan and Chunxu would like to discuss and share the architectural design of the Presto Iceberg connector, advanced Iceberg feature support (such as native iceberg connector, row-level deletion, and iceberg v2 support), and the future roadmap.

Presto and Apache Iceberg – Chunxu Tang, Twitter

Presto and Apache Iceberg – Chunxu Tang, Twitter

Apache Iceberg is an open table format for huge analytic datasets. At Twitter, engineers are working on the Presto-Iceberg connector, aiming to bring high-performance data analytics on Iceberg to the Presto ecosystem. Here, Chunxu would like to share what they have learned during the development, hoping to shed light on the future work of interactive queries.

Level 101 for Presto: What is PrestoDB?

Level 101 for Presto: What is PrestoDB?

In Level 101, you’ll get an overview of Presto, including: A high level overview of Presto & most common use cases The problems it solves and why you should use it A live, hands-on demo on getting Presto running on Docker Real world example: How Twitter uses Presto at scale