Hands-on Virtual Workshop:
    Building an Open Data Lakehouse with Presto and Apache Iceberg

    February 28, 2024 | 10am PT on Zoom

    You may be familiar with the Data Lakehouse, an emerging architecture that brings the flexibility, scale and cost management benefits of the data lake together with the data management capabilities of the data warehouse. In this workshop, we’ll get hands-on building an Open Data Lakehouse – an approach that brings open technologies and formats to your lakehouse.

    This is a beginner-level workshop for software developers and engineers who are building data platforms. We’ll use Presto for the open-source SQL query engine, Apache Iceberg to enable ACID transactions, and Minio S3-compatible Object Storage for the data lake.

    You’ll get hands-on with Presto and Iceberg. We’ll show you how to set up and connect these technologies, how to run queries on your data, and how to access and interpret Iceberg metadata. By the end, you should be well-versed in Presto and Iceberg and have the building blocks to create your own Open Data Lakehouse.

    Course outline:

    • Introduction to the Open Data Lakehouse and the Presto query engine
    • Introduction to Apache Iceberg and common use cases
    • Querying S3 data with Presto
    • Integrating Iceberg with Presto
    • Working with Iceberg data and metadata tables
    • Future roadmap – what additional Iceberg support is coming to Presto like time travel and merge-on-read support

    This event has ended

    See our upcoming events!

    Lab Instructors

    Kiersten Stokes
    Software Developer

    Yihong Wang
    Software Developer