Virtual Workshop
    Build an Open Data Lakehouse in a Day

    Hands-on with Presto, Iceberg, and Hudi

    Wednesday, Sept. 17 | 9:30am PT on Zoom

    The Data Lakehouse combines the flexibility, scale, and cost-efficiency of a data lake with the data management features of a warehouse. In this beginner-friendly workshop, we’ll take that concept a step further by building an Open Data Lakehouse using fully open-source technologies and formats.

    You’ll get hands-on experience with:

    • Presto as the open-source SQL query engine
    • MinIO for S3-compatible object storage
    • Two open table formats – Apache Iceberg and Apache Hudi – to compare their capabilities

    We’ll guide you through setting up and connecting these tools, running queries, and exploring the rich metadata and features that Iceberg and Hudi provide. By the end, you’ll understand how these technologies fit together and have the skills to start building your own Open Data Lakehouse.

    Course outline:

    • Introduction to the Open Data Lakehouse and the Presto query engine
    • Introduction to table formats and common use cases, with a quick overview of both Iceberg and Hudi
    • Querying S3 data with Presto
    • Integrating Iceberg & Hudi with Presto
    • Working with data and metadata tables

    Register to get a workshop seat

    Lab Instructors

    Kiersten Stokes
    Software Developer

    Yihong Wang
    Software Developer