Hands-on workshop
    Building an Open Lakehouse with

    Apache Hudi™ & Presto

    Thursday, July 31
    2pm – 4pm PDT
    IBM Silicon Valley Lab Office

    We’re hosting an in-person workshop in partnership with IBM and Onehouse! Please join us for a few hours of a hands-on workshop and get to know the Presto and Apache Hudi community.

    The lakehouse architecture brings together the scalability and flexibility of data lakes with the transactional capabilities traditionally found in data warehouses. This workshop is designed to equip data engineers and architects with the skills to build an open lakehouse architecture using Apache Hudi on AWS S3, with Presto as the engine for fast, interactive querying.

    Course outline:

    • Open Lakehouse architecture stack with Hudi as the lakehouse platform and Presto as the compute engine.
    • Practical exercises on:
      • Creating different Hudi table tables (Copy-On-Write and Merge-On-Read)
      • Ingesting data
      • Syncing with catalogs such as Hive Metastore
      • Various ways of querying data using Presto, including snapshot and read-optimized queries
      • The session will also touch on upcoming features in the Hudi-Presto connector, including support for Hudi 1.0 and new indexing capabilities

    Prerequisites

    • Programming knowledge – Python, SQL
    • Basic understanding of data lake file/table formats

    Location

    IBM Silicon Valley Lab
    555 Bailey Ave.
    San Jose, CA 95141

    Event has passed

    Check out our upcoming events

    Lab Instructors

    Kiersten Stokes
    Open Source Software Developer

    Sivabalan Narayanan
    Software Developer