What’s New in the Presto-Iceberg Connector
The last several years have seen the adoption of Apache Iceberg gaining significant momentum. As Apache Iceberg has grown, so has Presto’s support for the table format in its Presto-Iceberg connector. In this blog, we will briefly look at what Apache Iceberg is and why it’s important before diving into a summary of some of the important enhancements to Presto’s Iceberg connector in the last several months. Some of these features are not yet released, but will be a part of Presto version 0.288!
Apache Iceberg Background & Future Roadmap
If you’re not familiar with Apache Iceberg, it is an open source table format for storing petabyte-scale data sets. Using a host of metadata kept on each table, Iceberg provides functionality that is not traditionally available in a data lakehouse. This includes schema evolution, partition evolution, table version rollback, and plenty more. Iceberg manages the tracking of the relevant metadata using its concept of a catalog. Catalogs are implemented for several backend systems and are what allow Presto to load the tracked tables. Among the metadata that a catalog keeps track of include a list of table snapshots. A new snapshot is created every time the data changes.
Iceberg defines what features and functionality are supported in its table specification, or spec. The spec has gone through a few iterations: versions 1 and 2 are widely adopted by the community, with a third version in development right now. Version 1 is the first version and defines how Iceberg manages huge tables made up of Parquet, ORC or Avro files. Version 2 adds to the spec the ability to perform row-level operations like update and delete.
We’ll see more specifics about these Iceberg concepts as we go through the new enhancements to Presto below.
Presto-Iceberg Connector Enhancements
Time Travel Support
Time travel, in this case, refers to the ability to query historical data rather than limiting queries to only the current data version. This is a valuable feature for analyzing trends over time, reconstructing datasets, debugging and more. As of Presto version 0.286, users can query historical data by providing either a snapshot ID or a timestamp, along with the AS OF predicate. The ability to submit these queries using the BEFORE predicate was also recently added to the Presto codebase, set to be available in the 0.288 release. This blog provides a more in-depth look at time travel queries in Presto if you want to try an example for yourself.
Row-Level Deletes & Other Delete Enhancements
The concept of deleting or updating a row of data from a table is easy to understand logically and easy for a traditional OLTP database to perform, but is actually a very difficult concept in a data lakehouse. Data is stored in a columnar basis in Iceberg, which makes accessing and manipulating data faster but has the downside that a single-row operation has to touch several different files. Presto has added several delete-specific enhancements in the last few releases. The first came in Presto version 0.285, which supports deletion of one or more whole partitions in Iceberg tables. For example, a SQL statement such as DELETE FROM lineitem WHERE linenumber = 1; will succeed assuming the table lineitem is partitioned by linenumber.
Iceberg’s version 2 spec adds support for row-level deletion by encoding the operations in specific delete files. Recent Presto releases add support for these more fine-grained delete operations. Version 0.286 added read support for row-level deletes to the Iceberg connector, and version 0.287 implements full write support for row-level deletes on merge-on-read tables. Row-level deletes result in the creation of positional delete files that store the exact location of deleted records and are referenced on subsequent reads in order to display the most up-to-date information.
There is one additional recent delete enhancement for Iceberg tables available as of version 0.286, and that is the ability to apply equality deletes as a join operation. Consider a scenario in which log transactions are being streamed from another source and transformed into operations in an Iceberg table. The transactions enter at a rate of hundreds per second, and some of the operations are equality delete operations. This would be a very expensive operation, because each table split required a separate read of the equality delete file. By applying these deletes as a join operation instead, we reduce the number of reads of the equality delete file to a single occurrence, potentially saving an exponential amount of time. We can also take advantage of the built-in optimizations available for joins. This feature can be enabled with the new session property iceberg.delete_as_join_pushdown_enabled.
Partition Transform Enhancements
The concept of partitioning long predates Iceberg, but Iceberg uses what it calls hidden partitioning to make things even more seamless. For example, when partitioning in Hive, the partition column must be an explicit column of the table and of a particular data type. The user must keep track of how to insert new data into a partition column and how to query it. In Iceberg, partitions are logically separate from the table itself and can be optionally transformed for even more efficient organization. Partitioning can be done by hashing into a series of buckets, truncating values to a specified length, and for time-series data, chunking by year, month, day and hour. Bucketing, truncating, and identity transforms are called categorical transforms. The Presto-Iceberg connector has added several enhancements related to partition transforms over the last several releases.
While Presto has supported Iceberg bucket and truncate transforms for quite some time, version 0.285 added support not only for specifying a new table column as a partition column, but also for indicating a categorical transform when adding that column as well. This schema evolution operation would now look something like this: ALTER TABLE iceberg.web.page_views ADD COLUMN location VARCHAR WITH (partitioning = 'truncate(2)'). Presto version 0.286 goes on to add support for time-series partition transforms on columns of both date and timestamp types. It also supports indicating the time-series transform during schema evolution as well.
Iceberg REST Catalog Support
As mentioned above, Iceberg catalogs are the logical concept by which tables are accessed and tracked. Iceberg supports a handful of different catalog implementations, and Presto further supports most of these. Most recently, the Iceberg REST catalog has gained quick favor due to its flexibility. In the case of a REST catalog, the server-side catalog logic can be customized and implemented in any language as long as it follows the Iceberg REST Open API specification. Iceberg has prioritized work on the REST catalog, and new features will be available to REST catalog implementations first and foremost. As of version 0.288, Presto’s Iceberg connector will support the REST catalog, including basic OAUTH2 support and session support.
Concurrent Insertion Support
One of the often-touted features of Apache Iceberg is that it provides ACID compliance for table transactions, even when using eventually consistent data lakehouse stores like S3. This is possible once again due to Iceberg’s unique metadata storage model. All changes to a table’s data are captured – either directly or indirectly – in a single metadata file that replaces the previous version in an atomic operation. This provides the ability to do time travel operations, concurrent insertions, and other reliability guarantees. Presto version 0.287 enhanced the Iceberg connector by providing a new configurable table property, iceberg.commit_retries, that determines the number of attempts to commit the new metadata in case of concurrent requests. The default is 4.
View Support
Views are, as the name implies, a logical view of data that references data in one or multiple stored tables. A view is essentially a virtual table that is represented by a query, and this query is executed whenever the view is invoked. All processing engines store view and view metadata in a different way that cannot necessarily be accessed by other engines. Iceberg views seek to fix this issue and make views available for cross-engine access by standardizing the metadata stored for a view, similar to how Iceberg standardizes general table metadata storage. As of Presto version 0.285, the Iceberg connector has view support for Hive and Glue metastore Iceberg catalogs.
Table Register & Unregister Support
Version 0.286 of the Presto-Iceberg connector also adds the ability to register Iceberg tables with Presto. This means that Iceberg tables with data and metadata already existing in the file system can be added to a particular Presto catalog using the procedure <presto_catalog_name>.system.register_table. This can be useful in the context of migrating catalogs. The new procedure takes the target schema, desired table name, and the location of the table metadata as arguments, with the option to specify a specific metadata file location as a fourth argument. Likewise, a procedure for unregistering an Iceberg table is also available: <presto_catalog_name>.system.unregister_table('<schema_name>', '<table_name>'). This procedure removes a table from the catalog and, in the case of native Iceberg catalogs, removes its data and metadata from the underlying file system as well.
Presto C++ Enhancements
Lastly, there has been a great deal of work on the Iceberg connector for the Presto’s C++-based execution engine currently in development. Code-named Prestissimo, it will leverage the Velox library data processing primitives to vastly improve CPU efficiency while also eliminating unpredictable Java memory accounting issues. The Iceberg connector combined with the Prestissimo engine is poised to deliver impressive speed at impressive scale. Iceberg support for Prestissimo in the last few Presto releases has seen a lot of added support. As of Presto version 0.287, the native Iceberg connector can support reading Iceberg version 2 tables with positional deletes and has full support for Iceberg version 1 tables. Version 0.286 includes enhancements for filter pushdown, including a new optimizer rule for Velox execution and a configurable property to control the pushdown behavior for Java versus native engines.
Conclusion
The last 6 months have seen a great deal of work dedicated to Presto’s Iceberg connector, and momentum is building. As Iceberg continues to grow in popularity, we look forward to continuing the work to support all its great features.
As a true open source project, Presto welcomes community and user involvement of any kind. Find information about how you can reach out with questions and suggestions by joining our Slack channel, or learn more about contributing here. If you’re new to the Presto-Iceberg connector and are interested in learning more, register for our upcoming Building an Open Data Lakehouse with Presto & Apache Iceberg workshop. We look forward to your participation!