Improving Schema Management in Presto: Passing Catalog Names to the Metastore
Managing schemas in Presto just got a lot smarter. Thanks to a new enhancement, Presto can now pass catalog names directly to the metastore, enabling better logical organization, filtering, and schema isolation across multiple catalogs. This improvement significantly enhances the experience for users working with Hive, Hudi, Delta, and Iceberg catalogs.
š The Problem BeforeĀ
Historically, Presto interacted with the metastore assuming all schemas lived under a default catalog ā typically "hive". This caused several challenges:Ā
- Lack of catalog-awareness: The metastore treated all schemas as part of a single catalog namespace (āhiveā).Ā
- No schema isolation: Users couldnāt create the same schema name under different catalogs.Ā
- Inefficient schema filtering: There was no way to filter schemas based on the catalog association at the metastore level.Ā
This limitation led to confusion, cluttered namespace management, and potential naming conflicts for users operating multi-catalog environments.
ā The Solution: Catalog-Aware Metastore Integration
With the introduction of a new configuration property, Presto now supports passing the catalog name to the metastore:Ā
hive.metastore.catalog.name=<catalog-name>Ā 
This update applies across Hive, Hudi, Delta, and Iceberg catalogs, helping Presto users better manage metadata in modern, multi-catalog setups. You can view the full implementation details in the PR.
š What This Changes
Ā š Catalog + Schema = Unique KeyĀ
The metastore can now treat the combination of catalog and schema as a unique identifier.Ā
Example: You can have the same schema name under different catalogs like sales.analytics and customer.analytics.Ā
š Schema Isolation Across Catalogs
Logical separation of schemas across data sources or domains is now possible, reducing naming collisions and enabling multi-tenant designs.
ā” Efficient Schema Filtering
Schema queries can be filtered at the source by catalog, improving query performance and making results more accurate.
š Simplified Storage-Catalog Mapping
Passing the catalog name simplifies the relationship between physical storage and catalogs in the metastore.
š§ Why It MattersĀ
This update fixes a long-standing limitation in Presto and aligns with how some metastores, such as IBM Metastore and Hive Metastore, already handle catalogs internally. By fully supporting catalog-aware schema grouping, Presto now:Ā
- Removes the assumption of a single, default catalogĀ
- Unlocks flexible data modeling patternsĀ
- Makes metadata queries more efficient and semantically meaningfulĀ
Itās a big step forward for users managing complex or multi-tenant data lake architectures.
š§ How to Use It
Ā To enable catalog-aware schema management, simply set the configuration property in your catalog properties file (hive.properties, delta.properties, etc.):Ā
hive.metastore.catalog.name=<catalog_name>Youāll need to ensure your metastore (Hive or IBM Metastore) already has the catalog name registered. You can verify this by checking the CTLGS table in your metastore.Ā
š” Real-World Example
Suppose you need to connect to two different metastores, each with the same catalog name (foo) already registered. You can create two catalog property files in Presto, each configured for a different metastore:Ā
Configuration
foo-a-metastore.propertiesĀ
hive.metastore.catalog.name=fooĀ
hive.metastore.uri=thrift://metastore-a:9083foo-b-metastore.propertiesĀ
hive.metastore.catalog.name=fooĀ
hive.metastore.uri=thrift://metastore-b:9083Ā This setup lets Presto treat both as foo, while independently interacting with two distinct metastores ā making schema organization and access control simpler and more reliable across different environments.Ā
š” Final Thoughts
This enhancement brings schema management in Presto into better alignment with the needs of modern open-source data lake architectures. Whether you’re running multiple storage backends or building a multi-tenant platform, this update offers the structure and flexibility to grow.Ā
Ready to bring more order to your schemas? Start using hive.metastore.catalog.name today!Ā