Release 0.254#
Warning
There is a backward compatibility issue in the DWRF writer that might cause other engines to be unable to read files written by this release.
Details#
General Changes#
Fix a bug where queries that have both remote functions and a local function with only constant arguments could throw an
IndexOutOfBoundException
during planning. The bug was introduced by #16039.Fix a CPU regression for queries using
element_at()
forMAP
. Introduced by #16027.Add fragment result caching support for
UNNEST
queries.Add
poisson_cdf()
andinverse_poisson_cdf()
functions.Add memory tracking in
TableFinishOperator
which can be enabled by setting thetable-finish-operator-memory-tracking-enabled
configuration property totrue
. Enabling this property can help with investigating GC issues on the coordinator by allowing us to debug whether stats collection uses a lot of memory.Remove spilling strategy
PER_QUERY_MEMORY_LIMIT
and add configuration propertyexperimental.query-limit-spill-enabled
and session propertyquery_limit_spill_enabled
. Whenquery_limit_spill_enabled
is set totrue
and the spill strategy is notPER_TASK_MEMORY_THRESHOLD
, then we will spill whenever a query uses more than the per-node total memory limit in combined revocable and non-revocable memory.
Hive Changes#
Fix a bug where the files would not be sorted when inserting into bucketed sorted tables with Glue.
Add support for validating the values returned from the partition cache with the actual value from Metastore. This can be enabled by setting the configuration property
hive.partition-cache-validation-percentage
.Add support for allowing to match columns between table and partition schemas by names when the configuration property
hive.parquet.use-column-names
or the hive catalog session propertyparquet_use_column_names
is set totrue
. By default they are mapped by index.Add support for configuring the Glue endpoint URL. Hive Connector.
Add support for accessing tables in Glue metastore that do not have a table type.
Add support for the S3 Intelligent-Tiering storage class writing data. This can be enabled by setting the configuration property
hive.s3.storage-class
toINTELLIGENT_TIERING
.Add configuration property
hive.metastore.glue.max-error-retries
for the maximum number of retries for glue client connections. The default value is 10. Hive Connector.
Presto On Spark Changes#
Optimize Driver commit memory footprint.
Add session property
spark_memory_revoking_threshold
and configuration propertyspark.memory-revoking-threshold
. Spilling is triggered when total memory is beyond this threshold.
SPI Changes#
Add support for custom query prerequisites to be checked and satisfied through
QueryPrerequisites
interface. See #16073.
Contributors#
Abhisek Gautam Saikia, Akhil Umesh Mehendale, Andrii Rosa, Arjun Gupta, Beinan, Bhavani Hari, Chunxu Tang, Jalpreet Singh Nanda (:imjalpreet), James Petty, James Sun, Ke Wang, Maria Basmanova, Mayank Garg, Nikhil Collooru, Rebecca Schlussel, Rohit Jain, Rongrong Zhong, Sergey Pershin, Sergii Druzkin, Shixuan Fan, Tal Galili, Tim Meehan, Vic Zhang, Zhenxiao Luo, guhanjie, linjunhua, v-jizhang