There is a lock contention in Hive connector which slows the query execution.
Fix a bug where a query with an order by clause and an unpartitioned window function sometimes returned unordered results.
Improve Graphviz output for JOIN nodes and estimate stats.
Reduce the number of hdfsConfiguration copies in the worker. This feature is enabled by default and can be controlled by setting the
hive.copy-on-first-write-configurationconfiguration property appropriately.
Add support for complex JsonPath expressions in
json_size()using Jayway JsonPath.
Add support to nested SQL functions with lambdas when
inline_sql_functions = false.
Add the GeoPlugin by default. Previously it was an optional plugin.
Add documentation for T-Digest Functions.
Add ConnectorPlanRewriter utility to simplify writing Connector specific optimizer rules.
query.max-total-memory-per-node propertyoptional with default value dependent on
Add support of function registration for interface org.apache.hadoop.hive.ql.exec.UDF`.
Add sized-based split weights for Hudi connector to improve query performance. This is enabled by default, controlled by configuration property
hudi.size-based-split-weights-enabledand session property
hudi.size_based_split_weights_enabled. Two more configuration properties are added to adjust the weight:
hudi.standard-split-weight-sizeto configure the split size corresponding to the standard split weight, and
hudi.minimum-assigned-split-weightto configure the minimum split weight.
Update Iceberg to 0.14.0.
Fix an incorrect result issue caused by cross-join query pushdown by throwing errors instead of providing the wrong answer.
Add new config
pinot.attempt-broker-queriesto attempt to pushdown queries to brokers.
Add support for pinot controller and broker authentication with user and password.
Add pushdown support for
Add a round-robin scheduler.
Add a weighted random choice scheduler.
Add an optional config for the query predictor uri.
Add presto router’s doc.
Add a new configuration property
spark.retry-on-out-of-memory-with-increased-memory-settings-enabledto enable picking up presto and spark memory settings and retry the query within the same spark session with the new settings applied. This can be overridden by
Add new configuration properties
spark.retry-spark-configsto alter the Presto session properties and Spark settings during retry respectively. They can be overridden by
Add a new configuration property
spark.resource-allocation-strategy-enabledand its session property
spark_resource_allocation_strategy_enabledto allow optimized resource allocation strategy. This enables automatic executor and hash partition count to be estimated during planning time. The estimation could be bounded by configurations including
spark.min-hash-partition-count. There are also corresponding session properties for these estimations.
Amit Pandey, Anant Aneja, Andy Li, Arjun Gupta, Arunachalam Thirupathi, Beinan Wang, Chunxu Tang, Daniel Izcovich, Eduard Tudenhoefner, Feilong Liu, George Wang, Harsh Kevadia, James Sun, James Turner, Maria Basmanova, Michael Shang, Nikhil Collooru, Onder Kaya, Patrick Sullivan, Pranjal Shankhdhar, Rebecca Schlussel, Reetika Agrawal, Rongrong Zhong, Sacha Viscaino, Sergey Pershin, Sergii Druzkin, Sourav Pal, Swapnil Tailor, Timothy Meehan, Todd Gao, Vivek, Xiang Fu, Y Ethan Guo, abhiseksaikia, dnskr, ericyuliu, pratyakshsharma, shidayang, v-jizhang