Common Sub-Expression optimization

The problem One common pattern we see in some analytical workloads is the repeated use of the same, often times expensive expression. Look at the following query plan for example: The expression JSON_PARSE(features) is used 6 times, and casted to different ROW structures for further processing. Traditionally, Presto would just execute the expression 6 times,…

Using OptimizedTypedSet to Improve Map and Array Functions

Function evaluation is a big part of projection CPU cost. Recently we optimized a set of functions that use TypedSet, e.g. map_concat, array_union, array_intersect, and array_except. By introducing a new OptimizedTypeSet, the above functions saw improvements in several dimensions: Furthermore, OptimizedTypeSet resolves the long standing issue of throwing EXCEEDED_FUNCTION_MEMORY_LIMIT for large incoming blocks: “The input…