Avoid Data Silos in Presto in Meta: the journey from Raptor to RaptorX

    Raptor is a Presto connector (presto-raptor) that is used to power some critical interactive query workloads in Meta (previously Facebook). Though referred to in the ICDE 2019 paper Presto: SQL on Everything, it remains somewhat mysterious to many Presto users because there is no available documentation for this feature. This article will shed some light…

    Common Sub-Expression optimization

    The problem One common pattern we see in some analytical workloads is the repeated use of the same, often times expensive expression. Look at the following query plan for example: The expression JSON_PARSE(features) is used 6 times, and casted to different ROW structures for further processing. Traditionally, Presto would just execute the expression 6 times,…

    Improving the Presto planner for better push down and data federation

    Presto defines a connector API that allows Presto to query any data source that has a connector implementation. The existing connector API provides basic predicate pushdown functionality allowing connectors to perform filtering at the underlying data source. However, there are certain limitations with the existing predicate pushdown functionality that limits what connectors can do. The…