Presto is used for a variety of cases, but tends to be used for larger scale analytical queries. We have been transitioning to using Presto to power our data platform and customer-facing scripting language, RQL (Rippling Query Language) to run arbitrary customer queries to power core products. Presto helps enable diverse, federated querying at scale. In this talk, Andy will cover where Presto sits in Rippling’s ecosystem as a core query layer, our collaboration and contributions for closer integration with Apache Pinot, and learnings on using Presto to handle a large variety of query patterns.
In this world, most analytics products either focus on ad-hoc analytics, which requires query flexibility without guaranteed latency, or low latency analytics with limited query capability. In this talk, we will explore how to get the best of both worlds using Apache Pinot and Presto: 1. How people do analytics today to trade-off Latency and Flexibility: Comparison over analytics on raw data vs pre-join/pre-cube dataset. 2. Introduce Apache Pinot as a column store for fast real-time data analytics and Presto Pinot Connector to cover the entire landscape. 3. Deep dive into Presto Pinot Connector to see how the connector does predicate and aggregation push down. 4. Benchmark results for Presto Pinot connector.