PrestoDB in HPE Ezmeral Unified Analytics – Milind Bhandarkar, HPE
HPE Ezmeral Unified Analytics is an end-to-end data & AI/ML platform that consists of several popular open-source frameworks for data engineering, data analytics, data science, & ML engineering in a well-integrated packaging. These open-source frameworks include Apache Spark, Apache Airflow, Apache Superset, PrestoDB, MLFlow, Kubeflow, and Feast. This platform is built atop Kubernetes and provides built in security. In this talk we will focus on the role of PrestoDB in Unified Analytics as a fast SQL query engine, and also as a secure data access layer. We will discuss some of our value-additions to PrestoDB, such as a distributed memory-centric columnar caching layer that provides both explicit and transparent caching for dataset fragments, often leading to 3x to 4x query performance. We will conclude by proposing to make caching pluggable in PrestoDB and discussing future directions.
Running PrestoDB on Kubernetes with Ahana Cloud and AWS EKS
PrestoDB is built to be cloud agnostic and container-friendly, but getting it to run on Kubernetes in the cloud can be challenging. In this talk, Gary Stafford (AWS) and Dipti Borkar (Ahana) will discuss: Why use the in-VPC deployment model with AWS and demo, etc – Deploying PrestoDB on AWS EKS using the Ahana Cloud managed service within the user’s AWS account.
Presto for Real Time Analytics at Uber – Ankit Sultana, Uber
The Real Time Analytics Platform at Uber serves 100M+ queries daily and is used for several critical features: from end-user app features to radius selection for Uber Eats. All these queries are proxied via a custom internal fork of Presto (named Neutrino) that is optimized for low-latency/high-throughput (50ms latency at 1000s of RPS). With this talk we plan to share our learnings over the last 6 months and how we run Presto reliably at this scale for real-time analytics.
Building a Modern Data Platform with Presto – Denis Krivenko, Platform24
Hadoop era is gone. Cloud computing is today’s reality. But… What if you cannot use public clouds? What if your cloud does not provide data platform capabilities? What if you want your solution to be cloud agnostic? In this case you create your own cloud native data platform on Kubernetes. In the session Denis will talk about reasons for building analytics data platform solution in Platform24, cloud native data platform architecture principles, data stack they use and why Presto plays one of the key roles in it.
Realtime Analytics with Presto and Apache Pinot – Xiang Fu
In this world, most analytics products either focus on ad-hoc analytics, which requires query flexibility without guaranteed latency, or low latency analytics with limited query capability. In this talk, we will explore how to get the best of both worlds using Apache Pinot and Presto: 1. How people do analytics today to trade-off Latency and Flexibility: Comparison over analytics on raw data vs pre-join/pre-cube dataset. 2. Introduce Apache Pinot as a column store for fast real-time data analytics and Presto Pinot Connector to cover the entire landscape. 3. Deep dive into Presto Pinot Connector to see how the connector does predicate and aggregation push down. 4. Benchmark results for Presto Pinot connector.