Presto Blog - Page 2 of 6

Setting Up Presto with Apache Superset: Hands-On Guide
By Saurabh Mahawar August 7, 2025August 7, 2025
PrestoDB, an open-source distributed SQL query engine, allows you to query data from multiple disparate sources. When combined with Apache Superset, an open-source data visualization and exploration platform, it forms a powerful and flexible analytics solution. This guide provides a step-by-step approach to deploying these components within a Dockerized environment, simplifying setup and management. Pre-Requisites:…
Read More Setting Up Presto with Apache Superset: Hands-On Guide
Build Your Open Data Lakehouse: A Step-by-Step ETL Guide with MySQL, OLake, and PrestoDB
By Saurabh Mahawar July 29, 2025August 19, 2025
This tutorial provides a comprehensive guide to building an Open Data Lakehouse from scratch, a modern and flexible data architecture solution. Open Data Lakehouses offer a powerful and scalable method for storing, managing, and querying both structured and semi-structured data, leveraging a suite of robust open-source tools for enhanced control and flexibility. Pre-Requisites: Before commencing…
Read More Build Your Open Data Lakehouse: A Step-by-Step ETL Guide with MySQL, OLake, and PrestoDB
Leading by Contribution: IBM’s Ongoing Investment in Open-Source Presto
By Anant Aneja, Yabin Ma, Ali LeClerc & Ethan Zhang July 15, 2025July 16, 2025
Note: This is a cross-post from https://community.ibm.com/community/user/blogs/ali-leclerc/2025/07/15/ibms-ongoing-investment-in-presto At IBM, we believe open source is the engine of innovation. Presto, as a fast and flexible SQL engine for interactive analytics, continues to evolve rapidly thanks to community contributions. Over the past year, IBM engineers have focused on driving Presto forward across security, performance, native execution, and…
Read More Leading by Contribution: IBM’s Ongoing Investment in Open-Source Presto
Presto Installation : A Step by Step Guide to Run SQL Queries
By Saurabh Mahawar July 15, 2025February 12, 2026
Install and run PrestoDB (0.296+) on your local machine in under 10 minutes. Follow this hands-on tutorial to deploy a high-performance SQL query engine for your Data Lakehouse architecture. Prerequisites Before getting started, ensure that the following are installed: Download and Extract Run these commands in your terminal to download the latest server package. Configure…
Read More Presto Installation : A Step by Step Guide to Run SQL Queries
How Twilio Scales Presto with Odin: A New Query Gateway
By Ali LeClerc May 20, 2025May 20, 2025
One of my favorite parts of working with the Presto community is seeing how different companies push the project forward in creative ways. Recently, Aakash Pradeep from Twilio shared a great example of this with their development of Odin, a new modular query gateway they built to help scale Presto usage across their organization. I…
Read More How Twilio Scales Presto with Odin: A New Query Gateway
Fueling Presto’s Momentum and IBM’s Growing Role in the Open-Source SQL Engine
By Ethan Zhang May 13, 2025June 11, 2025
I’ve now been a part of IBM for 2 years and I’m pretty encouraged with the work this team has put into open-source Presto. So, I wanted to take some time to share in a blog what we’ve been up to for the last 2 years and the growth we’ve seen collectively in the community…
Read More Fueling Presto’s Momentum and IBM’s Growing Role in the Open-Source SQL Engine
Safeguarding Presto C++ Memory Usage with LinuxMemoryChecker
By Minhan Cao & Christian Zentgraf May 6, 2025May 6, 2025
Problem Running the Presto C++ worker stably in a production environment relies on proper configuration that maximizes stability without sacrificing performance. Presto C++ designed a LinuxMemoryChecker to achieve this goal. The evaluation engine used in Presto C++ is Velox. Velox, the evaluation engine used in Presto C++, implements a MemoryManager that provides several advanced features…
Read More Safeguarding Presto C++ Memory Usage with LinuxMemoryChecker
Improving Schema Management in Presto: Passing Catalog Names to the Metastore
By Anurag Dwivedi April 28, 2025May 6, 2025
Managing schemas in Presto just got a lot smarter. Thanks to a new enhancement, Presto can now pass catalog names directly to the metastore, enabling better logical organization, filtering, and schema isolation across multiple catalogs. This improvement significantly enhances the experience for users working with Hive, Hudi, Delta, and Iceberg catalogs. 🔍 The Problem Before …
Read More Improving Schema Management in Presto: Passing Catalog Names to the Metastore
How Jio Platforms Leverages Presto for Large-Scale Analytics
By Ali LeClerc March 6, 2025March 6, 2025
In our latest community spotlight, we sat down with Sonal Holankar, Associate Data Engineer at Jio Platforms, about how they use Presto to power analytics at scale. Jio Platforms, a subsidiary of Reliance Industries, is one of India’s leading digital service providers, with a suite of applications and services, including JioMart, JioMoney, and JioGames. Managing…
Read More How Jio Platforms Leverages Presto for Large-Scale Analytics
Optimizing Our Data Lakehouse with Presto: A Strategic Transformation Project
By Yahya Elemam February 25, 2025March 6, 2025
This is a guest post from Yahya Elhag Elemam, Director of Big Data and Analytics at Zain Sudan Our data journey – from Cloudera to Data Lakehouse Zain Sudan is a mobile phone operator in Sudan (and was the first mobile phone operator in Sudan when it started in 1996!), and part of the Zain…
Read More Optimizing Our Data Lakehouse with Presto: A Strategic Transformation Project
Presto Native Engine in 2025
By Aditi Pandit February 10, 2025February 10, 2025
Introduction The Presto Native Worker is the latest innovation in the Presto SQL engine. Its optimizations reduce the CPU and memory footprint of Presto production clusters leading to great price performance. We are already seeing the benefits at Meta/Uber/IBM watsonx.data. Hardware accelerations startups like Neuroblade also show great benchmarking results with Presto. To get a…
Read More Presto Native Engine in 2025
PrestoCon 2024 Recap: Celebrating Community and Showcasing Innovation
By Ali LeClerc December 12, 2024December 12, 2024
PrestoCon 2024 wrapped up last week, and it was truly inspiring to see so many members of our vibrant community come together in person. There’s something special about connecting face-to-face, exchanging ideas, and sharing a collective passion for Presto. Events like this strengthen the bonds within our community and remind us why open source thrives…
Read More PrestoCon 2024 Recap: Celebrating Community and Showcasing Innovation
Updating the Presto Helm Chart to Support Presto C++
By Michael Ceruzzi December 10, 2024December 11, 2024
While not mandatory for deploying Presto, Helm offers a streamlined and customizable method of deploying Presto in kubernetes, simplifying what could otherwise be a complicated process. With just a few properties specified in a values.yaml file, the user is able to configure and deploy a fully fledged version of Presto at the sizing of their…
Read More Updating the Presto Helm Chart to Support Presto C++
CTE Materialization Framework in Presto
By Jay Narale October 1, 2024October 1, 2024
In this blog, we’ll dive deep into the Common Table Expression (CTE) Materialization Framework in Presto, a framework open-sourced by Uber and Meta and used in Presto. We will also showcase some actual production gains observed in Uber. The goal of CTE Materialization is to minimize query level redundant computations in queries, conserving system resources…
Read More CTE Materialization Framework in Presto
Query Optimization with Historical-Based Optimization Framework in Presto
By Jay Narale September 26, 2024October 29, 2024
In this blog I’ll discuss the historical-based optimization (HBO), a framework open-sourced by Meta (see their presentation from PrestoCon) and used in Presto. The HBO framework enables advanced query optimization techniques by leveraging historical execution statistics. This approach offers a more efficient query execution strategy through its unique cost estimation, plan transformations, and the incorporation…
Read More Query Optimization with Historical-Based Optimization Framework in Presto
An update on Presto C++ (and more about Presto sidecar)
By Tim Meehan September 11, 2024September 11, 2024
Presto is transforming its evaluation engine to use Velox, a highly performant modular execution engine. As anyone who’s worked on databases knows, they are very large and complicated, and making systems behave in identical ways is nearly impossible. To address the subtle differences between how Velox executes queries and Presto Java used to, and to…
Read More An update on Presto C++ (and more about Presto sidecar)
Capturing Worker Runtime Metrics with Prometheus Reporter in Presto C++
By Karteek Murthy September 3, 2024September 5, 2024
In this blog we will look at the Presto C++ worker’s ability to report worker level metrics through Presto CPP BaseStatsReporter interface and how this interface is implemented and integrated with Prometheus, a time series database. Background on Presto Architecture Presto, an open-source distributed SQL query engine, operates with a coordinator and worker nodes. In this…
Read More Capturing Worker Runtime Metrics with Prometheus Reporter in Presto C++
Presto Console – SQL Client on Web UI
By Yi-hong Wang August 29, 2024August 29, 2024
The Presto command line interface (CLI) is one of the de facto tools for interacting with the Presto SQL engine. To use it, you need to download the jar file and a Java runtime. You can find detailed information here. You’re probably wondering: “Is there an easier way to run SQL queries?”. The answer is “YES”!…
Read More Presto Console – SQL Client on Web UI