Elevating Presto Query Optimization: Leveraging State-of-the-Art Techniques for Improved Performance 

Elevating Presto Query Optimization: Leveraging State-of-the-Art Techniques for Improved Performance 

Presto, a prominent open-source distributed SQL query engine, has been at the leading edge of high-performance data analytics for over a decade. In analytical data processing, the effectiveness of query optimization is paramount. Over the last half-century, optimizing SQL queries has been a hotbed of research and development, resulting in groundbreaking innovations. This blog post…

Our Presto Credo for the Truly Open Source SQL Query Engine

We believe that data analytics should be democratized—and is why we innovate Presto with state-of-the-art database technology. Trusted governance is important to us—and is why we model our project governance and by laws after the Linux Foundation. TO OUR FELLOW DATA ENGINEERS, SOFTWARE DEVELOPERS, AND DATA PLATFORM ENTHUSIASTS: As the use of data analytics and…

Is PrestoDB the most popular Open Source Data Analytics project?

The Presto Foundation is thrilled to announce that today Presto has been awarded “2022 Editors Choice for Top 3 Data and AI Open Source Projects to Watch” from BigDATAwire. Past winners are a true who’s who in the data world including Apache Spark (2020), Apache Kafka (2018), MongoDB (2019), Apache Cassandra, ElasticSearch and Redis (2021)….

Avoid Data Silos in Presto in Meta: the journey from Raptor to RaptorX

Raptor is a Presto connector (presto-raptor) that is used to power some critical interactive query workloads in Meta (previously Facebook). Though referred to in the ICDE 2019 paper Presto: SQL on Everything, it remains somewhat mysterious to many Presto users because there is no available documentation for this feature. This article will shed some light…

Presto Foundation and PrestoDB: Our Commitment to the Presto Open Source Community

We recently wrapped up an amazing PrestoCon Day attended by over 600 people from across the globe. The technical discussions and the panel was a clear indication of the growing community. We showcased a number of features contributed by various companies that continue to advance the mission of Presto open source, reiterating our commitment to…

RaptorX: Building a 10X Faster Presto

RaptorX is an internal project name aiming to boost query latency significantly beyond what vanilla Presto is capable of. This blog post introduces the hierarchical cache work, which is the key building block for RaptorX. With the support of the cache, we are able to boost query performance by 10X. This new architecture can beat…

Everything You Always Wanted To Do in Table Scan

Table scan, on the face of it, sounds trivial and boring. What’s there in just reading a long bunch of records from first to last? Aren’t indexing and other kinds of physical design more interesting? As data has gotten bigger, the columnar table scan has only gotten more prominent. The columnar scan is a fairly…

Introducing the Presto blog 

Presto is a key piece of data infrastructure at many companies. The community has many ongoing projects for taking it to new levels of performance and functionality plus unique experience and insight into challenges of scale. We are opening this blog as an informal channel for discussing our work as well as technology trends and…