Distributed SQL Query Engine for Big Data

Get Started Download

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

The community owned and driven Presto project is supported by the Presto Foundation, an independent nonprofit organization with open and neutral governance, hosted under the Linux Foundation®.

Learn more about the Presto's move to the Linux Foundation, and learn how to become a member of the Presto Foundation today.

What can it do?

Presto allows querying data where it lives, including Hive, Cassandra, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

Who uses it?

Facebook uses Presto for interactive queries against several internal data stores, including their 300PB data warehouse. Over 1,000 Facebook employees use Presto daily to run more than 30,000 queries that in total scan over a petabyte each per day.

Leading internet companies including Airbnb and Dropbox are using Presto.

Presto is amazing. Lead engineer Andy Kramolisch got it into production in just a few days. It's an order of magnitude faster than Hive in most our use cases. It reads directly from HDFS, so unlike Redshift, there isn't a lot of ETL before you can use it. It just works.

Christopher Gutierrez, Manager of Online Analytics, Airbnb

We're really excited about Presto. We're planning on using it to quickly gain insight about the different ways our users use Dropbox, as well as diagnosing problems they encounter along the way. In our tests so far it's been rock solid and extremely fast when applied to some of our most important ad hoc use cases.

Fred Wulff, Software Engineer, Dropbox

What are the latest innovations?

Project Aria – PrestoDB can now push down entire expressions to the data source for some file formats like ORC. Blog Design

Project Presto Unlimited – Introduced exchange materialization to create temporary in-memory bucketed tables to use significantly less memory. PR Blog

User Defined Functions – Support for dynamic SQL functions is now available in experimental mode. Docs

Apache Pinot and Druid ConnectorsDocs

RaptorX – Disaggregates the storage from compute for low latency to provide a unified, cheap, fast, and scalable solution to OLAP and interactive use cases. Issue

Presto-on-Spark Runs Presto code as a library within Spark executor. Design Docs

Disaggregated Coordinator (a.k.a. Fireball) – Scale out the coordinator horizontally and revamp the RPC stack. Beta in Q4 2020. Issues

What is the Presto Foundation?

The Presto Foundation is the non-profit established to support the developer and community processes for the Presto open source project. Hosted under the auspices of the Linux Foundation, the Presto Foundation is governed openly and transparently.

If you share our vision for Presto and are ready to provide financial support for the community development process, please join us!

Release Git Stats

Current Version 0.222
Date July 02, 2019
Commits 250
Authors 25
Committers 18