Presto FAQ

Running Presto

How much memory should I give a worker node?

The answer to this question will depend on the size of the data sets you are working with and the nature of the queries you are running, but Facebook typically runs Presto with a 16 GB heap (this is the amount specified by the example JVM config file in the deployment instructions).

Compatibility and Support

What versions of Hadoop does Presto support?

The Hive Connector supports all popular versions of Hadoop.

Does Presto connect to Cassandra?

Yes, via the Cassandra Connector.

Does Presto connect to MySQL or PostgreSQL?

Yes, via the MySQL Connector or PostgreSQL Connector. Both of these connectors extend a base JDBC connector that is easy to extend to connect other databases. Presto also includes a JDBC Driver that allows Java applications to connect to Presto.

Common Errors and Troubleshooting

Why can I run SHOW TABLES but I can't SELECT from any of them?

If you can run metadata commands like SHOW TABLES but can't read from them, this means that Presto is able to access your Hive metastore but not your HDFS cluster. You might see one of these error messages:

There is probably a mismatch between your Hadoop version and the Hive Connector version you have selected. Make sure that you set the catalog property appropriately for your version of Hadoop.

Why do I see a "Cannot connect to discovery server for refresh" error on startup?

This is usually not a problem. The error message appears because the discovery client starts before the embedded discovery server is ready. You will see a succeeded for refresh message shortly after the error message in the logs which shows that everything is working. We will fix the log message eventually but it is purely a cosmetic issue.

Queries are running slower than expected. What are the factors that influence Presto performance?

The first things to check are the basic machine stats for your workers and coordinators. Measure the load, network, and disk utilization over time to understand where Presto is running out of resources.

Running Presto Queries

Is there a user-friendly interface for running Presto queries?

The resources page lists several external projects designed to provide a user-friendly GUI interface for Presto queries.

