Securonix uses Ahana Cloud for Presto to enable extremely fast SQL queries on AWS S3 for ‘Threat Hunting’. “Before Presto we were using a Hadoop cluster, and the challenge was on scale…not only was it expensive but the scaling factors were not linear. The Presto engine was designed for scale, and it’s feature-built just for a query engine. Ahana Cloud made it easy for us to use Presto in the cloud.” – Derrick Harcey, Chief Architect at Securonix
HermesDB is the next generation of OLAP engine at Tencent with the architecture featuring separation of storage and calculation. HermesDB characterizes efficient indexing files in storage data, equipping with customized Presto as the core query engine. With the help of Presto connector, HermesDB could not only support full ANSI syntax but also ultilize Apache Lucene as underlying computer core.
Twilio uses Presto on AWS. Approximately 80% of Twilio’s data comes from product teams that use Kafka or MySQL databases. In addition to this, the company receives data from external sources such as Salesforce, Zendesk, and Marketo, as well as internal CSV files generated by accounting and finance teams. This data is loaded into the S3 data lake using config-driven Python and Spark-based loaders. With Presto, they can decouple the storage and compute layers and scale without affecting performance. In addition to data exploration and ad-hoc analysis by data analysts, Presto has also been used as a data source for real-time dashboards and machine learning models.
At Twitter, to overcome performance issues that can arise in developing and maintaining SQL systems with increasing volumes of data, we designed a large-scale SQL federation system across on-premises and cloud Hadoop and Google Cloud Storage (GCS) clusters. leveraging Presto as the core of SQL engine clusters, pursuing high scalability and availability to fulfill the increasing need for data analytics on a petabyte (PB) scale of data. All the presto clusters have ran more than 10 million presto sql queries.
Uber uses Presto for the SQL Data Lakehouse where over 7K weekly active users run 500K queries/day on 59 PB HDFS bytes/day.