Presto has been widely used in Bytedance, e.g. DataWarehouse, BI Tools, Ads and so on. At Bytedance, OLAP Platform migrated their ad-hoc workloads from Apache Hive and Apache Spark to Presto and It quickly become popular and expanded fast. Today, Presto cluster at Bytedance have tens of thousands compute cores and serves about 1 million queries per day which cover more than 90 percent of interactive queries. This dramatically reduced the query latency and saved a lot of compute resources.
Twilio uses Presto on AWS. Approximately 80% of Twilio’s data comes from product teams that use Kafka or MySQL databases. In addition to this, the company receives data from external sources such as Salesforce, Zendesk, and Marketo, as well as internal CSV files generated by accounting and finance teams. This data is loaded into the S3 data lake using config-driven Python and Spark-based loaders. With Presto, they can decouple the storage and compute layers and scale without affecting performance. In addition to data exploration and ad-hoc analysis by data analysts, Presto has also been used as a data source for real-time dashboards and machine learning models.