Videos Archive - PrestoDB

PrestoDB TSC Chair Keynote – Tim Meehan, Ahana/IBM

Hear more about the future of Presto including roadmap and other open source initiatives from Tim Meehan’s technical keynote.

Keynote: Presto – Swiss Army Knife for the Lakehouse – Tim Meehan

Presto has long been known for its ability to bridge together multiple sources of data using a single consistent language. Learn about the future direction and initiatives that will make Presto even more convenient, reliable and efficient for the Lakehouse.

The Past, Present, and Future of Presto – Philip Bell, Meta

PrestoDB recently underwent major architectural updates as the Presto Foundation grows membership and is looking to vastly grow the number of new commits and forks. Achieving this desired end state required successful refactoring and improving of Presto’s already impressive speed, efficiency, reliability, and extensibility. Establishing PrestoDB as a premier Open Source project required a major commitment of time and resources from Meta to ensure the community can benefit from this project for years to come, as well as positioning PrestoDB to evolve beyond what Meta alone could create. Members of the Presto Foundation need more of you to be involved in this major evolution in Presto’s history and core components, and bring your own inventive ideas to the mix.

Query Execution Optimization for Broadcast Join using Replicated-Reads Strategy – George Wang, Ahana

Today presto supports broadcast join by having a worker to fetch data from a small data source to build a hash table and then sending the entire data over the network to all other workers for hash lookup probed by large data source. This can be optimized by a new query execution strategy as source data from small tables is pulled directly by all workers which is known as replicated reads from dimension tables. This feature comes with a nice caching property given that all worker nodes N are now participating in scanning the data from remote sources. The table scan operation for dimension tables is cacheable per all worker nodes. In addition, there will be better resource utilization because the presto scheduler can now reduce the number plan fragment to execute as the same workers run tasks in parallel within a single stage to reduce data shuffles.

Presto, Today & Beyond – Dipti Borkar, David Simmen, Girish Baliga & Biswapesh Chattopadhyay

In this PrestoCon talk, team of community experts share Presto evolution, Today & Beyond