NeuroBlade Joins Presto Foundation: Pioneering Hardware Acceleration for Faster Data Analytics

NeuroBlade Joins Presto Foundation: Pioneering Hardware Acceleration for Faster Data Analytics

NeuroBlade, known for its SQL Processing Unit (SPU) designed to accelerate data analytics, has joined the Presto Foundation. This partnership underscores NeuroBlade’s commitment to pushing the boundaries of open-source support for hardware acceleration, thereby enhancing Presto’s robust capabilities. By aligning with the Presto community’s drive to transcend traditional hardware limitations through sophisticated software optimizations, NeuroBlade…

Denodo Joins the Presto Foundation

We are pleased to announce that Denodo Technologies has joined the Presto Foundation. The Denodo Platform is a popular data management platform based on the concept of data virtualization and logical data models, which includes capabilities for data integration, privacy, governance, and data cataloging. Denodo is often used to implement logical and distributed data architectures…

IBM joins the Presto Foundation through acquisition of Ahana

Today we’re thrilled to share that IBM has acquired Ahana, the venture-backed SaaS for Presto startup company, and we want to write more about our belief in Open Source and why IBM and Ahana are joining forces for the benefit of Presto. We believe that this is an exciting time for the Presto project. We’re…

Common Sub-Expression optimization

The problem One common pattern we see in some analytical workloads is the repeated use of the same, often times expensive expression. Look at the following query plan for example: The expression JSON_PARSE(features) is used 6 times, and casted to different ROW structures for further processing. Traditionally, Presto would just execute the expression 6 times,…

Using OptimizedTypedSet to Improve Map and Array Functions

Function evaluation is a big part of projection CPU cost. Recently we optimized a set of functions that use TypedSet, e.g. map_concat, array_union, array_intersect, and array_except. By introducing a new OptimizedTypeSet, the above functions saw improvements in several dimensions: Furthermore, OptimizedTypeSet resolves the long standing issue of throwing EXCEEDED_FUNCTION_MEMORY_LIMIT for large incoming blocks: “The input…

PrestoCon and Growing Industry Consortium – Intel and Upsolver Join Presto Foundation

Presto Foundation joined the Linux Foundation over a year ago, and has been focused on growing the Presto open source project and community. We encourage industry involvement with an open charter, clear guiding principles, and community-oriented goals. We recently hosted PrestoCon 2020, our first annual community conference, which was widely attended and well represented by…

Even Faster Unnest

Unnest is a common operation in Facebook’s daily Presto workload. It converts an ARRAY, MAP, or ROW into a flat relation. Its original implementation used deep copy all the time and was very inefficient. In Unnest Operator Performance Enhancement with Dictionary Blocks, the author improved the Unnest operator by up to 10x in CPU and…

Getting Started with PrestoDB and Aria Scan Optimizations

This article was originally published by Adam on June 15th, 2020 over at his blog at datacatessen.com. PrestoDB recently released a set of experimental features under their Aria project in order to increase table scan performance of data stored in ORC files via the Hive Connector. In this post, we’ll check out these new features…

PrestoDB and Apache Hudi

Apache Hudi is a fast growing data lake storage system that helps organizations build and manage petabyte-scale data lakes. Hudi brings stream style processing to batch-like big data by introducing primitives such as upserts, deletes and incremental queries. These features help surface faster, fresher data on a unified serving layer. Hudi tables can be stored…

Spatial Joins 1: Local Spatial Joins

A common type of spatial query involves relating one table of geometric objects (e.g., a table population_centers with columns population, latitude, longitude) with another such table (e.g., a table counties with columns county_name, boundary_wkt), such as calculating for each county the population sum of all population centers contained within it. These kinds of calculations are…

Engineering SQL Support on Apache Pinot at Uber

The article, Engineering SQL Support on Apache Pinot at Uber, was originally published by Uber on the Uber Engineering Blog on January 15, 2020. Check out eng.uber.com for more articles about Uber’s engineering work and follow Uber Engineering at @UberEng and Uber Open Source at @UberOpenSouce on Twitter for updates from our teams. Uber leverages…

Improving the Presto planner for better push down and data federation

Presto defines a connector API that allows Presto to query any data source that has a connector implementation. The existing connector API provides basic predicate pushdown functionality allowing connectors to perform filtering at the underlying data source. However, there are certain limitations with the existing predicate pushdown functionality that limits what connectors can do. The…

5 design choices—and 1 weird trick — to get 2x efficiency gains in Presto repartitioning

We like Presto. We like it a lot — so much we want to make it better in every way. Here’s an example: we just optimized the PartitionedOutputOperator. It’s now 2-3x more CPU efficient, which, when measured against Facebook’s production workload, translates to 6% gains overall. That’s huge. The optimized repartitioning is in use on…

Join Us! Growing the Presto Foundation in 2020 and Beyond

The Presto Foundation (PF) was established in September 2019 as an openly governed and vendor-neutral body dedicated to scaling and diversifying the Presto community. Hosted by the Linux Foundation, PF and its Governing Board are in a unique position to make Presto the fastest and the most reliable SQL engine for massively distributed data processing….

Presto now hosted under the Linux Foundation

We are excited to announce today, in partnership with Alibaba, Facebook, Twitter, and Uber, the launch of the Presto Foundation, a non-profit organization under the umbrella of the Linux Foundation. Hosting by the Linux Foundation opens up the Presto community to a broader ecosystem of users and contributors. The Presto Foundation’s open and neutral governance…