Capturing Worker Runtime Metrics with Prometheus Reporter in Presto C++

In this blog we will look at the Presto C++ worker’s ability to report worker level metrics through Presto CPP BaseStatsReporter interface and how this interface is implemented and integrated with Prometheus, a time series database.

Background on Presto Architecture

Presto, an open-source distributed SQL query engine, operates with a coordinator and worker nodes. In this context, Presto C++ acts as an HTTP server implementing the worker’s control and data plane REST APIs. It uses the Velox library to convert incoming queries from the coordinator into Velox query plans. The Presto C++ worker is a thin wrapper designed to integrate seamlessly with the Presto ecosystem.

You can learn more about Presto C++ in this blog post from IBM and Meta.

Capturing Metrics in Presto C++

During query execution, multiple operator events are triggered that impact the worker’s performance. Tracking these events is crucial for understanding the operator’s performance and the overall worker’s resource utilization. Presto C++ captures these metrics through the PrestoCPP BaseStatReporter using a set of predefined macros.

There are two levels of stats collected:

1. Operator Stats: which measure operator performance, and. These stats are available through Presto coordinator UI.

2. Worker Stats, which track worker resources use at any instant of time and includes aggregated operator stats at the worker level.

The BaseStatReporter class interface in Velox is central to defining and recording metrics. Two key macros are defined for convenience which we call BaseStatsReporter methods. They are:

Define Metric: This macro defines the metric with a key (name of the metric) and a stat type (how the metric was captured and supported).

Record Metric Value: This macro records the value against the metric name.

For instance, a recently added metric, PrestoCPPNumTasksBytesProcessed, tracks the total number of bytes processed by tasks on a single worker. The process involves defining the metric name, specifying the stat type (e.g., average), and recording the metric value. The value is typically retrieved from the task manager, which iterates over tasks to count the total processed input data size.

Metric Types in Presto C++

Presto C++ defines several metric types to capture various aspects of worker performance:

Counter: Tracks metrics that grow over time, resetting only on application restart (e.g., total number of HTTP requests).

Sum: Tracks the cumulative sum of values over time, reporting the difference (delta) between the new and earlier sums.

Average: Captures metrics that grow or decay over time (e.g., CPU utilization).

Rate: Tracks the sum of values per second.

Histogram: Summarizes an event over time, dividing it into buckets and counting occurrences within each range.

Integration with Prometheus

The implementation of the BaseStatReporter interface in Presto C++ is based on Prometheus, a time series database optimized for storing and querying time series events. The Prometheus C++ library simplifies the process of recording metrics by providing the necessary data structures and interfaces.

Data Format and Integration Model

Metrics are reported in Prometheus’ text-based data format, which includes the metric name, type, labels (e.g., cluster and worker ID), and value. The Presto C++ worker exposes a REST API endpoint (/v1/info/metrics) that allows Prometheus to pull metrics directly from the worker. This integration model avoids the need for a separate sidecar process, simplifying deployment and reducing resource overhead.

Setting Up Prometheus Stats Reporter

To enable Prometheus-based metrics collection in Presto C++, the following steps are necessary:

Install Prometheus C++ Library: Use setup scripts during the pre-build phase to install the Prometheus library.

Configure Build System: Modify the build system (e.g., CMake) to include the Prometheus folder and related test files.

Enable Runtime Metrics Collection: Configure the runtime environment to enable metric collection.

In a recent meetup we demo’d how to collect metrics in real-time from a Presto C++ cluster using Grafana dashboards. The demo illustrated how metrics like cache hits, SST statistics, and others could be visualized, providing valuable insights into the worker’s performance.

This integration of Prometheus Reporter with Presto C++ not only enhances observability but also simplifies the process of capturing and reporting runtime metrics, making it a robust solution for monitoring and optimizing distributed query execution in production environments.

If you’re interested in learning more about Presto C++ or want to get involved with the OS community, join us in the Presto community slack channel #presto-cpp.