8.23. Quantile Digest Functions
Presto implements the approx_percentile
function with the quantile digest
data structure. The underlying data structure, qdigest,
is exposed as a data type in Presto, and can be created, queried and stored
separately from approx_percentile
.
Data Structures
A quantile digest is a data sketch which stores approximate percentile
information. The presto type for this data structure is called qdigest
,
and it takes a parameter which must be one of bigint
, double
or
real
which represent the set of numbers that may be ingested by the
qdigest
. They may be merged without losing precision, and for storage
and retrieval they may be cast to/from VARBINARY
.
Functions

merge
(qdigest) → qdigest Merges all input
qdigest
s into a singleqdigest
.

value_at_quantile
(qdigest(T), quantile) → T Returns the approximate percentile values from the quantile digest given the number
quantile
between 0 and 1.

quantile_at_value
(qdigest(T), T) → quantile Returns the approximate
quantile
number between 0 and 1 from the quantile digest given an input value. Null is returned if the quantile digest is empty or the input value is outside of the range of the quantile digest.

scale_qdigest
(qdigest(T), scale_factor) > qdigest(T) Returns a
qdigest
whose distribution has been scaled by a factor specified byscale_factor
.

values_at_quantiles
(qdigest(T), quantiles) → T Returns the approximate percentile values as an array given the input quantile digest and array of values between 0 and 1 which represent the quantiles to return.

qdigest_agg
(x) → qdigest<[same as x]> Returns the
qdigest
which is composed of all input values ofx
.

qdigest_agg
(x, w) → qdigest<[same as x]> Returns the
qdigest
which is composed of all input values ofx
using the peritem weightw
.

qdigest_agg
(x, w, accuracy) → qdigest<[same as x]> Returns the
qdigest
which is composed of all input values ofx
using the peritem weightw
and maximum error ofaccuracy
.accuracy
must be a value greater than zero and less than one, and it must be constant for all input rows.