Post-aggregations
Apache Druid supports two query languages: Druid SQL and native queries. This document describes the native language. For information about functions available in SQL, refer to the SQL documentation.
Post-aggregations are specifications of processing that should happen on aggregated values as they come out of Apache Druid. If you include a post aggregation as part of a query, make sure to include all aggregators the post-aggregator requires.
There are several post-aggregators available.
Arithmetic post-aggregator
The arithmetic post-aggregator applies the provided function to the given fields from left to right. The fields can be aggregators or other post aggregators.
Property | Description | Required |
---|---|---|
type | Must be "arithmetic" . | Yes |
name | Output name of the post-aggregation | Yes |
fn | Supported functions are + , - , * , / , pow and quotient | Yes |
fields | List of post-aggregator specs which define inputs to the fn | Yes |
ordering | If no ordering (or null ) is specified, the default floating point ordering is used. numericFirst ordering always returns finite values first, followed by NaN , and infinite values last. | No |
Note:
/
division always returns0
if dividing by0
, regardless of the numerator.quotient
division behaves like regular floating point division- Arithmetic post-aggregators always use floating point arithmetic.
Example:
{
"type" : "arithmetic",
"name" : "mult",
"fn" : "*",
"fields": [
{"type": "fieldAccess", "fieldName": "someAgg"},
{"type": "fieldAccess", "fieldName": "someOtherAgg"}
]
}
Field accessor post-aggregators
These post-aggregators return the value produced by the specified dimension or aggregator.
Property | Description | Required |
---|---|---|
type | Must be "fieldAccess" or "finalizingFieldAccess" . Use type "fieldAccess" to return the raw aggregation object, or use type "finalizingFieldAccess" to return a finalized value, such as an estimated cardinality. | Yes |
name | Output name of the post-aggregation | Yes if defined as a standalone post-aggregation, but may be omitted if used inline to some other post-aggregator in a fields list |
fieldName | The output name of the dimension or aggregator to reference | Yes |
Example:
{ "type" : "fieldAccess", "name": "someField", "fieldName" : "someAggregator" }
or
{ "type" : "finalizingFieldAccess", "name": "someFinalizedField", "fieldName" : "someAggregator" }
Constant post-aggregator
The constant post-aggregator always returns the specified value.
Property | Description | Required |
---|---|---|
type | Must be "constant" | Yes |
name | Output name of the post-aggregation | Yes |
value | The constant value | Yes |
Example:
{ "type" : "constant", "name" : "someConstant", "value" : 1234 }
Expression post-aggregator
The expression post-aggregator is defined using a Druid expression.
Property | Description | Required |
---|---|---|
type | Must be "expression" | Yes |
name | Output name of the post-aggregation | Yes |
expression | Native Druid expression to compute, may refer to any dimension or aggregator output names | Yes |
ordering | If no ordering (or null ) is specified, the "natural" ordering is used. numericFirst ordering always returns finite values first, followed by NaN , and infinite values last. If the expression produces array or complex types, specify ordering as null and use outputType instead to use the correct type native ordering. | No |
outputType | Output type is optional, and can be any native Druid type: LONG , FLOAT , DOUBLE , STRING , ARRAY types (e.g. ARRAY<LONG> ), or COMPLEX types (e.g. COMPLEX<json> ). If not specified, the output type will be inferred from the expression . If specified and ordering is null, the type native ordering will be used for sorting values. If the expression produces array or complex types, this value must be non-null to ensure the correct ordering is used. If outputType does not match the actual output type of the expression , the value will be attempted to coerced to the specified type, possibly failing if coercion is not possible. | No |
Example:
{
"type": "expression",
"name": "someExpression",
"expression": "someAgg + someOtherAgg",
"ordering": null,
"outputType": "LONG"
}
Greatest / Least post-aggregators
doubleGreatest
and longGreatest
computes the maximum of all fields and Double.NEGATIVE_INFINITY.
doubleLeast
and longLeast
computes the minimum of all fields and Double.POSITIVE_INFINITY.
Property | Description | Required |
---|---|---|
type | Must be "doubleGreatest" , "doubleLeast" , "longGreatest" , or "longLeast" . | Yes |
name | Output name of the post-aggregation | Yes |
fields | List of post-aggregator specs which define inputs to the greatest or least function | Yes |
The difference between the doubleMax
aggregator and the doubleGreatest
post-aggregator is that doubleMax
returns the highest value of
all rows for one specific column while doubleGreatest
returns the highest value of multiple columns in one row. These are similar to the
SQL MAX
and GREATEST
functions.
Example:
{
"type" : "doubleGreatest",
"name" : "theGreatest",
"fields": [
{ "type": "fieldAccess", "fieldName": "someAgg" },
{ "type": "fieldAccess", "fieldName": "someOtherAgg" }
]
}
JavaScript post-aggregator
Applies the provided JavaScript function to the given fields. Fields are passed as arguments to the JavaScript function in the given order.
Property | Description | Required |
---|---|---|
type | Must be "javascript" | Yes |
name | Output name of the post-aggregation | Yes |
fieldNames | List of input dimension or aggregator output names | Yes |
function | String javascript function which accepts fieldNames as arguments | Yes |
Example:
{
"type": "javascript",
"name": "someJavascript",
"fieldNames" : ["someAgg", "someOtherAgg"],
"function": "function(someAgg, someOtherAgg) { return 100 * Math.abs(someAgg) / someOtherAgg;"
}
JavaScript-based functionality is disabled by default. Please refer to the Druid JavaScript programming guide for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it.
HyperUnique Cardinality post-aggregator
The hyperUniqueCardinality post aggregator is used to wrap a hyperUnique object such that it can be used in post aggregations.
Property | Description | Required |
---|---|---|
type | Must be "hyperUniqueCardinality" | Yes |
name | Output name of the post-aggregation | Yes |
fieldName | The output name of a hyperUnique aggregator | Yes |
{
"type" : "hyperUniqueCardinality",
"name": "someCardinality",
"fieldName" : "someHyperunique"
}
It can be used in a sample calculation as so:
{
...
"aggregations" : [{
{"type" : "count", "name" : "rows"},
{"type" : "hyperUnique", "name" : "unique_users", "fieldName" : "uniques"}
}],
"postAggregations" : [{
"type" : "arithmetic",
"name" : "average_users_per_row",
"fn" : "/",
"fields" : [
{ "type" : "hyperUniqueCardinality", "fieldName" : "unique_users" },
{ "type" : "fieldAccess", "name" : "rows", "fieldName" : "rows" }
]
}]
...
This post-aggregator will inherit the rounding behavior of the aggregator it references. Note that this inheritance is only effective if you directly reference an aggregator. Going through another post-aggregator, for example, will cause the user-specified rounding behavior to get lost and default to "no rounding".
Example Usage
In this example, let’s calculate a simple percentage using post aggregators. Let’s imagine our data set has a metric called "total".
The format of the query JSON is as follows:
{
...
"aggregations" : [
{ "type" : "count", "name" : "rows" },
{ "type" : "doubleSum", "name" : "tot", "fieldName" : "total" }
],
"postAggregations" : [{
"type" : "arithmetic",
"name" : "average",
"fn" : "/",
"fields" : [
{ "type" : "fieldAccess", "name" : "tot", "fieldName" : "tot" },
{ "type" : "fieldAccess", "name" : "rows", "fieldName" : "rows" }
]
}]
...
}
{
...
"aggregations" : [
{ "type" : "doubleSum", "name" : "tot", "fieldName" : "total" },
{ "type" : "doubleSum", "name" : "part", "fieldName" : "part" }
],
"postAggregations" : [{
"type" : "arithmetic",
"name" : "part_percentage",
"fn" : "*",
"fields" : [
{ "type" : "arithmetic",
"name" : "ratio",
"fn" : "/",
"fields" : [
{ "type" : "fieldAccess", "name" : "part", "fieldName" : "part" },
{ "type" : "fieldAccess", "name" : "tot", "fieldName" : "tot" }
]
},
{ "type" : "constant", "name": "const", "value" : 100 }
]
}]
...
}
The same could be computed using an expression post-aggregator:
{
...
"aggregations" : [
{ "type" : "doubleSum", "name" : "tot", "fieldName" : "total" },
{ "type" : "doubleSum", "name" : "part", "fieldName" : "part" }
],
"postAggregations" : [{
"type" : "expression",
"name" : "part_percentage",
"expression" : "100 * (part / tot)"
}]
...
}