mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-13 03:12:13 +00:00
We define the "aggregation depth" of an expression by how many nested aggregation functions are applied. In CQL/SQL, legal values are 0 and 1, but for generality we deal with any aggregation depth. The first helper measures the maximum aggregation depth along any path in the expression graph. If it's 2 or greater, we have something like max(max(x)) and we should reject it (though these helpers don't). If we get 1 it's a simple aggregation. If it's zero then we're not aggregating (though CQL may decide to aggregate anyway if GROUP BY is used). The second helper edits an expression to make sure the aggregation depth along any path that reaches a column is the same. Logically, `SELECT x, max(y)` does not make sense, as one is a vector of values and the other is a scalar. CQL resolves the problem by defining x as "the first value seen". We apply this resolution by converting the query to `SELECT first(x), max(y)` (where `first()` is an internal aggregate function), so both selectors refer to scalars that consume vectors. When a scalar is consumed by an aggregate function (for example, `SELECT max(x), min(17)` we don't have to bother, since a scalar is implicity promoted to a vector by evaluating it every row. There is some ambiguity if the scalar is a non-pure function (e.g. `SELECT max(x), min(random())`, but it's not worth following. A small unit test is added.