mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-28 12:17:02 +00:00
SELECT clause components (selectors) are currently evaluated during query execution
using a stateful class hierarchy. This state is needed to hold intermediate state while
aggregating over multiple rows. Because the selectors are stateful, we must re-create
them each query using a selector_factory hierarchy.
We'd like to convert all of this to the unified expression evaluation machinery, so we can
have just one grammar for expressions, and just one way to evaluate expressions, but
the statefulness makes this complex.
In commit 59ab9aac44 "(Merge 'functions: reframe aggregate functions in terms
of scalar functions' from Avi Kivity)", we made aggregate functions stateless, moving
their state to aggregate_function_selector::_accumulator, and therefore into the
class hierarchy we're addressing now. Another reason for keeping state is that selectors
that aren't aggregated capture the first value they see in a GROUP BY group.
Since expressions can't contain state directly, we break apart expressions that contain
aggregate functions into two: an inner expression that processes incoming rows within
a group, and an outer expression that generates the group's output. The two expressions
communicate via a newly introduced expression element: a temporary.
The problem of non-aggregated columns requiring state is solved by encapsulating
those columns in an internal aggregate function, called the "first" function.
In terms of performance, this series has little effect, since the common case of selectors
that only contain direct column references without transformations is evaluated via a fast
path (`simple_selection`). This fast-path is preserved with almost no changes.
While the series makes it possible to start to extend the grammar and unify expression
syntaxes, it does not do so. The grammar is unchanged. There is just one breaking change:
the `SELECT JSON` statement generates json object field names based on the input selectors.
In one case the name of the field has changed, but it is an esoteric case (where a function call
is selected as part of `SELECT JSON`), and the new behavior is compatible with Cassandra.
Closes #14467
* github.com:scylladb/scylladb:
cql3: selection: drop selector_factories, selectables, and selectors
cql3: select_statement: stop using selector_factories in SELECT JSON
cql3: selection: don't create selector_factories any more
cql3: selection: collect column_definitions using expressions
cql3: selection: reimplement selection::is_aggregate()
cql3: selection: evaluate aggregation queries via expr::evaluate()
cql3: selection, select_statement: fine tune add_column_for_post_processing() usage
cql3: selection: evaluate non-aggregating complex selections using expr::evaluate()
cql3: selection: store primary key in result_set_builder
cql3: expression: fix field_selection::type interpretation by evaluate()
cql3: selection: make result_set_builder::current non-optional<>
cql3: selection: simplify row/group processing
cql3: selection: convert requires_thread to expressions
cql: selection: convert used_functions() to expressions
cql3: selection: convert is_reducible/get_reductions to expressions
cql3: selection: convert is_count() to expressions
cql3: selection convert contains_ttl/contains_writetime to work on expressions
cql3: selection: make simple_selectors stateless
cql3: expression: add helper to split expressions with aggregate functions
cql3: selection: short-circuit non-aggregations
cql3: selection: drop validate_selectors
cql3: select_statement: force aggregation if GROUP BY is used
cql3: select_statement: levellize aggregation depth
cql3: selection: skip first_function when collecting metadata
cql3: select_statement: explicitly disable automatic parallelization with no aggregates
cql3: expression: introduce temporaries
cql3: select_statement: use prepared selectors
cql3: selection: avoid selector_factories in collect_metadata()
cql3: expressions: add "metadata mode" formatter for expressions
cql3: selection: convert collect_metadata() to the prepared expression domain
cql3: selection: convert processes_selection to work on prepared expressions
cql3: selection: prepare selectors earlier
cql3: raw_selector: deinline
cql3: expression: reimplement verify_no_aggregate_functions()
cql3: expression: add helpers to manage an expression's aggregation depth
cql3: expression: improve printing of prepared function calls
cql3: functions: add "first" aggregate function