scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 13:06:57 +00:00

Author	SHA1	Message	Date
Avi Kivity	b9bc783418	cql3: selection: don't ignore regular column restriction if a regular row is not present If a regular row isn't present, no regular column restriction (say, r=3) can pass since all regular columns are presented as NULL, and we don't have an IS NULL predicate. Yet we just ignore it. Handle the restriction on a missing column by return false, signifying the row was filtered out. We have to move the check after the conditional checking whether there's any restriction at all, otherwise we exit early with a false failure. Unit test marked xfail on this issue are now unmarked. A subtest of test_tombstone_limit is adjusted since it depended on this bug. It tested a regular column which wasn't there, and this bug caused the filter to be ignored. Change to test a static column that is there. A test for a bug found while developing the patch is also added. It is also tested by test_tombstone_limit, but better to have a dedicated test. Fixes #10357 Closes scylladb/scylladb#20486	2024-09-15 13:44:16 +03:00
Avi Kivity	3de4e8f91b	Merge 'cql: process LIMIT for GROUP BY select queries' from Paweł Zakrzewski This change fixes #17237, fixes #5361 and fixes #5362 by passing the limit value down the call chain in cql3. A test is also added. fixes #17237 fixes #5361 fixes #5362 The regression happened in 5.4 as we changed the way GROUP BY is processed in `432cb02` - to force aggregation when it is used. The LIMIT value was not passed to aggregations and thus we failed to adhere to it. W want to backport this fix to 5.4 and 6.0 to have continuous correct results for the test case from #17237 This patch consists of 4 commits: - fa4225ea0fac2057b7a9976f57dc06bcbd900cd4 - cql3: respect the user-defined page size in aggregate queries - a precondition for this patch to be implementable - 8fbe69e74dca16ed8832d9a90489ca47ba271d0b - cql3/select_statement: simplify the get_limit function - the `do_get_limit()` function did a lot of legwork that should not be associated with it. This change makes it trivial and makes its callers do additional checks (for unset guards, or for an aggregate query) - 162828194a2b88c22fbee335894ff045dcc943c9 - cql3: process LIMIT for GROUP BY queries - pass the limit value down the chain and make use of it. This is the actual fix to #17237 - b3dc6de6d6cda8f5c09b01463bb52f827a6a00b4 - test/cql-pytest: Add test for GROUP BY queries with LIMIT - tests Closes scylladb/scylladb#18842 * github.com:scylladb/scylladb: test/cql-pytest: Add test for GROUP BY queries with LIMIT cql3: process LIMIT for GROUP BY queries cql3/select_statement: simplify the get_limit function cql3: respect the user-defined page size in aggregate queries	2024-08-14 17:54:59 +03:00
Paweł Zakrzewski	e7ae7f3662	cql3: process LIMIT for GROUP BY queries Currently LIMIT not passed to the query executor at all and it was just an accident that it worked for the case referenced in #17237. This change passes the limit value down the chain.	2024-08-11 09:08:43 +02:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Kefu Chai	ee80742c39	cql3: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19906	2024-07-28 17:29:07 +03:00
Avi Kivity	3fc4e23a36	forward_service: rename to mapreduce_service forward_service is nondescriptive and misnamed, as it does more than forward requests. It's a classic map/reduce algorithm (and in fact one of its parameters is "reducer"), so name it accordingly. The name "forward" leaked into the wire protocol for the messaging service RPC isolation cookie, so it's kept there. It's also maintained in the name of the logger (for "nodetool setlogginglevel") for compatibility with tests. Closes scylladb/scylladb#19444	2024-07-03 19:29:47 +03:00
Nadav Har'El	1aea2136c8	cql: fix regression in SELECT * GROUP BY Recently, the expression-rewrite effort changed the way that GROUP BY is implemented. Usually GROUP BY involves an aggregation function (e.g., if you want a separate SUM per partition). But there's also a query like SELECT p, c1, c2, v FROM tbl GROUP BY p This query is supposed to return one row - the first row in clustering order - per group (in this case, partition). The expression rewrite re-implemented this feature by introducing a new internal aggregator, first(), which returns the first aggregated value. The above query is rewritten into: SELECT first(p), first(c1), first(c2), first(v) FROM tbl GROUP BY p This case works correctly, and we even have a regression test for it. But unfortunately the rewrite broke the following query: SELECT * FROM tbl GROUP BY p Note the "" instead of the explicit list of columns. In our implementation, a selection of "" is looks like an empty selection, and it didn't get the "first()" treatment and it remained a "SELECT " - and wrongly returned all rows instead of just the first one in each partition. This was a regression - it worked correctly in Scylla 5.2 (and also in Cassandra) - see the next patch for a regression test. In this patch we fix this regression. When there is a GROUP BY, the "" is rewritten to the appropriate list of all visible columns and then gets the first() treatment, so it will return only the first row as expected. The next patch will be a test that confirms the bug and its fix. Fixes #16531 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-12-25 17:52:57 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Avi Kivity	66c47d40e6	cql3: selection: drop selector_factories, selectables, and selectors The whole class hierarchy is no longer used by anything and we can just delete it.	2023-07-03 19:45:17 +03:00
Avi Kivity	039472ffb9	cql3: selection: don't create selector_factories any more We no longer use selector_factories for anything, so we can drop them.	2023-07-03 19:45:17 +03:00
Avi Kivity	e521557ce5	cql3: selection: collect column_definitions using expressions The replica needs to know which columns we're interested in. Iterate and recurse into all selector expressions to collect all mentioned columns. We use the same algorithm that create_factories_and_collect_column_definitions() uses, even though it is quadratic, to avoid causing surprises.	2023-07-03 19:45:17 +03:00
Avi Kivity	7bd317ace4	cql3: selection: reimplement selection::is_aggregate() We can get rid of the last use of selector_factories by reimplementing is_aggregate(). It's simple - if we have an inner loop, we're aggregating.	2023-07-03 19:45:17 +03:00
Avi Kivity	91cdaa72bd	cql3: selection: evaluate aggregation queries via expr::evaluate() When constructing a selection_with_processing, split the selectors into an inner loop and an outer loop with split_aggregation(). We can then reimplement add_input_row() and get_output_row() as follows: - add_input_row(): evaluate the inner loop expressions and store the results in temporaries - get_output_row(): evaluate the outer loop expressions, pulling in values from those temporaries. reset(), which is called between groups, simply copies the initial values rathered by split_aggregation() into the temporaries. The only complexity comes from add_column_for_post_query_processing(), which essentially re-does the work of split_aggregation(). It would be much better if we added the column before split_aggregation() was called, but some refactoring has to take place before that happens.	2023-07-03 19:45:17 +03:00
Avi Kivity	27254c4f50	cql3: selection, select_statement: fine tune add_column_for_post_processing() usage In three cases we need to consult a column that's possibly not explicitly selected: - for the WHERE clause - for GROUP BY - for ORDER BY The return value of the function is the index where the newly-added column can be found. Currently, the index is correct for both the internal column vector and the result set, but soon in won't be. In the first two cases (WHERE clause and ORDER BY), we're interested in the column before grouping, in the last case (ORDER BY) we're interested in the column after grouping, so we need to distinguish between the two. Since we already have selection::index_of() that returns the pre-grouping index, choose the post-grouping index for the return value of selection::add_column_for_post_processing(), and change the GROUP BY code to use index_of(). Comments are added.	2023-07-03 19:45:17 +03:00
Avi Kivity	6bf1bd7130	cql3: selection: evaluate non-aggregating complex selections using expr::evaluate() Now that everything is in place, implement the fast-path transform_input_row() for selection_with_processing. It's a straightforward call to evaluate() in a loop. We adjust add_column_for_post_processing() to also update _selectors, otherwise ORDER BY clauses that require an additional column will not see that column. Since every sub-class implements transform_input_row(), mark the base class declaration as pure virtual.	2023-07-03 19:45:17 +03:00
Avi Kivity	f5eb7fd6dc	cql3: selection: store primary key in result_set_builder expr::evaluate() expects an exploded primary key in its evaluation_inputs structure (this dates back from the conversion of filtering to expressions). But right now, the exploded primary key is only available in the filter. That's easy to fix however: move the primary key containers to result_set_builder and just keep references in the filter. After this, we can evaluate column_value expressions that reference the primary key.	2023-07-03 19:45:17 +03:00
Avi Kivity	aed01018a3	cql3: selection: make result_set_builder::current non-optional<> Previously, we used the engagedness of result_set_builder::optional as a flag, but the previous patch eliminated that and it's always engaged. Remove the optional wrapper to reduce noise.	2023-07-03 19:45:17 +03:00
Avi Kivity	44c8507075	cql3: selection: simplify row/group processing Processing a result set relies on calling result_set_builder::new_row(). This function is quite complex as it has several roles: - complete processing of the previously computed row, if any - determine if GROUP BY grouping has changed, and flush the previous group if so - flush the last group if that's the case This works now, but won't work with expr::evaluate. The reason is that new_row() is called after the partition key and clustering key of the new row have been evaluated, so processing of the previous row will see incorrect data. It works today because we copy the partition key and clustering key into result_set_builder::current, but expr::evaluate uses the exploded partition key and clustering key, which have been clobbered. The solution is to separate the roles. Instead of new_row() that's responsible for completing the previous row and starting a new one, we have start_new_row() that's responsible for what its name says, and complete_row() that's responsible for completing the row and checking for group change. The responsibity for flushing the final group is moved to result_set_builder::build(). This removes the awkward "more_rows_coming" parameter that makes everything more complicated. result_set_builder::current is still optional, but it's always engaged. The next patch will clean that up.	2023-07-03 19:45:17 +03:00
Avi Kivity	877f4f86d2	cql3: selection: convert requires_thread to expressions If any function requires a thread to execute (due to running in Lua or wasm), then the entire selection needs to run in a thread.	2023-07-03 19:45:17 +03:00
Avi Kivity	cbd68abde8	cql: selection: convert used_functions() to expressions used_functions() is used to check whether prepared statements need to be invalidated when user-defined functions change. We need to skip over empty scalar components of aggregates, since these can be defined by users (with the same meaning as if the identity function was used).	2023-07-03 19:45:17 +03:00
Avi Kivity	bfb1acc6d3	cql3: selection: convert is_reducible/get_reductions to expressions The current version of automatic query parallelization works when all selectors are reducible (e.g. have a state_reduction_function member), and all the inputs to the aggregates are direct column selectors without further transformation. The actual column names and reductions need to be packed up for forward_service to be used. Convert is_reducible()/get_reductions() to the expression world. The conversion is fairly straightforward.	2023-07-03 19:45:17 +03:00
Avi Kivity	d99fc29e2d	cql3: selection: convert is_count() to expressions Early versions of automatic query parallelization only supported `SELECT count(*)` with one selector. Convert the check to expressions.	2023-07-03 19:45:17 +03:00
Avi Kivity	d36eb8cea6	cql3: selection convert contains_ttl/contains_writetime to work on expressions contains_ttl/contains_writetime are two attributes of a selection. If a selection contains them, we must ask the replica to send them over; otherwise we don't have data to process. Not sending ttl/writetime saves some effort. The implementation is a straightforward recursive descent using expr::find_in_expression.	2023-07-03 19:45:17 +03:00
Avi Kivity	6c2bb5e1ed	cql3: selection: make simple_selectors stateless Now that we push all GROUP BY queries to selection_with_processing, we always process rows via transform_input_row() and there's no reason to keep any state in simple_selectors. Drop the state and raise an internal error if we're ever called for aggregation.	2023-07-03 19:45:17 +03:00
Avi Kivity	f48ecb5049	cql3: selection: short-circuit non-aggregations Currently, selector evaluation assumes the most complex case where we aggregate, so multiple input rows combine into one output row. In effect the query either specifies an outer loop (for the group) and an inner loop (for input rows), or it only specifies the inner loop; but we always perform the outer and inner loop. Prepare to have a separate path for the non-aggregation case by introducing transform_input_row().	2023-07-03 19:45:17 +03:00
Avi Kivity	4a2428e4ec	cql3: selection: drop validate_selectors It's unused. It dates from the (perhaps better) time when regularity of aggregation across selectors was enforced.	2023-07-03 19:45:17 +03:00
Avi Kivity	ecdded90cd	cql3: selection: skip first_function when collecting metadata We plan to rewrite aggregation queries that have a non-aggregating selector using the first function, so that all selectors are aggregates (or none are). Prevent the first function from affecting metadata (the auto-generated column names), by skipping over the first function if detected. They input and output types are unchanged so this only affects the name.	2023-07-03 19:45:17 +03:00
Avi Kivity	778ae2b461	cql3: expression: introduce temporaries Temporaries are similar to bind variables - they are values provided from outside the expression. While bind variables are provided by the user, temporaries are generated internally. The intended use is for aggregate accumulator storage. Currently aggregates store the accumulator in aggregate_function_selector::_accumulator, which means the entire selector hierarchy must be cloned for every query. With expressions, we can have a single expression object reused for many computations, but we need a way to inject the accumulator into an aggregation, which this new expression element provides.	2023-07-03 19:45:17 +03:00
Avi Kivity	7c3ceb6473	cql3: select_statement: use prepared selectors Change one more layer of processing to work on prepared rather than raw selectors. This moves the call to prepare the selectors early in select_statement processing. In turn this changes maybe_jsonize_select_clause() and forward_service's mock_selection() to work in the prepared realm as well. This moves us one step closer to using evaluate() to process the select clause, as the prepared selectors are now available in select_statement. We can't use them yet since we can't evaluate aggregations.	2023-07-03 19:45:17 +03:00
Avi Kivity	a338d0455d	cql3: selection: avoid selector_factories in collect_metadata() Generate the column headings in the result set metadata using the newly introduced result_set_metadata mode of the expression printer.	2023-07-03 19:45:17 +03:00
Avi Kivity	a1f4abb753	cql3: selection: convert collect_metadata() to the prepared expression domain Simplifies refactoring later on.	2023-07-03 19:45:17 +03:00
Avi Kivity	91b251f6b4	cql3: selection: convert processes_selection to work on prepared expressions processes_selection() checks whether a selector passes-through a column or applies some form of processing (like a case or function application). It's more sensible to do this in the prepared domain as we have more information about the expression. It doesn't really help here, but it does help the refactoring later in the series.	2023-07-03 19:45:17 +03:00
Avi Kivity	4fb797303f	cql3: selection: prepare selectors earlier Currently, each selector expression is individually prepared, then converted into a selector object that is later executed. This is done (on a vector of raw selectors) by cql3::selection::raw_selector::to_selectables(). Split that into two phases. The first phase converts raw_selector into a new struct prepared_selector (a better name would be plain 'selector', but it's taken for now). The second phase continues the process and converts prepared_selector into selectables. This gives us a full view of the prepared expressions while we're preparing the select clause of the select statement.	2023-07-03 19:45:17 +03:00
Avi Kivity	70b246eaaf	cql3: raw_selector: deinline It's easier to refactor things if they don't cause the entire universe to recompile, plus adding new headers is less painful.	2023-07-03 19:45:17 +03:00
Avi Kivity	f6f974cdeb	cql3: selection: fix GROUP BY, empty groups, and aggregations A GROUP BY combined with aggregation should produce a single row per group, except for empty groups. This is in contrast to an aggregation without GROUP BY, which produces a single row no matter what. The existing code only considered the case of no grouping and forced a row into the result, but this caused an unwanted row if grouping was used. Fix by refining the check to also consider GROUP BY. XFAIL tests are relaxed. Fixes #12477. Note, forward_service requires that aggregation produce exactly one row, but since it can't work with grouping, it isn't affected. Closes #14399	2023-06-28 18:56:22 +03:00
Avi Kivity	b858a4669d	cql3: expr: break up expression.hh header Adding a function declaration to expression.hh causes many recompilations. Reduce that by: - moving some restrictions-related definitions to the existing expr/restrictions.hh - moving evaluation related names to a new header expr/evaluate.hh - move utilities to a new header expr/expr-utilities.hh expression.hh contains only expression definitions and the most basic and common helpers, like printing.	2023-06-22 14:21:03 +03:00
Avi Kivity	32b27d6a08	cql3: expr: change evaluation_input vector components to take spans Spans are slightly cleaner, slightly faster (as they avoid an indirection), and allow for replacing some of the arguments with small_vector:s. Closes #14313	2023-06-22 11:28:01 +02:00
Avi Kivity	190d1b20bf	cql3: seletor: drop inheritance from assignment_testable Since all function overload selection is done by prepare_expression(), we no longer need to implement the assignment_testable interface, so drop it. Since there's now just one implementation of assignment_testable, we can drop it and replace it by the implementation (expressions), but that is left for later.	2023-06-13 21:04:49 +03:00
Avi Kivity	f438b9b044	cql3: selection: rely on prepared expressions Now that selector expressions are prepared, we can avoid doing the work ourselves: - function_name:s are resolved into functions, so we can error out if we see a function_name (and drop the with_function class) - casts are converted to anonymous functions, so we can error out if we see them (and drop with with_cast class) - field_selection:s can relay on the prepared field_idx	2023-06-13 21:04:49 +03:00
Avi Kivity	1040589828	cql3: selection: prepare selector expressions Call prepare_expression() on selector expressions to resolve types. This leaves us with just one way to move from the unprepared domain to the prepared domain. The change is somewhat awkward since do_prepare_selectable() is re-doing work that is done by prepare_expression(), but somehow it all works. The next patch will tear down the unnecessary double-preparation.	2023-06-13 21:04:49 +03:00
Avi Kivity	c0f59f0789	cql3: eliminate dynamic_cast<selector> from functions::get() Type inference for function calls is a bit complicated: - a function argument can be inferred from the signature: a call to my_func(:arg) will infer :arg's type from the function signature - a function signature can be inferred from its argument types: a call to max(my_column) will select the correct max() signature (as max is generic) from my_column's type Currently, functions::get() implements this by invoking dynamic_cast<selector*> on the argument. If the caller of functions::get() is the SELECT clause preparation, then the cast will succeed and we'll be able to find the type. If not, we fail (and fall back to inferring the argument types from a non-generic function signature). Since we're about to move selectors to expressions, the dynamic_cast will fail, so we must replace it with a less fragile approach. The fix is to augment assignment_testable (the interface representing a function argument) with an intentionally-awkwardly-named assignment_testable_type_opt(), that sees whether we happen to know the type for the argument in order to implement signature-from-argument inference. A note about assignment_testable: this is a bridge interface that is the least common denominator of anything that calls functions. Since we're moving towards expressions, there are fewer implementations of the interface as the code evolves.	2023-06-13 21:04:49 +03:00
Avi Kivity	5983e9e7b2	cql3: test_assignment: pass optional schema everywhere test_assignment() and related functions check for type compatibility between a right-hand-side and a left-hand-side. It started its life with a limited functionality for INSERT and UPDATE, but now it's about to be used for cast expression in selectors, which can cast a column_value. A column_value is still an unresolved_identifier during the prepare phase, and cannot be resolved without a schema. To prepare for this, pass an optional schema everywhere. Ultimately, test_assignment likely needs to be folded into prepare_expr(), but before that prepare_expr() has to be used everywhere.	2023-06-13 21:04:49 +03:00
Nadav Har'El	3b2c87a82b	cql: fix column name in writetime() error message Found and fixed yet another place where an error message prints a column name as "bytes" type which causes it to be printed as hexadecimal codes instead of the actual characters of the name. The specific error message fixed here is "Cannot use selection function writeTime on PRIMARY KEY part k" which happens when you try to use writetime() or ttl() on a key column (which isn't allowed today - see issue #14019). Before this patch we got "6b" in the error message instead of "k". The patch also includes a regression test that verifies that this error condition is recognized and the real name of the column is printed. This test fails before this patch, and passes after it. As usual, the test also passes on Cassandra. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14021	2023-05-24 19:28:44 +03:00
Avi Kivity	42a1ced73b	cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt The expression system uses managed_bytes_opt for values, but result_set uses bytes_opt. This means that processing values from the result set in expressions requires a copy. Out of the two, managed_bytes_opt is the better choice, since it prevents large contiguous allocations for large blobs. So we switch result_set to use managed_bytes_opt. Users of the result_set API are adjusted. The db::function interface is not modified to limit churn; instead we convert the types on entry and exit. This will be adjusted in a following patch.	2023-05-07 17:17:36 +03:00
Jan Ciolek	be8ef63bf5	cql3: remove expr::token Let's remove expr::token and replace all of its functionality with expr::function_call. expr::token is a struct whose job is to represent a partition key token. The idea is that when the user types in `token(p1, p2) < 1234`, this will be internally represented as an expression which uses expr::token to represent the `token(p1, p2)` part. The situation with expr::token is a bit complicated. On one hand side it's supposed to represent the partition token, but sometimes it's also assumed that it can represent a generic call to the token() function, for example `token(1, 2, 3)` could be a function_call, but it could also be expr::token. The query planning code assumes that each occurence of expr::token represents the partition token without checking the arguments. Because of this allowing `token(1, 2, 3)` to be represented as expr::token is dangerous - the query planning might think that it is `token(p1, p2, p3)` and plan the query based on this, which would be wrong. Currently expr::token is created only in one specific case. When the parser detects that the user typed in a restriction which has a call to `token` on the LHS it generates expr::token. In all other cases it generates an `expr::function_call`. Even when the `function_call` represents a valid partition token, it stays a `function_call`. During preparation there is no check to see if a `function_call` to `token` could be turned into `expr::token`. This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented as `expr::token` and the query planner handles that, but sometimes it might be represented as `function_call`, which the query planner doesn't handle. There is also a problem because there's a lot of duplication between a `function_call` and `expr::token`. All of the evaluation and preparation is the same for `expr::token` as it's for a `function_call` to the token function. Currently it's impossible to evaluate `expr::token` and preparation has some flaws, but implementing it would basically consist of copy-pasting the corresponding code from token `function_call`. One more aspect is multi-table queries. With `expr::token` we turn a call to the `token()` function into a struct that is schema-specific. What happens when a single expression is used to make queries to multiple tables? The schema is different, so something that is representad as `expr::token` for one schema would be represented as `function_call` in the context of a different schema. Translating expressions to different tables would require careful manipulation to convert `expr::token` to `function_call` and vice versa. This could cause trouble for index queries. Overall I think it would be best to remove expr::token. Although having a clear marker for the partition token is sometimes nice for query planning, in my opinion the pros are outweighted by the cons. I'm a big fan of having a single way to represent things, having two separate representations of the same thing without clear boundaries between them causes trouble. Instead of having expr::token and function_call we can just have the function_call and check if it represents a partition token when needed. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-04-29 13:11:31 +02:00
Avi Kivity	9fb5443f87	cql3: abstract_function_selector: use small_vector for argument buffer abstract_function_selector uses a preallocated vector to store the arguments to aggregate functions, to prevent an allocation for every row. Use small_vector to prevent an allocation per query, if the number of arguments happens to be small. This isn't expected to make a significant performance difference.	2023-04-19 20:42:25 +03:00
Avi Kivity	3e0aacc8b5	db, cql3: functions: pass function parameters as a span instead of a vector Spans are more flexible and can be constructed from any contiguous container (such as small_vector), or a subrange of such a container. This can save allocations, so change the signature to accept a span. Spans cannot be constructed from std::initializer_list, so one such call site is changed to use construct a span directly from the single argument.	2023-04-19 20:38:55 +03:00
Nadav Har'El	59ab9aac44	Merge 'functions: reframe aggregate functions in terms of scalar functions' from Avi Kivity Currently, aggregate functions are implemented in a statefull manner. The accumulator is stored internally in an aggregate_function::aggregate, requiring each query to instantiate new instances (see aggregate_function_selector's constructor, and note how it's called from selector::new_instance()). This makes aggregates hard to use in expressions, since expressions are stateless (with state only provided to evaluate()). To facilitate migration towards stateless expressions, we define a stateless_aggregate_function (modeled after user-defined aggregates, which are already stateless). This new struct defines the aggregate in terms of three scalar functions: one to aggregate a new input into an accumulator (provided in the first parameter), one to finalize an accumulator into a result, and one to reduce two accumulators for parallelized aggregation. All existing native aggregate functions are converted to the new model, and the old interface is removed. This series does not yet convert selectors to expressions, but it does remove one of the obstacles. Performance evaluation: I created a table with a million ints on a single-node cluster, and ran the avg() function on them. I measured the number of instructions executed with `perf stat -p $(pgrep scylla) -e instructions` while the query was running. The query executed from cache, memtables were flushed beforehand. The instruction count per row increased from roughly 49k to roughly 52k, indicating 3k extra instructions per row. While 3k instructions to execute a function is huge, it is currently dwarfed by other overhead (and will be even less important in a cluster where it CL>1 will cause non-coordinator code to run multiple times). Closes #13105 * github.com:scylladb/scylladb: cql3/selection, forward_service: use use stateless_aggregate_function directly db: functions: fold stateless_aggregate_function_adapter into aggregate_function cql3: functions: simplify accumulator_for template cql3: functions: base user-defined aggregates on stateless aggregates cql3: functions: drop native_aggregate_function cql3: functions: reimplement count(column) statelessly cql3: functions: reimplement avg() statelessly cql3: functions: reimplement sum() statelessly cql3: functions: change wide accumulator type to varint cql3: functions: unreverse types for min/max cql3: functions: rename make_{min,max}_dynamic_function cql3: functions: reimplement min/max statelessly cql3: functions: reimplement count(*) statelessly cql3: functions: simplify creating native functions even more cql3: functions: add helpers for automating marshalling for scalar functions types: fix big_decimal constructor from literal 0 cql3: functions: add helper class for internal scalar functions db: functions: add stateless aggregate functions db, cql3: move scalar_function from cql3/functions to db/functions	2023-03-30 13:58:47 +03:00
Avi Kivity	6977df5539	cql3/selection, forward_service: use use stateless_aggregate_function directly Now that stateless_aggregate_function is directly exposed by aggregate_function, we can use it directly, avoiding the intermediary aggregate_function::aggregate, which is removed.	2023-03-28 23:49:34 +03:00
Kefu Chai	c37f4e5252	treewide: use fmt::join() when appropriate now that fmtlib provides fmt::join(). see https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view there is not need to revent the wheel. so in this change, the homebrew join() is replaced with fmt::join(). as fmt::join() returns an join_view(), this could improve the performance under certain circumstances where the fully materialized string is not needed. please note, the goal of this change is to use fmt::join(), and this change does not intend to improve the performance of existing implementation based on "operator<<" unless the new implementation is much more complicated. we will address the unnecessarily materialized strings in a follow-up commit. some noteworthy things related to this change: * unlike the existing `join()`, `fmt::join()` returns a view. so we have to materialize the view if what we expect is a `sstring` * `fmt::format()` does not accept a view, so we cannot pass the return value of `fmt::join()` to `fmt::format()` * fmtlib does not format a typed pointer, i.e., it does not format, for instance, a `const std::string`. but operator<<() always print a typed pointer. so if we want to format a typed pointer, we either need to cast the pointer to `void` or use `fmt::ptr()`. * fmtlib is not able to pick up the overload of `operator<<(std::ostream& os, const column_definition* cd)`, so we have to use a wrapper class of `maybe_column_definition` for printing a pointer to `column_definition`. since the overload is only used by the two overloads of `statement_restrictions::add_single_column_parition_key_restriction()`, the operator<< for `const column_definition*` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 20:34:18 +08:00

1 2 3 4 5

238 Commits