scylladb

Author	SHA1	Message	Date
Kefu Chai	3e84d43f93	treewide: use seastar::format() or fmt::format() explicitly before this change, we rely on `using namespace seastar` to use `seastar::format()` without qualifying the `format()` with its namespace. this works fine until we changed the parameter type of format string `seastar::format()` from `const char*` to `fmt::format_string<...>`. this change practically invited `seastar::format()` to the club of `std::format()` and `fmt::format()`, where all members accept a templated parameter as its `fmt` parameter. and `seastar::format()` is not the best candidate anymore. despite that argument-dependent lookup (ADT for short) favors the function which is in the same namespace as its parameter, but `using namespace` makes `seastar::format()` more competitive, so both `std::format()` and `seastar::format()` are considered as the condidates. that is what is happening scylladb in quite a few caller sites of `format()`, hence ADT is not able to tell which function the winner in the name lookup: ``` /__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous 265 \| return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id()); \| ^~~~~~ /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 4290 \| format(format_string<_Args...> __fmt, _Args&&... __args) \| ^ /__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 143 \| format(fmt::format_string<A...> fmt, A&&... a) { \| ^ ``` in this change, we change all `format()` to either `fmt::format()` or `seastar::format()` with following rules: - if the caller expects an `sstring` or `std::string_view`, change to `seastar::format()` - if the caller expects an `std::string`, change to `fmt::format()`. because, `sstring::operator std::basic_string` would incur a deep copy. we will need another change to enable scylladb to compile with the latest seastar. namely, to pass the format string as a templated parameter down to helper functions which format their parameters. to miminize the scope of this change, let's include that change when bumping up the seastar submodule. as that change will depend on the seastar change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-09-11 23:21:40 +03:00
Avi Kivity	d69bf4f010	cql3: introduce dialect infrastructure A dialect is a different way to interpret the same CQL statement. Examples: - how duplicate bind variable names are handled (later in this series) - whether `column = NULL` in LWT can return true (as is now) or whether it always returns NULL (as in SQL) Currently, dialect is an empty structure and will be filled in later. It is passed to query_processor methods that also accept a CQL string, and from there to the parser. It is part of the prepared statement cache key, so that if the dialect is changed online, previous parses of the statement are ignored and the statement is prepared again. The patch is careful to pick up the dialect at the entry point (e.g. CQL protocol server) so that the dialect doesn't change while a statement is parsed, prepared, and cached.	2024-08-29 21:19:23 +03:00
Botond Dénes	46563d719f	replica/mutation_dump: enfore pinning of effective replication map By making it a required argument, making sure the topology version is pinned for the duration of the query. This is needed because mutation dump queries bypass the storage proxy, where this pinning usually takes place. So it has to be enforced here.	2024-08-22 06:24:06 -04:00
Avi Kivity	3de4e8f91b	Merge 'cql: process LIMIT for GROUP BY select queries' from Paweł Zakrzewski This change fixes #17237, fixes #5361 and fixes #5362 by passing the limit value down the call chain in cql3. A test is also added. fixes #17237 fixes #5361 fixes #5362 The regression happened in 5.4 as we changed the way GROUP BY is processed in `432cb02` - to force aggregation when it is used. The LIMIT value was not passed to aggregations and thus we failed to adhere to it. W want to backport this fix to 5.4 and 6.0 to have continuous correct results for the test case from #17237 This patch consists of 4 commits: - fa4225ea0fac2057b7a9976f57dc06bcbd900cd4 - cql3: respect the user-defined page size in aggregate queries - a precondition for this patch to be implementable - 8fbe69e74dca16ed8832d9a90489ca47ba271d0b - cql3/select_statement: simplify the get_limit function - the `do_get_limit()` function did a lot of legwork that should not be associated with it. This change makes it trivial and makes its callers do additional checks (for unset guards, or for an aggregate query) - 162828194a2b88c22fbee335894ff045dcc943c9 - cql3: process LIMIT for GROUP BY queries - pass the limit value down the chain and make use of it. This is the actual fix to #17237 - b3dc6de6d6cda8f5c09b01463bb52f827a6a00b4 - test/cql-pytest: Add test for GROUP BY queries with LIMIT - tests Closes scylladb/scylladb#18842 * github.com:scylladb/scylladb: test/cql-pytest: Add test for GROUP BY queries with LIMIT cql3: process LIMIT for GROUP BY queries cql3/select_statement: simplify the get_limit function cql3: respect the user-defined page size in aggregate queries	2024-08-14 17:54:59 +03:00
Łukasz Paszkowski	15a01c7111	select_statement::do_execute: Add tracing informaction Add information on table and query schema versions to tracing.	2024-08-13 10:07:12 +02:00
Łukasz Paszkowski	158b994676	query::trim_clustering_row_ranges_to: require reversed schema for native reversed ranges Simplify implementation and for clustering key ranges in native reversed format, require a reversed table schema. Trimming native reversed clustering key ranges requires a reversed schema to be passed in. Thus, the reverse flag is no longer required as it would always be set to false.	2024-08-13 10:07:10 +02:00
Łukasz Paszkowski	309ba68692	select_statement: Execute reversed query in native format Use a reversed schema and a native reversed slice when constructing a read_command and executing a reversed select statement. Such a created read_command is passed further down to query_pagers::pager and storage::proxy::query_result that transform it to the format they accept/know, i.e. lagacy.	2024-08-13 10:03:46 +02:00
Paweł Zakrzewski	e7ae7f3662	cql3: process LIMIT for GROUP BY queries Currently LIMIT not passed to the query executor at all and it was just an accident that it worked for the case referenced in #17237. This change passes the limit value down the chain.	2024-08-11 09:08:43 +02:00
Paweł Zakrzewski	3838ad64b3	cql3/select_statement: simplify the get_limit function The get_limit() function performed tasks outside of its scope - for example checked if the statement was an aggregate. This change moves the onus of the check to the caller.	2024-08-11 09:08:43 +02:00
Paweł Zakrzewski	08f3219cb8	cql3: respect the user-defined page size in aggregate queries The comment in the code already states that we should use the user-defined page size if it's provided. To avoid OOM conditions we'll use the internally defined limit as the upper bound or if no page size is provided. This change lays ground work for fixing #5362 and is necessary to pass the test introduced in #19392 once it is implemented.	2024-08-11 09:08:43 +02:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Avi Kivity	3fc4e23a36	forward_service: rename to mapreduce_service forward_service is nondescriptive and misnamed, as it does more than forward requests. It's a classic map/reduce algorithm (and in fact one of its parameters is "reducer"), so name it accordingly. The name "forward" leaked into the wire protocol for the messaging service RPC isolation cookie, so it's kept there. It's also maintained in the name of the logger (for "nodetool setlogginglevel") for compatibility with tests. Closes scylladb/scylladb#19444	2024-07-03 19:29:47 +03:00
Michał Jadwiszczak	e9ace7c203	cql3/select_statement: do not parallelize single-partition aggregations Currently reads with WHERE clause which limits them to be single-partition reads, are unnecessarily parallelized. This commit checks this condition and the query doesn't use forward_service in single-partition reads.	2024-06-18 19:21:32 +02:00
Tomasz Grabiec	c9294b1642	lwt: Avoid deprecated sharder::shard_of() Instead, use shard_for_reads(). The justification is that: 1) In cas_shard(), we need to pick a single request coordinator. shard_for_reads() gives that, which is equivalent to shard_of() if there is no intra-node migration. 2) In paxos handler for prepare(), the shard we execute it on is the shard from which we read, so shard_for_reads() is the one. 3) Updates of paxos state are separate CQL requests, and use their own sharding. 4) Handler for learn is executing updates using calls to storage_proxy::mutate_locally() which will use the right sharder for writes However, the code is still not prepared for intra-node migration, and possibly regular migration too in case of abandoned requests, because the locking of paxos state assumes that the shard is static. That would have to be fixed separately, e.g. by locking both shards (shard_for_writes()) during migration, so that the set of locked shards always intersects during migration and local serialization of paxos state updates is achieved. I left FIXMEs for that.	2024-05-16 00:28:47 +02:00
Pavel Emelyanov	1612aa01ca	cql3: Reserve vector with pk columns When constructing a vector with partition key data, the size of that vector is known beforehand Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18239	2024-04-16 07:06:07 +03:00
Kefu Chai	15d59db98b	cql3: select_statement: include <ranges> we should include used header, to avoid compilation failures like: ``` cql3/statements/select_statement.cc:229:79: error: no member named 'filter' in namespace 'std::ranges::views' for (const auto& used_function : used_functions \| std::ranges::views::filter(not_native)) { ~~~~~~~~~~~~~~~~~~~~^ 1 error generated.` ``` if some of the included header drops its own `#include <optional>`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18145	2024-04-02 18:47:54 +03:00
Botond Dénes	c228e4d518	cql3: select_statement: mutation_fragments_select_statement: fix use-after-return Don't capture stack variables by reference... it can (and will) explode in your face.	2024-02-28 06:48:09 -05:00
Kefu Chai	2dbf044b91	cql3: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16791	2024-01-16 16:43:17 +02:00
Kamil Braun	d93074e87e	cql3: don't parallelize select aggregates to local tables We've observed errors during shutdown like the following: ``` ERROR 2023-12-26 17:36:17,413 [shard 0:main] raft - [088f01a3-a18b-4821-b027-9f49e55c1926] applier fiber stopped because of the error: std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down) INFO 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft_state_monitor_fiber aborted with raft::stopped_error (Raft instance is stopped) ERROR 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft topology: failed to fence previous coordinator raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down)") ``` some CQL statement execution was trying to use `forward_service` during shutdown. It turns out that the statement is in `system_keyspace::load_topology_state`: ``` auto gen_rows = co_await execute_cql( format("SELECT count(range_end) as cnt FROM {}.{} WHERE key = '{}' AND id = ?", NAME, CDC_GENERATIONS_V3, cdc::CDC_GENERATIONS_V3_KEY), gen_uuid); ``` It's querying a table in the `system` keyspace. Pushing local table queries through `forward_service` doesn't make sense as the data is not distributed. Excluding local tables from this logic also fixes the shutdown error. Fixes scylladb/scylladb#16570 Closes scylladb/scylladb#16662	2024-01-08 14:44:22 -05:00
Sylwia Szunejko	91a5a41313	add a way to negotiate generation of the tablet info for drivers Tablets metadata is quite expensive to generate (each data_value is an allocation), so an old driver (without support for tablets) will generate huge amounts of such notifications. This commit adds a way to negotiate generation of the notification: a new driver will ask for them, and an old driver won't get them. It uses the OPTIONS/SUPPORTED/STARTUP protocol described in native_protocol_v4.spec. Closes scylladb/scylladb#16611	2024-01-02 20:00:50 +02:00
Nadav Har'El	fc71c34597	Merge 'select statement: verify EXECUTE permissions only for non native functions' from Eliran Sinvani Commit `62458b8e4f` introduced the enforcement of EXECUTE permissions of functions in cql select. However, according to the reference in #12869, the permissions should be enforced only on UDFs and UDAs. The code does not distinguish between the two so the permissions are also unintenionally enforced also on native function. This commit introduce the distinction and only enforces the permissions on non native functions. Fixes #16526 Manually verified (before and after change) with the reproducer supplied in #16526 and also with some the `min` and `max` native functions. Also added test that checks for regression on native functions execution and verified that it fails on authorization before the fix and passes after the fix. Closes scylladb/scylladb#16556 * github.com:scylladb/scylladb: test.py: Add test for native functions permissions select statement: verify EXECUTE permissions only for non native functions	2023-12-26 18:14:21 +02:00
Eliran Sinvani	cac79977d6	select statement: verify EXECUTE permissions only for non native functions Commit `62458b8e4f` introduced the enforcement of EXECUTE permissions of functions in cql select. However, according to the reference in #12869, the permissions should be enforced only on UDFs and UDAs. The code does not distinguish between the two so the permissions are also unintentionally enforced also on native function. This commit introduce the distinction and only enforces the permissions on non native functions. Fixes #16526 Manually verified (before and after change) with the reproducer supplied in #16526 and also with some the `min` and `max` native functions. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-12-26 10:27:04 +02:00
Nadav Har'El	1aea2136c8	cql: fix regression in SELECT * GROUP BY Recently, the expression-rewrite effort changed the way that GROUP BY is implemented. Usually GROUP BY involves an aggregation function (e.g., if you want a separate SUM per partition). But there's also a query like SELECT p, c1, c2, v FROM tbl GROUP BY p This query is supposed to return one row - the first row in clustering order - per group (in this case, partition). The expression rewrite re-implemented this feature by introducing a new internal aggregator, first(), which returns the first aggregated value. The above query is rewritten into: SELECT first(p), first(c1), first(c2), first(v) FROM tbl GROUP BY p This case works correctly, and we even have a regression test for it. But unfortunately the rewrite broke the following query: SELECT * FROM tbl GROUP BY p Note the "" instead of the explicit list of columns. In our implementation, a selection of "" is looks like an empty selection, and it didn't get the "first()" treatment and it remained a "SELECT " - and wrongly returned all rows instead of just the first one in each partition. This was a regression - it worked correctly in Scylla 5.2 (and also in Cassandra) - see the next patch for a regression test. In this patch we fix this regression. When there is a GROUP BY, the "" is rewritten to the appropriate list of all visible columns and then gets the first() treatment, so it will return only the first row as expected. The next patch will be a test that confirms the bug and its fix. Fixes #16531 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-12-25 17:52:57 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
sylwiaszunejko	54f22927a3	cql3: send tablet if wrong node/shard is used during select statement	2023-11-22 09:23:43 +01:00
Avi Kivity	ef7db6df99	Merge 'schema_tables: turn view schema fixing code into a sanity check' from Kamil Braun The purpose of `maybe_fix_legacy_secondary_index_mv_schema` was to deal with legacy materialized view schemas used for secondary indexes, schemas which were created before the notion of "computed columns" was introduced. Back then, secondary index schemas would use a regular "token" column. Later it became a computed column and old schemas would be migrated during rolling upgrade. The migration code was introduced in 2019 (`db8d4a0cc6`) and then fixed in 2020 (`d473bc9b06`). The fix was present in Enterprise 2022.1 and in OSS 4.5. So, assuming that users don't try crazy things like upgrading from 2021.X to 2023.X (which we do not support), all clusters will have already executed the migration code once they upgrade to 2023.X, meaning we can get rid of it. The main motivation of this PR is to get rid of the `db::schema_tables::merge_schema` call in `parse_schema_tables`. In Raft mode this was the only call to `merge_schema` outside "group 0 code" and in fact it is unsafe -- it uses locally generated mutations with locally generated timestamp (`api::new_timestamp()`), so if we actually did it, we would permanently diverge the group 0 state machine across nodes (the schema pulling code is disabled in Raft mode). Fortunately, this should be dead code by now, as explained in the previous paragraph. The migration code is now turned into a sanity check, if the users try something crazy, they will get an error instead of silent data corruption. Closes scylladb/scylladb#15695 * github.com:scylladb/scylladb: view: remove unused `_backing_secondary_index` schema_tables: turn view schema fixing code into a sanity check schema_tables: make comment more precise feature_service: make COMPUTED_COLUMNS feature unconditionally true	2023-10-31 13:23:19 +02:00
Kamil Braun	3976808b12	schema_tables: turn view schema fixing code into a sanity check The purpose of `maybe_fix_legacy_secondary_index_mv_schema` was to deal with legacy materialized view schemas used for secondary indexes, schemas which were created before the notion of "computed columns" was introduced. Back then, secondary index schemas would use a regular "token" column. Later it became a computed column and old schemas would be migrated during rolling upgrade. The migration code was introduced in 2019 (`db8d4a0cc6`) and then fixed in 2020 (`d473bc9b06`). The fix was present in Enterprise 2022.1 and in OSS 4.5. So, assuming that users don't try crazy things like upgrading from 2021.X to 2023.X (which we do not support), all clusters will have already executed the migration code once they upgrade to 2023.X, meaning we can get rid of it. The main motivation of this patch is to get rid of the `db::schema_tables::merge_schema` call in `parse_schema_tables`. In Raft mode this was the only call to `merge_schema` outside "group 0 code" and in fact it is unsafe -- it uses locally generated mutations with locally generated timestamp (`api::new_timestamp()`), so if we actually did it, we would permanently diverge the group 0 state machine across nodes (the schema pulling code is disabled in Raft mode). Fortunately, this should be dead code by now, as explained in the previous paragraph. The migration code is now turned into a sanity check, if the users try something crazy, they will get an error instead of silent data corruption.	2023-10-24 13:33:35 +02:00
Botond Dénes	23898581d5	cql3: mutation_fragments_select_statement: use host_id instead of node The statement only uses the node to get its host_id later. Simpler to obtain and store only the host_id int he first place.	2023-10-24 03:12:58 -04:00
Botond Dénes	3cb1669340	cql3: mutation_fragments_select_statement: pin erm reference This query bypasses the usual read-path in storage-proxy and therefore also misses the erm pinning done by storage-proxy. To avoid a vnode being pulled from under its feet, do the erm pinning in the statement itself.	2023-10-24 03:12:36 -04:00
Gleb Natapov	4ffc39d885	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNsynXayKim2XAFr@scylladb.com>	2023-08-17 15:52:48 +03:00
Avi Kivity	d57a951d48	Revert "cql3: Extend the scope of group0_guard during DDL statement execution" This reverts commit `70b5360a73`. It generates a failure in group0_test .test_concurrent_group0_modifications in debug mode with about 4% probability. Fixes #15050	2023-08-15 00:26:45 +03:00
Gleb Natapov	70b5360a73	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>	2023-08-13 14:19:39 +03:00
Avi Kivity	4f7e83a4d0	cql3: select_statement: reject DISTINCT with GROUP BY on clustering keys While in SQL DISTINCT applies to the result set, in CQL it applies to the table being selected, and doesn't allow GROUP BY with clustering keys. So reject the combination like Cassandra does. While this is not an important issue to fix, it blocks un-xfailing other issues, so I'm clearing it ahead of fixing those issues. An issue is unmarked as xfail, and other xfails lose this issue as a blocker. Fixes #12479 Closes #14970	2023-08-07 15:35:59 +03:00
Botond Dénes	6458ff9917	cql3/statements: wire-in mutation_fragments_select_statement This commit contains all the changes required to wire-in the new select from mutation_fragment() statement.	2023-07-19 01:28:28 -04:00
Botond Dénes	0b6b00178e	cql3/statments/select_statement: add mutation_fragments_select_statement Not wired in yet. SELECT * FROM MUTATION_FRAGMENTS($table) is a new select statement sub-type, which allows dumping the underling mutations making up the data of a given table. The output of this statement is mutation-fragments presented as CQL rows. Each row corresponds to a mutation-fragment. Subsequently, the output of this statement has a schema that is different than that of the underlying table. Data is always read from the local replica, on which the query is executed. Migrating queries between coordinators is not allowed.	2023-07-19 01:28:28 -04:00
Avi Kivity	0f59b17056	cql3: select_statement: don't copy metadata object needlessly It's a shared_ptr<const metadata>, so it's safe to pass around. perf-simple-query: before: 211989.40 tps ( 62.1 allocs/op, 13.1 tasks/op, 43812 insns/op, 0 errors) 217889.09 tps ( 62.1 allocs/op, 13.1 tasks/op, 43713 insns/op, 0 errors) 211418.75 tps ( 62.1 allocs/op, 13.1 tasks/op, 43782 insns/op, 0 errors) 217388.46 tps ( 62.1 allocs/op, 13.1 tasks/op, 43733 insns/op, 0 errors) 211528.74 tps ( 62.1 allocs/op, 13.1 tasks/op, 43766 insns/op, 0 errors) after: 215241.86 tps ( 61.1 allocs/op, 13.1 tasks/op, 43563 insns/op, 0 errors) 216172.41 tps ( 61.1 allocs/op, 13.1 tasks/op, 43562 insns/op, 0 errors) 212591.73 tps ( 61.1 allocs/op, 13.1 tasks/op, 43586 insns/op, 0 errors) 212217.28 tps ( 61.1 allocs/op, 13.1 tasks/op, 43553 insns/op, 0 errors) 215863.47 tps ( 61.1 allocs/op, 13.1 tasks/op, 43559 insns/op, 0 errors) About 200 instructions saved. Closes #14499	2023-07-04 16:41:51 +03:00
Avi Kivity	66c47d40e6	cql3: selection: drop selector_factories, selectables, and selectors The whole class hierarchy is no longer used by anything and we can just delete it.	2023-07-03 19:45:17 +03:00
Avi Kivity	d9cf81f1a6	cql3: select_statement: stop using selector_factories in SELECT JSON SELECT JSON uses selector_factories to obtain the names of the fields to insert into the json object, and we want to drop selector_factories entirely. Switch instead to the ":metadata" mode of printing expressions, which does what we want. Unfortunately, the switch changes how system functions are converted into field names. A function such as unixtimestampof() is now rendered as "system.unixtimestampof()"; before it did not have the keyspace prefix. This is a compatiblity problem, albeit an obscure one. Since the new behavior matches Cassandra, and the odds of hitting this are very low, I think we can allow the change.	2023-07-03 19:45:17 +03:00
Avi Kivity	27254c4f50	cql3: selection, select_statement: fine tune add_column_for_post_processing() usage In three cases we need to consult a column that's possibly not explicitly selected: - for the WHERE clause - for GROUP BY - for ORDER BY The return value of the function is the index where the newly-added column can be found. Currently, the index is correct for both the internal column vector and the result set, but soon in won't be. In the first two cases (WHERE clause and ORDER BY), we're interested in the column before grouping, in the last case (ORDER BY) we're interested in the column after grouping, so we need to distinguish between the two. Since we already have selection::index_of() that returns the pre-grouping index, choose the post-grouping index for the return value of selection::add_column_for_post_processing(), and change the GROUP BY code to use index_of(). Comments are added.	2023-07-03 19:45:17 +03:00
Avi Kivity	432cb02d64	cql3: select_statement: force aggregation if GROUP BY is used GROUP BY is typically used with aggregation. In one case the aggregation is implicit: SELECT a, b, c FROM tab GROUP BY x, y, z One row will appear from each group, even though no aggregation was specified. To avoid this irregularity, rewrite this query as SELECT first(a), first(b), first(c) FROM tab GROUP BY x, y, z This allows us to have different paths for aggregations and non-aggregations, without worrying about this special case.	2023-07-03 19:45:17 +03:00
Avi Kivity	bc6c64e13c	cql3: select_statement: levellize aggregation depth Avoid mixed aggregate/non-aggregate queries by inserting calls to the first() function. This allows us to avoid internal state (simple_selector::_current) and make selector evaluation stateless apart from explicit temporaries.	2023-07-03 19:45:17 +03:00
Avi Kivity	996e02f5bf	cql3: select_statement: explicitly disable automatic parallelization with no aggregates A query of the form `SELECT foo, count(foo) FROM tab` returns the first value of the foo column along with the count. This can't be parallized today since the first selector isn't an aggregate. We plan to rewrite the query internally as `SELECT first(foo), count(foo) FROM tab`, in order to make the query more regular (no mixing of aggregates and non-aggregates). However, this will defeat the current check since after the rewrite, all selectors are aggregates. Prepare for this by performing the check on a pre-rewrite variable, so it won't be affected by the query rewrite in the next patch. Note that although even though we could add support for running first() in parallel, it's not possible to get the correct results, since first() is not commutative and we don't reduce in order. It's also not a particularly interesting query.	2023-07-03 19:45:17 +03:00
Avi Kivity	7c3ceb6473	cql3: select_statement: use prepared selectors Change one more layer of processing to work on prepared rather than raw selectors. This moves the call to prepare the selectors early in select_statement processing. In turn this changes maybe_jsonize_select_clause() and forward_service's mock_selection() to work in the prepared realm as well. This moves us one step closer to using evaluate() to process the select clause, as the prepared selectors are now available in select_statement. We can't use them yet since we can't evaluate aggregations.	2023-07-03 19:45:17 +03:00
Avi Kivity	4fb797303f	cql3: selection: prepare selectors earlier Currently, each selector expression is individually prepared, then converted into a selector object that is later executed. This is done (on a vector of raw selectors) by cql3::selection::raw_selector::to_selectables(). Split that into two phases. The first phase converts raw_selector into a new struct prepared_selector (a better name would be plain 'selector', but it's taken for now). The second phase continues the process and converts prepared_selector into selectables. This gives us a full view of the prepared expressions while we're preparing the select clause of the select statement.	2023-07-03 19:45:17 +03:00
Kamil Braun	e6942d31d3	Merge 'query processor code cleanup' from Gleb The series contains mostly cleanups for query processor and no functional change. The last patch is a small cleanup for the storage_proxy. * 'qp-cleanup' of https://github.com/gleb-cloudius/scylla: storage_proxy: remove unused variable client_state: co-routinise has_column_family_access function query_processor: get rid of internal_state and create individual query_satate for each request cql3: move validation::validate_column_family from client_state::has_column_family_access client_state: drop unneeded argument from has.*access functions cql3: move check for dropping cdc tables from auth to the drop statement code itself query_processor: co-routinise execute_prepared_without_checking_exception_message function query_processor: co-routinize execute_direct_without_checking_exception_message function cql3: remove empty statement::validate functions cql3: remove empty function validate_cluster_support cql3/statements: fix indentation and spurious white spaces query_processor: move statement::validate call into execute_with_params function query_processor: co-routinise execute_with_params function query_processor: execute statement::validate before each execution of internal query instead of only during prepare query_processor: get rid of shared internal_query_state query_processor: co-routinize execute_paged_internal function query_processor: co_routinize execute_batch_without_checking_exception_message function query_processor: co-routinize process_authorized_statement function	2023-06-23 10:32:57 +02:00
Avi Kivity	b858a4669d	cql3: expr: break up expression.hh header Adding a function declaration to expression.hh causes many recompilations. Reduce that by: - moving some restrictions-related definitions to the existing expr/restrictions.hh - moving evaluation related names to a new header expr/evaluate.hh - move utilities to a new header expr/expr-utilities.hh expression.hh contains only expression definitions and the most basic and common helpers, like printing.	2023-06-22 14:21:03 +03:00
Gleb Natapov	4bad482e4b	cql3: move validation::validate_column_family from client_state::has_column_family_access Checking keyspace/table presence should not be part of authorization code and it is not done consistently today. For instance keyspace presence is not checked in "alter keyspace" during authorization, but during statement execution. Make it consistent.	2023-06-22 13:57:36 +03:00
Gleb Natapov	31bddb65c7	client_state: drop unneeded argument from has.access functions After previous patch we can drop db argument to most of has.access functions in the client_state.	2023-06-22 13:57:36 +03:00
Gleb Natapov	45ce608117	cql3: remove empty statement::validate functions There are a lot of empty overloads for the function so lets remove them and use the one in the parent class instead.	2023-06-22 13:57:33 +03:00
Kamil Braun	563d466de1	Merge 'cql3: select_statement: coroutinize indexed statement's do_execute()' from Avi Kivity Improves readability, and probably a little faster too. Closes #14311 * github.com:scylladb/scylladb: cql3: select_statement: reindent indexed_table_select_statement::do_execute cql3: select_statement: simplify inner lambda in indexed_table_select_statement::do_execute() cql3: select_statement: coroutinize indexed_table_select_statement::do_execute()	2023-06-22 12:10:45 +02:00

1 2 3 4 5 ...

474 Commits