scylladb

Author	SHA1	Message	Date
Avi Kivity	9823e75d16	cql3: grammar: make where clause return an expression In preparation of the relaxation of the grammar to return any expression, change the whereClause production to return an expression rather than terms. Note that the expression is still constrained to be a conjunction of relations, and our filtering code isn't prepared for more. Before the patch, if the WHERE clause was optional, the grammar would pass an empty vector of expressions (which is exactly correct). After the patch, it would pass a default-constructed expression. Now that happens to be an empty conjunction, which is exactly what's needed, but it is too accidental, so the patch changes optional WHERE clauses to explicitly generate an empty conjunction if the WHERE clause wasn't specified.	2022-07-22 20:14:48 +03:00
Avi Kivity	a5dd588465	cql3: statement_restrictions: accept a single expression rather than a vector Move closer to the goal of accepting a generic expression for WHERE clause by accepting a generic expression in statement_restrictions. The various callers will synthesize it from a vector of terms.	2022-07-22 20:14:48 +03:00
Avi Kivity	4aa0a03b7e	cql3: select_statement: remove wrong but harmless std::move() in prepare_restrictions std::move(_where_clause) is wrong, because _where_clause is used later (when analyzing GROUP BY), but also harmless (because the statement_restrictions constructor accepts it by const reference). To avoid confusion in the next patch where we'll pass _where_clause to a different function, remove the bad std::move() in advance here.	2022-07-22 20:14:48 +03:00
Avi Kivity	13a64d8ab2	Merge 'Remove all remaining restrictions classes' from Jan Ciołek This PR removes all code that used classes `restriction`, `restrictions` and their children. There were two fields in `statement_restrictions` that needed to be dealt with: `_clustering_columns_restrictions` and `_nonprimary_key_restrictions`. Each function was reimplemented to operate on the new expression representaiion and eventually these fields weren't needed anymore. After that the restriction classes weren't used anymore and could be deleted as well. Now all of the code responsible for analyzing WHERE clause and planning a query works on expressions. Closes #11069 * github.com:scylladb/scylla: cql3: Remove all remaining restrictions code cql3: Move a function from restrictions class to the test cql3: Remove initial_key_restrictions cql3: expr: Remove convert_to_restriction cql3: Remove _new from _new_nonprimary_key_restrictions cql3: Remove _nonprimary_key_restrictions field cql3: Reimplement uses of _nonprimary_key_restrictions using expression cql3: Keep a map of single column nonprimary key restrictions cql3: Remove _new from _new_clustering_columns_restrictions cql3: Remove _clustering_columns_restrictions from statement_restrictions cql3: Use a variable instead of dynamic cast cql3: Use the new map of single column clustering restrictions cql3: Keep a map of single column clustering key restrictions cql3: Return an expression in get_clustering_columns_restrctions() cql3: Reimplement _clustering_columns_restrictions->has_supporting_index() cql3: Don't create single element conjunction cql3: Add expr::index_supports_some_column cql3: Reimplement has_unrestricted_components() cql3: Reimplement _clustering_columns_restrictions->need_filtering() cql3: Reimplement num_prefix_columns_that_need_not_be_filtered cql3: Use the new clustering restrictions field instead of ->expression cql3: Reimplement _clustering_columns_restrictions->size() using expressions cql3: Reimplement _clustering_columns_restrictions->get_column_defs() using expressions cql3: Reimplement _clustering_columns_restrictions->is_all_eq() using expressions cql3: expr: Add has_only_eq_binops function cql3: Reimplement _clustering_columns_restrictions->empty() using expressions	2022-07-20 18:01:15 +03:00
Botond Dénes	014c5b56a3	query-result: move last_pos up to query::result query_result was the wrong place to put last position into. It is only included in data-responses, but not on digest-responses. If we want to support empty pages from replicas, both data and digest responses have to include the last position. So hoist up the last position to the parent structure: query::result. This is a breaking change inter-node ABI wise, but it is fine: the current code wasn't released yet. Closes #11072	2022-07-20 13:28:09 +03:00
Jan Ciolek	599bcd6ea7	cql3: Remove all remaining restrictions code The classes restriction, restrictions and its children aren't used anywhere now and can be safely removed. Some includes need to be modified for the code to compile. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-07-20 09:10:31 +02:00
Jan Ciolek	2b7ffd57fb	cql3: Return an expression in get_clustering_columns_restrctions() get_clustering_columns_restrctions() used to return a shared pointer to the clustering_restrictions class. Now everything is being converted to expression, so it should return an expression as well. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-07-19 16:02:01 +02:00
Jadw1	182438c5f8	forward_service: enable multiple selection Enables parallelization of query like `SELECT MIN(x), MAX(x)`. Compatibility is ensured under the same cluster feature as UDA and native aggregates parallelization. (UDA_NATIVE_PARALLELIZED_AGGREGATION)	2022-07-18 15:25:41 +02:00
Jadw1	29a0be75da	forward_service: support UDA and native aggregate parallelization Enables parallelization of UDA and native aggregates. The way the query is parallelized is the same as in #9209. Separate reduction type for `COUNT(*)` is left for compatibility reason.	2022-07-18 15:25:41 +02:00
Jadw1	6d977fcf88	cql3: selection: detect parallelize reduction type Detects type of reduction if it is possible. Separate case for `COUNT(*)` is left for compatibility reason. By now only single selection is supported.	2022-07-18 15:25:41 +02:00
Jan Ciolek	76bf75a9d3	cql3: Use expression for index restrictions Restrictions that might be used by an index are currently being kept as shared_ptr<restrictions>. This stand in the way of replacing _parition_key_restrictions with an expression as an expression can't be cast to shared_ptr<restriction>. Change shared_ptr<restriction> to expression everywhere where necessary in index operations. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-07-01 16:29:11 +02:00
Jan Ciolek	1339ff1c79	cql3: Use expression instead of _partition_key_restrictions in the remaining code There are still some places that use partition_key_restrictions instead of _new_partition_key_restrictions in statement_restrictions. Change them to use the new representation Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-07-01 16:29:10 +02:00
Jan Ciolek	7f620cfa29	cql3: Replace parition_key_restrictions->empty() To remove partition_key_restrictions all of its methods have to be implemented using the new expression representation. The first to go is empty() as it's easy to implement. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-07-01 16:29:09 +02:00
Avi Kivity	3131cbea62	Merge 'query: allow replica to provide arbitrary continue position' from Botond Dénes Currently, we use the last row in the query result set as the position where the query is continued from on the next page. Since only live rows make it into query result set, this mandates the query to be stopped on a live row on the replica, lest any dead rows or tombstones processed after the live rows, would have to be re-processed on the next page (and the saved reader would have to be thrown away due to position mismatch). This requirement of having to stop on a live row is problematic with datasets which have lots of dead rows or tombstones, especially if these form a prefix. In the extreme case, a query can time out before it can process a single live row and the data-set becomes effectively unreadable until compaction gets rid of the tombstones. This series prepares the way for the solution: it allows the replica to determine what position the query should continue from on the next page. This position can be that of a dead row, if the query stopped on a dead row. For now, the replica supplies the same position that would have been obtained with looking at the last row in the result set, this series merely introduces the infrastructure for transferring a position together with the query result, and it prepares the paging logic to make use of this position. If the coordinator is not prepared for the new field, it will simply fall-back to the old way of looking at the last row in the result set. As I said for now this is still the same as the content of the new field so there is no problem in mixed clusters. Refs: https://github.com/scylladb/scylla/issues/3672 Refs: https://github.com/scylladb/scylla/issues/7689 Refs: https://github.com/scylladb/scylla/issues/7933 Tests: manual upgrade test. I wrote a data set with: ``` ./scylla-bench -mode=write -workload=sequential -replication-factor=3 -nodes 127.0.0.1,127.0.0.2,127.0.0.3 -clustering-row-count=10000 -clustering-row-size=8096 -partition-count=1000 ``` This creates large, 80MB partitions, which should fill many pages if read in full. Then I started a read workload: ``` ./scylla-bench -mode=read -workload=uniform -replication-factor=3 -nodes 127.0.0.1,127.0.0.2,127.0.0.3 -clustering-row-count=10000 -duration=10m -rows-per-request=9000 -page-size=100 ``` I confirmed that paging is happening as expected, then upgraded the nodes one-by-one to this PR (while the read-load was ongoing). I observed no read errors or any other errors in the logs. Closes #10829 * github.com:scylladb/scylla: query: have replica provide the last position idl/query: add last_position to query_result mutlishard_mutation_query: propagate compaction state to result builder multishard_mutation_query: defer creating result builder until needed querier: use full_position instead of ad-hoc struct querier: rely on compactor for position tracking mutation_compactor: add current_full_position() convenience accessor mutation_compactor: s/_last_clustering_pos/_last_pos/ mutation_compactor: add state accessor to compact_mutation introduce full_position idl: move position_in_partition into own header service/paging: use position_in_partition instead of clustering_key for last row alternator/serialization: extract value object parsing logic service/pagers/query_pagers.cc: fix indentation position_in_partition: add to_string(partition_region) and parse_partition_region() mutation_fragment.hh: move operator<<(partition_region) to position_in_partition.hh	2022-06-27 12:23:21 +03:00
Pavel Emelyanov' via ScyllaDB development	a78af050fd	cql: Constify select_statement restrictions It is in fact immutable (both the pointer and the object it points to), so is the pointer copy returned by get_restrictions() method, so are those propagated to filtering stuff. tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1028 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220624083351.24970-1-xemul@scylladb.com>	2022-06-24 12:27:36 +03:00
Botond Dénes	fd5f8f2275	query: have replica provide the last position Use the recently introduced query-result facility to have the replica set the position where the query should continue from. For now this is the same as what the implicit position would have been previously (last row in result), but it opens up the possibility to stop the query at a dead row.	2022-06-23 13:36:24 +03:00
Piotr Dulikowski	a7ad70600d	query-request: add allow_limit flag Adds allow_limit flag to the read_command. The flag decides whether rate limiting of this operation is allowed.	2022-06-22 20:16:49 +02:00
Avi Kivity	9e213d979f	cql3: expr: pass schema to prepare_expression Currently prepare_expression is never used where a schema is needed - it is called for the right-hand-side of binary operators (where we don't accept columns) or for attributes like WRITETIME or TTL. But when we unify expression preparation it will need to handle columns too, and these need the schema to look up the column. So pass the schema as a parameter. It is optional (a pointer) since not all contexts will have a schema (for example CREATE AGGREGATE).	2022-06-01 18:48:03 +03:00
Michał Radwański	906cee7052	treewide: remove unqualified calls to std::move clang 15 emits such a warning: cql3/statements/raw/parsed_statement.cc:46:16: error: unqualified call to 'std::move' [-Werror,-Wunqualified-std-cast-call] , warnings(move(warnings)) ^ std:: cql3/statements/raw/parsed_statement.cc:52:101: error: unqualified call to 'std::move' [-Werror,-Wunqualified-std-cast-call] : prepared_statement(statement_, ctx.get_variable_specifications(), partition_key_bind_indices, move(warnings)) ^ std:: Closes #10656	2022-05-27 16:36:49 +02:00
Piotr Sarna	ec0a3bbbd4	cql3: add a statement for deleting ghost rows In order to expose the API for deleting ghost rows from a view, a CQL statement is created. It is loosely based on select_statement, as its first step is to select view table rows.	2022-05-19 10:11:50 +02:00
Piotr Sarna	d74e25be67	cql3: convert is_json statement parameter to enum Right now is_json is used to decide if the statement needs to be treated in a special way. For two types (regular statement and JSON statement), a boolean is enough, but this series extends it for two more types, so the flag is converted to an enum.	2022-05-19 10:11:50 +02:00
cvybhu	51cdbdeacb	cql3: Make parser output expression for relations Parser used to output the where clause as a vector of relations, but now we can change it to a vector of expressions. Cql.g needs to be modified to output expressions instead of relations. The WHERE clause is kept in a few places in the code that need to be changed to vector<expression>. Finally relation->to_restriction is replaced by expr::to_restriction and the expressions are converted to restrictions where required. The relation class isn't used anywhere now and can be removed. Signed-off-by: cvybhu <jan.ciolek@scylladb.com>	2022-05-16 18:17:58 +02:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Avi Kivity	19ab3edd77	gms: feature_service: remove variable/helper function duplication Each feature has a private variable and a public accessor. Since the accessor effectively makes the variable public, avoid the intermediary and make the variable public directly. To ease mechanical translation, the variable name is chosen as the function name (without the cluster_supports_ prefix). References throughout the codebase are adjusted.	2022-05-04 18:59:56 +03:00
Piotr Sarna	83ec505fab	cql3: add tracing indexed aggregate queries Commit `1c99ed6ced` added tracing logs about the index chosen for the query, but aggregate queries have a separate code path, which wasn't taken into account. After this patch, tracing for aggregate queries also includes this additional information. Closes #10195	2022-03-11 15:27:03 +02:00
Eliran Sinvani	bf50dbd35b	cql3 statements: Change dependency test API to express better it's purpose Cql statements used to have two API functions, depends_on_keyspace and depends_on_column_family. The former, took as a parameter only a table name, which makes no sense. There could be multiple tables with the same name each in a different keyspace and it doesn't make sense to generalize the test - i.e to ask "Does a statement depend on any table named XXX?" In this change we unify the two calls to one - depends on that takes a keyspace name and optionally also a table name, that way every logical dependency tests that makes sense is supported by a single API call.	2022-02-27 11:48:03 +02:00
Piotr Dulikowski	ddf049738d	indexed_table_select_statement: return some exceptions as exception messages Adjusts the indexed_table_select_statement so that it uses the result-aware methods in storage_proxy and propagates failed results as result_message::exception.	2022-02-22 16:25:21 +01:00
Piotr Dulikowski	c5bcfee28f	select_statement: return exceptions as errors in execute_without_checking_exception_message Modifies the remaining logic of execute_without... (apart from the do_execute call) so that the result-aware versions of storage_proxy's methods are called and failed results are converted to result_message::exception.	2022-02-22 16:25:21 +01:00
Piotr Dulikowski	5106c60cd0	select_statement: return exceptions without throwing in do_execute Modifies do_execute so that it uses the result-aware versions of the query_pager's methods and returns them as result_message::exception.	2022-02-22 16:25:21 +01:00
Piotr Dulikowski	3a4d3f3175	select_statement: implement execute_without_checking_exception_message The select_statement will be able to propagate coordinator failures without throwing, so it's important to override the default implementations of execute and excecute_without... so that the first calls the latter and not the other way around.	2022-02-22 16:25:21 +01:00
Piotr Dulikowski	df7668797b	select_statement: introduce helpers for working with failed results Adds: - Includes for result-related helper methods (to be used in later commits), - Alias for coordinator_result, - The wrap_result_to_error_message function - a bit similar to utils::result_wrap. Adapts a callable T -> shared_ptr<result_message> to take result<T> -> shared_ptr<result_message>. If the result is failed, it converts it into result_message::exception and returns.	2022-02-22 16:25:21 +01:00
Michał Sala	b439d6e710	db: config: add a flag to disable new parallelized aggregation algorithm Just in case the new algorithm turns out to be buggy, add a flag to fall-back to the old algorithm.	2022-02-01 21:26:25 +01:00
Michał Sala	aec96be553	forward_service: add tracing	2022-02-01 21:14:41 +01:00
Michał Sala	f344bd0aaa	cql3: statements: introduce parallelized_select_statement Detect whether a statement is a count() query in prepare time. If so, instantiate a new `select_statement` subclass - `parallelized_select_statement`. This subclass has a different execution logic, that enables it to distribute count() queries across a cluster. Also, a new counter was added - `select_parallelized` that counts the number of parallelized aggregation SELECT query executions.	2022-02-01 21:14:41 +01:00
Michał Sala	0fe59082ec	storage_proxy: extract query_ranges_to_vnodes_generator to a separate file Such separation allows using query_ranges_to_vnodes_generator by other services without needing a storage_proxy dependency.	2022-02-01 21:14:41 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Pavel Emelyanov	00de5f4876	validation: Make validate_column_family use data_dictionary::database And instantly convert the validate_keyspace() as it's not called from anywhere but the validate_column_family(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-14 13:00:53 +03:00
Pavel Emelyanov	b6bc7a9b29	client_state: Make has_column_family_access use data_dictionary::database Straightforward replacement. Internals of the has_column_family_access() temporarily get .real_database(), but it will be changed soon. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-14 12:55:15 +03:00
Pavel Emelyanov	095d93eaf8	pager: Keep shared pointer to proxy onboard Pagers are created by alternator and select statement, both have the proxy reference at hands. Next, the pager's unique_ptr is put on the lambda of its fetch_page() continuation and thus it survives the fetch_page execution and then gets destroyed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-10 07:58:57 +03:00
Pavel Emelyanov	d98dd0ff80	cql3: Generalize bounce-to-shard result creation The main intention is actually to free the qp.proxy() from the need to provide the get_stats() method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 11:28:44 +03:00
Pavel Emelyanov	d32de22ee8	cql3: Get data dictionary directly from query_processor After previous patches there's a whole bunch of places that do qp.proxy().data_dictionary() while the data_dictionary is present on the query processor itself and there's a public method to get one. So use it everywhere. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 11:28:44 +03:00
Pavel Emelyanov	da4c29105d	select_statement: Replace all proxy-s with query_processor This is the largest user of proxy argument. Fix them all and their callers (all sit in the same .cc file). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 10:54:28 +03:00
Pavel Emelyanov	bce2ed9c6c	cql3: Make execution stages carry query_processor over The batch_ , modification_ and select_ statements get proxy from query processor just to push it through execution stage. Simplify that by pushing the query processor itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 10:53:44 +03:00
Pavel Emelyanov	b990ca5550	cql3: Make .validate() and .check_access() accept query_processor This is mostly a sed script that replaces methods' first argument plus fixes of compiler-generated errors. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 10:53:44 +03:00
Avi Kivity	d768e9fac5	cql3, related: switch to data_dictionary Stop using database (and including database.hh) for schema related purposes and use data_dictionary instead. data_dictionary::database::real_database() is called from several places, for these reasons: - calling yet-to-be-converted code - callers with a legitimate need to access data (e.g. system_keyspace) but with the ::database accessor removed from query_processor. We'll need to find another way to supply system_keyspace with data access. - to gain access to the wasm engine for testing whether used defined functions compile. We'll have to find another way to do this as well. The change is a straightforward replacement. One case in modification_statement had to change a capture, but everything else was just a search-and-replace. Some files that lost "database.hh" gained "mutation.hh", which they previously had access to through "database.hh".	2021-12-15 13:54:23 +02:00
Pavel Emelyanov	b0a8c153f7	select_statement: Remove unused proxy args and captures The generate_view_paging_state_from_base_query_results() has unused proxy argument that's carried over quite a long stack for nothing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20211210175203.26197-1-xemul@scylladb.com>	2021-12-10 20:39:55 +02:00
Nadav Har'El	c6f2afb93d	Merge 'cql3: Allow to skip EQ restricted columns in ORDER BY' from Jan Ciołek In queries like: ```cql SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c1 ASC, c2 ASC) ``` we can skip the requirement to specify ordering for `c1` column. The `c1` column is restricted by an `EQ` restriction, so it can have at most one value anyway, there is no need to sort. This commit makes it possible to write just: ```cql SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c2 ASC) ``` I reorganized the ordering code, I feel that it's now clearer and easier to understand. It's possible to only introduce a small change to the existing code, but I feel like it becomes a bit too messy. I tried it out on the [`orderby_disorder_small`](https://github.com/cvybhu/scylla/commits/orderby_disorder_small) branch. The diff is a bit messy because I moved all ordering functions to one place, it's better to read [select_statement.cc](https://github.com/cvybhu/scylla/blob/orderby_disorder/cql3/statements/select_statement.cc#L1495-L1658) lines 1495-1658 directly. In the new code it would also be trivial to allow specifying columns in any order, we would just have to sort them. For now I commented out the code needed to do that, because the point of this PR was to fix #2247. Allowing this would require some more work changing the existing tests. Fixes: #2247 Closes #9518 * github.com:scylladb/scylla: cql-pytest: Enable test for skipping eq restricted columns in order by cql3: Allow to skip EQ restricted columns in ORDER BY cql3: Add has_eq_restriction_on_column function cql3: Reorganize orderings code	2021-12-09 21:11:56 +03:00
Jan Ciolek	a548c2dac4	cql3: Allow to skip EQ restricted columns in ORDER BY In queries like: SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c1 ASC, c2 ASC) we can skip the requirement to specify ordering for c1 column. The c1 column is restricted by an EQ restriction, so it can have only one value anyway, there is no need to sort. This commit makes it possible to write just: SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c2 ASC) Fixes: #2247 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-12-09 12:07:02 +01:00
Jan Ciolek	f76a1cd4bf	cql3: Reorganize orderings code Reorganized the code that handles column ordering (ASC or DESC). I feel that it's now clearer and easier to understand. Added an enum that describes column ordering. It has two possible values: ascending or descending. It used to be a bool that was sometimes called 'reversed', which could mean multiple things. Instead of column.type->is_reversed() != <ordering bool> there is now a function called are_column_select_results_reversed. Split checking if ordering is reversed and verifying whether it's correct into two functions. Before all of this was done by is_reversed() This is a preparation to later allow skipping ORDER BY restrictions on some columns. Adding this to the existing code caused it to get quite complex, but this new version is better suited for the task. The diff is a bit messy because I moved all ordering functions to one place, it's better to read select_statement.cc lines 1495-1651 directly. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-12-09 12:06:42 +01:00
Raphael S. Carvalho	648c921af2	cql3: statements: Fix UB when getting memory consumption limit for unpaged query get_max_result_size() is called on slice moved in previous argument. This results in use-after-move with clang, which evaluation order is left-to-right. For paged queries, max_result_size is later overriden by query_pager, but for unpaged and/or reversed queries it can happen that max result size incorrectly contains the 1MB limit for paged, non-reversed queries. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211207145133.69764-1-raphaelsc@scylladb.com>	2021-12-07 16:57:01 +02:00

1 2 3 4 5 ...

371 Commits