scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 19:35:12 +00:00

Author	SHA1	Message	Date
Nadav Har'El	ce347f4b67	test/cql-pytest: add test for meaning of fetch_size with filtering A question was raised on what fetch_size (the requested page size in a paged scan) counts when there is a filter: does it count the rows before filtering (as scanned from disk) or after filter (as will be returned to the client)? This patch adds a test which demonstrates that Cassandra and Scylla behave differently in this respect: Cassandra counts post-filtering - so fetch_size results are actually returned, while Scylla currently counts pre-filtering. It is arguable which behavior is the "correct" one - we discuss this in issue #12102. But we have already had several users (such as #11340) who complained about Scylla's behavior and expected Cassandra's behavior, so if we decide to keep Scylla's behavior we should at least explain and justify this decision in our documentation. Until then, let's have this test which reminds us of this incompatibility. This test currently passes on Cassandra and fails (xfail) on Scylla. Refs #11340 Refs #12102 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12103	2022-11-30 12:27:06 +02:00
Nadav Har'El	8bd8ef3d03	test/cql-pytest: add regression test for old issue This patch adds a regression test for the old issue #65 which is about a multi-column (tuple) clustering-column relation in a SELECT when one these columns has reversed order. It turns out that we didn't notice, but this issue was already solved - but we didn't have a regression test for it. So this patch adds just a regression test. The test confirms that Scylla now behaves like was desired when that issue was opened. The test also passes on Cassandra, confirming that Scylla and Cassandra behave the same for such requests. Fixes #65 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12130	2022-11-30 12:22:21 +02:00
Nadav Har'El	c5121cf273	cql: fix column-name aliases in SELECT JSON The SELECT JSON statement, just like SELECT, allows the user to rename selected columns using an "AS" specification. E.g., "SELECT JSON v AS foo". This specification was not honored: We simply forgot to look at the alias in SELECT JSON's implementation (we did it correctly in regular SELECT). So this patch fixes this bug. We had two tests in cassandra_tests/validation/entities/json_test.py that reproduced this bug. The checks in those tests now pass, but these two tests still continue to fail after this patch because of two other unrelated bugs that were discovered by the same tests. So in this patch I also add a new test just for this specific issue - to serve as a regression test. Fixes #8078 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12123	2022-11-29 18:16:19 +02:00
Nadav Har'El	99a72a9676	Merge 'cql3: expr: make it possible to evaluate expr::binary_operator' from Jan Ciołek As a part of CQL rewrite we want to be able to perform filtering by calling `evaluate()` on an expression and checking if it evaluates to `true`. Currently trying to do that for a binary operator would result in an error. Right now checking if a binary operation like `col1 = 123` is true is done using `is_satisfied_by`, which is able to check if a binary operation evaluates to true for a small set of predefined cases. Eventually once the grammar is relaxed we will be able to write expressions like: `(col1 < col2) = (1 > ?)`, which doesn't fit with what `is_satisfied_by` is supposed to do. Additionally expressions like `1 = NULL` should evaluate to `NULL`, not `true` or `false`. `is_satsified_by` is not able to express that properly. The proper way to go is implementing `evaluate(binary_operator)`, which takes a binary operation and returns what the result of it would be. Implementing `prepare_expression` for `binary_operator` requires us to be able to evaluate it first. In the next PR I will add support for `prepare_expression`. Closes #12052 * github.com:scylladb/scylladb: cql-pytest: enable two unset value tests that pass now cql-pytest: reduce unset value error message cql3: expr: change unset value error messages to lowercase cql_pytest: ensure that where clauses like token(p) = 0 AND p = 0 are rejected cql3: expr: remove needless braces around switch cases cql3: move evaluation IS_NOT NULL to a separate function expr_test: test evaluating LIKE binary_operator expr_test: test evaluating IS_NOT binary_operator expr_test: test evaluating CONTAINS_KEY binary_operator expr_test: test evaluating CONTAINS binary_operator expr_test: test evaluating IN binary_operator expr_test: test evaluating GTE binary_operator expr_test: test evaluating GT binary_operator expr_test: test evaluating LTE binary_operator expr_test: test evaluating LT binary_operator expr_test: test evaluating NEQ binary_operator expr_test: test evaluating EQ binary_operator cql3: expr properly handle null in is_one_of() cql3: expr properly handle null in like() cql3: expr properly handle null in contains_key() cql3: expr properly handle null in contains() cql3: expr: properly handle null in limits() cql3: expr: remove unneeded overload of limits() cql3: expr: properly handle null in equality operators cql3: expr: remove unneeded overload of equal() cql3: expr: use evaluate(binary_operator) in is_satisfied_by cql3: expr: handle IS NOT NULL when evaluating binary_operator cql3: expr: make it possible to evaluate binary_operator cql3: expr: accept expression as lhs argument to like() cql3: expr: accept expression as lhs in contains_key cql3: expr: accept expression as lhs argument to contains()	2022-11-28 11:30:00 +02:00
Jan Ciolek	77c7d8b8f6	cql-pytest: enable two unset value tests that pass now While implementing evaluate(binary_operator) missing checks for unset value were added for comparisons in filtering code. Because of that some tests for unset value started passing. There are still other tests for unset value that are failing because Scylla doesn't have all the checks that it should. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-24 17:07:17 +01:00
Jan Ciolek	5bc0bc6531	cql-pytest: reduce unset value error message When unset value appears in an invalid place both Cassandra and Scylla throw an error. The tests were written with Cassandra and thus the expected error messages were exactly the same as produced by Cassandra. Scylla produces different error messages, but both databases return messages with the text 'unset value'. Reduce the expected message text from the whole message to something that contains 'unset value'. It would be hard to mimic Cassandra's error messages in Scylla. There is no point in spending time on that. Instead it's better to modify the tests so that they are able to work with both Cassandra and Scylla. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-24 17:04:07 +01:00
Nadav Har'El	c6bb64ab0e	Merge 'Fix LWT insert crash if clustering key is null' from Gusev Petr [PR](https://github.com/scylladb/scylladb/pull/9314) fixed a similar issue with regular insert statements but missed the LWT code path. It's expected behaviour of `modification_statement::create_clustering_ranges` to return an empty range in this case, since `possible_lhs_values` it uses explicitly returns `empty_value_set` if it evaluates `rhs` to null, and it has a comment about it (All NULL comparisons fail; no column values match.) On the other hand, all components of the primary key are required to be set, this is checked at the prepare phase, in `modification_statement::process_where_clause`. So the only problem was `modification_statement::execute_with_condition` was not expecting an empty `clustering_range` in case of a null clustering key. Also this patch contains a fix for the problem with wrong column name in Scylla error messages. If `INSERT` or `DELETE` statement is missing a non-last element of the primary key, the error message generated contains an invalid column name. The problem occurs if the query contains a column with the list type, otherwise `statement_restrictions::process_clustering_columns_restrictions` checks that all the components of the key are specified. Closes #12047 * github.com:scylladb/scylladb: cql: refactor, inline modification_statement::validate_primary_key_restrictions cql: DELETE with null value for IN parameter should be forbidden cql: add column name to the error message in case of null primary key component cql: batch statement, inserting a row with a null key column should be forbidden cql: wrong column name in error messages modification_statement: fix LWT insert crash if clustering key is null	2022-11-24 16:15:27 +02:00
Petr Gusev	f9936bb0cb	cql: DELETE with null value for IN parameter should be forbidden If a DELETE statement contains an IN operator and the parameter value for it is NULL, this should also trigger an error. This is in line with how Cassandra behaves in this case.	2022-11-23 21:39:23 +04:00
Petr Gusev	c123f94110	cql: add column name to the error message in case of null primary key component It's more user-friendly and the error message corresponds to what Cassandra provides in this case.	2022-11-23 21:39:23 +04:00
Petr Gusev	7730c4718e	cql: batch statement, inserting a row with a null key column should be forbidden Regular INSERT statements with null values for primary key components are rejected by Scylla since #9286 and #9314. Batch statements missed a similar check, this patch fixes it. Fixes: #12060	2022-11-23 21:39:23 +04:00
Petr Gusev	89a5397d7c	cql: wrong column name in error messages If INSERT or DELETE statement is missing a non-last element of the primary key, the error message generated contains an invalid column name. The problem occurs if the query contains a column with the list type, otherwise statement_restrictions::process_clustering_columns_restrictions checks that all the components of the key are specified. Fixes: #12046	2022-11-23 21:39:16 +04:00
Jan Ciolek	84501851eb	cql_pytest: ensure that where clauses like token(p) = 0 AND p = 0 are rejected Scylla doesn't support combining restrictions on token with other restrictions on partition key columns. Some pieces of code depend on the assumption that such combinations are allowed. In case they were allowed in the future these functions would silently start returning wrong results, and we would return invalid rows. Add a test that will start failing once this restriction is removed. It will warn the developer to change the functions that used to depend on the assumption. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 13:09:22 +01:00
Petr Gusev	0d443dfd16	modification_statement: fix LWT insert crash if clustering key is null PR #9314 fixed a similar issue with regular insert statements but missed the LWT code path. It's expected behaviour of modification_statement::create_clustering_ranges to return an empty range in this case, since possible_lhs_values it uses explicitly returns empty_value_set if it evaluates rhs to null, and it has a comment about it (All NULL comparisons fail; no column values match.) On the other hand, all components of the primary key are required to be set, this is checked at the prepare phase, in modification_statement::process_where_clause. So the only problem was modification_statement::execute_with_condition was not expecting an empty clustering_range in case of a null clustering key. Fixes: #11954	2022-11-22 16:45:16 +04:00
Nadav Har'El	ff617c6950	cql-pytest: translate a few small Cassandra tests This patch includes a translation of several additional small test files from Cassandra's CQL unit test directory cql3/validation/operations. All tests included here pass on both Cassandra and Scylla, so they did not discover any new Scylla bugs, but can be useful in the future as regression tests. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12045	2022-11-22 07:54:13 +02:00
Nadav Har'El	2ba8b8d625	test/cql-pytest: remove "xfail" from passing test testIndexOnFrozenCollectionOfUDT We had a test that used to fail because of issue #8745. But this issue was alread fixed, and we forgot to remove the "xfail" marker. The test now passes, so let's remove the xfail marker. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12039	2022-11-20 19:54:59 +02:00
Jan Ciolek	286f182a8c	cql-pytest: add a reproducer for #12014 , verify that filtering multi column and regular restrictions works In issue #12014 a user has encountered an instance of #6200. When filtering a WHERE clause which contained both multi-column and regular restrictions, the regular restrictions were ignored. Add a test which reproduces the issue using a reproducer provided by the user. This problem is tested in another similar test, but this one reproduces the issue in the exact way it was found by the user. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-18 15:27:42 +01:00
Jan Ciolek	99e1032e34	cql-pytest: enable test for filtering combined multi column and regular column restrictions The test test_multi_column_restrictions_and_filtering was marked as xfail, because issue #6200 wasn't fixed. Now that filtering multi column and other restrictions together has been fixed the test passes. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-18 15:14:32 +01:00
Nadav Har'El	e393639114	test/cql-pytest: reproducer for crash in LWT with null key This patch adds a reproducer for issue #11954: Attempting an "IF NOT EXISTS" (LWT) write with a null key crashes Scylla, instead of producing a simple error message (like happens without the "IF NOT EXISTS" after #7852 was fixed). The test passed on Cassandra, but crashes Scylla. Because of this crash, we can't just mark the test "xfail" and it's temporarily marked "skip" instead. Refs #11954. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11982	2022-11-17 07:31:13 +02:00
Nadav Har'El	2f2f01b045	materialized views: fix view writes after base table schema change When we write to a materialized view, we need to know some information defined in the base table such as the columns in its schema. We have a "view_info" object that tracks each view and its base. This view_info object has a couple of mutable attributes which are used to lazily-calculate and cache the SELECT statement needed to read from the base table. If the base-table schema ever changes - and the code calls set_base_info() at that point - we need to forget this cached statement. If we don't (as before this patch), the SELECT will use the wrong schema and writes will no longer work. This patch also includes a reproducing test that failed before this patch, and passes afterwords. The test creates a base table with a view that has a non-trivial SELECT (it has a filter on one of the base-regular columns), makes a benign modification to the base table (just a silly addition of a comment), and then tries to write to the view - and before this patch it fails. Fixes #10026 Fixes #11542	2022-11-16 13:58:21 +02:00
Nadav Har'El	e4dba6a830	test/cql-pytest: add test for when MV requires IS NOT NULL As noted in issue #11979, Scylla inconsistently (and unlike Cassandra) requires "IS NOT NULL" one some but not all materialized-view key columns. Specifically, Scylla does not require "IS NOT NULL" on the base's partition key, while Cassandra does. This patch is a test which demonstrates this inconsistency. It currently passes on Cassandra and fails on Scylla, so is marked xfail. Refs #11979 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11980	2022-11-15 14:21:48 +01:00
Nadav Har'El	ff87624fb4	test/cql-pytest: add another regression test for reversed-type bug In commit `544ef2caf3` we fixed a bug where a reveresed clustering-key order caused problems using a secondary index because of incorrect type comparison. That commit also included a regression test for this fix. However, that fix was incomplete, and improved later in commit `c8653d1321`. That later fix was labeled "better safe than sorry", and did not include a test demonstrating any actual bug, so unsurprisingly we never backported that second fix to any older branches. Recently we discovered that missing the second patch does cause real problems, and this patch includes a test which fails when the first patch is in, but the second patch isn't (and passes when both patches are in, and also passes on Cassandra). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11943	2022-11-11 11:01:22 +02:00
Nadav Har'El	b9d88a3601	cql/pytest: add reproducer for timestamp column validation issue This patch adds a reproducing test for issue #11588, which is still open so the test is expected to fail on Scylla ("xfail), and passes on Cassandra. The test shows that Scylla allows an out-of-range value to be written to timestamp column, but then it can't be read back. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11864	2022-11-01 08:11:01 +02:00
Nadav Har'El	c31bf4184f	test/cql-pytest: two reproducers for SI returning oversized pages This patch has two reproducing tests for issue #7432, which are cases where a paged query with a restriction backed by a secondary-index returns pages larger than the desired page size. Because these tests reproduce a still-open bug, they are both marked "xfail". Both tests pass on Cassandra. The two tests involve quite dissimilar casess - one involves requesting an entire partition (and Scylla forgetting to page through it), and the other involves GROUP BY - so I am not sure these two bugs even have the same underlying cause. But they were both reported in #7432, so let's have reproducers for both. Refs #7432 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11586	2022-10-17 11:36:05 +03:00
Nadav Har'El	9f02431064	test/cql-pytest: fix test_permissions.py when running with "--ssl" The tests in test_permissions.py use the new_session() utility function to create a new connection with a different logged-in user. It models the new connection on the existing one, but incorrectly assumed that the connection is NOT ssl. This made this test failed with cql-pytest/run is passed the "--ssl" option. In this patch we correctly infer the is_ssl state from the existing cql fixture, instead of assuming it is false. After this pass, "cql-pytest/run --ssl" works as expected for this test. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11742	2022-10-17 06:46:46 +03:00
Jan Ciolek	52bbc1065c	cql3: allow lists of IN elements to be NULL Requests like `col IN NULL` used to cause an error - Invalid null value for colum col. We would like to allow NULLs everywhere. When a NULL occurs on either side of a binary operator, the whole operation should just evaluate to NULL. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #11775	2022-10-13 15:11:32 +02:00
Nadav Har'El	ef0da14d6f	test/cql-pytest: add simple tests for USE statement This patch adds a couple of simple tests for the USE statement: that without USE one cannot create a table without explicitly specifying a keyspace name, and with USE, it is possible. Beyond testing these specific feature, this patch also serves as an example of how to write more tests that need to control the effective USE setting. Specifically, it adds a "new_cql" function that can be used to create a new connection with a fresh USE setting. This is necessary in such tests, because if multiple tests use the same cql fixture and its single connection, they will share their USE setting and there is no way to undo or reset it after being set. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11741	2022-10-11 08:20:19 +03:00
Jan Ciolek	a2c359a741	cql3: Make CONTAINS KEY NULL return false A binary operator like this: {1: 2, 3: 4} CONTAINS KEY NULL used to evaluate to `true`. This is wrong, any operation involving null on either side of the operator should evaluate to NULL, which is interpreted as false. This change is not backwards compatible. Some existing code might break. partially fixes: #10359 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-10-05 18:15:44 +02:00
Jan Ciolek	bbfef4b510	cql3: Make CONTAINS NULL return false A binary operator like this: [1, 2, 3] CONTAINS NULL used to evaluate to `true`. This is wrong, any operation involving null on either side of the operator should evaluate to NULL, which is interpreted as false. This change is not backwards compatible. Some existing code might break. partially fixes: #10359 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-10-05 18:15:15 +02:00
Botond Dénes	5621cdd7f9	db/view/view_builder: don't drop partition and range tombstones when resuming The view builder builds the views from a given base table in view_builder::batch_size batches of rows. After processing this many rows, it suspends so the view builder can switch to building views for other base tables in the name of fairness. When resuming the build step for a given base table, it reuses the reader used previously (also serving the role of a snapshot, pinning sstables read from). The compactor however is created anew. As the reader can be in the middle of a partition, the view builder injects a partition start into the compactor to prime it for continuing the partition. This however only included the partition-key, crucially missing any active tombstones: partition tombstone or -- since the v2 transition -- active range tombstone. This can result in base rows covered by either of this to be resurrected and the view builder to generate view updates for them. This patch solves this by using the detach-state mechanism of the compactor which was explicitly developed for situations like this (in the range scan code) -- resuming a read with the readers kept but the compactor recreated. Also included are two test cases reproducing the problem, one with a range tombstone, the other with a partition tombstone. Fixes: #11668 Closes #11671	2022-10-03 11:28:22 +03:00
Avi Kivity	cf3830a249	Merge 'Add support for TRUNCATE USING TIMEOUT' from Benny Halevy Extend the cql3 truncate statement to accept attributes, similar to modification statements. To achieve that we define cql3::statements::raw::truncate_statement derived from raw::cf_statement, and implement its pure virtual prepare() method to make a prepared truncate_statement. The latter is no longer derived from raw::cf_statement, and just stores a schema_ptr to get to the keyspace and column_family. `test_truncate_using_timeout` cql-pytest was added to test the new USING TIMEOUT feature. Fixes #11408 Also, update docs/cql/ddl.rst truncate-statement section respectively. Closes #11409 * github.com:scylladb/scylladb: docs: cql-extensions: add TRUNCATE to USING TIMEOUT section. docs: cql: ddl: add support for TRUNCATE USING TIMEOUT cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT cql3: selectStatement: restrict to USING TIMEOUT in grammar cql3: deleteStatement: restrict to USING TIMEOUT\|TIMESTAMP in grammar	2022-09-28 18:19:03 +03:00
Benny Halevy	64140ccf05	cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT Extend the cql3 truncate statement to accept attributes, similar to modification statements. To achieve that we define cql3::statements::raw::truncate_statement derived from raw::cf_statement, and implement its pure virtual prepare() method to make a prepared truncate_statement. The latter, statements::truncate_statement, is no longer derived from raw::cf_statement, and just stores a schema_ptr to get to the keyspace and column_family names. `test_truncate_using_timeout` cql-pytest was added to test the new USING TIMEOUT feature. Fixes #11408 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-26 18:30:39 +03:00
Benny Halevy	27d3e48005	cql3: selectStatement: restrict to USING TIMEOUT in grammar It is preferred to reject USING TLL / TIMESTAMP at the grammar level rather than functionally validating the USING attributes. test_using_timeout was adjusted respectively to expect the `SyntaxException` error rather than `InvalidRequest`. Note that cql3::statements::raw::select_statement validate_attrs now asserts that the ttl or the timestamp attributes aren't set. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-26 18:30:39 +03:00
Benny Halevy	0728d33d5f	cql3: deleteStatement: restrict to USING TIMEOUT\|TIMESTAMP in grammar It is preferred to reject USING TLL / TIMESTAMP at the grammar level rather than functionally validating the USING attributes. test_using_timeout was adjusted respectively to expect the `SyntaxException` error rather than `InvalidRequest`. Note that now delete_statement ctor asserts that the ttl attribute is not set. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-26 18:30:39 +03:00
Nadav Har'El	868a884b79	test/cql-pytest: add reproducer for ignored IS NOT NULL This test reproduces issue #10365: It shows that although "IS NOT NULL" is not allowed in regular SELECT filters, in a materialized view it is allowed, even for non-key columns - but then outright ignored and does not actually filter out anything - a fact which already surprised several users. The test also fails on Cassandra - it also wrongly allows IS NOT NULL on the non-key columns but then ignores this in the filter. So the test is marked with both xfail (known to fail on Scylla) and cassandra_bug (fails on Cassandra because of what we consider to be a Cassandra bug). Refs #10365 Refs #11606 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11615	2022-09-26 09:02:08 +03:00
Nadav Har'El	4c93a694b7	cql: validate bloom_filter_fp_chance up-front Scylla's Bloom filter implementation has a minimal false-positive rate that it can support (6.71e-5). When setting bloom_filter_fp_chance any lower than that, the compute_bloom_spec() function, which writes the bloom filter, throws an exception. However, this is too late - it only happens while flushing the memtable to disk, and a failure at that point causes Scylla to crash. Instead, we should refuse the table creation with the unsupported bloom_filter_fp_chance. This is also what Cassandra did six years ago - see CASSANDRA-11920. This patch also includes a regression test, which crashes Scylla before this patch but passes after the patch (and also passes on Cassandra). Fixes #11524. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11576	2022-09-20 06:18:51 +03:00
Jadw1	ba461aca8b	cql-pytest: more neutral command in cql_test_connection fixture I found 'use system` to not be neutral enough (e.g. in case of testing describe statement). `BEGIN BATCH APPLY BATCH` sounds better. Closes #11504	2022-09-11 18:49:06 +03:00
Wojciech Mitros	49dba4f0c1	functions: fix dropping of a keyspace with an aggregate in it Currently, if a keyspace has an aggregate and the keyspace is dropped, the keyspace becomes corrupted and another keyspace with the same name cannot be created again This is caused by the fact that when removing an aggregate, we call create_aggregate() to get values for its name and signature. In the create_aggregate(), we check whether the row and final functions for the aggregate exist. Normally, that's not an issue, because when dropping an existing aggregate alone, we know that its UDFs also exist. But when dropping and entire keyspace, we first drop the UDFs, making us unable to drop the aggregate afterwards. This patch fixes this behavior by removing the create_aggregate() from the aggregate dropping implementation and replacing it with specific calls for getting the aggregate name and signature. Additionally, a test that would previously fail is added to cql-pytest/test_uda.py where we drop a keyspace with an aggregate. Fixes #11327 Closes #11375	2022-08-25 16:28:57 +02:00
Wojciech Mitros	9e6e8de38f	tests: prevent test_wasm from occasional failing Some cases in test_wasm.py assumed that all cases are ran in the same order every time and depended on values that should have been added to tables in previous cases. Because of that, they were sometimes failing. This patch removes this assumption by adding the missing inserts to the affected cases. Additionally, an assert that confirms low miss rate of udfs is more precise, a comment is added to explain it clearly. Closes #11367	2022-08-25 11:32:06 +03:00
Nadav Har'El	055340ae39	cql-pytest: increase more timeouts In commit `7eda6b1e90`, we increased the request_timeout parameter used by cql-pytest tests from the default of 10 seconds to 120 seconds. 10 seconds was usually more than enough for finishing any Scylla request, but it turned out that in some extreme cases of a debug build running on an extremely over-committed machine, the default timeout was not enough. Recently, in issue #11289 we saw additional cases of timeouts which the request_timeout setting did not solve. It turns out that the Python CQL driver has two additional timeout settings - connect_timeout and control_connection_timeout, which default to 5 seconds and 2 seconds respectively. I believe that most of the timeouts in issue #11289 come from the control_connection_timeout setting - by changing it to a tiny number (e.g., 0.0001) I got the same error messages as those reported in #11289. The default of that timeout - 2 seconds - is certainly low enough to be reached on an extremely over-committed machine. So this patch significantly increases both connect_timeout and control_connection_timeout to 60 seconds. We don't care that this timeout is ridiculously large - under normal operations it will never be reached. There is no code which loops for this amount of time, for example. Refs #11289 (perhaps even Fixes, we'll need to see that the test errors go away). NOTE: This patch only changes test/cql-pytest/util.py, which is only used by the cql-pytest test suite. We have multiple other test suites which copied this code, and those test suites might need fixing separately. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11295	2022-08-16 19:11:59 +03:00
Piotr Sarna	cf30d4cbcf	Merge 'Secondary index of collection columns' from Nadav Har'El This pull request introduces global secondary-indexing for non-frozen collections. The intent is to enable such queries: ``` CREATE TABLE test(int id, somemap map<int, int>, somelist<int>, someset<int>, PRIMARY KEY(id)); CREATE INDEX ON test(keys(somemap)); CREATE INDEX ON test(values(somemap)); CREATE INDEX ON test(entries(somemap)); CREATE INDEX ON test(values(somelist)); CREATE INDEX ON test(values(someset)); -- index on test(c) is the same as index on (values(c)) CREATE INDEX IF NOT EXISTS ON test(somelist); CREATE INDEX IF NOT EXISTS ON test(someset); CREATE INDEX IF NOT EXISTS ON test(somemap); SELECT * FROM test WHERE someset CONTAINS 7; SELECT * FROM test WHERE somelist CONTAINS 7; SELECT * FROM test WHERE somemap CONTAINS KEY 7; SELECT * FROM test WHERE somemap CONTAINS 7; SELECT * FROM test WHERE somemap[7] = 7; ``` We use here all-familiar materialized views (MVs). Scylla treats all the collections the same way - they're a list of pairs (key, value). In case of sets, the value type is dummy one. In case of lists, the key type is TIMEUUID. When describing the design, I will forget that there is more than one collection type. Suppose that the columns in the base table were as follows: ``` pkey int, ckey1 int, ckey2 int, somemap map<int, text>, PRIMARY KEY(pkey, ckey1, ckey2) ``` The MV schema is as follows (the names of columns which are not the same as in base might be different). All the columns here form the primary key. ``` -- for index over entries indexed_coll (int, text), idx_token long, pkey int, ckey1 int, ckey2 int -- for index over keys indexed_coll int, idx_token long, pkey int, ckey1 int, ckey2 int -- for index over values indexed_coll text, idx_token long, pkey int, ckey1 int, ckey2 int, coll_keys_for_values_index int ``` The reason for the last additional column is that the values from a collection might not be unique. Fixes #2962 Fixes #8745 Fixes #10707 This patch does not implement local secondary indexes for collection columns: Refs #10713. Closes #10841 * github.com:scylladb/scylladb: test/cql-pytest: un-xfail yet another passing collection-indexing test secondary index: fix paging in map value indexing test/cql-pytest: test for paging with collection values index cql, view: rename and explain bytes_with_action cql, index: make collection indexing a cluster feature test/cql-pytest: failing tests for oversized key values in MV and SI cql: fix secondary index "target" when column name has special characters cql, index: improve error messages cql, index: fix default index name for collection index test/cql-pytest: un-xfail several collecting indexing tests test/cql-pytest/test_secondary_index: verify that local index on collection fails. docs/design-notes/secondary_index: add `VALUES` to index target list test/cql-pytest/test_secondary_index: add randomized test for indexes on collections cql-pytest/cassandra_tests/.../secondary_index_test: fix error message in test ported from Cassandra cql-pytest/cassandra_tests/.../secondary_index_on_map_entries,select_test: test ported from Cassandra is expected to fail, since Scylla assumes that comparison with null doesn't throw error, just evaluates to false. Since it's not a bug, but expected behavior from the perspective of Scylla, we don't mark it as xfail. test/boost/secondary_index_test: update for non-frozen indexes on collections test/cql-pytest: Uncomment collection indexes tests that should be working now cql, index: don't use IS NOT NULL on collection column cql3/statements/select_statement: for index on values of collection, don't emit duplicate rows cql/expr/expression, index/secondary_index_manager: needs_filtering and index_supports_expression rewrite to accomodate for indexes over collections cql3, index: Use entries() indexes on collections for queries cql3, index: Use keys() and values() indexes on collections for queries. types/tuple: Use std::begin() instead of .begin() in tuple_type_impl::build_value_fragmented cql3/statements/index_target: throw exception to signalize that we didn't miss returning from function db/view/view.cc: compute view_updates for views over collections view info: has_computed_column_depending_on_base_non_primary_key column_computation: depends_on_non_primary_key_column schema, index/secondary_index_manager: make schema for index-induced mv index/secondary_index_manager: extract keys, values, entries types from collection cql3/statements/: validate CREATE INDEX for index over a collection cql3/statements/create_index_statement,index_target: rewrite index target for collection column_computation.hh, schema.cc: collection_column_computation column_computation.hh, schema.cc: compute_value interface refactor Cql.g, treewide: support cql syntax `INDEX ON table(VALUES(collection))`	2022-08-16 14:18:51 +02:00
Nadav Har'El	fbb0b66d0c	test/cql-pytest: fix run's "--ssl" option Commit `23acc2e848` broke the "--ssl" option of test/cql-pytest/run (which makes Scylla - and cqlpytest - use SSL-encrypted CQL). The problem was that there was a confusion between the "ssl" module (Python's SSL support) and a new "ssl" variable. A rename and a missing "import" solves the breakage. We never noticed this because Jenkins does not run cql-pytest/run with --ssl (actually, it no longer runs cql-pytest/run at all). It is still a useful option for checking SSL-related problems in Scylla and Seastar. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11292	2022-08-16 12:29:05 +02:00
Nadav Har'El	329068df99	test/cql-pytest: un-xfail yet another passing collection-indexing test After collection indexing has been implemented, yet another test which failed because of #2962 now passes. So remove the "xfail" marker. Refs #2962 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	f6f18b187a	secondary index: fix paging in map value indexing When indexing a map column's values, if the same value appears more than once, the same row will appear in the index more than once. We had code that removed these duplicates, but this deduplication did not work across page boundaries. We had two xfailing tests to demonstrate this bug. In this patch we fix this bug by looking at the page's start and not generating the same row again, thereby getting the same deduplication we had inside pages - now across pages. The previously-xfailing tests now pass, and their xfail tag is removed. I also added another test, for the case where the base table has only partition keys without clustering keys. This second test is important because the code path for the partition-key-only case is different, and the second test exposed a bug in it as well (which is also fixed in this patch). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	dc445b9a73	test/cql-pytest: test for paging with collection values index If a map has several keys with the same value, then the "values(m)" index must remember all of them as matching the same row - because later we may remove one of these keys from the map but the row would still need to match the value because of the remaining keys. We already had a test (test_index_map_values) that although the same row appears more than once for this value, when we search for this value the result only returns the row once. Under the hood, Scylla does find the same value multiple times, but then eliminates the duplicate matched raw and returns it only once. But there is a complication, that this de-duplication does not easily span paging. So in this patch we add a test that checks that paging does not cause the same row to be returned more than once. Unfortunately, this test currently fails on Scylla so marked "xfail". It passes on Cassandra. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	aa86f808a6	test/cql-pytest: failing tests for oversized key values in MV and SI In issue #9013, we noticed that if a value larger than 64 KB is indexed, the write fails in a bad way, and we fixed it. But the test we wrote when fixing that issue already suggested that something was still wrong: Cassandra failed the write cleanly, with an InvalidRequest, while Scylla failed with a mysterious WriteFailure (with a relevant error message only in the log). This patch adds several xfailing tests which demonstrate what's still wrong. This is also summarized in issue #8627: 1. A write of an oversized value to an indexed column returns the wrong error message. 2. The same problem also exists when indexing a collection, and the indexed key or value is oversized. 3. The situation is even less pleasant when adding an index to a table with pre-existing data and an oversized value. In this case, the view building will fail on the bad row, and never finish. 4. We have exactly the same bugs not just with indexes but also with materialized views. Interestingly, Cassandra has similar bugs in materialized views as well (but not in the secondary index case, where Cassandra does behave as expected). Refs #8627. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	2c244c6e09	cql: fix secondary index "target" when column name has special characters Unfortunately, we encode the "target" of a secondary index in one of three ways: 1. It can be just a column name 2. It can be a string like keys(colname) - for the new type of collection indexes introduced in this series. 3. It can be a JSON map ({ ... }). This form is used for local indexes. The code parsing this target - target_parser::parse() - needs not to confuse these different formats. Before this patch, if the column name contains special characters like braces or parentheses (this is allowed in CQL syntax, via quoting), we can confuse case 1, 2, and 3: A column named "keys(colname)" will be confused for case 2, and a column named "{123}" will be confused with case 3. This problem can break indexing of some specially-crafted column names - as reproduced by test_secondary_index.py::test_index_quoted_names. The solution adopted in this patch is that the column name in case 1 should be escaped somehow so it cannot be possibly confused with either cases 2 and 3. The way we chose is to convert the column name to CQL (with column_definition::as_cql_name()). In other words, if the column name contains non-alphanumeric characters, it is wrapped in quotes and also quotes are doubled, as in CQL. The result of this can't be confused with case 2 or 3, neither of which may begin with a quote. This escaping is not the minimal we could have done, but incidentally it is exactly what Cassandra does as well, so I used it as well. This change is mostly backward compatible: Already-existing indexes will still have unescaped column names stored for their "target" string, and the unescaping code will see they are not wrapped in quotes, and not change them. Backward compatibility will only fail on existing indexes on columns whose name begin and end in the quote characters - but this case is extremely unlikely. This patch illustrates how un-ideal our index "target" encoding is, but isn't what made it un-ideal. We should not have used three different formats for the index target - the third representation (JSON) should have sufficed. However, two two other representations are identical to Cassandra's, so using them when we can has its compatibility advantages. The patch makes test_secondary_index.py::test_index_quoted_names pass. Fixes #10707. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	56204a3794	cql, index: improve error messages Before this patch, trying to create an index on entries(x) where x is not a map results in an error message: Cannot create index on index_keys_and_values of column x The string "index_keys_and_values" is strange - Cassandra prints the easier to understand string "entries()" - which better corresponds to what the user actually did. It turns out that this string "index_keys_and_values" comes from an elaborate set of variables and functions spanning multiple source files, used to convert our internal target_type variable into such a string. But although this code was called "index_option" and sounded very important, it was actually used just for one thing - error messages! So in this patch we drop the entire "index_option" abstraction, replacing it by a static trivial function defined exactly where it's used (create_index_statement.cc), which prints a target type. While at it, we print "entries()" instead of "index_keys_and_values" ;-) After this patch, the test_secondary_index.py::test_index_collection_wrong_type finally passes (the previous patch fixed the default table names it assumes, and this patch fixes the expected error messages), so its "xfail" tag is removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	84461f1827	cql, index: fix default index name for collection index When creating an index "CREATE INDEX ON tbl(keys(m))", the default name of the index should be tbl_m_idx - with just "m". The current code incorrectly used the default name tbl_m_keys_idx, so this patch adds a test (which passes on Cassandra, and after this patch also on Scylla) and fixes the default name. It turns out that the default index name was based on a mysterious index_target::as_string(), which printed the target "keys(m)" as "m_keys" without explaining why it was so. This method was actually used only in three places, and all of them wanted just the column name, without the "_keys" suffix! So in this patch we rename the mysterious as_string() to column_name(), and use this function instead. Now that the default index name uses column_name() and gets just column_name(), the correct default index name is generated, and the test passes. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	94ba03a4d6	test/cql-pytest: un-xfail several collecting indexing tests After the previous patches implemented collection indexing, several tests in test/cql-pytest/test_secondary_index.py that were marked with "xfail" started to pass - so here we remove the xfail. Only three collection indexing tests continue to xfail: test_secondary_index.py::test_index_collection_wrong_type test_secondary_index.py::test_index_quoted_names (#10707) test_secondary_index.py::test_local_secondary_index_on_collection (#10713) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Michał Radwański	2690ecd65d	test/cql-pytest/test_secondary_index: verify that local index on collection fails. Collection indexing is being tracked by #2962. Global secondary index over collection is enabled by #10123. Leave this test to track this behaviour. Related issue: #10713	2022-08-14 10:29:52 +03:00

1 2 3 4 5 ...

352 Commits