scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Nadav Har'El	ed34f3b5e4	cql-pytest: translate Cassandra's test for LWT with collections This is a translation of Cassandra's CQL unit test source file validation/operations/InsertUpdateIfConditionTest.java into our cql-pytest framework. This test file checks various LWT conditional updates which involve collections or UDTs (there is a separate test file for LWT conditional updates which do not involve collections, which I haven't translated yet). The tests reproduce one known bug: Refs #5855: lwt: comparing NULL collection with empty value in IF condition yields incorrect results And also uncovered three previously-unknown bugs: Refs #13586: Add support for CONTAINS and CONTAINS KEY in LWT expressions Refs #13624: Add support for UDT subfields in LWT expression Refs #13657: Misformatted printout of column name in LWT error message Beyond those bona-fide bugs, this test also demonstrates several places where we intentionally deviated from Cassandra's behavior, forcing me to comment out several checks. These deviations are known, and intentional, but some of them are undocumented and it's worth listing here the ones re-discovered by this test: 1. On a successful conditional write, Cassandra returns just True, Scylla also returns the old contents of the row. This difference is officially documented in docs/kb/lwt-differences.rst. 2. Scylla allows the test "l = [null]" or "s = {null}" with this weird null element (the result is false), whereas Cassandra prints an error. 3. Scylla allows "l[null]" or "m[null]" (resulting in null), Cassandra prints an error. 4. Scylla allows a negative list index, "l[-2]", resulting in null. Cassandra prints an error in this case. 5. Cassandra allows in "IF v IN (?, ?)" to bind individual values to UNSET_VALUE and skips them, Scylla treats this as an error. Refs #13659. 6. Scylla allows "IN null" (the condition just fails), Cassandra prints an error in this case. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13663	2023-05-02 11:53:58 +03:00
Jan Ciolek	be8ef63bf5	cql3: remove expr::token Let's remove expr::token and replace all of its functionality with expr::function_call. expr::token is a struct whose job is to represent a partition key token. The idea is that when the user types in `token(p1, p2) < 1234`, this will be internally represented as an expression which uses expr::token to represent the `token(p1, p2)` part. The situation with expr::token is a bit complicated. On one hand side it's supposed to represent the partition token, but sometimes it's also assumed that it can represent a generic call to the token() function, for example `token(1, 2, 3)` could be a function_call, but it could also be expr::token. The query planning code assumes that each occurence of expr::token represents the partition token without checking the arguments. Because of this allowing `token(1, 2, 3)` to be represented as expr::token is dangerous - the query planning might think that it is `token(p1, p2, p3)` and plan the query based on this, which would be wrong. Currently expr::token is created only in one specific case. When the parser detects that the user typed in a restriction which has a call to `token` on the LHS it generates expr::token. In all other cases it generates an `expr::function_call`. Even when the `function_call` represents a valid partition token, it stays a `function_call`. During preparation there is no check to see if a `function_call` to `token` could be turned into `expr::token`. This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented as `expr::token` and the query planner handles that, but sometimes it might be represented as `function_call`, which the query planner doesn't handle. There is also a problem because there's a lot of duplication between a `function_call` and `expr::token`. All of the evaluation and preparation is the same for `expr::token` as it's for a `function_call` to the token function. Currently it's impossible to evaluate `expr::token` and preparation has some flaws, but implementing it would basically consist of copy-pasting the corresponding code from token `function_call`. One more aspect is multi-table queries. With `expr::token` we turn a call to the `token()` function into a struct that is schema-specific. What happens when a single expression is used to make queries to multiple tables? The schema is different, so something that is representad as `expr::token` for one schema would be represented as `function_call` in the context of a different schema. Translating expressions to different tables would require careful manipulation to convert `expr::token` to `function_call` and vice versa. This could cause trouble for index queries. Overall I think it would be best to remove expr::token. Although having a clear marker for the partition token is sometimes nice for query planning, in my opinion the pros are outweighted by the cons. I'm a big fan of having a single way to represent things, having two separate representations of the same thing without clear boundaries between them causes trouble. Instead of having expr::token and function_call we can just have the function_call and check if it represents a partition token when needed. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-04-29 13:11:31 +02:00
Kefu Chai	642854f36f	test: s/os.P_NOWAIT/os.WNOHANG/ `os.P_NOWAIT` is supposed to be used in spawn calls, while `os.WNOHANG` is used as in the options parameter passed to wait calls. fortunately, `P_NOWAIT` is defined as "1" in CPython, and `os.WNOHANG` is defined as "1" in linux kernel. that's why the existing implementation works. but we should not rely on this coincidence. so, in this change, `os.P_NOWAIT` is replaced with `os.WNOHANG` for correctness and for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13646	2023-04-24 11:42:34 +03:00
Nadav Har'El	9c3907bb3c	test/cql-pytest: reproducers for incorrect AVG of "decimal" type This patch contains tests reproducing issue #13601 and the corresponding Cassandra issue CASSANDRA-18470. These issues are about what the AVG aggregation does for arbitrary-precision "decimal" numbers - the tests we add here show examples where the current behavior doesn't make sense: The problem is that "decimal" has arbitrary precision - so, should an average of 1/3 be returned as 0.3 or 0.33333333333333333? This is not specified, so Scylla (and Cassandra) decided to pick the result precision based on the input precision. In particular, the average of 1 and 2 is returned as 2 (zero digits after the decimal point, like in the inputs) instead of the expected 1.5. Arguably this isn't useful behavior. The test adds a second test which fails on Cassandra, but does pass on Scylla: Cassandra returns as the average of 1, 2, 2, 3 the integer 1 whereas the correct average is 2 (and Scylla returns it correctly). The reason why this bug is even worse on Cassandra is that Scylla's AVG only loses precision when dividing the sum and count, but Cassandra tries to maintain only the average, and loses precision at every step. Refs #13601 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13603	2023-04-21 08:32:30 +03:00
Botond Dénes	66ee73641e	test/cql-pytest/nodetool.py: no_autocompaction_context: use the correct API This `with` context is supposed to disable, then re-enable autocompaction for the given keyspaces, but it used the wrong API for it, it used the column_family/autocompaction API, which operates on column families, not keyspaces. This oversight led to a silent failure because the code didn't check the result of the request. Both are fixed in this patch: * switch to use `storage_service/auto_compaction/{keyspace}` endpoint * check the result of the API calls and report errors as exceptions Fixes: #13553 Closes #13568	2023-04-20 16:21:16 +03:00
Nadav Har'El	81e0f5b581	cql3: allow SUM() aggregation to result in a NaN When floating-point data contains +Inf and -Inf, the sum is NaN. Our SUM() aggregation calculated this sum correctly, but then instead of returning it, complained that the sum overflowed by narrowing. This was a false positive: The sum() finalizer wanted to test that no precision was lost when casting the accumulator to the result type, so checked that the result before and after the cast are the same. But specifically for NaN, it is never equal to anything - not even to itself. This check is wrong for floating point, but moreover - isn't even necessary when the two types (accumulator type and result type) are identical so in this patch we skip it in this case. Note that in the current code, a different accumulator and result type is only used in the case of integer types; When accumulating floating point sums, the same type is used, so the broken check will be avoided. The test for this issue starts to pass with this patch, so the xfail tag is removed. Fixes #13551 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-19 09:31:41 +03:00
Nadav Har'El	78555ba7f1	test/cql-pytest: add tests for data casts and inf in sums This patch adds tests to reproduce issue #13551. The issue, discovered by a dtest (cql_cast_test.py), claimed that either cast() or sum(cast()) from varint type broke. So we add two tests in cql-pytest: 1. A new test file, test_cast_data.py, for testing data casts (a CAST (...) as ... in a SELECT), starting with testing casts from varint to other types. The test uncovers a lot of interesting cases (it is heavily commented to explain these cases) but nothing there is wrong and all tests pass on Scylla. 2. An xfailing test for sum() aggregate of +Inf and -Inf. It turns out that this caused #13551. In Cassandra and older Scylla, the sum returned a NaN. In Scylla today, it generates a misleading error message. As usual, the tests were run on both Cassandra (4.1.1) and Scylla. Refs #13551. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-18 13:38:42 +03:00
Tomasz Grabiec	952b455310	Merge ' tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes scylla-sstable currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a CQL format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a schema.cql is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like qurantine, staging etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13448 * github.com:scylladb/scylladb: test/cql-pytest: test_tools.py: add tests for schema loading test/cql-pytest: add no_autocompaction_context docs: scylla-sstable.rst: remove accidentally added copy-pasta docs: scylla-sstable.rst: remove paragraph with schema limitations docs: scylla-sstable.rst: update schema section test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-04-14 16:46:26 +02:00
Kefu Chai	c580e30ec7	cql3: expr: return more accurate error message for invalidated token() args before this change, we just print out the addresses of the elements in `column_defs`, if the arguments passed to `token()` function are not valid. this is not quite helpful from the user's perspective. as user would be more interested in the values. also, we could print more accurate error message for different error. in this change, following Cassandra 4.1's behavior, three cases are identified, and corresponding errors are returned respectively: * duplicated partition keys * wrong order of partition key * missing keys where, if the partition key order is wrong, instead of printing the keys specified by user, the correct order is printed in the error message for helping user to correct the `token()` function. for better performance, the checks are performed only if the keys do not match, based on the assumption that the error handling path is not likely to be executed. tests are added accordingly. they tested with Canssandra 4.1.1 also. Fixes #13468 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13470	2023-04-14 11:46:18 +03:00
Raphael S. Carvalho	47b2a0a1f6	data_directory: Describe storage options of a keyspace Description of storage options is important for S3, as one needs to know if underlying storage is either local or remote, and if the latter, details about it. This relies on server-side desc statement. $ ./bin/cqlsh.py -e "describe keyspace1;" CREATE KEYSPACE keyspace1 WITH replication = { ... } AND storage = {'type': 'S3', 'bucket': 'sstables', 'endpoint': '127.0.0.1:9000'} AND durable_writes = true; Fixes #13507. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13510	2023-04-14 11:34:35 +03:00
Botond Dénes	1440efa042	test/cql-pytest: test_tools.py: add tests for schema loading A set of comprehensive tests covering all the supported ways of providing the schema to scylla-sstable, either explicitely or implicitely (auto-detect).	2023-04-12 03:14:43 -04:00
Botond Dénes	76a7d3448f	test/cql-pytest: add no_autocompaction_context	2023-04-12 03:14:43 -04:00
Botond Dénes	222f624757	test/cql-pytest: nodetool.py: add flush_keyspace() It would have been better if `flush()` could have been called with a keyspace and optional table param, but changing it now is too much churn, so we add a dedicated method to flush a keyspace instead.	2023-04-12 03:14:43 -04:00
Nadav Har'El	79114c5030	cql-pytest: translate Cassandra's tests for DELETE operations This is a translation of Cassandra's CQL unit test source file validation/operations/DeleteTest.java into our cql-pytest framework. There are 51 tests, and they did not reproduce any previously-unknown bug, but did provide additional reproducers for three known issues: Refs #4244 Add support for mixing token, multi- and single-column restrictions Refs #12474 DELETE prints misleading error message suggesting ALLOW FILTERING would work Refs #13250 one-element multi-column restriction should be handled like a single-column restriction Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13436	2023-04-11 09:10:11 +03:00
Pavel Emelyanov	ced8a07d09	cql-pytest: Add option to run scylla over stable directory The facilities in run.py script allow launching scylla over temporary directory, waiting for it to come alive, killing, etc. The limitation of those is that the work-dir create for scylla is tighly coupled with its pid. The object-storage test in next patches will need to check that the sstables are preserved on scylla restart and this hard binding of workdir to pid won't work. This patch generalizes the scylla run/abort helpers to accept an external directory to work on and adds a call to restart scylla process over existing directory. And one small related change here -- log file is opened in O_APPEND mode so that restarted scylla process continues writing into the old file. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Botond Dénes	54c0a387a2	Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" This reverts commit `32fff17e19`, reversing changes made to `164afe14ad`. This series proved to be problematic, the new test introduced by it failing quite often. Revert it until the problems are tracked down and fixed.	2023-04-03 13:54:00 +03:00
Nadav Har'El	32fff17e19	Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes `scylla-sstable` currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a `CQL` format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a `schema.cql` is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like `qurantine`, `staging` etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13075 * github.com:scylladb/scylladb: docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section test/cql-pytest: test_tools.py: add test for schema loading test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-03-30 09:35:59 +03:00
Kefu Chai	33f4012eeb	test: cql-pytest: test_describe: clamp bloom filter's fp rate before this change, we use `round(random.random(), 5)` for the value of `bloom_filter_fp_chance` config option. there are chances that this expression could return a number lower or equal to 6.71e-05. but we do have a minimal for this option, which is defined by `utils::bloom_calculations::probs`. and the minimal false positive rate is 6.71e-05. we are observing test failures where the we are using 0 for the option, and scylla right rejected it with the error message of ``` bloom_filter_fp_chance must be larger than 6.71e-05 and less than or equal to 1.0 (got 0) ```. so, in this change, to address the test failure, we always use a number slightly greater or equal to a number slightly greater to the minimum to ensure that the randomly picked number is in the range of supported false positive rate. Fixes #13313 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13314	2023-03-26 19:41:22 +03:00
Botond Dénes	bc9341b84a	test/cql-pytest: test_tools.py: add test for schema loading A comprehensive test covering all the supported ways of providing the schema to scylla-sstable, either explicitely or implicitely (auto-detect).	2023-03-24 11:41:40 -04:00
Botond Dénes	afdfe34ca7	test/cql-pytest: nodetool.py: add flush_keyspace() It would have been better if `flush()` could have been called with a keyspace and optional table param, but changing it now is too much churn, so we add a dedicated method to flush a keyspace instead.	2023-03-24 11:41:40 -04:00
Nadav Har'El	4fdcee8415	test/alternator: increase CQL connection timeout This patch increases the connection timeout in the get_cql_cluster() function in test/cql-pytest/run.py. This function is used to test that Scylla came up, and also test/alternator/run uses it to set up the authentication - which can only be done through CQL. The Python driver has 2-second and 5-second default timeouts that should have been more than enough for everybody (TM), but in #13239 we saw that in one case it apparently wasn't enough. So to be extra safe, let's increase the default connection-related timeouts to 60 seconds. Note this change only affects the Scylla boot in the test/*/run scripts, and it does not affect the actual tests - those have different code to connect to Scylla (see cql_session() in test/cql-pytest/util.py), and we already increased the timeouts there in #11289. Fixes #13239 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13291	2023-03-23 16:03:20 +02:00
Nadav Har'El	b5e61e1b83	test/cql-pytest, lwt: test for detection of contradicting batches Cassandra detects when a batch has both an IF EXISTS and IF NOT EXISTS on the same row, and complains this is not a useful request (after all, it can never succeed, because the batch can only succeed if both conditions are true, and that can't be if one checks IF EXISTS and the other IF NOT EXISTS). This patch adds a test, test_lwt_with_batch_conflict_1, which checks that this case results in an error. It passes on Cassandra, but xfails on Scylla which doesn't report an error in this case. A second test, test_lwt_with_batch_conflict_2, shows that the detection of the EXISTS / NOT EXISTS conflict is special, and other conflicts such as having both "r=1" and "r=2" for the same row, are NOT detected by Cassandra. Refs #13011. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13270	2023-03-23 13:35:21 +02:00
Nadav Har'El	2038388268	cql-pytest: translate Cassandra's tests for multi-column relations This is a translation of Cassandra's CQL unit test source file validation/operations/SelectMultiColumnRelationTest.java into our cql-pytest framework. The tests reproduce four already-known Scylla bugs and three new bugs. All tests pass on Cassandra. Because of these bugs 9 of the 22 tests are marked xfail, and one is marked skip (it crashes Scylla). Already known issues: Refs #64: CQL Multi column restrictions are allowed only on a clustering key prefix Refs #4178: Not covered corner case for key prefix optimization in filtering Refs #4244: Add support for mixing token, multi- and single-column restrictions Refs #8627: Cleanly reject updates with indexed values where value > 64k New issue discovered by these tests: Refs #13217: Internal server error when null is used in multi-column relation Refs #13241: Multi-column IN restriction with tuples of different lengths crashes Scylla Refs #13250: One-element multi-column restriction should be handled like a single-column restriction Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13265	2023-03-22 09:54:32 +02:00
Nadav Har'El	511308bccf	test/cql-pytest: tests for single-element multi-column restrictions It turns out that Cassandra handles a restriction like `(c2) = (1)` just like `c2 = 1`, and is not limited like multi-column restrictions. In particular, this query works despite missing "c1", and may also use an index if c2 is indexed. But currently in Scylla, `(c2) = (1)` is handled like a multi-column restriction, so complains if c2 is not the first clustering key column, and cannot use an index. This patch adds several tests demonstrating this difference between Scylla and Cassandra (#13250). The xfailing tests pass on Cassandra but fail on Scylla. Refs #13250 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13252	2023-03-21 07:56:24 +02:00
Nadav Har'El	8b0822be77	test/cql-pytest: reproducer for bug crashing Scylla on mismatched tuple This patch addes a reproducing test for issue #13241, where attempting a SELECT restriction (b,c,d) IN ((1,2)) - where the tuple is shorter than needed - crashes Scylla (on segmentation fault) instead of generating a clean error as it should (and as done on Cassandra). The test also demonstractes that if the tuple is longer than needed (instead of shorter), the behavior is correct, and it is also correct if "=" is used instead of IN. Only the combination of IN and too-short tuple seems to be broken - but broken in a bad way (can be used to crash Scylla). Because the test crashes Scylla when fails, it is marked "skip". Refs #13241 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13244	2023-03-20 11:13:02 +02:00
Nadav Har'El	c5195e0acd	cql-pytest: add reproducers for GROUP BY bugs The translated Cassandra unit tests in cassandra_tests/validation/operations/ reproduced three bugs in GROUP BY's interaction with LIMIT and PER PARTITION LIMIT - issue #5361, #5362 and #5363. Unfortunately, those test functions are very long, and each test fails on all of these issues and a few more, making it difficult to use these tests to verify when those tests have been fixed. In other words, ideally a patch for issue 5361 should un-xfail some reproducing test for this issue - but all the existing tests will continue to fail after fixing 5361, because of other remaining bugs. So in this patch, I created a new test file test_group_by.py with my own tests for the GROUP BY feature. I tried to explore the different capabilities of the GROUP BY feature, its different success and error paths, and how GROUP BY interacts with LIMIT and PER PARTITION LIMIT. As usual, I created many small test functions and not one huge test function, and as a result we now have 5 xfailing tests which each reproduces one bug and when the bug is fixed, it will start to pass. All tests added here pass on Cassandra. Refs #5361 Refs #5362 Refs #5363 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13136	2023-03-16 10:39:05 +02:00
Nadav Har'El	543d4ed726	cql-pytest: translate Cassandra's tests for GROUP BY This is a translation of Cassandra's CQL unit test source file validation/operations/SelectGroupByTest.java into our cql-pytest framework. This test file contains only 8 separate test functions, but each of them is very long checking hundreds of different combinations of GROUP BY with other things like LIMIT, ORDER BY, etc., so 6 out of the 7 tests fail on Scylla on one of the bugs listed below - most of the tests actually fail in multiple places due to multiple bugs. All tests pass on Cassandra. The tests reproduce six already-known Scylla issues and one new issue: Already known issues: Refs #2060: Allow mixing token and partition key restrictions Refs #5361: LIMIT doesn't work when using GROUP BY Refs #5362: LIMIT is not doing it right when using GROUP BY Refs #5363: PER PARTITION LIMIT doesn't work right when using GROUP BY Refs #12477: Combination of COUNT with GROUP BY is different from Cassandra in case of no matches Refs #12479: SELECT DISTINCT should refuse GROUP BY with clustering column A new issue discovered by these tests: Refs #13109: Incorrect sort order when combining IN, GROUP BY and ORDER BY Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13126	2023-03-15 12:40:24 +02:00
Nadav Har'El	e72b85e82c	Merge 'cql-pytest/lwt_test: test LWT UPDATE when partition/clustering ranges are empty' from Jan Ciołek Adds two test cases which test what happens when we perform an LWT UPDATE, but the partition/clustering key has 0 possible values. This can happen e.g when a column is supposed to be equal to two different values (`c = 0 AND c = 1`). Empty partition ranges work properly, empty clustering range currently causes a crash (#13129). I added tests for both of these cases. Closes #13130 * github.com:scylladb/scylladb: cql-pytest/test_lwt: test LWT update with empty clustering range cql-pytest/test_lwt: test LWT update with empty partition range	2023-03-12 15:11:33 +02:00
Nadav Har'El	53c8c43d8a	Merge 'cql3: improve support for C-style parenthesis casts' from Jan Ciołek CQL supports type casting using C-style casts. For example it's possible to do: `blob_column = (blob)funcReturningInt()` This functionality is pretty limited, we only allow such casts between types that have a compatible binary representation. Compatible means that the bytes will stay unchanged after the conversion. This means that it's legal to cast an int to blob (int is just a 4 byte blob), but it's illegal to cast a bigint to int (change 4 bytes -> 8 bytes). This simplifies things, to cast we can just reinterpret the value as the other type. Another use of C-style casts are type hints. Sometimes it's impossible to infer the exact type of an expression from the context. In such cases the type can be specified by casting the expression to this type. For example: `overloadedFunction((int)?)` Without the cast it would be impossible to guess what should be the bind marker's type. The function is overloaded, so there are many possible argument types. The type hint specifies that the bind marker has type int. An interesting thing is that such casts don't have to be explicit. CQL allows to put an int value in a place where a blob value is expected and it will be automatically converted without any explicit casting. --- I started looking at our implementation of casts because of #12900. In there the author expressed the need to specify a type hint for bind marker used to pass the WASM code. It could be either `(text)?` for text WASM, or `(blob)?` for binary WASM. This specific use of type hints wasn't supported because there was no `receiver` and the implementation of `prepare_expression` didn't handle that. Preparing casts without a receiver should be easy to implement - we can infer the type of the expression by looking at the type to which the expression is cast. But while reading `prepare_expression` for `expr::cast` I noticed that the code there is a bit strange. The implementation prepared the expression to cast using the original `receiver` instead of a receiver with the cast type. This caused some issues because of which casting didn't work as expected. For example it was possible to do: ```cql blob_column = (blob)funcReturningInt() ``` But this didn't work at all: ```cql blob_column = (blob)(int)12323 ``` It tried to prepare `untyped_contant(12323)` with a `blob` receiver, which fails. This makes `expr::cast` useless for casting. Casting when the representation is compatible is already implicit. I couldn't find a single case where adding a cast would change the behavior in any way. There was some use for it as a type hint to choose a specific overload of a function, but it was worthless for casting. Cassandra has the same issue, I created a `cql-pytest` test and it showed that we behave in the same way as Cassandra does. I decided to improve this. By preparing the expression using a receiver with the cast type, `expr::cast` becomes actually useful for casting values. Things like `(blob)(int)12323` now work without any issues. This diverges from the behavior in Cassandra, but it's an extension, not a breaking incompatibility. --- This PR improves `prepare_expression` for `expr::cast` in the following ways: 1) Support for more complex casts by preparing the expression using a different receiver. This makes casts like `(blob)(int)123` possible 2) Support preparing `expr::cast` without a receiver. Type inference chooses the cast type as the type of the expression. 3) Add pytest tests for C-style casts `2)` Is needed for #12900, the other changes is just something I decided to do since I was already working on this piece of code. Closes #13053 * github.com:scylladb/scylladb: expr_test: more tests for preparing bind variables with type hints prepare_expr: implement preparing expr::cast with no receiver prepare_expr: use :user formatting in cast_prepare_expression prepare_expr: remove std::get<> in cast_prepare_expression prepare_expr: improve cast_prepare_expression prepare_expr: improve readability in cast_prepare_expression cql-pytest: test expr::cast in test_cast.py	2023-03-12 15:07:54 +02:00
Nadav Har'El	843a5dfc15	Merge 'Allow setting permissions for user-defined functions' from Wojciech Mitros This series aims to allow users to set permissions on user-defined functions. The implementation is based on Cassandra's documentation and should be fully compatible: https://cassandra.apache.org/doc/latest/cassandra/cql/security.html#cql-permissions Fixes: #5572 Fixes: #10633 Closes #12869 * github.com:scylladb/scylladb: cql3: allow UDTs in permissions on UDFs cql3: add type_parser::parse() method taking user_types_metadata schema_change_test: stop using non-existent keyspace cql3: fix parameter names in function resource constructors cql3: handle complex types as when decoding function permissions cql3: enforce permissions for ALTER FUNCTION cql-pytest: add a (failing) test case for UDT in UDF cql-pytest: add a test case for user-defined aggregate permissions cql-pytest: add tests for function permissions cql3: enforce permissions on function calls selection: add a getter for used functions abstract_function_selector: expose underlying function cql3: enforce permissions on DROP FUNCTION cql3: enforce permissions for CREATE FUNCTION client_state: add functions for checking function permissions cql-pytest: add a case for serializing function permissions cql3: allow specifying function permissions in CQL auth: add functions_resource to resources	2023-03-12 14:04:34 +02:00
Wojciech Mitros	6b8c1823a3	cql3: allow UDTs in permissions on UDFs Currently, when preparing an authorization statement on a specific function, we're trying to "prepare" all cql types that appear in the function signature while parsing the statement. We cannot do that for UDTs, because we don't know the UDTs that are present in the databse at parsing time. As a result, such authorization statements fail. To work around this problem, we postpone the "preparation" of cql types until the actual statement validation and execution time. Until then, we store all type strings in the resource object. The "preparation" happens in the `maybe_correct_resource` method, which is called before every `execute` during a `check_access` call. At that point, we have access to the `query_processor`, and as a result, to `user_types_metadata` which allows us to prepare the argument types even for UDTs.	2023-03-10 11:02:33 +01:00
Wojciech Mitros	9a303fd99c	cql3: handle complex types as when decoding function permissions Currently, we're parsing types that appear in a function resource using abstract_type::parse_type, which only works with simple types. This patch changes it to db::marshal::type_parser::parse, which can also handle collections. We also adjust the test_grant_revoke_udf_permissions test so that it uses both simple and complex types as parameters of the function that we're granting/revoking permissions on.	2023-03-10 11:02:32 +01:00
Wojciech Mitros	438c7fdfa7	cql3: enforce permissions for ALTER FUNCTION Currently, the ALTER permission is only enforced on ALL FUNCTIONS or on ALL FUNCTIONS IN KEYSPACE. This patch enforces the permisson also on a specific function.	2023-03-10 11:02:32 +01:00
Piotr Sarna	c4e6925bb6	cql-pytest: add a (failing) test case for UDT in UDF Our permissions system is currently incapable of figuring out user-defined type definitions when preparing functions permissions. This test case creates such a function, and it passes on Cassandra.	2023-03-10 11:02:32 +01:00
Piotr Sarna	63e67c9749	cql-pytest: add a test case for user-defined aggregate permissions This test case is similar to the one for user-defined functions, but checks if aggregate permissions are enforced.	2023-03-10 11:02:32 +01:00
Piotr Sarna	6deebab786	cql-pytest: add tests for function permissions The test case checks that function permissions are enforced for non-superuser users.	2023-03-10 11:01:48 +01:00
Jan Ciolek	7c384de476	prepare_expr: improve cast_prepare_expression Preparing expr::cast had some artificial limitations. Things like this worked: `blob_col = (blob)funcReturnsInt()` But this didn't: `blob_col = (blob)(int)1234` This is caused by the line: `prepare_expression(c.arg, db, keyspace, schema_opt, receiver)` Here the code prepares the expression to be cast using the original receiver which was passed to cast_prepare_expression. In the example above this meant that it tried to prepare untyped_constant(1234) using a receiver with type blob. This failed because an integer literal is invalid for a blob column. To me it looks like a mistake. What it should do instead is prepare the int literal using the type (int) and then see if int can be cast to blob, by checking if these types have compatible binary representation. This can be achieved by using `cast_type_receiver` instead of `receiver`. Making this small change makes it possible to use the cast in many situations where it was previously impossible. The tests have to be updated to reflect the change, some of them ow deviate from Cassandra, so they have to be marked scylla_only. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 18:31:41 +01:00
Piotr Sarna	488934e528	cql3: enforce permissions on DROP FUNCTION Only users with DROP permission are allowed to drop user-defined functions.	2023-03-09 17:51:15 +01:00
Piotr Sarna	8de1017691	cql-pytest: add a case for serializing function permissions This test case checks that granting function permissions result in correct serialization of the permissions - so that reading system_auth.role_permissions and listing the permissions via CQL with `LIST permission OF role` works in a compatible way with both Scylla and Cassandra.	2023-03-09 17:50:56 +01:00
Jan Ciolek	e4a3e2ac14	cql-pytest/test_lwt: test LWT update with empty clustering range Add a test case which performs an LWT UPDATE, but the clustering key has 0 possible values, because it's supposed to be equal to two different values. This currently causes a crash, see https://github.com/scylladb/scylladb/issues/13129 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 15:44:10 +01:00
Jan Ciolek	5e5e4c5323	cql-pytest/test_lwt: test LWT update with empty partition range Add a test case which performs an LWT UPDATE, but the partition key has 0 possible values, because it's supposed to be equal to two different values. Such queries used to cause problems in the past. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 15:43:24 +01:00
Nadav Har'El	a4a318f394	cql: USING TTL 0 means unlimited, not default TTL Our documentation states that writing an item with "USING TTL 0" means it should never expire. This should be true even if the table has a default TTL. But Scylla mistakenly handled "USING TTL 0" exactly like having no USING TTL at all (i.e., it took the default TTL, instead of unlimited). We had two xfailing tests demonstrating that Scylla's behavior in this is different from Cassandra. Scylla's behavior in this case was also undocumented. By the way, Cassandra used to have the same bug (CASSANDRA-11207) but it was fixed already in 2016 (Cassandra 3.6). So in this patch we fix Scylla's "USING TTL 0" behavior to match the documentation and Cassandra's behavior since 2016. One xfailing test starts to pass and the second test passes this bug and fails on a different one. This patch also adds a third test for "USING TTL ?" with UNSET_VALUE - it behaves, on both Scylla and Cassandra, like a missing "USING TTL". The origin of this bug was that after parsing the statement, we saved the USING TTL in an integer, and used 0 for the case of no USING TTL given. This meant that we couldn't tell if we have USING TTL 0 or no USING TTL at all. This patch uses an std::optional so we can tell the case of a missing USING TTL from the case of USING TTL 0. Fixes #6447 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13079	2023-03-08 16:18:23 +02:00
Jan Ciolek	0417c48bdc	cql-pytest: test unset value in UPDATE and LWT UPDATE Add a test which performs an UPDATE and tries to pass an UNSET_VALUE as a value for the primary key. There is also an LWT variant of this test that tries to set an UNSET_VALUE in the IF condition. These two tests are analogous to test_insert_update_where and test_insert_update_where_lwt, but use an UPDATE instead of INSERT. It's useful to test UPDATE as well as INSERT. When I was developing a fix for #13001 I initially added the condition for unset value inside insert_statement, but this didn't handle update statements. These two tests allowed me to see that UPDATE still causes a crash. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13058	2023-03-08 10:39:26 +02:00
Nadav Har'El	ef50e4022c	test: drop our "pytest" wrapper script When Fedora 37 came out, we discovered that its "pytest" script started to run Python with the "-s" option, which caused problems for packages installed personally via pip. We fixed this by adding our own wrapper script test/pytest. But this bug (https://bugzilla.redhat.com/show_bug.cgi?id=2152171) was already fixed in Fedora 37, and the new version already reached our dbuild. So we no longer need this wrapper script. Let's remove it. Fixes #12412 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13083	2023-03-08 07:31:37 +02:00
Jan Ciolek	03d37bdc14	cql-pytest: test expr::cast in test_cast.py CQL supports C-style casts with the destination type specified inside parenthesis e.g `blob_column = (blob)funcThatReturnsInt()`. These casts can be used to convert values of types that have compatible binary representation, or as a type hint to specify the type where the situation is ambiguous. I didn't find any cql-pytest tests for this feature, so I added some. It looks like the feature works, but only partially. Doing things like this works: `blob_column = (blob)funcThatReturnsInt()` But trying to do something a bit more complex fails: `blob_column = (blob)(int)1234` This is the case in both Cassandra and Scylla, the tests introduced in this commit pass on both of them. In future commits I will extend this feature to support the more complex cases as well, then some tests will have to be marked scylla_only. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-08 03:24:13 +01:00
Nadav Har'El	f05ea80fb5	test/cql-pytest: remove unused async marker One test in test/cql-pytest/test_batch.py accidentally had the asyncio marker, despite not using any async features. Remove it. The test still runs fine. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13002	2023-03-07 14:33:34 +02:00
Wojciech Mitros	d4851ccae7	treewide: rename the "xwasm" UDF language to "wasm" When the WASM UDFs were first introduced, the LANGUAGE required in the CQL statements to use them was "xwasm", because the ABI for the UDFs was still not specified and changes to it could be backwards incompatible. Now, the ABI is stabilized, but if backwards incompatible changes are made in the future, we will add a new ABI version for them, so the name "xwasm" is no longer needed and we can finally change it to "wasm". Closes #13089	2023-03-07 10:21:11 +02:00
Alejo Sanchez	eaed778f4a	test/cql-pytest: print driver version Print driver version for cql-pytest tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12840	2023-03-06 11:31:26 +02:00
Botond Dénes	92fde47261	Merge 'test/cql-pytest - aggregation tests' from Nadav Har'El This small series reorganizes the existing functional tests for aggregation (min, max, count) and adds additional tests for sum reproducing the strange (but Cassandra-compatible) behavior described in issue #13027. Closes #13038 * github.com:scylladb/scylladb: cql-pytest: add tests for sum() aggregate test/cql-pytest: move aggregation tests to one file	2023-03-01 11:02:08 +02:00
Nadav Har'El	363f326d49	test/cql-pytest: test for CLUSTERING ORDER BY verification in MV Since commit `73e258fc34`, Scylla has partial verification for the CLUSTERING ORDER BY clause in CREATE MATERIALIZED VIEW. Specifically, invalid column names are rejected. But for reasons explained in issue #12936 and in the test in this patch, Cassandra demands that if CLUSTERING ORDER BY appears it must list all the clustering columns, with no duplicates, and do so in the right order. This patch replaces an existing test which suggested it is fine (an extention over Cassandra) to accept a partial list of clustering columns, by a test that verifies that such a partial list, or an incorrectly-ordered list, or list with duplicates, should be rejected. The new test fails on Scylla, and passes on Cassandra, so marked as xfail. Refs #12936. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12938	2023-03-01 08:02:39 +02:00

1 2 3 4 5 ...

481 Commits