scylladb

Author	SHA1	Message	Date
Nadav Har'El	4810937ddf	test/alternator: fix flaky test test_item_latency The Alternator test test_metrics.py::test_item_latency confirms that for several operation types (PutItem, GetItem, DeleteItem, UpdateItem) we did not forget to measure their latencies. The test checked that a latency was updated by checking that two metrics increases: scylla_alternator_op_latency_count scylla_alternator_op_latency_sum However, it turns out that the "sum" is only an approximate sum of all latencies, and when the total sum grows large it sometimes does not increase when a short latency is added to the statistics. When this happens, this test fails on the assertion that the "sum" increases after an operation. We saw this happening sometimes in CI runs. The simple fix is to stop checking _sum at all, and only verify that the _count increases - this is really an integer counter that unconditionally increases when a latency is added to the histogram. Don't worry that the strength of this test is reduced - this test was never meant to check the accuracy or correctness of the histograms - we should have different (and better) tests for that, unrelated to Alternator. The purpose of this test is only to verify that for some specific operation like PutItem, Alternator didn't forget to measure its latency and update the histogram. We want to avoid a bug like we had in counters in the past (#9406). Fixes #18847. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `13cf6c543d`) Closes scylladb/scylladb#19193	2024-06-10 20:20:54 +03:00
Pavel Emelyanov	62a23fd86a	config: Remove experimental TABLETS feature ... and replace it with boolean enable_tablets option. All the places in the code are patched to check the latter option instead of the former feature. The option is OFF by default, but the default scylla.yaml file sets this to true, so that newly installed clusters turn tablets ON. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `83d491af02`) Closes scylladb/scylladb#19012	2024-06-03 12:16:41 +03:00
Marcin Maliszkiewicz	cbf47319c1	db: auth: move auth tables to system keyspace Separate keyspace which also behaves as system brings little benefit while creating some compatibility problems like schema digest mismatch during rollback. So we decided to move auth tables into system keyspace. Fixes https://github.com/scylladb/scylladb/issues/18098 Closes scylladb/scylladb#18769 (cherry picked from commit `2ab143fb40`) [avi: adjust test/alternator/suite.yaml to reflect new keyspace]	2024-06-02 21:41:14 +03:00
Andrei Chekun	76a766cab0	Migrate alternator tests to PythonTestSuite As part of the unification process, alternator tests are migrated to the PythonTestSuite instead of using the RunTestSuite. The main idea is to have one suite, so there will be easier to maintain and introduce new features. Introduce the prepare_sql option for suite.yaml to add possibility to run cql statements as precondition for the test suite. Related: https://github.com/scylladb/scylladb/issues/18188 Closes scylladb/scylladb#18442	2024-05-13 13:23:29 +03:00
Nadav Har'El	5558143014	test/alternator: test addressing LSI using REST API The name of the Scylla table backing an Alternator LSI looks like basename:!lsiname. Some REST API clients (including Scylla Manager) when they send a "!" character in the REST API request may decide to "URL encode" it - convert it to %21. Because of a Seastar bug (https://github.com/scylladb/seastar/issues/725) Scylla's REST API server forgets to do the URL decoding, which leads to the REST API request failing to address the LSI table. This patch introduces a test for this bug, which fails without the Seastar issue being fixed, and passes afterwards (i.e., after the previous patch that starts to use the new, fixed, Seastar API). The test creates an LSI, uses the REST API to find its name and then tries to call some REST API ("compaction_strategy") on this table name, after deliberately URL-encoding it. Refs #5883. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-05-02 12:33:54 +03:00
Nadav Har'El	c24bc3b57a	alternator: do not use tablets on new Alternator tables A few months ago, in merge `d3c1be9107`, we decided that if Scylla has the experimental "tablets" feature enabled, new Alternator tables should use this feature by default - exactly like this is the default for new CQL tables. Sadly, it was now decided to reverse this decision: We do not yet trust enough LWT on tablets, and since Alternator often (if not always) relies on LWT, we want Alternator tables to continue to use vnodes - not tablets. The fix is trivial - just changing the default. No test needed to change because anyway, all Alternator tests work correctly on Scylla with the tablets experimental feature disabled. I added a new test to enshrine the fact that Alternator does not use tablets. An unfortunate result of this patch will be that Alternator tables created on versions with this patch (e.g., Scylla 6.0) will not use tablets and will continue to not use tablets even if Scylla is upgraded (currently, the use of tablets is decided at table creation time, and there is no way to "upgrade" a vnode-based table to be tablet based). This patch should be reverted as soon as LWT support matures on tablets. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#18157	2024-04-04 12:11:29 +03:00
Nadav Har'El	ba97fd98a3	alternator: reduce stall for Query and Scan with large pages Before this patch, Alternator's Query and Scan operations convert an entire result page to JSON without yielding. For a page of maximum size (1MB) and tiny rows, this can cause a significant stall - the test included in this patch reported stalls of 14-26ms on my laptop. The problem is the describe_items() function, which does this conversion immediately, without yielding. This patch changes this function to return a future, and use the result_set::visit_gently() method instead of visit() that yields when needed. This patch does not completely eliminate stalls in the test, but on my laptop usually reduces them to around 5ms. It appears that the remaining stalls some from other places not fixed in this PR, such as perhaps query_page::handle_result(), and will need to be fixed by additional patches. The test included in this patch is useful for manually reproducing the stall, but not useful as a regression test: It is slow (requiring a couple of seconds to set up the large partition) and doesn't check anything, and can't even report the stall without modifying the test runner. So the test is skipped by default (using the "veryslow" marker) and can be enabled and run manually by developers who want to continue working on #17995. Refs #17995. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-03-26 18:32:45 +02:00
Nadav Har'El	d207962e40	test/alternator: tests for latency metrics In test/alternator/test_metrics.py we had tests for the operation-count metrics for different Alternator API operations, but not for the latency histograms for these same operations. So this patch adds the missing tests (and removes a TODO asking to do that). Note that only a subset of the operations - PutItem, GetItem, DeleteItem, UpdateItem, and GetRecords - currently have a latency history, and this test verifies this. We have an issue (Refs #17616) about adding latency histograms for more operations - at which point we will be able to expand this test for the additional operations. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-03-11 19:26:59 +02:00
Nadav Har'El	970c2dc7a6	test/alternator: improve comments and unhide hidden test The original goal of this patch was to improve comments in test/alternator/test_metrics.py, but while doing that I discovered that one of the test functions was hidden by a second test with the same name! So this patch also renames the second test. The test continues to work after this patch - the hidden test was successful. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-03-11 19:26:59 +02:00
Marcin Maliszkiewicz	9cb1f111d5	alternator: add support for auth-v2 Alternator doesn't do any writes to auth tables so it's simply change of keyspace name. Docs will be updated later, when auth-v2 is enabled as default.	2024-03-01 16:25:14 +01:00
Nadav Har'El	28db187756	alternator, tablets: return error if enabling TTL with tablets Alternator TTL doesn't yet work on tables using tablets (this is issue #16567). Before this patch, it can be enabled on a table with tablets, and the result is a lot of log spam and nothing will get expired. So let's make the attempt to enable TTL on a table that uses tablets into a clear error. The error message points to the issue, and also suggests how to create a table that uses vnodes, not tablets. This patch also adds a test that verifies that trying to enable TTL with tablets is an error. Obviously, this test should be removed once the issue is solved and TTL begins working with tablets. Refs #16567 Refs #16807 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#17306	2024-02-15 10:47:06 +02:00
Nadav Har'El	dce47a81b0	alternator, tablets: return error if enabling Streams with tablets Alternator Streams doesn't yet work on tables using tablets (this is issue #16317). Before this patch, an attempt to enable it results in an unsightly InternalServerError, which isn't terrible - but we can do better. So in this patch, we make the attempt to enable Streams and tablets together into a clear error. The error message points to the open issue, and also suggests how to create a table that uses vnodes, not tablets. Unfortunately, there are slightly two different code paths and error messages for two cases: One case is the creation of a new table (where the validation happens before the keyspace is actually created), and the other case is an attempt to enable streams on an existing table with an existing keyspace (which already might or might not be using tablets). This patch also adds a test that verifies that trying to enable Streams with tablets is an error - in both cases (table creation and update). Obviously, this test - and the validation code - should be removed once the issue is solved and Alternator Streams begins working with tablets. Fixes #16497 Refs #16807 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#17311	2024-02-13 16:42:35 +02:00
Nadav Har'El	21e7deafeb	alternator, mv: fix case of two new key columns in GSI A materialized view in CQL allows AT MOST ONE view key column that wasn't a key column in the base table. This is because if there were two or more of those, the "liveness" (timestamp, ttl) of these different columns can change at every update, and it's not possible to pick what liveness to use for the view row we create. We made an exception for this rule for Alternator: DynamoDB's API allows creating a GSI whose partition key and range key are both regular columns in the base table, and we must support this. We claim that the fact that Alternator allows neither TTL (Alternator's "TTL" is a different feature) nor user-defined timestamps, does allow picking the liveness for the view row we create. But we did it wrong! We claimed in a comment - and implemented in the code before this patch - that in Alternator we can assume that both GSI key columns will have the same liveness, and in particular timestamp. But this is only true if one modifies both columns together! In fact, in general it is not true: We can have two non-key attributes 'a' and 'b' which are the GSI's key columns, and we can modify only b, without modifying a, in which case the timestamp of the view modification should be b's newer timestamp, not a's older one. The existing code took a's timestamp, assuming it will be the same as b's, which is incorrect. The result was that if we repeatedly modify only b, all view updates will receive the same timestamp (a's old timestamp), and a deletion will always win over all the modifications. This patch includes a reproducing test written by a user (@Zak-Kent) that demonstrates how after a view row is deleted it doesn't get recreated - because all the modifications use the same timestamp. The fix is, as suggested above, to use the higher of the two timestamps of both base-regular-column GSI key columns as the timestamp for the new view rows or view row deletions. The reproducer that failed before this patch passes with it. As usual, the reproducer passes on AWS DynamoDB as well, proving that the test is correct and should really work. Fixes #17119 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#17172	2024-02-12 13:17:29 +02:00
Botond Dénes	d3c1be9107	Merge 'alternator: enable tablets by default if experimental feature is enabled' from Nadav Har'El This series does a similar change to Alternator as was done recently to CQL: 1. If the "tablets" experimental feature in enabled, new Alternator tables will use tablets automatically, without requiring an option on each new table. A default choice of initial_tablets is used. These choices can still be overridden per-table if the user wants to. 3. In particular, all test/alternator tests will also automatically run with tablets enabled 4. However, some tests will fail on tablets because they use features that haven't yet been implemented with tablets - namely Alternator Streams (Refs #16317) and Alternator TTL (Refs #16567). These tests will - until those features are implemented with tablets - continue to be run without tablets. 5. An option is added to the test/alternator/run to allow developers to manually run tests without tablets enabled, if they wish to (this option will be useful in the short term, and can be removed later). Fixes #16355 Closes scylladb/scylladb#16900 * github.com:scylladb/scylladb: test/alternator: add "--vnodes" option to run script alternator: use tablets by default, if available test/alternator: run some tests without tablets	2024-01-29 09:22:13 +02:00
Nadav Har'El	830e52008d	test/alternator: add more tests for TagResource Issue #16904 discovered that Alternator refuses to allow an empty tag value while it's useful (and DynamoDB allows it). This brought to my attention that our test coverage of the TagResource operation was lacking. So this patch adds more tests for some corner cases of TagResource which we missed, including the allowed lengths of tag keys and values. These tests reproduce #16904 (the case of empty tag value) and also #16908 (allowing and correctly counting unicode letters), and also add regression testing to cases which we already handled correctly. As usual, all the new tests also pass on DynamoDB. Refs #16904 Refs #16908 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-23 11:55:22 +02:00
Nadav Har'El	4d6b286345	test/alternator: add "--vnodes" option to run script test/cql-pytest/run.py was recently modified to add the "tablets" experimental feature, so test/alternator/run now also runs Scylla by default with tablets enabled. This is the correct default going forward, but in the short term it would be nice to also have an option to easily do a manual test run without tablets. So this patch adds a "--vnodes" option to the test/alternator/run script. This option causes "run" to run Scylla without enabling the "tablets" experimental feature. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-23 10:53:23 +02:00
Nadav Har'El	36f14f89df	test/alternator: run some tests without tablets If an Alternator table uses tablets (we'll turn this on in a following patch), some tests are known to fail because of features not yet supported with tablets, namely: Refs #16317 - Support Alternator Streams with tablets (CDC) Refs #16567 - Support Alternator TTL with tablets This patch changes all tests failing on tablets due to one of these two known issues to explicitly ask to disable tablets when creating their test table. This means that at least we continue to test these two features (Streams and TTL) even if they don't yet work with tablets. We'll need to remember to remove this override when tablet support for CDC and Alternator TTL arrives. I left a comment in the right places in the code with the relevant issue numbers, to remind us what to change when we fix those issues. This patch also adds xfail_tablets and skip_tablets fixtures that can be used to xfail or skip tests when running with tablets - but we don't use them yet - and may never use them, but since I already wrote this code it won't hurt having it, just in case. When running without tablets, or against an older Scylla or on DynamoDB, the tests with these marks are run normally. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-23 10:46:48 +02:00
Nadav Har'El	94580df1c5	test/alternator: fix flaky test in test_filter_expression.py The test test_filter_expression.py::test_filter_expression_precedence is flaky - and can fail very rarely (so far we've only actually seen it fail once). The problem is that the test generates items with random clustering keys, chosen as an integer between 1 and 1 million, and there is a chance (roughly 2/10,000) that two of the 20 items happen to have the same key, so one of the items is "lost" and the comparison we do to the expected truth fails. The solution is to just use sequential keys, not random keys. There is nothing to gain in this test by using random keys. To make this test bug easy to reproduce, I temporarily changed random_i()'s range from 1,000,000 to 3, and saw the test failing every single run before this patch. After this patch - no longer using random_i() for the keys - the test doesn't fail any more. Fixes #16647 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#16649	2024-01-04 21:36:40 +02:00
Patryk Jędrzejczak	5ebfbf42bc	db: config: make consistent_cluster_management mandatory Code that executed only when consistent_cluster_management=false is removed. In particular, after this patch: - raft_group0 and raft_group_registry are always enabled, - raft_group0::status_for_monitoring::disabled becomes unused, - topology tests can only run with consistent_cluster_management.	2023-12-14 16:54:04 +01:00
Botond Dénes	d2a88cd8de	Merge 'Typos: fix typos in code' from Yaniv Kaul Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255 Closes scylladb/scylladb#16289 * github.com:scylladb/scylladb: Update unified/build_unified.sh Update main.cc Update dist/common/scripts/scylla-housekeeping Typos: fix typos in code	2023-12-06 07:36:41 +02:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Yaniv Kaul	21cce458d8	test: alternator: fix typo passs instead of pass in test_gsi.py Fix a typo. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#16258	2023-12-04 18:58:31 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Marcin Maliszkiewicz	81be3e0935	test/alternator/run: port -h and --omit-scylla-output options from cql-pytest Closes scylladb/scylladb#16171	2023-11-26 13:52:01 +02:00
Marcin Maliszkiewicz	3992d1c2ce	alternator: add support for ReturnValuesOnConditionCheckFailure feature As announced in https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-dynamodb-cost-failed-conditional-writes/, DynamoDB added a new option for write operations (PutItem, UpdateItem, or DeleteItem), ReturnValuesOnConditionCheckFailure, which if set to ALL_OLD returns the current value of the item - but only if a condition check failed. Fixes https://github.com/scylladb/scylladb/issues/14481	2023-10-30 15:33:56 +01:00
Nikita Kurashkin	2a7932efa1	alternator: fix DeleteTable return values to match DynamoDB's It seems that Scylla has more values returned by DeleteTable operation than DynamoDB. In this patch I added a table status check when generating output. If we delete the table, values KeySchema, AttributeDefinitions and CreationDateTime won't be returned. The test has also been modified to check that these attributes are not returned. Fixes scylladb#14132 Closes scylladb/scylladb#15707	2023-10-19 09:34:16 +03:00
Nadav Har'El	ea56c8efcd	test/alternator: reduce code duplication in test for list_append() A reviewer noted that test_update_expression_list_append_non_list_arguments has too much code duplication - the same long API call to run "SET a = list_append(...)" was repeated many times. So in this patch we add a short inner function "try_list_append" to avoid this duplication. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes: #15298	2023-09-11 10:09:35 +03:00
Nadav Har'El	c52e0fd333	test/alternator: avoid warnings about unverified HTTPS The Alternator tests can run against HTTPS - namely when using test/alternator/run with the "--https" option (local Alternator configured with HTTPS) or "--aws" option (DynamoDB, using HTTPS). In some cases we make these HTTPS requests with verify=False, to avoid checking the SSL certificates. E.g., this is necessary for Alternator with a self-signed certificate. Unfortunately, the urllib3 library adds an ugly warning message when SSL certificate verification is disabled. In the past we tried to disable these warnings, using the documented urllib3.disable_warnings() function, but it didn't help. It turns out that pytest has its own warning handling, so to disable warnings in pytest we must say so in a special configuration parameter in pytest.ini. So in this patch, we drop the disable_warnings call from conftest.py (where it didn't help), and instead put a similar declaration in pytest.ini. The disable_warnings call in the test/alternator/run script needs to remain - it is run outside pytest, so pytest.ini doesn't affect it. After this patch, running test/alternator/run with --https or --aws finishes without warnings, as desired. Fixes #15287 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15292	2023-09-07 07:23:57 +03:00
Nadav Har'El	cfc70810d3	test/alternator: more error-path tests for list_append() function Improved the coverage of the tests for the list_append() function in UpdateExpression - test that if one of its arguments is not a list, including a missing attribute or item, it is reported as an error as expected. The new tests pass on both Alternator and DynamoDB. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15291	2023-09-06 11:59:54 +03:00
Nadav Har'El	04e5082d52	alternator: limit expression length and recursion depth DynamoDB limits of all expressions (ConditionExpression, UpdateExpression, ProjectionExpression, FilterExpression, KeyConditionExpression) to just 4096 bytes. Until now, Alternator did not enforce this limit, and we had an xfailing test showing this. But it turns out that not enforcing this limit can be dangerous: The user can pass arbitrarily-long and arbitrarily nested expressions, such as: a<b and (a<b and (a<b and (a<b and (a<b and (a<b and (...)))))) or ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( and those can cause recursive algorithms in Alternator's parser and later when applying expressions to recurse very deeply, overflow the stack, and crash. This patch includes new tests that demonstrate how Scylla crashes during parsing before enforcing the 4096-byte length limit on expressions. The patch then enforces this length limit, and these tests stop crashing. We also verify that deeply-nested expressions shorter than the 4096-byte limit are apparently short enough for our recursion ability, and work as expected. Unforuntately, running these tests many times showed that the 4096-byte limit is not low enough to avoid all crashes so this patch needs to do more: The parsers created by ANTLR are recursive, and there is no way to limit the depth of their recursion (i.e., nothing like YACC's YYMAXDEPTH). Very deep recursion can overflow the stack and crash Scylla. After we limited the length of expression strings to 4096 bytes this was almost enough to prevent stack overflows. But unfortunetely the tests revealed that even limited to 4096 bytes, the expression can sometimes recurse too deeply: Consider the expression "((((((....((((" with 4000 parentheses. To realize this is a syntax error, the parser needs to do a recursive call 4000 times. Or worse - because of other Antlr limitations (see rants in comments in expressions.g) it's actually 12000 recursive calls, and each of these calls have a pretty large frame. In some cases, this overflows the stack. The solution used in this patch is not pretty, but works. We add to rules in alternator/expressions.g that recurse (there are two of those - "value" and "boolean_expression") an integer "depth" parameter, which we increase when the rule recurses. Moreover, we add a so-called predicate "{depth<MAX_DEPTH}?" that stops the parsing when this limit is reached. When the parsing is stopped, the user will see a special kind of parse error, saying "expression nested too deeply". With this last modification to expressions.g, the tests for deeply-nested but still-below-4096-bytes expressions (test_limits.py::test_deeply_nested_expression_*) would not fail sporadically as they did without it. While adding the "expression nested too deeply" case, I also made the general syntax-error reporting in Alternator nicer: It no longer prints the internal "expression_syntax_error" type name (an exception type will only be printed if some sort of unexpected exception happens), and it prints the character position where the syntax error (or too deep nested expression) was recognized. Fixes #14473 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14477	2023-07-31 08:57:54 +03:00
Nadav Har'El	59c1498338	test/alternator: don't forget to delete tables on test failures Most of the Alternator tests are careful to unconditionally remove the test tables, even if the test fails. This is important when testing on a shared database (e.g., DynamoDB) but also useful to make clean shutdown faster as there should be no user table to flush. We missed a few such cases in test_gsi.py, and this patch corrects them. We do this by using the context manager new_test_table() - which automatically deletes the table when done - instead of the function create_test_table() which needs an explicit delete at the end. There are no functional changes in this patch - most of the lines changed are just reindents. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14835	2023-07-26 21:51:22 +03:00
Nadav Har'El	e01a369708	alternator: detect errors in AttributeDefinitions parameter Add missing validation of the AttributeDefinitions parameter of the CreateTable operation in Alternator. This validation isn't needed for correctness or safety - the invalid entries would have been ignored anyway. But this patch is useful for user-experience - the user should be notified when the request is malformed instead of ignoring the error. The fix itself is simple (a new validate_attribute_definitions() function, calling it in the right place), but much of the contents of this patch is a fairly large set of tests covering all the interesting cases of how AttributeDefinitions can be broken. Particularly interesting is the case where the same AttributeName appears more than once, e.g., attempting to give two different types to the same key attribute - which is not allowed. One of the new tests remains xfail even after this patch - it checks the case that a user attempts to add a GSI to an existing table where another GSI defined the key's type differently. This test can't succeed until we allow adding GSIs to existing tables (Refs #11567). Fixes #13870. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14556	2023-07-13 11:28:47 +03:00
Nadav Har'El	a4087f58df	alternator: fix error path for size() function on constants The DynamoDB documentation for the size() function claims that it only works on paths (attribute names or references), but it actually works on constants from the query (e.g., ":val") as well. It turns out that Alternator supports this undocumented case already, but gets the error path wrong: Usually, when size() is calculated on the data, if the data has the wrong type of size() (e.g., an integer), the condition simply doesn't match. But if the value comes from the query - it should generate an error that the query is wrong - ValidationException. This patch fixes this case, and also adds tests for it that pass on both DynamoDB and Alternator (after this patch). Fixes #14592 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14593	2023-07-12 12:29:05 +03:00
Nadav Har'El	599636b307	test/alternator: fix flaky test test_ttl_expiration_gsi_lsi The Alternator test test_ttl.py::test_ttl_expiration_gsi_lsi was flaky. The test incorrectly assumes that when we write an already expired item, it will be visible for a short time until being deleted by the TTL thread. But this doesn't need to be true - if the test is slow enough, it may go look or the item after it was already expired! So we fix this test by splitting it into two parts - in the first part we write a non-expiring item, and notice it eventually appears in the GSI, LSI, and base-table. Then we write the same item again, with an expiration time - and now it should eventually disappear from the GSI, LSI and base-table. This patch also fixes a small bug which prevented this test from running on DynamoDB. Fixes #14495 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14496	2023-07-12 11:23:12 +03:00
Marcin Maliszkiewicz	6424dd5ec4	alternator: close output_stream when exception is thrown during response streaming When exception occurs and we omit closing output_stream then the whole process is brought down by an assertion in ~output_stream. Fixes https://github.com/scylladb/scylladb/issues/14453 Relates https://github.com/scylladb/scylladb/issues/14403 Closes #14454	2023-07-04 16:15:08 +03:00
Marcin Maliszkiewicz	8b06684a8c	docs: dev: document pytest run convenience script Closes #13995	2023-06-07 12:37:52 +03:00
Nadav Har'El	8a1334cf6f	Merge 'alternator: eliminate duplicated rjson::find() of ExpressionAttributeNames and ExpressionAttributeValues' from Marcin Maliszkiewicz Summary of the patch set: - eliminates not needed calls to rjson::find (~1% tps improvement in `perf-simple-query --write`) - adds some very specific test in this area (more general cases were covered already) - fixes some minor validation bug Fixes https://github.com/scylladb/scylladb/issues/13251 Closes #12675 * github.com:scylladb/scylladb: alternator: fix unused ExpressionAttributeNames validation when used as a part of BatchGetItem alternator: eliminate duplicated rjson::find() of ExpressionAttributeNames and ExpressionAttributeValues	2023-06-04 23:10:12 +03:00
Alexey Novikov	ffd4fcceec	Alternator: return full table description on return of DeleteTable The DeleteTable operation in Alternator shoudl return a TableDescription object describing the table which has just been deleted, similar to what DescribeTable returns Fixes scylladb#11472 Closes #11628	2023-06-04 21:00:26 +03:00
Marcin Maliszkiewicz	9ce65270d5	alternator: fix unused ExpressionAttributeNames validation when used as a part of BatchGetItem BatchGetItem request is a map of table names and 'sub-requests', ExpressionAttributeNames is defined on 'sub-request' level but the code was instead checking the top level, obtaining nullptr every time which effectively disables unused names check. Fixes #13251	2023-05-26 15:03:15 +02:00
Pavel Emelyanov	d2f5a44e3b	test/alternator: Don't use empty AWS secret key There's a test case that checks in valid credentials (wrong key). However, some boto3 libraries don't like empty secret key values: request = <FixtureRequest for <Function test_wrong_key_access>> dynamodb = dynamodb.ServiceResource() def test_wrong_key_access(request, dynamodb): print("Please make sure authorization is enforced in your Scylla installation: alternator_enforce_authorization: true") url = dynamodb.meta.client._endpoint.host with pytest.raises(ClientError, match='UnrecognizedClientException'): if url.endswith('.amazonaws.com'): boto3.client('dynamodb',endpoint_url=url, aws_access_key_id='wrong_id', aws_secret_access_key='').describe_endpoints() else: verify = not url.startswith('https') > boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='', verify=verify).describe_endpoints() test_authorization.py:23: ... cls = <class 'awscrt.auth.AwsCredentialsProvider'>, access_key_id = 'whatever' secret_access_key = '', session_token = None @classmethod def new_static(cls, access_key_id, secret_access_key, session_token=None): """ Create a simple provider that just returns a fixed set of credentials. Args: access_key_id (str): Access key ID secret_access_key (str): Secret access key session_token (Optional[str]): Optional session token Returns: AwsCredentialsProvider: """ assert isinstance(access_key_id, str) assert isinstance(secret_access_key, str) assert isinstance(session_token, str) or session_token is None > binding = _awscrt.credentials_provider_new_static(access_key_id, secret_access_key, session_token) E RuntimeError: 34 (AWS_ERROR_INVALID_ARGUMENT): An invalid argument was passed to a function. $ pip3 show boto3 Name: boto3 Version: 1.26.139 Summary: The AWS SDK for Python Home-page: https://github.com/boto/boto3 Author: Amazon Web Services Author-email: License: Apache License 2.0 Location: /home/xemul/.local/lib/python3.11/site-packages Requires: botocore, jmespath, s3transfer Required-by: Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #14022	2023-05-24 19:46:16 +03:00
Nadav Har'El	02d31786ff	test/alternator: better README.md on how to run and write tests Improve test/alternator/README.md by adding better and more beginner- friendly introduction to how to run the Alternator tests, as well as a section about the philosophy of the Alternator test suite, and some guideliness on how to write good tests in that framework. Much of this text was copied from test/cql-pytest/README.md. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13999	2023-05-24 09:58:12 +03:00
Nadav Har'El	3c0603558c	alternator: add validation of numbers' magnitude and precision DynamoDB limits the allowed magnitude and precision of numbers - valid decimal exponents are between -130 and 125 and up to 38 significant decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal" type which offers unlimited precision. This can cause two problems: 1. Users might get used to this "unofficial" feature and start relying on it, not allowing us to switch to a more efficient limited-precision implementation later. 2. If huge exponents are allowed, e.g., 1e-1000000, summing such a number with 1.0 will result in a huge number, huge allocations and stalls. This is highly undesirable. After this patch, all tests in test/alternator/test_number.py now pass. The various failing tests which verify magnitude and precision limitations in different places (key attributes, non-key attributes, and arithmetic expressions) now pass - so their "xfail" tags are removed. Fixes #6794 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	0eccc49308	test/alternator: more tests for limits on number precision and magnitude We already have xfailing tests for issue #6794 - the missing checks on precision and magnitudes of numbers in Alternator - but this patch adds checks for additional corner cases. In particular we check the case that numbers are used in a key column, which goes to a different code path than numbers used in non-key columns, so it's worth testing as well. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	56b8b9d670	test/alternator: reproducer for DoS in unlimited-precision addition As already noted in issue #6794, whereas DynamoDB limits the magnitude of numbers to between 10^-130 and 10^125, Scylla does not. In this patch we add yet another test for this problem, but unlike previous tests which just shown too much magnitude being allowed which always sounded like a benign problem - the test in this patch shows that this "feature" can be used to DoS Scylla - a user user can send a short request that causes arbitrarily-large allocations, stalls and CPU usage. The test is currently marked "skip" because it cause cause Scylla to take a very long time and/or run out of memory. It passes on DynamoDB because the excessive magnitude is simply not allowed there. Refs #6794 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:03:51 +03:00
Botond Dénes	dba1d36aa6	Merge 'alternator: fix isolation of concurrent modifications to tags' from Nadav Har'El Alternator's implementation of TagResource, UntagResource and UpdateTimeToLive (the latter uses tags to store the TTL configuration) was unsafe for concurrent modifications - some of these modifications may be lost. This short series fixes the bug, and also adds (in the last patch) a test that reproduces the bug and verifies that it's fixed. The cause of the incorrect isolation was that we separately read the old tags and wrote the modified tags. In this series we introduce a new function, `modify_tags()` which can do both under one lock, so concurrent tag operations are serialized and therefore isolated as expected. Fixes #6389. Closes #13150 * github.com:scylladb/scylladb: test/alternator: test concurrent TagResource / UntagResource db/tags: drop unsafe update_tags() utility function alternator: isolate concurrent modification to tags db/tags: add safe modify_tags() utility functions migration_manager: expose access to storage_proxy	2023-04-11 11:17:23 +03:00
Calle Wilund	6525209983	alternator/rest api tests: Remove name assumption and rely on actual scylla info Fixes #13332 The tests user the discriminator "system" as prefix to assume keyspaces are marked "internal" inside scylla. This is not true in enterprise universe (replicated key provider). It maybe/probably should, but that train is sailing right now. Fix by removing one assert (not correct) and use actual API info in the alternator test. Closes #13333	2023-03-28 15:41:23 +03:00
Nadav Har'El	c41b2d35ed	test/alternator: test concurrent TagResource / UntagResource This patch adds an Alternator test reproducing issue #6389 - that concurrent TagResource and/or UntagResource operations was broken and some of the concurrent modifications were lost. The test has two threads, one loops adds and removes a tag A, the other adds and removes a tag B. After we add tag A, we expect tag A to be there - but due to issue #6389 this modification was sometimes lost when it raced with an operation on B. This test consistently failed before issue #6389 was fixed, and passes now after the issue was fixed by the previous patches. The bug reproduces by chance, so it requires a fairly long loop (a few seconds) to be sure it reproduces - so is marked a "veryslow" test and will not run in CI, but can be used to manually reproduce this issue with: test/alternator/run --runveryslow test_tag.py::test_concurrent_tag Refs #6389. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-03-13 13:38:15 +02:00
Nadav Har'El	7dc54771e1	test/cql-pytest: allow "run-cassandra" without building Scylla Before this patch, all scripts which use test/cql-pytest/run.py looked for the Scylla executable as their first step. This is usually the right thing to do, except in two cases where Scylla is not needed: 1. The script test/cql-pytest/run-cassandra. 2. The script test/alternator/run with the "--aws" option. So in this patch we change run.py to only look for Scylla when actually needed (the find_scylla() function is called). In both cases mentioned above, find_scylla() will never get called and the script can work even if Scylla was never built. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13010	2023-03-01 07:54:19 +02:00
Nadav Har'El	14cdd034ee	test/alternator: fix flaky test for partition-tombstone scan The test test_scan.py::test_scan_long_partition_tombstone_string checks that a full-table Scan operation ends a page in the middle of a very long string of partition tombstones, and does NOT scan the entire table in one page (if we did that, getting a single page could take an unbounded amount of time). The test is currently flaky, having failed in CI runs three times in the past two months. The reason for the flakiness is that we don't know exactly how long we need to make the sequence of partition tombstones in the test before we can be absolutely sure a single page will not read this entire sequence. For single-partition scans we have the "query_tombstone_page_limit" configuration parameter, which tells us exactly how long we need to make the sequence of row tombstones. But for a full-table scan of partition tombstones, the situation is more complicated - because the scan is done in parallel on several vnodes in parallel and each of them needs to read query_tombstone_page_limit before it stops. In my experiments, using query_tombstone_limit * 4 consecutive tombstones was always enough - I ran this test hundreds of times and it didn't fail once. But since it did fail on Jenkins very rarely (3 times in the last two months), maybe the multiplier 4 isn't enough. So this patch doubles it to 8. Hopefully this would be enough for anyone (TM). This makes this test even bigger and slower than it was. To make it faster, I changed this test's write isolation mode from the default always_use_lwt to forbid_rmw (not use LWT). This leaves the test's total run time to be similar to what it was before this patch - around 0.5 seconds in dev build mode on my laptop. Fixes #12817 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12819	2023-02-14 08:09:44 +02:00
Nadav Har'El	621c49b621	test/alternator: more tests for listing streams In issue #12601, a dtest involving paging of ListStreams showed incorrect results - the paged results had one duplicate stream and one missing stream. We believe that the cause of this bug was that the unsorted map of tables can change order between pages. In this patch we add a test test_list_streams_paged_with_new_table which can demonstrate this bug - by adding a lot of tables in mid-paging, we cause the unsorted map to be reshufled and the paging to break. This is not the same situation as in #12601 (which did not involve new tables) but we believe it demonstrates the same bug - and check its fix. Indeed this passes with the fix in pull request #12614 and fails without it. This patch also adds a second test, test_stream_arn_unchanging: That test eliminates a guess we had for the cause of #12601. We thought that maybe stream ARN changing on a table if its schema version changes, but the new test confirms that it actually behaves as expected (the stream ARN doesn't change). Refs #12601 Refs #12614 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12616	2023-02-13 16:30:24 +02:00

1 2 3 4 5 ...

311 Commits