scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 19:10:42 +00:00

Author	SHA1	Message	Date
Nadav Har'El	e5f6adf46c	test/alternator: improve tests for DescribeTable for indexes I created new issues for each missing field in DescribeTable's response for GSIs and LSIs, so in this patch we edit the xfail messages in the test to refer to these issues. Additionally, we only had a test for these fields for GSIs, so this patch also adds a similar test for LSIs. I turns out there is a difference between the two tests - the two fields IndexStatus and ProvisionedThroughput are returned for GSIs, but not for LSIs. Refs #7750 Refs #11466 Refs #11470 Refs #11471 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11473	2022-09-07 09:50:16 +02:00
Nadav Har'El	941c719a23	alternator: return ProvisionedThroughput in DescribeTable DescribeTable is currently hard-coded to return PAY_PER_REQUEST billing mode. Nevertheless, even in PAY_PER_REQUEST mode, the DescribeTable operation must return a ProvisionedThroughput structure, listing both ReadCapacityUnits and WriteCapacityUnits as 0. This requirement is not stated in some DynamoDB documentation but is explictly mentioned in https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_ProvisionedThroughput.html Also in empirically, DynamoDB returns ProvisionedThroughput with zeros even in PAY_PER_REQUEST mode. We even had an xfailing test to confirm this. The ProvisionedThroughput structure being missing was a problem for applications like DynamoDB connectors for Spark, if they implicitly assume that ProvisionedThroughput is returned by DescribeTable, and fail (as described in issue #11222) if it's outright missing. So this patch adds the missing ProvisionedThroughput structure, and the xfailing test starts to pass. Note that this patch doesn't change the fact that attempting to set a table to PROVISIONED billing mode is ignored: DescribeTable continues to always return PAY_PER_REQUEST as the billing mode and zero as the provisioned capacities. Fixes #11222 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11298	2022-08-22 09:58:09 +02:00
Nadav Har'El	c27f431580	test/alternator: fix a flaky test for full-table scan page size This patch fixes the test test_scan.py::test_scan_paging_missing_limit which failed in a Jenkins run once (that we know of). That test verifies that an Alternator Scan operation without an explicit "Limit" is nevertheless paged: DynamoDB (and also Scylla) wanted this page size to be 1 MB, but it turns out (see #10327) that because of the details of how Scylla's scan works, the page size can be larger than 1 MB. How much larger? I ran this test hundreds of times and never saw it exceed a 3 MB page - so the test asserted the page must be smaller than 4 MB. But now in one run - we got to this 4 MB and failed the test. So in this patch we increase the table to be scanned from 4 MB to 6 MB, and assert the page size isn't the full 6 MB. The chance that this size will eventually fail as well should be (famous last words...) very small for two reasons: First because 6 MB is even higher than I the maximum I saw in practice, and second because empirically I noticed that adding more data to the table reduces the variance of the page size, so it should become closer to 1 MB and reduce the chance of it reaching 6 MB. Refs #10327 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11280	2022-08-12 06:57:45 +03:00
Nadav Har'El	d03bd82222	Revert "test: move scylla_inject_error from alternator/ to cql-pytest/" This reverts commit `8e892426e2` and fixes the code in a different way: That commit moved the scylla_inject_error function from test/alternator/util.py to test/cql-pytest/util.py and renamed test/alternator/util.py. I found the rename confusing and unnecessary. Moreover, the moved function isn't even usable today by the test suite that includes it, cql-pytest, because it lacks the "rest_api" fixture :-) so test/cql-pytest/util.py wasn't the right place for it anyway. test/rest_api/rest_util.py could have been a good place for this function, but there is another complication: Although the Alternator and rest_api tests both had a "rest_api" fixture, it has a different type, which led to the code in rest_api which used the moved function to have to jump through hoops to call it instead of just passing "rest_api". I think the best solution is to revert the above commit, and duplicate the short scylla_inject_error() function. The duplication isn't an exact copy - the test/rest_api/rest_util.py version now accepts the "rest_api" fixture instead of the URL that the Alternator version used. In the future we can remove some of this duplication by having some shared "library" code but we should do it carefully and starting with agreeing on the basic fixtures like "rest_api" and "cql", without that it's not useful to share small functions that operate on them. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11275	2022-08-11 06:43:26 +03:00
Aleksandra Martyniuk	8e892426e2	test: move scylla_inject_error from alternator/ to cql-pytest/ Move scylla_inject_error from alternator/ to cql-pytest/ so it can be reached from various tests dirs. alternator/util.py is renamed to alternator/alternator_util.py to avoid name shadowing.	2022-07-29 09:35:20 +02:00
Nadav Har'El	eaf3579c15	test/alternator: several more simple tests for UpdateItem This patch adds several more tests for Alternator's UpdateItem operation. These tests verify a few simple cases that, surprisingly, never had test coverage. The new tests pass (on both DynamoDB and Alternator) so did not expose any bug. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11025	2022-07-12 21:48:33 +02:00
Nadav Har'El	2581b54ea0	test/{alternator,redis}: stop using deprecated "disutils" package Python has deprecated the distutils package. In several places in the Alternator and Redis test suites, we used distutils.version to check if the library is new enough for running the test (and skip the test if it's too old). On new versions of Python, we started getting deprecation warnings such as: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives PEP 632 recommends using package.version instead of distutils.version, and indeed it works well. After applying this patch, Alternator and Redis test runs no long end in silly deprecation warnings. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11007	2022-07-11 08:00:45 +03:00
David Garcia	b85843b9cc	Fix broken links Fix broken links	2022-06-28 15:19:36 +01:00
David Garcia	bb21c3c869	Move dev docs to docs/dev	2022-06-24 18:07:08 +01:00
Nadav Har'El	3aca1ca572	alternator: make BatchGetItem group reads by partition DynamoDB API's BatchGetItem invokes a number (up to 25) of read requests in parallel, returning when all results are available. Alternator naively implemented this by sending all read requests in parallel, no matter which requests these were. That implementation was inefficient when all the requests are to different items (clustering rows) of the same partition. In a multi-node setup this will end up sending 25 separate requests to the same remote node(s). Even on a single-node setup, this may result in reading from disk more than once, and even if the partition is cached - doing an O(logN) search in each multiple times. What we do in this patch, instead, is to group all the BatchGetItem requests that aimed at the same partition into a single read request asking for a (sorted) list of clustering keys. This is similar to an "IN" request in CQL. As an example of the performance benefit of this patch, I tried a BatchGetItem request asking for 20 random items from a 10-million item partition. I measured the latency of this request on a single-node Scylla. Before this patch, I saw a latency of 17-21 ms (the lower number is when the request is retried and the requested items are already in the cache). After this patch, the latency is 10-14 ms. The performance improvement on multi-node clusters are expected to be even higher. Unfortunately the patch is less trivial than I hoped it would be, because some of the old code was organized under the assumption that each read request only returned one item (and if it failed, it means only one item failed), so this part of the code had to be reorganized (and, for making the code more readable, coroutinized). An unintended benefit of the code reorganization is that it also gave me an opportunity to fail an attempt to ask BatchGetItem the same item more than once (issue #10757). The patch also adds a few more corner cases in the tests, to be even more sure that the code reorganization doesn't introduce a regression in BatchGetItem. Fixes #10753 Fixes #10757 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-06-19 14:47:57 +03:00
Nadav Har'El	0be06e0bdf	test/alternator: additional test for BatchGetItem Our simple test for BatchGetItem on a table with sort keys still has requests with just one sort key per partition, so if BatchGetItem has a bug with requesting multiple sort keys from the same partition, such bug won't be caught by the simple tests. So in this test we add a test that does. This will be useful for the next patch, we are planning to refactor BatchGetItem's handling of multiple sort keys in the same partition - so it will be useful to have more regression tests. The tests test_batch_get_item_large and test_batch_get_item_partial would actually also catch such bugs, but they are more elaborate tests and it's nice to have smaller tests more focused on checking specific features. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-06-16 18:19:20 +03:00
Nadav Har'El	e20233dab1	alternator: improve error handling when trying to tag a GSI or LSI In issue #10786, we raised the idea of maybe allowing to tag (with TagResource) GSIs and LSIs, not just base tables. However, currently, neither DynamoDB nor Syclla allows it. So in this patch we add a test that confirms this. And while at it, we fix Alternator to return the same error message as DynamoDB in this case. Refs #10786. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-06-13 18:14:42 +03:00
Nadav Har'El	8866c326de	alternator: forbid duplicate index (LSI and GSI) names Adding an LSI and GSI with the same name to the same Alternator table should be forbidden - because if both exists only one of them (the GSI) would actually be usable. DynamoDB also forbids such duplicate name. So in this patch we add a test for this issue, and fix it. Since the patch involves a few more uses of the IndexName string, we also clean up its handling a bit, to use std::string_view instead of the old-style std::string&. Fixes #10789 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-06-13 18:14:42 +03:00
Nadav Har'El	00866a75d8	alternator: add ARN for indexes (LSI and GSI) DynamoDB gives an ARN ("Amazon Resource Name") to LSIs and GSIs. These look like BASEARN/index/INDEXNAME, where BASEARN is the ARN of the base table, and INDEXNAME is the name of the LSI or the GSI. These ARNs should be returned by DescribeTable as part of its description of each index, and this patch adds that missing IndexArn field. The ARN we're adding here is hardly useful (e.g., as explained in issue #10786, it can't be used to add tags to the index table), but nevertheless should exist for compatibility with DynamoDB. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-06-13 18:14:42 +03:00
Nadav Har'El	75c2bd78ae	test/alternator: reproducer for GetBatchItem duplicate keys It turns out that DynamoDB forbids requesting the same item more than once in a GetBatchItem request. Trying to do it would obviously be a waste, but DynamoDB outright refuses it - and Alternator currently doesn't (refs #10757). The test currently passes on DynamoDB and fails on Alternator, so it is marked xfail. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10758	2022-06-09 07:04:50 +02:00
Nadav Har'El	d0ca09a925	alternator: implement DescribeContinuousBackups operation Although we don't yet support the DynamoDB API's backup features (see issue #5063), we can already implement the DescribeContinuousBackups operation. It should just say that continuous backups, and point-in-time restores, and disabled. This will be useful for client code which tries to inquire about continuous backups, even if not planning to use them in practice (e.g., see issue #10660). Refs #5063 Refs #10660 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-05-26 15:13:50 +03:00
Nadav Har'El	f6ce7891a5	test/alternator: add test for key length limits DynamoDB limits partition-key length to 2048 bytes and sort-key length to 1024 bytes. Alternator currently has no such limits officially, but if a user tries a key length of over 64 KB, the result will be an "internal server error" as Alternator runs into Scylla's low-level key length limit of 64 KB. In this patch we add (mostly xfailing) tests confirming all the above observations. The tests include extensive comments on what they are testing and why. Some of these tests (specifically, the ones checking what happens above 64 KB) should pass once Alternator is fixed. Other tests - requiring that the limits be exactly what they are in DynamoDB - may either not pass or change in the future, depending on what we decide the limits should be in Alternator. Refs #10347 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10438	2022-04-26 18:09:19 +02:00
Nadav Har'El	84143c2ee5	alternator: implement Select option of Query and Scan This patch implements the previously-unimplemented Select option of the Query and Scan operators. The most interesting use case of this option is Select=COUNT which means we should only count the items, without returning their actual content. But there are actually four different Select settings: COUNT, ALL_ATTRIBUTES, SPECIFIC_ATTRIBUTES, and ALL_PROJECTED_ATTRIBUTES. Five previously-failing tests now pass, and their xfail mark is removed: * test_query.py::test_query_select * test_scan.py::test_scan_select * test_query_filter.py::test_query_filter_and_select_count * test_filter_expression.py::test_filter_expression_and_select_count * test_gsi.py::test_gsi_query_select_1 These tests cover many different cases of successes and errors, including combination of Select and other options. E.g., combining Select=COUNT with filtering requires us to get the parts of the items needed for the filtering function - even if we don't need to return them to the user at the end. Because we do not yet support GSI/LSI projection (issue #5036), the support for ALL_PROJECTED_ATTRIBUTES is a bit simpler than it will need to be in the future, but we can only finish that after #5036 is done. Fixes #5058. The most intrusive part of this patch is a change from attrs_to_get - a map of top-level attributes that a read needs to fetch - to an optional<attrs_to_get>. This change is needed because we also need to support the case that we want to read no attributes (Select=COUNT), and attrs_to_get.empty() used to mean that we want to read all attributes, not no attributes. After this patch, an unset optional<attrs_to_get> means read all attributes, a set but empty attrs_to_get means read no attributes, and a set and non-empty attrs_to_get means read those specific attributes. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220405113700.9768-2-nyh@scylladb.com>	2022-04-11 10:04:32 +02:00
Nadav Har'El	9c1ebdceea	alternator: forbid empty AttributesToGet In DynamoDB one can retrieve only a subset of the attributes using the AttributesToGet or ProjectionExpression paramters to read requests. Neither allows an empty list of attributes - if you don't want any attributes, you should use Select=COUNT instead. Currently we correctly refuse an empty ProjectionExpression - and have a test for it: test_projection_expression.py::test_projection_expression_toplevel_syntax However, Alternator is missing the same empty-forbidding logic for AttributesToGet. An empty AttributesToGet is currently allowed, and basically says "retrieve everything", which is sort of unexpected. So this patch adds the missing logic, and the missing test (actually two tests for the same thing - one using GetItem and the other Query). Fixes #10332 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220405113700.9768-1-nyh@scylladb.com>	2022-04-11 10:21:02 +03:00
Nadav Har'El	86d01542de	test/alternator: test another example of nested function calls In the existing test we noticed that list_append(if_not_exists(...)) is allowed, but list_append(list_append(...)) is not. I wasn't sure whether if_not_exists(if_not_exists(..)) will be allowed - and this test verifies that it is - it works on both Scylla and DynamoDB, and gives the same results on both. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220407122729.155648-1-nyh@scylladb.com>	2022-04-11 09:56:02 +03:00
Nadav Har'El	67e0590bbc	alternator: remove old TODO (with test verifying it) We had an old TODO in the Alternator "Scan" operation code which suggested that we may need to do something to limit the size of pages when a row limit ("Limit") isn't given. But we do already have a built-in limit on page sizes (1 MB), so this TODO isn't needed and can be removed. But I also wanted to make sure we have a test that this limit works: We already had a test that this 1 MB limit works for a single-partition Query (test_query.py::test_query_reverse_longish - tested both forward and reversed queries). In this patch I add a similar test for a whole- table Scan. It turns out that although page size is limited in this case as well, it's not exactly 1 MB... For small tables can even reach 3 MB. I consider this "good enough" and that we can drop the TODO, but also opened issue #10327 to document this surprising (for me) finding. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220404145240.354198-1-nyh@scylladb.com>	2022-04-05 09:23:23 +03:00
Nadav Har'El	56936d3c16	test/alternator: add reproducers for scan of long string of tombstones This patch adds two xfailing tests for issue #7933. That issue is about what Scan or Query paging does when encountering a very long string of consecutive tombstones (partition or row tombstones). Ideally, in that case the scan could stop on one of these tombstones after already processing too many. But as these two tests demonstrate, the scan can't stop in the middle of a long string of tombstones - and as a result retrieving a single page can take an unbounded amount of time, which is wrong. Currently the tests are marked `@veryslow` (they each take more than a minute) because they each create a huge number of tombstones to demonstrate a huge amount of work for a single page. When we fix issue #7933 and have a much smaller limit on the number of tombstones processed in a single page, we can hopefully make these tests much shorter and remove the `@veryslow` tag. The `@veryslow` tags means that although these tests can be used manually (with `--runveryslow`) they will not yet be run as part of the usual regression tests. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220403070706.250147-1-nyh@scylladb.com>	2022-04-05 09:11:38 +03:00
Nadav Har'El	758f8f01d7	test/alternator: turn REST API finding into a fixture In test_tracing.py and util.py, we already have three duplicates of code which looks for the Scylla REST API. We'll soon want to add even more uses of this REST API, so it's good time to add a single fixture, "rest_api", which can be use in all tests that need the Scylla REST API instead of duplicating the same code. A test using the "rest_api" fixture will be skipped if the server isn't Scylla, or its port 10000 is not available or not responsive. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220331195337.64352-1-nyh@scylladb.com>	2022-04-01 10:51:59 +03:00
Nadav Har'El	d8c0680585	test/alternator: add regression test for old ALL_NEW bug In commit `964500e47a`, in the middle of a larger series, I fixed a small Alternator bug that I found while working on that series. The bug was that the ReturnValues=ALL_NEW feature moved out the read previous_item, which breaks operations that need previous_item, e.g., an ADD operation. Unfortunately, we never had a regression test for this fix bug, so in this patch I add one. This bug was re-discovered on an old branch by a user, at which point I noticed that we don't have a test for it - so I want to add it now, even though the bug itself is long gone from Scylla master. I verified that the new test indeed fails on old versions of Scylla before the aforementioned commit, and passes when backporting only that commit. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220327074928.3608576-1-nyh@scylladb.com>	2022-03-28 08:40:28 +02:00
Nadav Har'El	653f2df28f	alternator: fix JSON escaping of error responses In the DynamoDB API, error responses are in JSON format with specific fields ("__type" and "message" in the x-amz-json-1.0 format currently used). Alternator tried to be clever and build the string representation of this JSON itself, instead of using RapidJSON. But this optimization was a mistake - if the error message contains characters that need escaping (such as double quotes and newlines), they weren't escaped, and the resulting JSON was malformed. When the client library boto3 read this malformed JSON it got confused, cosidered the entire error response to be a string, which resulted in an ugly error message. The fix is easy - just build the JSON output as usual with RapidJSON instead of trying to optimize using string operation. The patch also includes two tests reproducing this bug and checking its fix. The first test uses boto3 and shows it got confused on the type of error (not understanding that it is a ValidationException). The second test bypasses boto3 and shows exactly where the bug happens - the response is an unparsable JSON. Fixes #10278 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220327132705.3707979-1-nyh@scylladb.com>	2022-03-27 16:32:36 +03:00
Nadav Har'El	49a8164fb7	alternator: add configurable scan period to TTL expiration Before this patch, the experimental TTL (expiration time) feature in Alternator scans tables for expiration in a tight loop - starting the next scan one second after the previous one completed. In this patch we introduce a new configuration option, alternator_ttl_period_in_seconds, which determines how frequently to start the scan. The default is 24 hours - meaning that the next scan is started 24 hours after the previous one started. The tests (test/alternator/run) change this configuration back to one second, so that expiration tests finish as quickly as possible. Please note that the scan is not slowed down to fill this 24 hours - if it finishes in one hour, it will then sleep for 23 hours. Additional work would be needed to slow down the scan to not finish too quickly. One idea not yet implemented is to move the expiration service from the "maintenance" scheduling group which it uses today to a new scheduling group, and modifying the number of shares that this group gets. Another thing worth noting about the configurable period (which defaults to 24 hours) is that when TTL is enabled on an Alternator table, it can take that amount of time until its scan starts and items start expiring from it. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-02-25 07:26:11 +02:00
Nadav Har'El	4349514064	test/alternator: add smaller reproducer for Limit-less reverse query The regression test we have for Alternator's issue #9487 (where a reverse query without a Limit given was broken into 100MB pages instead of the expected 1MB) is test_query.py::test_query_reverse_long. But this is a very long test requiring a 100MB partition, and because of its slowness isn't run by default. This patch adds another version of that test, test_query_reverse_longish, which reproduces the same issue #9487 with a partition 50 times shorter (2MB) so it only takes a fraction of a second and can be enabled by default. It also requires much less network traffic which is important when running these tests non-locally. We leave the original test test_query_reverse_long behind, it can be still useful to stress Scylla even beyond the 100MB boundary, but it remains in @veryslow mode so won't run in default test runs. Refs #9487 Refs #7586 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220220161905.852994-1-nyh@scylladb.com>	2022-02-21 09:12:16 +01:00
Nadav Har'El	f292d3d679	alternator: make schema modifications in CreateTable atomic The Alternator CreateTable operation currently performs several schema- changing operations separately - one by one: It creates a keyspace, a table in that keyspace and possibly also multiple views, and it sets tags on the table. A consequence of this is that concurrent CreateTable and DeleteTable operations (for example) can result in unexpected errors or inconsistent states - for example CreateTable wants to create the table in the keyspace it just created, but a concurrent DeleteTable deleted it. We have two issues about this problem (#6391 and #9868) and three tests (test_table.py::test_concurrent_create_and_delete_table) reproducing it. In this patch we fix these problems by switching to the modern Scylla schema-changing API: Instead of doing several schema-changing operations one by one, we create a vector of schema mutation performing all these operations - and then perform all these mutations together. When the experimental Raft-based schema modifications is enabled, this completely solves the races, and the tests begin to pass. However, if the experimental Raft mode is not enabled, these tests continue to fail because there is still no locking while applying the different schema mutations (not even on a single node). So I put a special fixture "fails_without_raft" on these tests - which means that the tests xfail if run without raft, and expected to pass when run on Raft. Indeed, after this patch test/alternator/run --raft test_table.py::test_concurrent_create_and_delete_table shows three passing tests (they also pass if we drastically improve the number of iterations), while test/alternator/run test_table.py::test_concurrent_create_and_delete_table shows three xfailing tests. All other Alternator tests pass as before with this patch, verifying that the handling of new tables, new views, tags, and CDC log tables, all happen correctly even after this patch. A note about the implementation: Before this patch, the CreateTable code used high-level functions like prepare_new_column_family_announcement(). These high-level functions become unusable if we write multiple schema operations to one list of mutations, because for example this function validates that the keyspace had already been created - when it hasn't and that's the whole point. So instead we had to use lower-level function like add_table_or_view_to_schema_mutation() and before_create_column_family(). However, despite being lower level, these functions were public so I think it's reasonable to use them, and we probably have no other alternative. Fixes #6391 Fixes #9868 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-02-18 09:03:52 +02:00
Nadav Har'El	212c321c55	test/alternator: add reproducers for non-atomic table creation We add reproducing tests for two known Alternator issues, #6391 and #9868, which involve the non-atomicity of table creation. Creating a table currently involves multiple steps - creating a keyspace, a table, materialized views, and tags. If some of these steps succeed and some fail, we get an InternalServerError and potentially leave behind some half-built table. Both issues will be solved by making better use of the new Raft-based capabilities of making multiple modifications to the schema atomically, but this patch doesn't fix the problem - it just proves it exist. The new tests involve two threads - one repeatedly trying to create a table with a GSI or with tags - and the other thread repeatedly trying to delete the same table under its feet. Both bugs are reproduced almost immediately. Note that like all test/alternator tests, the new tests are usually run on just one node. So when we fix the bug and these tests begin to pass, it will not be a proof that concurrent schema modification works safely on different nodes. To prove that, we will also need a multi-node test. However, this test can prove that we used Raft-based schema modification correctly - and if we assume that the Raft-based schema modification feature is itself correct, then we can be sure that CreateTable will be correct also across multiple nodes. Although it won't hurt to check it directly. Refs #6391 Refs #9868 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220207223100.207074-1-nyh@scylladb.com>	2022-02-14 18:21:21 +02:00
Nadav Har'El	4937270803	test/alternator: add option to run with Raft-based schema changes This patch adds a "--raft" option to test/alternator/run to enable the experimental Raft-based schema changes ("--experimental-features=raft") when running Scylla for the tests. This is the same option we added to test/cql-pytest/run in a previous patch. Note that we still don't have any Alternator tests that pass or fail differently in these two modes - these will probably come later as we fix issues #9868 and #6391. But in order to work on fixing those issues we need to be able to run the tests in Raft mode. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220209123144.321344-1-nyh@scylladb.com>	2022-02-10 09:43:10 +02:00
Nadav Har'El	9982a28007	alternator: allow REMOVE of non-existent nested attribute DynamoDB allows an UpdateItem operation "REMOVE x.y" when a map x exists in the item, but x.y doesn't - the removal silently does nothing. Alternator incorrectly generated an error in this case, and unfortunately we didn't have a test for this case. So in this patch we add the missing test (which fails on Alternator before this patch - and passes on DynamoDB) and then fix the behavior. After this patch, "REMOVE x.y" will remain an error if "x" doesn't exist (saying "document paths not valid for this item"), but if "x" exists and is a map, but "x.y" doesn't, the removal will silently do nothing and will not be an error. Fixes #10043. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220207133652.181994-1-nyh@scylladb.com>	2022-02-07 18:40:48 +02:00
Nadav Har'El	8a745593a2	Merge 'alternator: fill UnprocessedKeys for failed batch reads' from Piotr Sarna DynamoDB protocol specifies that when getting items in a batch failed only partially, unprocessed keys can be returned so that the user can perform a retry. Alternator used to fail the whole request if any of the reads failed, but right now it instead produces the list of unprocessed keys and returns them to the user, as long as at least 1 read was successful. This series comes with a test based on Scylla's error injection mechanism, and thus is only useful in modes which come with error injection compiled in. In release mode, expect to see the following message: SKIPPED (Error injection not enabled in Scylla - try compiling in dev/debug/sanitize mode) Fixes #9984 Closes #9986 * github.com:scylladb/scylla: test: add total failure case for GetBatchItem test: add error injection case for GetBatchItem test: add a context manager for error injection to alternator alternator: add error injection to BatchGetItem alternator: fill UnprocessedKeys for failed batch reads	2022-01-31 15:28:24 +02:00
Piotr Sarna	c87126198d	test: add total failure case for GetBatchItem The test verifies that if all reads from a batch operation failed, the result is an error, and not a success response with UnprocessedKeys parameter set to all keys.	2022-01-31 14:21:55 +01:00
Piotr Sarna	e79c2943fc	test: add error injection case for GetBatchItem The new test case is based on Scylla error injection mechanism and forces a partial read by failing some requests from the batch.	2022-01-31 14:21:55 +01:00
Piotr Sarna	99c5bec0e2	test: add a context manager for error injection to alternator With the new context manager it's now easier to request an error to be injected via REST API. Note that error injection is only enabled in certain build modes (dev, debug, sanitize) and the test case will be skipped if it's not possible to use this mechanism.	2022-01-31 14:21:55 +01:00
Nadav Har'El	a25e265373	test/alternator: improve comment on why we need "global_random" Improve the comment that explains why we needed to use an explicitly shared random sequence instead of the usual "random". We now understand that we need this workaround to undo what the pytest-randomly plugin does. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220130155557.1181345-1-nyh@scylladb.com>	2022-01-31 10:07:56 +01:00
Piotr Sarna	471205bdcf	test/alternator: use a global random generator for all test cases It was observed (perhaps it depends on the Python implementation) that an identical seed was used for multiple test cases, which violated the assumption that generated values are in fact unique. Using a global generator instead makes sure that it was only seeded once. Tests: unit(dev) # alternator tests used to fail for me locally before this patch was applied Message-Id: <315d372b4363f449d04b57f7a7d701dcb9a6160a.1643365856.git.sarna@scylladb.com>	2022-01-30 16:40:20 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Nadav Har'El	a30e71e27a	alternator: doc, test: fix mentions of reverse queries Now that issues #7586 and #9487 were fixed, reverse queries - even in long partitions - work well, we can drop the claim in alternator/docs/compatibility.md that reverse queries are buggy for large partitions. We can also remove the "xfail" mark from the tes that checks this feature, as it now passes. Refs #7586 Refs #9487 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #9831	2022-01-16 17:46:26 +02:00
Nadav Har'El	e7e9001808	test/alternator: add more tests for GSI "Projection" We already have multiple tests for the unimplemented "Projection" feature of GSI and LSI (see issue #5036). This patch adds seven more test cases, focusing on various types of errors conditions (e.g., trying to project the same attribute twice), esoteric corner cases (it's fine to list a key in NonKeyAttributes!), and corner cases that I expect we will have in our implementation (e.g., a projected attribute may either be a real Scylla column or just an element in a map column). All new tests pass on DynamoDB and fail on Alternator (due to #5036), so marked with "xfail". Refs #5036. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211228193748.688060-1-nyh@scylladb.com>	2022-01-05 10:35:36 +02:00
Nadav Har'El	31eeb44d28	alternator: fix error on UpdateTable for non-existent table When the UpdateTable operation is called for a non-existent table, the appropriate error is ResourceNotFoundException, but before this patch we ran into an exception, which resulted in an ugly "internal server error". In this patch we use the existing get_table() function which most other operations use, and which does all the appropriate verifications and generates the appropriate Alternator api_error instead of letting internal Scylla exceptions escape to the user. This patch also includes a test for UpdateTable on a non-existent table, which used to fail before this patch and pass afterwards. We also add a test for DeleteTable in the same scenario, and see it didn't have this bug. As usual, both tests pass on DynamoDB, which confirms we generate the right error codes. Fixes #9747. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211206181605.1182431-1-nyh@scylladb.com>	2021-12-14 13:09:27 +01:00
Nadav Har'El	815324713e	test/alternator: add more tests for ADD operand mismatch The "ADD" operator in UpdateItem's AttributeUpdates supports a number of types (numbers, sets and strings), should result in a ValidationException if the attribute's existing type is different from the type of the operand - e.g., trying to ADD a number to an attribute which has a set as a value. So far we only had partial testing for this (we tested the case where both operands are sets, but of different types) so this patch adds the missing tests. The new tests pass (on both Alternator and DynamoDB) - we don't have a bug there. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211213195023.1415248-1-nyh@scylladb.com>	2021-12-14 11:15:23 +02:00
Nadav Har'El	03d67440ef	alternator: test additional metrics and fix another broken counter In issue #9406 we noticed that a counter for BatchGetItem operations was missing. When we fixed it, we added a test which checked this counter - but only this counter. It was left as a TODO to test the rest of the Alternator metrics, and this is what this patch does. Here we add a comprehensive test for all of the operations supported by Scylla and how they increase the appropriate operation counter. With this test we discovered a new bug: the DescribeTimeToLive operation incremented the UpdateTimeToLiveCounter :-( So in this patch we also include a fix for that bug, and the new test verifies that it is fixed. In addition to the operation counters, Alternator also has additional metric and we also added tests for some of them - but not all. The remaining untested metrics are listed in a TODO comment. Message-Id: <20211206154727.1170112-1-nyh@scylladb.com>	2021-12-10 08:08:54 +02:00
Piotr Sarna	26288c1a86	test,alternator: make TTL tests less prone to false negatives On my local machine, a 3 second deadline proved to cause flakiness of test_ttl_expiration case, because its execution time is just around 3 seconds. This patch addresse the problem by bumping the local timeout to 10 (and 15 for test_ttl_expiration_long, since it's dangerously near the 10 second deadline on my machine as well). Moreover, some test cases short-circuited once they detected that all needed items expired, but other ones lacked it and always used their full time slot. Since 10 seconds is a little too long for a single test case, even one marked with --veryslow, this patch also adds a couple of other short-circuits. One exception is test_ttl_expiration_hash_wrong_type, which actually depends on the fact that we should wait for the whole loop to finish. Since this case was never flaky for me with the 3 second timeout, it's left as is. Theoretically, test_ttl_expiration also kind of depends on checking the condition more than once (because the TTL of one of the values is bumped on each iteration), but empirical evidence shows that multiple iterations always occur in this test case anyway - for me, it always spinned at least 3 times. Tests: unit(release) Message-Id: <a0a479929dac37daace744e0a970567a8aa3b518.1638431933.git.sarna@scylladb.com>	2021-12-08 16:02:45 +02:00
Nadav Har'El	92e7fbe657	test/alternator: check correct error for unknown operation Add a short test verifying that Alternator responds with the correct error code (UnknownOperationException) when receiving an unknown or unsupported operation. The test passes on both AWS and Alternator, confirming that the behavior is the same. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211206125710.1153008-1-nyh@scylladb.com>	2021-12-08 13:56:38 +02:00
Nadav Har'El	d3abff9ea1	test/alternator: validate that TagResource needs a Tags parameter A short new test to verify that in the TagResource operation, the Tags parameter - specifying which tags to set - is required. The test passes on both AWS and Alternator - they both produce a ValidationException in this case (the specific human-readable error message is different, though, so we don't check it). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211206140541.1157574-1-nyh@scylladb.com>	2021-12-06 15:08:16 +01:00
Calle Wilund	3e21fea2b6	test_streamts: test_streams_starting_sequence_number fix 'LastEvaluatedShardId' usage It is not part of raw response, but of the 'StreamDescription' object. Test fails internmittently depending on PK randomization. Closes #9710	2021-12-01 11:05:40 +02:00
Nadav Har'El	d9c5c4eab6	test/alternator: tests for Select parameter in GSI and LSI We already have tests for the behavior of the "Select" parameter when querying a base table, but this patch adds additional tests for its behavior when querying a GSI or a LSI. There are some differences: Select=ALL_PROJECTED_ATTRIBUTES is not allowed for base tables, but is allowed - and in fact is the default - for GSI and LSI. Also, GSI may not allow ALL_ATTRIBUTES (which is the default for base tables) if only a subset of the attributes were projected. The new tests xfail because the Select and Projection features have not yet been implemented in Alternator. They pass in DynamoDB. After this patch we have (hopefully) complete test coverage of the Select feature, which will be helpful when we start implementing it. Refs #5058 (Select) Refs #5036 (Projection) Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211125100443.746917-1-nyh@scylladb.com>	2021-11-29 20:28:43 +01:00
Nadav Har'El	1c279118f4	test/alternator: more test cases for Select parameter Add to the existing tests for the Select parameter of the Query and Scan operations another check: That when Select is ALL_ATTRIBUTES or COUNT, specifying AttributesToGet or ProjectionExpression is forbidden - because the combination doesn't make sense. The expanded test continues to xfail on Alternator (because the Select parameter isn't yet implemented), and passes on DynamoDB. Strengthening the tests for this feature will be helpful when we decide to implement it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211125074128.741677-1-nyh@scylladb.com>	2021-11-29 20:28:25 +01:00
Piotr Sarna	ecd122a1b0	Merge 'alternator: rudimentary implementation of TTL expiration service' from Nadav Har'El In this patch series we add an implementation of an expiration service to Alternator, which periodically scans the data in the table, looking for expired items and deleting them. We also continue to improve the TTL test suite to cover additional corner cases discovered during the development of the code. This implementation is good enough to make all existing tests but one, plus a few new ones, pass, but is still a very partial and inefficient implementation littered with FIXMEs throughout the code. Among other things, this initial implementation doesn't do anything reasonable about pacing of the scan or about multiple tables, it scans entire items instead of only the needed parts, and because each shard "owns" a different subset of the token ranges, if a node goes down, partitions which it "owns" will not get expired. The current tests cannot expose these problems, so we will need to develop additional tests for them. Because this implementation is very partial, the Alternator TTL continues to remain "experimental", cannot be used without explicitly enabling this experimental feature, and must not be used for any important deployment. Refs #5060 but doesn't close the issue (let's not close it until we have a reasonably complete implementation - not this partial one). Closes #9624 * github.com:scylladb/scylla: alternator: fix TTL expiration scanner's handling of floating point test/alternator: add TTL test for more data test/alternator: remove "xfail" tag from passing tests in test_ttl.py test/alternator: make test_ttl.py tests fast on Alternator alternator: initial implmentation of TTL expiration service alternator: add another unwrap_number() variant alternator: add find_tag() function test/alternator: test another corner case of TTL setting test/alternator: test TTL expiration for table with sort key test/alternator: improve basic test for TTL expiration test/alternator: extract is_aws() function	2021-11-28 22:12:52 +02:00

1 2 3 4 5

233 Commits