scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 22:25:48 +00:00

Author	SHA1	Message	Date
Nadav Har'El	6e1344eb4f	alternator: better error handling for wrongly-encoded numbers In the DynamoDB API, a number is encoded in JSON requests as something like: {"N": "123"} - the type is "N" and the value "123". Note that the value of the number is encoded as a string, because the floating-point range and accuracy of DynamoDB differs from what various JSON libraries may support. We have a function unwrap_number() which supported the value of the number being encoded as an actual number, not a string. But we should NOT support this case - DynamoDB doesn't. In this patch we add a test that confirms that DynamoDB doesn't, and remove the unnecessary case from unwrap_number(). The unnecessary case also had a FIXME, so it's a good opportunity to get rid of a FIXME. When writing the test, I noticed that the error which DynamoDB returns in this case is SerializionException instead of the more usual ValidationException. I don't know why, but let's also change the error type in this patch. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211115125738.197099-1-nyh@scylladb.com>	2021-11-15 14:47:49 +01:00
Nadav Har'El	b95e431228	alternator: fix bug in ReturnValues=ALL_NEW This patch fixes a bug in UpdateItem's ReturnValues=ALL_NEW, which in some cases returned the OLD (pre-modification) value of some of the attributes, instead of its NEW value. The bug was caused by a confusion in our JSON utility function, rjson::set(), which sounds like it can set any member of a map, but in fact may only be used to add a new member - if a member with the same name (key) already existed, the result is undefined (two values for the same key). In ReturnValues=ALL_NEW we did exactly this: we started with a copy of the original item, and then used set() to override some of the members. This is not allowed. So in this patch, we introduce a new function, rjson::replace(), which does what we previously thought that rjson::set() does - i.e., replace a member if it exists, or if not, add it. We call this function in the ReturnValues=ALL_NEW code. This patch also adds a test case that reproduces the incorrect ALL_NEW results - and gets fixed by this patch. In an upcoming patch, we should rename the confusingly-named set() functions and audit all their uses. But we don't do this in this patch yet. We just add some comments to clarify what set() does - but don't change it, and just add one new function for replace(). Fixes #9542 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211104134937.40797-1-nyh@scylladb.com>	2021-11-04 16:34:58 +01:00
Piotr Sarna	f36bbe05b4	Merge 'alternator: add support for AttributeUpdates Add operation' from Nadav Har'El In UpdateItem's AttributeUpdates (old-style parameter) we were missing support for the ADD operation - which can increment a number, or add items to sets (or to lists, even though this fact isn't documented). This two-patch series add this missing feature. The first patch just moves an existing function to where we can reuse it, and the second patch is the actual implementation of the feature (and enabling its test). Fixes #5893 Closes #9574 * github.com:scylladb/scylla: alternator: add support for AttributeUpdates ADD operation alternator: move list_concatenate() function	2021-11-03 09:33:50 +01:00
Nadav Har'El	00335b1901	alternator: add support for AttributeUpdates ADD operation In UpdateItem's AttributeUpdates (old-style parameter) we were missing support for the ADD operation - which can increment a number, or add items to sets (or to lists, even though this fact isn't documented). This patch adds this feature, and the test for it begins to pass so its "xfail" marker is removed. Fixes #5893 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-11-03 10:19:26 +02:00
Nadav Har'El	56eb994d8f	alternator: allow Authorization header to be without spaces The "Authorization" HTTP header is used in DynamoDB API to sign requests. Our parser for this header, in server::verify_signature(), required the different components of this header to be separated by a comma followed by a whitespace - but it turns out that in DynamoDB both spaces and commas are optional - one of them is enough. At least one DynamoDB client library - the old "boto" (which predated boto3) - builds this header without spaces. In this patch we add a test that shows that an Authorization header with spaces removed works fine in DynamoDB but didn't work in Alternator, and after this patch modifies the parsing code for this header, the test begins to pass (and the other tests show that the previously-working cases didn't break). Fixes #9568 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211101214114.35693-1-nyh@scylladb.com>	2021-11-03 06:38:28 +02:00
Nadav Har'El	6ae0ea0c48	alternator: return the correct Content-Type header Although the DynamoDB API responses are JSON, additional conventions apply to these responses - such as how error codes are encoded in JSON. For this reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead of the standard `application/json` in its responses. Until this patch, Scylla used `application/json` in its responses. This unexpected content-type didn't bother any of the AWS libraries which we tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27). Moreover, we should return the x-amz-json-1.0 content type for future proofing: It turns out that AWS already defined x-amz-json-1.1 - see: https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html The 1.1 content type differs (only) in how it encodes error replies. If one day DynamoDB starts to use this new reply format (it doesn't yet) and if DynamoDB libraries will need to differenciate between the two reply formats, Alternator better return the right one. This patch also includes a new test that the Content-Type header is returned with the expected value. The test passes on DynamoDB, and after this patch it starts to pass on Alternator as well. Fixes #9554. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211031094621.1193387-1-nyh@scylladb.com>	2021-10-31 10:50:25 +01:00
Nadav Har'El	1d751491a3	test/alternator: recognize when Scylla crashes Before this patch, if Scylla crashes during some test in test/alternator, all tests after it will fail because they can't connect to Scylla - and we can get a report on hundreds of failures without a clear sign of where the real problem was. This patch introduces an autouse fixture (i.e., a fixture automatically used by every test) which tries to run a do-nothing health-check request after each test. If this health-check request fails, we conclude that Scylla crashed and report the test in which this happened - and exit pytest instead of failing a hundred more tests. The failure report looks something like this: ``` ! _pytest.outcomes.Exit: Scylla appears to have crashed in test test_batch.py::test_batch_get_item ! ``` And the entire test run fails. These extra health checks are not free, but they come fairly close to being free: In my tests I measured less than 0.1 seconds slowdown of the entire test suite (which has 618 tests) caused by the extra health checks. Fixes #9489 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211017123222.217559-1-nyh@scylladb.com>	2021-10-17 20:45:30 +03:00
Nadav Har'El	86e8979ff2	test/alternator, test/cql-pytest: enable specific experimental features Issue #9467 deprecated the blanket "--experimental" option which we used to enable all experimental Scylla features for testing, and suggests that individual experimental features should be enabled instead. So this is what we do in this patch for the Scylla-running scripts in test/alternator and test/cql-pytest: We need to enable UDF for the CQL tests, and to enable Alternator Streams and Alternator TTL for the Alternator tests. Refs #9467 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211012110312.719654-2-nyh@scylladb.com>	2021-10-15 16:36:35 +03:00
Nadav Har'El	77bd4afda7	test/alternator: avoid client-side validation Ever since we started testing Alternator with tests written in Python and using Amazon's "boto3" library, one limitation kept annoying us: Boto3 verifies the validity of the request parameters before passing them on to the server. It verifies that mandatory parameters are not missing, that parameters have the right types, and sometimes even the right ranges - all in the library before ever sending the request. This meant that in many cases, we couldn't get good test coverage for Alternator's server-side handling of wrong parameters. As it turns out, it is trivial to tell boto3 to not do its client-side request validation, with the `parameter_validation=False` config flag. We just never noticed that such a flag existed :-) So this patch adds this flag. It then fixes a few tests which expected ParameterValidationError - this error is the client-side validation failure, but should now be replaced by checking the server-side error. The patch also adds a couple of invalid parameter checks that we couldn't do before because of boto3's eagerness to check them on the client side. We can add a lot more of these error tests in the future, now that we got rid of client-side validation. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211005095514.537226-1-nyh@scylladb.com>	2021-10-05 13:26:51 +02:00
Nadav Har'El	6dee86eade	test/alternator: another test for adding a GSI to an existing table This patch adds yet another test for Alternator's unimplemented feature of adding a GSI to an already existing table (issue #5022), but this test is for a very specific corner case - tables which contain string attributes with an empty value - the corner case described in issue #9424: DynamoDB used to forbid any string attributes from being set to an empty string, but this changed in May 2020, and since then empty strings are allowed - but NOT as keys. So although it is legal to set a string attribute to an empty string, if this table has a GSI whose key is that specific attribute, the update command is refused. We already had a test for this - test_gsi_empty_value. However, the case in this patch is the case where a GSI is added to a table after the table already has data. In this case (as this test demonstrates), we are supposed to drop the items which have the empty string key from the GSI. Even when #5022 (the ability to add GSIs to existing tables) will be done, this test will continue to fail. The unique problem of this test is that Scylla's materialized views do allow empty strings as clustering keys (right now) and even partition keys (after #9375 will be solved), while we don't want them to enter the GSI. We will probably need to add to the view's filter, which right now contains (as required) "x IS NOT NULL" also the filter "x != ''" (when x's type is a string or binary) so that items with empty-string keys will be dropped. Refs #5022 Refs #9375 Refs #9424 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211003170636.477582-1-nyh@scylladb.com>	2021-10-05 13:26:43 +02:00
Nadav Har'El	b136104298	alternator/test: test for invalid numeric values DynamoDB has a rather baroque definition of numbers, and in particular it does not allow numeric attributes to be set to infinity or NaN. Although I did check invalid numbers in the past, manually, I was never able to write a unit test for this in the past - because the boto3 library catches such errors on the client side, and prevents the test from sending broken requests to the server. So in this patch, I finally came up with a solution - a context manager client_no_transform() which yields a client which does NOT do any transformation or validation on the request's parameters, allowing us to use boto3 to create improper requests - and test the server's handling of them. The test in this patch passes - it did not discover a new bug, but it is a useful regression test and the client_no_transform() trick can be used in more error-case tests which until now we were unable to write. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211004161809.520236-1-nyh@scylladb.com>	2021-10-05 13:13:45 +02:00
Nadav Har'El	414b672e22	test/alternator: verify that empty-string keys are NOT allowed Since May 2020 empty strings are allowed in DynamoDB as attribute values (see announcment in [1]). However, they are still not allowed as keys. We had tests that they are not allowed in keys of LSI or GSI, but missed tests that they are not allowed as keys (partition or sort key) of base tables. This patch add these missing tests. These tests pass - we already had code that checked for empty keys and generated an appropriate error. Note that for compatibility with DynamoDB, Alternator will forbid empty strings as keys even though Scylla does support this possibility (Scylla always supported empty strings as clustering key, and empty partition keys will become possible with issue #9352). [1] https://aws.amazon.com/about-aws/whats-new/2020/05/amazon-dynamodb-now-supports-empty-values-for-non-key-string-and-binary-attributes-in-dynamodb-tables/ Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211003122842.471001-1-nyh@scylladb.com>	2021-10-04 08:40:43 +02:00
Nadav Har'El	1edcc3a218	test/alternator: add test for reverse queries This patch adds a reproducer for issue #7586 - that Alternator queries (Query) operating in reverse order (ScanIndexForward = false) are artificially limited to 100 MB partitions because of their memory use. This test generates a partition over 100 MB in size and then tries various reverse queries on it - with or without Limit, starting at the end or the middle of the partition. The test currently fails when a reverse query refuses to operate on such a large partition - the log reports this: ERROR ... Memory usage of reversed read exceeds hard limit of 104857600 (configured via max_memory_for_unlimited_query_hard_limit), while reading partition K1H6ON3A1C With yet-uncommitted reverse-scan improvements, the test proceeds further, but still fails where we test that a reverse query with Limit not explicitly specified should still be limited to a certain size (e.g. 1MB) and cannot return the entire 100 MB partition in one response. Please note that this is not a comprehensive test for Scylla's reverse scan implementation: In particular we do not have separate tests for reverse scan's implementation on different sources - memtables, sstables, or the cache. Nor do we check all sorts of edge cases. We assume that Scylla's reverse scan implementation will have its own unit tests elsewhere that will check these things - and this test can focus on the Alternator use case. This test is marked "xfail" because it still fails on Alternator. It is marked "veryslow" because it's a (relatively) slow test, taking multiple seconds to set up the 100 MB partition. So run the test with the pytest options "--runxfail --runveryslow" to see how it fails. Refs #7586 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210930063700.407511-1-nyh@scylladb.com>	2021-09-30 09:34:39 +02:00
Nadav Har'El	5cbe9178fd	alternator: add missing BatchGetItem metric Unfortunately, defining metrics in Scylla requires some code duplication, with the metrics declared in one place but exported in a different place in the code. When we duplicated this code in Alternator, we accidentally dropped the first metric - for BatchGetItem. The metric was accounted in the code, but not exported to Prometheus. In addition to fixing the missing metric, this patch also adds a test that confirms that the BatchGetItem metric increases when the BatchGetItem operation is used. This test failed before this patch, and passes with it. The test only currently tests this for BatchGetItem (and BatchWriteItem) but it can be later expanded to cover all the other operations as well. Fixes #9406 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210929121611.373074-1-nyh@scylladb.com>	2021-09-29 14:16:54 +02:00
Nadav Har'El	88177d7be7	test/alternator: add test for too many items in BatchWriteItem DynamoDB limits the number of items that a BatchWriteItem call can write to 25. As noted in issue #5057, in Alternator we don't have this limit or any limit on the number of items in a BatchWriteItem - which probably isn't wise. This patch adds a simple xfailing test for this. Refs #5057 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210912140736.76995-1-nyh@scylladb.com>	2021-09-29 10:48:58 +02:00
Nadav Har'El	58078a3f84	test/alternator: rename utility function test_table_name() We have a utility function test_table_name() to create a unique name for a test table. The funny thing is, that because this function starts with the string "test_", pytest believes it's a test. This doesn't cause any problems (it's consider a passing test), but it's nevertheless strange to see it listed on the list of tests. So in this page, we trivially rename this function to unique_table_name(), a name why pytest doesn't think is the name of test. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-09-19 21:05:21 +03:00
Nadav Har'El	4ffd8c1f2b	alternator: stub TTL operations This patch adds stubs for the UpdateTimeToLive and DescribeTimeToLive operations to Alternator. These operations can enable, disable, or inquire about, the chosen expiration-time attribute. Currently, the information about the chosen attribute is only saved, with no actual expiration of any items taking place. Some of the tests for the TTL feature start to pass, so their xfail tag is removed. Because this this new feature is incomplete, it is not enabled unless the "alternator-ttl" experimental feature is enabled. Moreover, for these operations to be allowed, the entire cluster needs to support this experimental feature, because all nodes need to participate in the data expiration - if some old nodes don't support Alternator TTL, some of the data they hold won't get expired... So we don't allow enabling TTL until all the nodes in the cluster support this feature. The implementation is in a new source file, alternator/ttl.cc. This source file will continue to grow as we implement the expiration feature. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-09-19 21:05:21 +03:00
Nadav Har'El	82d2942ac8	test/alternator: test another corner case of TTL Usually the TTL feature's expiration-time attribute is a schema-less attribute, implemented in Alternator as a JSON-serialized item in a bigger map column. However, key attributes are a special case because they are implemented as separate columns. We already had test cases showing that this case works too - for the case of hash and range keys. In this test we test another possibility of an attribute that is implemented as a schema column - the case of an LSI key. As the other TTL tests, this test too passes on DynamoDB but xfails on Alternator because the TTL feature is not yet implemented. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-09-19 21:05:21 +03:00
Nadav Har'El	1d4474d543	test/alternator/run: don't run Scylla if "--aws" option The test/alternator/run script runs Scylla and then runs pytest against it. But when passing the "--aws" option, the intention is that these tests be run against AWS DynamoDB, not a local Scylla, so there is no point in starting Scylla at all - so this is what we do in this patch. This doesn't really add a new feature - "test/alternator/run --aws" will now be nothing more than "cd test/alternator; pytest --aws". But it adds the convenience that you can run the same tests on Scylla and AWS with exactly the same "run" command, just adding the "--aws" option, and don't need to sometimes use "run" and sometimes "pytest". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210912133239.75463-1-nyh@scylladb.com>	2021-09-12 16:50:38 +03:00
Nadav Har'El	b3f4a37a75	test/alternator: verify that nulls are valid inside string and bytes The tests in this patch verify that null characters are valid characters inside string and bytes (blob) attributes in Alternator. The tests verify this for both key attributes and non-key attributes (since those are serialized differently, it's important to check both cases). The tests pass on both DynamoDB and Alternator - confirming that we don't have a bug in this area. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210824163442.186881-1-nyh@scylladb.com>	2021-09-03 08:49:06 +02:00
Liu Lan	a5c54867f8	alternator: Exclusive start key must lie within the segment ...when using Segment/TotalSegment option. The requirement is not specified in DynamoDB documents, but found in DynamoDB Local: {"__type":"com.amazon.coral.validate#ValidationException", "message":"Exclusive start key must lie within the segment"} Fixes #9272 Signed-off-by: Liu Lan <liulan_yewu@cmss.chinamobile.com> Closes #9270	2021-09-01 11:05:45 +03:00
Nadav Har'El	cf06b7cd40	test/alternator: correct some typos in comments Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210729125317.1610573-1-nyh@scylladb.com>	2021-08-24 19:43:29 +03:00
Nadav Har'El	65381bd155	test/alternator: add tests for expression length limits The DynamoDB documentation https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html describes several hard limits on the size of the size of expressions (ProjectionExpression, ConditionExpression, UpdateExpression, FilterExpression) and various elements they contain. In this patch we begin testing those limits with a comprehensive test for the length of each of these four expressions: we test that lengths up to (and including) 4096 bytes are allowed but longer expressions are rejected. We also add TODOs for additional documented limits that should be tested in the future. Currently, this test passes on DynamoDB but xfails on Alternator because Alternator does not enforce any limits on the expression length. I don't think this is a real problem, and we may consider keeping it this way, but we should at least be aware that this difference exists and an xfailing test will remind us. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210810081948.2012120-2-nyh@scylladb.com>	2021-08-10 12:06:21 +02:00
Nadav Har'El	9d49a32486	test/alternator: add tests for attribute name limits DynamoDB limits attribute names in items to lengths of up 65535 bytes, but in some cases (such as key attributes) the limit is lower - 255. This patch adds tests for many of these cases. All the new tests pass on DynamoDB, but some still xfail on Alternator because Alternator is too lenient - sometimes allowing longer attribute names than DynamoDB allows. While this may sound great, it also has downsides: The oversized attribute names perform badly, and as they grow, Alternator's internal limits will be reached as well, and result in an unsightly "internal server error" being reported instead of the expected user-friendly error. Refs #9169. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210810081948.2012120-1-nyh@scylladb.com>	2021-08-10 12:06:13 +02:00
Nadav Har'El	dba184039a	test/alternator: another test for Query's ExclusiveStartKey We already have tests for Query's ExclusiveStartKey option, but we only exercised it as a way for paging linearly through all the results. Now we add a test that confirms that ExclusiveStartKey can be used not just for paging through all the result - but also for jumping directly to the middle of a partition after any clustering key (existing or non- existing clustering key). The new test also for the first time verifies that ExclusiveStartKey with a specific format works (previous tests just copied LastEvaluatedKey to ExclusiveStartKey, so any opaque cookie could have worked). The test passes on both DynamoDB and Alternator so it did not find a new bug. But it's useful to have as a regression test, in case in the future we want to improve paging performance (see #6278) - and need to keep in mind that ExclusiveStartKey is not just for paging. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210729114703.1609058-1-nyh@scylladb.com>	2021-08-04 15:24:47 +03:00
Nadav Har'El	2acfee8118	test/alternator: add tests for the Alternator TTL feature This patch adds a comprehensive test suite for the DynamoDB API's TTL (item expiration) feature. The tests check the two new API commands added by this feature (UpdateTimeToLive and DescribeTimeToLive), and also how items are expired in practice, and how item expiration interacts with other features such as GSI, LSI and DynamoDB Streams. Because DynamoDB has extremely long delays until items are expired, or until expiration configuration may be changed, several of these tests take up to 30 minutes to complete. We mark these tests with the "verylong" marker, so they are skipped in ordinary test runs - use the "--runverylong" option to run them. All these tests currently pass on DynamoDB, but xfail on Alternator because the two commands UpdateTimeToLive and DescribeTimeToLive are currently rejected by Alternator. Refs #5060 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-07-14 14:08:55 +03:00
Nadav Har'El	9d07ce3cb6	test/alternator: add marker for "veryslow" tests Until now, Alternator test have all been very fast, taking milliseconds or at worst seconds each - or a bit longer on DynamoDB. However, sometimes we need to write tests which take a huge amount of time - for example, tests for the TTL feature may take 10 minutes because the item expiration is delayed by that much. Because a 10 minute test is ridiculous (all 500 Alternator tests together take just one minute today!), we would normally run such test once, and then mark it "skip" so will never run again. One annoying thing about skipped tests is that there is no way to temporarily "unskip" them when we want to run such a test anyway. So in this patch, we introduce a better option for these very slow tests instead of the simple "skip": The patch introduces a marker "@pytest.mark.veryslow". By default, a test with this marker is skipped. However, an command-line option "--runveryslow" is introduced which causes tests with the veryslow mark to be run anyway, and not skipped. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-07-14 00:26:21 +03:00
Nadav Har'El	2fb379bb94	test/alternator: add new_test_table() utility function This patch adds a convenient function new_test_table() that Alternator tests can use to safely create a temporary table, and be sure it is deleted in any case. This function is used in a "with", as follows: with new_test_table(dynamodb, ...) as table: do_something(table) # at this point table has already been deleted. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-07-14 00:26:21 +03:00
Nadav Har'El	c174eeae06	alternator: do not allow LSI on base table with no sort key The purpose of an LSI (local secondary index) in Alternator is to allow a different sort key for the existing partitions, keeping the same division into partititions. So it doesn't make sense to create an LSI on a table that did not originally have a sort key (i.e., single-item partitions). DynamoDB indeed doesn't allow this case, and Alternator forgot to forbid it - so this patch adds the missing check to the CreateTable operation. This patch also adds a test case for this, test_lsi_wrong_no_sort_key, which failed before the patch and passes after it (and also passes on DynamoDB). Also, the existing test_lsi_wrong tests for bad LSI creation attempts by mistake used a base table without a sort key - so while they encountered an error as expected, it was not the right error! So we fix that test (and split it into two tests), adding the missing sort key and exposing the actual errors that the tests were meant to expose. That test passed before this patch and also afterwards - but at least after the patch it is actually testing what it was meant to be testing. Fixes #9018. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210713123747.1012954-1-nyh@scylladb.com>	2021-07-13 15:12:01 +02:00
Nadav Har'El	775a64b003	test/alternator: test for change in CDC preimage In pull request #8568, the CDC API changed slightly, with preimage data gaining extra "delete$k" values for columns whose preimage was missing. In this new test, we verify that this change did not break Alternator. We didn't expect it to break Alternator, because it just outputs the known base-table columns and ignores the columns which weren't a real base-table column - like this "delete$k". In the test we set up a stream with preimages, ensure that a real column (note that an LSI key is a real column instead of a map element) has a null preimage - and see that the preimage is returned as expected, without fake columns like "delete$k". The test passes, showing that PR #8568 was ok. The test also passes, as expected, on DynamoDB. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210504120121.915829-1-nyh@scylladb.com>	2021-07-06 14:53:42 +02:00
Nadav Har'El	b26fcf5567	test/alternator: increase timeouts in test_tracing.py The query tracing tests in test/alternator's test_tracing.py had one timeout of 30 seconds to find the trace, and one unclearly-coded timeout for finding the right content for the trace. We recently saw both timeouts exceeded in tests, but only rarely and only in debug mode, in a run 100 times slower than normal. This patch increases both timeouts to 100 seconds. Whatever happens then, we win: If the test stops failing, we know the new timeout was enough. If the test continues to fail, we will be able to conclude that we have a real bug - e.g., perhaps one of the LWT operations has a bug causing it to hang indefinitely. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210608205026.1600037-1-nyh@scylladb.com>	2021-06-10 09:19:01 +03:00
Piotr Sarna	cb17aa1e53	Merge 'test/alternator: rewrite run script to share code with cql-pytest's run script' from Nadav Har'El In this small series, I rewrite test/alternator/run to Python using the utility functions developed for test/cql-pytest. In the future, we should do the same to test/redis/run and test/scylla-gdb/run. The benefit of this rewrite is less code duplication (all run scripts start with the same duplicate code to deal with temporary directories, to run Scylla IP addresses, etc.), but most importantly - in the future fixes we do to cql-pytest (e.g., parameters needed to start Scylla efficiently, how to shut down Scylla, etc.) will appear automatically in alternator test without needing to remember to change both. Another benefit is that test/alternator/run will now be Python, not a shell script. This should make it easier to integrate it into test.py (refs #6212) in the future - if we want to. Closes #8792 * github.com:scylladb/scylla: test/alternator: rewrite test/alternator/run script in Python test/cql-pytest: make test run code more general	2021-06-06 19:18:49 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Nadav Har'El	f22ed3ff5c	test/alternator: reduce very high timeout in one tracing test In test_tracing.py::test_slow_query_log, the was what looked like an innocent 30-second timeout, but this was in fact a 8 minute timeout - because it started with sleeping 1 second, then 2 seconds, then 3, ... until 30 seconds. Such a high timeout is frustrating when trying to debug failures in the test - which is only expected to take 2 seconds (and all of it because of an artificial timeout). So fix the loop to stop iterating after 60 seconds (a compromise between 30 seconds and 8 minutes...), sleeping a constant amount between iterations. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210601150631.1037158-1-nyh@scylladb.com>	2021-06-06 09:21:23 +03:00
Nadav Har'El	0bb2e010f5	test/alternator: rewrite test/alternator/run script in Python We already wrote the test/cql-pytest/run script in Python in a way it can be reusable for the other test//run scripts. So this patch replaces the test/alternator/run shell script with Python code which does the same thing (safely runs Scylla with Alternator and pytest on it in a temporary directory and IP address), but sharing most of the code that cql-pytest uses. The benefit of reusing the test/cql-pytest/run.py library goes beyond shorter code - the main benefit will be that we can't forget to fix one of the test//run scripts (e.g., add more command line options or fix a bug) when fixing another one. To make the test/cql-pytest/run.py library reusable for running Alternator, I needed to generalize a few things in this patch (e.g., the way we check and wait for Scylla to boot with the different APIs we intend to check). There is also one bug-fix on how interrupts are handled (they are now better guaranteed to kill pytest) - and now fixing this bug benefits all runners using run.py (cql-pytest/run, cql-pytest/run-cassandra and alternator/run). In the future, we can port the runners which are still duplicate shell scripts - test/redis/run and test/scylla-gdb/run - to Python in a similar manner to what we did here for test/alternator/run. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-06-03 11:23:00 +03:00
Nadav Har'El	a2379b96b1	alternator test: test for large BatchGetItem This patch adds an Alternator test, test_batch_get_item_large, which checks a BatchGetItem with a moderately large (1.5 MB) response. The test passes - we do not have a bug in BatchGetItem - but it does reproduce issue #8522 - the long response is stored in memory as one long contiguous string and causes a warning about an over-sized allocation: WARN ... seastar_memory - oversized allocation: 2281472 bytes. Incidentally, this test also reproduces a second contiguous allocation problem - issue #8183 (in BatchWriteItem which we use in this test to set up the item to read). Refs #8522 Refs #8183 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210520161619.110941-1-nyh@scylladb.com>	2021-05-21 08:38:53 +02:00
Nadav Har'El	7d2df8a9bc	test/alternator,cql-pytest: fix resource leak on failure In the alternator and cql-pytest test frameworks, we have some convenient contextmanager-based functions that allows us to create a temporary resource (e.g., a table) that will be automatically deleted, for example: with create_stream_test_table(...) as table: test_something(table) However, our implementation of these functions wasn't safe. We had code looking like: table = ... yield table table.delete() The thinking was that the cleanup part (the table.delete()) will be called after the user's code. However, if the user's code threw (i.e., a failed assertion), the cleanup wasn't called... When the user's code throws, it looks as if the "yield" throws. So the correct code should look like: table = ... try: yield table finally: table.delete() Python's contextmanager documentation indeed gives this idiom in its example. This patch fixes all contextmanager implementations in our tests to do the cleanup even if the user's "with" block throws. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210428083748.552203-1-nyh@scylladb.com>	2021-04-28 10:51:02 +02:00
Piotr Sarna	0779fa8428	test: add username verification to alternator tracing tests The test case now additionally checks if the username entry from found tracing events matches the username used by the test suite.	2021-04-26 11:54:02 +02:00
Nadav Har'El	50f3201ee2	alternator: fix inequality check of two sets In issue #5021 we noted that Alternator's equality operator needs to be fixed for the case of comparing two sets, because the equality check needs to take into account the possibility of different element order. Unfortunately, we fixed only the equality check operator, but forgot there is also an inequality operator! So in this patch we fix the inequality operator, and also add a test for it that was previously missing. The implementation of the inequality operator is trivial - it's just the negation of the equality test. Our pre-existing tests verify that this is the correct implementation (e.g., if attribute x doesn't exist, then "x = 3" is false but "x <> 3" is true). Refs #5021 Fixes #8513 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210419141450.464968-1-nyh@scylladb.com>	2021-04-20 13:14:19 +02:00
Nadav Har'El	dae7528fe5	alternator: fix equality check of nested document containing a set In issue #5021 we noticed that the equality check in Alternator's condition expressions needs to handle sets differently - we need to compare the set's elements ignoring their order. But the implementation we added to fix that issue was only correct when the entire attribute was a set... In the general case, an attribute can be a nested document, with only some inner set. The equality-checking function needs to tranverse this nested document, and compare the sets inside it as appropriate. This is what we do in this patch. This patch also adds a new test comparing equality of a nested document with some inner sets. This test passes on DynamoDB, failed on Alternator before this patch, and passes with this patch. Refs #5021 Fixes #8514 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210419184840.471858-1-nyh@scylladb.com>	2021-04-20 13:14:10 +02:00
Nadav Har'El	46448b0983	alternator: fix equality check of two unset attributes When a condition expression (ConditionExpression, FilterExpression, etc.) checks for equality of two item attributes, i.e., "x = y", and when one of these attributes was missing we correctly returned false. However, we also need to return false when both attributes are missing in the item, because this is what DynamoDB does in this case. In other words an unset attribute is never equal to anything - not even to another unset attribute. This was not happening before this patch: When x and y were both missing attributes, Alternator incorrectly returned true for "x = y", and this patch fixes this case. It also fixes "x <> y" which should to be true when both x and y are unset (but was false before this patch). The other comparison operators - <, <=, >, >=, BETWEEN, were all implemented correctly even before this patch. This patch also includes tests for all the two-unset-attribute cases of all the operators listed above. As usual, we check that these tests pass on both DynamoDB and Alternator to confirm our new behavior is the correct one - before this patch, two of the new tests failed on Alternator and passed on DynamoDB. Fixes #8511 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210419123911.462579-1-nyh@scylladb.com>	2021-04-20 13:14:00 +02:00
Nadav Har'El	989589b570	test/cql-pytest,alternator,redis: avoid an annoying warning This patch avoids an annoying warning Warning: Unknown config ini key: flake8-ignore when running one of the pytest-based test projects (cql-pytest, alternator and redis) on recent versions of pytest. In commit `2022da2405`, we added to the toplevel Scylla directory a "tox.ini" file with some intention to configure Python syntax checking. One of the configurations in this tox.ini is: [pytest] flake8-ignore = E501 It turns out that pytest, if a certain test directory does not have its own pytest.ini file, looks up in ancestor directory for various configuration files (the configuration file precedence is described in https://docs.pytest.org/en/stable/customize.html), and this includes this tox.ini configuration section. Recent versions of pytest complain about the "flake8-ignore" configuration parameter, which they don't recognize. This parameter may be ok (?) if you install a flake8 pytest plugin, but we do not require users to do this for running these tests. Moreover, whatever noble intentions this commit and its tox.ini had, nobody ever followed up on it. The three pytest-based test directories never adhered to flake8's recommended syntax, and never intended to do so. None of the developers of these tests use flake8, or seem to wish to do so. If this ever changes, we can change the pytest.ini or undo this commit and go back to a top-level tox.ini, but I don't see this happening anytime soon. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210411085708.300851-1-nyh@scylladb.com>	2021-04-12 08:04:06 +02:00
Nadav Har'El	0d0db05cf3	test/alternator: speed up two slow xfailing tests By far the two slowest Alternator tests when running a development build on my laptop are test_gsi.py::test_gsi_projection_include and test_gsi.py::test_gsi_projection_keys_only Each of those takes around 3.2, and the sum of just these two tests is as much as 10% (!) of all other 600 tests. The reason why these tests are slow is that they check scanning a GSI with projection. Scylla currently ignores the projection, so the scan returns the wrong value. Because this is a GSI, which supports only eventually- consistent reads, we need to retry the read - and did it for up to 3 seconds! But this retry only makes sense if the GSI read did not yet return the expected data. But in these xfailing test, we read a wrong item (with too many attributes) almost immediately, and this should indicate an immediate failure - no amount of retry would help. So in this patch we detect this case and fail the test immediately instead of wasting 3 seconds in retries. On my laptop with dev build, this patch reduces the time to run the entire Alternator test suite from 70 seconds to 63 seconds. Also, now that we never just waste time until the timeout, we can increase it to any number, and in this patch we increase it from 3 seconds to 5. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210317183918.1775383-1-nyh@scylladb.com>	2021-04-06 14:49:15 +02:00
Nadav Har'El	15cab90f7b	test/alternator: switch some fixture scopes from "session" to "module" In conftest.py we have several fixtures creating shared tables which many test files can share, so they are marked with the "session" scope - all the tests in the testing session may share the same instance. This is fine. Some of test files have additional fixtures for creating special tables needed only in those files. Those were also, unnecessarily, marked "session" scope as well. This means that these temporary tables are only deleted at the very end of test suite, event though they can be deleted at the end of the test file which needed them. This is exactly what the "module" fixture scope is, so this patch changes all the fixtures private to one test file to be "module". After this patch, the teardown of the last test in the suite goes down from 4 seconds to just 1.5 seconds (it's still long because there are still plenty of session-scoped fixtures in conftest.py). Another small benefit is that the peak disk usage of the test suite is lower, because some of the temporary tables are deleted sooner. This patch does not change any test functionality, and also does not make any test faster - it just changes the order of the fixture teardowns. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210317175036.1773774-1-nyh@scylladb.com>	2021-04-06 14:43:36 +02:00
Avi Kivity	8785dd62cb	tests: use kernel page cache Tests are short-lived and use a small amount of data. They are also often run repeatly, and the data is deleted immediately after the test. This is a good scenario for using the kernel page cache, as it can cache read-only data from test to test, and avoid spilling write data to disk if it is deleted quickly. Acknowledge this by using the new --kernel-page-cache option for tests. This is expected to help on large machines, where the disk can be overloaded. Smaller machines with NVMe disks probably will not see a difference. Closes #8347	2021-03-30 12:04:55 +02:00
Nadav Har'El	0b2cf21932	alternator-test: increase read timeout and avoid retries By default the boto3 library waits up to 60 second for a response, and if got no response, it sends the same request again, multiple times. We already noticed in the past that it retries too many times thus slowing down failures, so in our test configuration lowered the number of retries to 3, but the setting of 60-second-timeout plus 3 retries still causes two problems: 1. When the test machine and the build are extremely slow, and the operation is long (usually, CreateTable or DeleteTable involving multiple views), the 60 second timeout might not be enough. 2. If the timeout is reached, boto3 silently retries the same operation. This retry may fail because the previous one really succeeded at least partially! The symptom is tests which report an error when creating a table which already exists, or deleting a table which dooesn't exist. The solution in this patch is first of all to never do retries - if a query fails on internal server error, or times out, just report this failure immediately. We don't expect to see transient errors during local tests, so this is exactly the right behavior. The second thing we do is to increase the default timeout. If 1 minute was not enough, let's raise it to 5 minutes. 5 minutes should be enough for every operation (famous last words...). Even if 5 minutes is not enough for something, at least we'll now see the timeout errors instead of some wierd errors caused by retrying an operation which was already almost done. Fixes #8135 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210222125630.1325011-1-nyh@scylladb.com>	2021-03-18 18:58:08 +02:00
Nadav Har'El	4a7d3175e9	test/alternator: make another test faster The slowest test in test_streams.py is test_list_streams_paged. It is meant to test the ListStreams operation with paging. The existing test repeated its test four times, for four different stream types. However, there is no reason to suspect that the ListStreams operation might somehow be different for the four stream types... We already have other tests which create streams of the four types, and uses these streams - we don't need the test for ListStreams to also test creating the four types. By doing this test just once, not four times, we can save around 1.5 seconds of test time. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210318073755.1784349-1-nyh@scylladb.com>	2021-03-18 11:24:18 +01:00
Nadav Har'El	79af728335	test/alternator: make tracing test a bit faster In the test test_tracing.py::test_tracing_all, we do some operations and then need to wait until they appear in the tracing table. The current code used an exponentially-increasing delay during this wait, starting with 0.1 seconds and then doubling the delay until we find what we're looking for. However, it turns out that the delay until the data appears in the table is deliberately chosen by Scylla - and is always around 2 seconds. In this case, an exponential delay is really bad - we will usually wait for around 1 seconds too long after the needed wait of 2 seconds. So in this patch we replace the exponential delay by a constant delay - we wait 0.3 seconds between each retry. This change makes the test test_tracing.py::test_tracing_all finish in a little over 2 seconds, instead of a little over 3 seconds before this patch. We cannot reduce this 2 second time any further unless we make the 2-second tracing delay configurable. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210318000040.1782933-1-nyh@scylladb.com>	2021-03-18 11:24:18 +01:00
Nadav Har'El	4e87f95b42	test/alternator: remove slow and unhelpful test The test test_table.py::test_table_streams_on creates tables with various stream types, and then immediately deletes them without testing anything. This is a slow test (taking almost a full second on my laptop), and is redundant because in test_streams.py we have tests which create tables with streams in the same way - but then actually test that things work with these streams. So this test might as well be removed, and this is what we do in this patch. Removing this test shaves another second from the Alternator test suite's run time. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210317230530.1780849-1-nyh@scylladb.com>	2021-03-18 11:24:18 +01:00
Nadav Har'El	879656e3e0	test/alternator: make a test faster, safer and more correct The test test_condition_expression.py::test_condition_expression_with_forbidden_rmw takes half a second to run (dev build, on my laptop), one of the slowest tests in Alternator's test suite. Part of the reason was that it needlessly set the same table to forbidden_rmw, multiple times. Instead of doing that, we switch to using the test_table_s_forbid_rmw fixture, which is a table like test_table_s but created just once in forbid_rmw mode. The result is a faster test (0.05 seconds instead of 0.5 seconds), but also safer if we ever want to run tests in parallel. It also fixes a bug in the test: At the end of the test, we intended to double-check that although the forbid_rmw table forbids read-modify-write operations, it does allow pure writes. Yet the test did this after clearing the forbid_rmw mode... So after this patch the test verifies this on the forbid_rmw table, as intended. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210317222703.1779992-1-nyh@scylladb.com>	2021-03-18 11:24:18 +01:00

1 2 3 4

173 Commits