This patch increases the compatibility with DynamoDB Streams by integrating the DynamoDB's event type rules (described in https://github.com/scylladb/scylladb/issues/6918) into Alternator. The main changes are:
- introduce a new flag `alternator_streams_strict_compatibility`, meant as a guard of performance-intensive operations that increase the compatibility with DynamoDB Streams. If enabled, Alternator always performs a RBW before a data-modifying operation, and propagates its result to CDC. Then, the old item is compared to the new one, to determine the mutation type (INSERT vs MODIFY). This option is a no-op for tables with disabled Alternator Streams,
- reduce splitting of simple Alternator mutations,
- correctly distinguish event types described in #6918, except for item deletes. Deleting a missing item with DeleteItem, BatchWriteItem, or a missing field with UpdateItem still emit REMOVEs.
To summarize, the emitted events of the data manipulation operations should be as follows:
- DeleteItem/BatchWriteItem.DeleteItem of existing item: REMOVE (OK)
- DeleteItem of nonexistent item: nothing (OK)
- BatchWriteItem.DeleteItem of nonexistent item: nothing (OK)
- PutItem/UpdateItem/BatchWriteItem.PutItem of existing and not equal item: MODIFY (OK)
- PutItem/UpdateItem/BatchWriteItem.PutItem of existing and equal item: nothing (OK)
- PutItem/UpdateItem/BatchWriteItem.PutItem of nonexistent item: INSERT (OK)
No backport is necessary.
Refs https://github.com/scylladb/scylladb/pull/26149
Refs https://github.com/scylladb/scylladb/pull/26396
Refs https://github.com/scylladb/scylladb/issues/26382
Fixes https://github.com/scylladb/scylladb/issues/6918
Closes scylladb/scylladb#26121
* github.com:scylladb/scylladb:
test/alternator: Enable the tests failing because of #6918
alternator, cdc: Don't emit events for no-op removes
alternator, cdc: Don't emit an event for equal items
alternator/streams, cdc: Differentiate item replace and item update in CDC
alternator: Change the return type of rmw_operation_return
config: Add alternator_streams_strict_compatibility flag
cdc: Don't split a row marker away from row cells
The tag was lately renamed from `experimental:initial_tablets` to
`system::initial_tablets`. This commit fixes both the tests as well as
the exceptions sent to the user instructing how to create table with
vnodes.
The tests pass only with alternator_streams_strict_compatibility flag
enabled, because of a suspected non-negligible performance impact (i.e.
an additional entire-item comparison and type conversions).
Refs https://github.com/scylladb/scylladb/issues/6918
Until this patch, CDC haven't fetched a preimage for mutations
containing only a partition tombstone. Therefore, single-row deletions
in a table witout a clustering key didn't include a preimage, which was
inconsistent with single-row clustered deletions. This commit addresses
this inconsistency.
Second reason is compatibility with DynamoDB Streams, which doesn't
support entire-partition deletes. Alternator uses partition tombstones
for single-row deletions, though, and in these cases the 'OldImage' was
missing from REMOVE records.
Fixes https://github.com/scylladb/scylladb/issues/26382Closesscylladb/scylladb#26578
This commit adds tests to `test_streams.py` (i.e. Alternator Streams)
checking the following cases:
* putting an item with BatchWriteItem shouldn't emit a log if the old
item and the new item are identical,
* deleting an item with BatchWriteItem shouldn't emit a log if the item
doesn't exist,
* UpdateItem shouldn't emit a log if the old item and the new item are
identical.
These cases haven't been tested until this commit.
Refs https://github.com/scylladb/scylladb/issues/6918Closesscylladb/scylladb#26396
This patch makes three small mostly-cosmetic improvements to a test in
test/alternator/test_streams.py:
1. The test is renamed "test_streams_deleteitem_old_image_no_ck" to
emphasize its focus on the combination of deleteitem, old image,
and no ck. The "putitem" we had in the name was not relevant, and
the "old_image" was missing and important.
2. Moreover, using PutItem in this test just to set up the test scenario
mixed the bug which the test tries to reproduced with a different
only-recently-fixed bug (that PutItem also generated a spurious
"REMOVE" event). So I changed the use of PutItem by using UpdateItem,
to make this test indepedent of the other bug. Test independence is
important because it allows us - if we want - to backport a fix for
just one bug independently of the fix to the other bug.
3. Also improved the comment in front of the test to mention where we
already tested the with-ck case, and also to mention issue 26382
which this test reproduces (the xfail line also mentions it, but
the xfail line will be removed when the bug is fixed - but the
mention in the comment will remain - and should remain.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#26526
Until now, every PutItem operation appeared in the Alternator Streams as
two events - a REMOVE and a MODIFY. DynamoDB Streams emits only INSERT
or MODIFY, depending on whether a row was replaced, or created anew. A
related issue scylladb#6918 concerns distinguishing the mutation type properly.
This was because each call to PutItem emitted the two CDC rows, returned
by GetRecords. Since this patch, we use a collection tombstone for the
`:attrs` column, and a separate tombstone for each regular column in the
table's schema. We don't expect that new tables would have any other
regular column, except for the `:attrs` and keys, but we may encounter
them in in upgraded tables which had old GSIs or LSIs.
Fixes: scylladb#6930.
Closesscylladb/scylladb#24991
update all the references about the issue of tablets support for
alternator streams to issue #23838 instead of #16317.
The issue #16317 is about support of CDC with tablets, but it is now
closed and it didn't address alternator streams. the remaining issues
about alternator streams should be addressed as part of #23838, so fix
the references in order for them not to be missed.
This commit adds missing fields to GetRecords responses: `awsRegion` and
`eventVersion`. We also considered changing `eventSource` from
`scylladb:alternator` to `aws:dynamodb` and setting `SizeBytes` subfield
inside the `dynamodb` field.
We set `awsRegion` to the datacenter's name of the node that received
the request. This is in line with the AWS documentation, except that
Scylla has no direct equivalent of a region, so we use the datacenter's
name, which is analogous to DynamoDB's concept of region.
The field `eventVersion` determines the structure of a Record. It is
updated whenever the structure changes. We think that adding a field
`userIdentity` bumped the version from `1.0` to `1.1`. Currently, Scylla
doesn't support this field (#11523), hence we use the older 1.0 version.
We have decided to leave `eventSource` as is, since it's easy to modify
it in case of problems to `aws:dynamodb` used by DynamoDB.
Not setting `SizeBytes` subfield inside the `dynamodb` field was
dictated by the lack of apparent use cases. The documentation is unclear
about how `SizeBytes` is calculated and after experimenting a little
bit, I haven't found an obvious pattern.
Fixes: #6931Closesscylladb/scylladb#24903
Currently, in Alternator it is possible to create a table whose name has
222 characters, and then trying to add Streams to that table results in
an attempt to create a CDC log table with the same name plus a
15-character suffix "_scylla_cdc_log", which resulted (Ref #24598) in
an IO-error and a Scylla shutdown.
This patch adds code to the Stream-adding operations (both CreateTable
and UpdateTable) that validates that the table's name, plus that 15
character suffix, doesn't exceed max_auxiliary_table_name_length, i.e.,
222.
After this patch, if you have a table whose name is between 207 and 222
characters, attempting to enable Streams on it will fail with:
"Streams cannot be added if the table name is longer than 207 characters."
Note that in the future, if we lower max_table_name_length to below 207,
e.g., to 192, then it will always be possible to add a stream to any
legal table, and the new checks we had here will be mostly redundant.
But only "mostly" - not entirely: Checking in UpdateTable is still
important because of the possibility that an upgrading user might have
a pre-existing table whose name is longer than the new limit, and might
try to enable Streams.
After this patch, the crash reported in #24598 can no longer happen, so
in this sense the bug is solved. However, we still want to lower
max_table_name_length from 222 to 192, so that it will always be
possible to enable streams on any table with a legal name length.
We'll do this in the next patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The two tests in this patch reproduce issue #24598: When enabling
Alternator streams on an Alternator table with a very long name,
such as the maximum allowed name length 222, the result is an
I/O error and a Scylla shutdown.
The two tests are currently marked "skip", otherwise they would
crash the Scylla being tested.
Refs #24598
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Alternator Streams' "GetRecords" operation has a "Limit" parameter on
how many records to return. The DynamoDB documentations says that the
upper limit on this Limit parameter is 1000 - but Alternator didn't
enforce this. In this patch we begin enforcing this highest Limit, and
also add a test for verifying this enforcement. As usual, the new test
passes on DynamoDB, and after this patch - also on Alternator.
The reason why it's useful to have *some* upper limit on Limit is that
the existing executor::get_records() implementation does not really have
preemption points in all the necessary places. In particular, we have a
loop on all returned records without preemption points. We also store
the returned records in a RapidJson vector, which requires a contiguous
allocation.
Even before this patch, GetRecords had a hard limit of 1 MB of results.
But still, in some cases 1 MB of results may be a lot of results, and we
can see stalls in the aforementioned places being O(number of results).
Fixes#23534
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#23547
In commit c24bc3b we decided that creating a new table in Alternator
will by default use vnodes - not tablets - because of all the missing
features in our tablets implementation that are important for
Alternator, namely - LWT, CDC and Alternator TTL.
We never documented this, or the fact that we support a tag
`experimental:initial_tablets` which allows to override this decision
and create an Alternator table using tablets. We also never documented
what exactly doesn't work when Alternator uses tablet.
This patch adds the missing documentation in docs/alternator/new-apis.md
(which is a good place for describing the `experimental:initial_tablets`
tag). The patch also adds a new test file, test_tablets.py, which
includes tests for all the statements made in the document regarding
how `experimental:initial_tablets` works and what works or doesn't
work when tablets are enabled.
Two existing tests - for TTL and Streams non-support with tablets -
are moved to the new test file.
When the tablets feature will finally be completed, both the document
and the tests will need to be modified (some of the tests should be
outright deleted). But it seems this will not happen for at least
several months, and that is too long to wait without accurate
documentation.
Fixes#21629
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#22462
Table updates that try to enable stream (while changing or not the
StreamViewType) on a table that already has the stream enabled
will result in ValidationError.
Table updates that try to disable stream on a table that does not
have the stream enabled will result in ValidationError.
Add two tests to verify the above.
Mark the test for changing the existing stream's StreamViewType
not to xfail.
Fixesscylladb/scylladb#6939Closesscylladb/scylladb#22827
The test test_streams.py::test_stream_list_tables reproduces a bug where
enabling streams added a spurious result to ListTables. A reviewer of
that patch asked to also add a check that name of the table itself
doesn't disappear from ListTables when a stream is enabled, so this is
what this patch adds.
This theoretical scenario (a table's name disappearing from ListTables)
never happened, so the new check doesn't reproduce any known bug, but
I guess it never hurts to make the test stronger for regression testing.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#19934
The Alternator command ListTables is supposed to list actual tables
created with CreateTable, and should list things like materialized views
(created for GSI or LSI) or CDC log tables.
We already properly excluded materialized views from the list - and
had the tests to prove it - but forgot both the exclusion and the testing
for CDC log tables - so creating a table xyz with streams enable would
cause ListTables to also list "xyz_scylla_cdc_log".
This patch fixes both oversights: It adds the code to exclude CDC logs
from the output of ListTables, add adds a test which reproduces the bug
before this fix, and verifies the fix works.
Fixes#19911.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#19914
Alternator Streams doesn't yet work on tables using tablets (this is
issue #16317). Before this patch, an attempt to enable it results in
an unsightly InternalServerError, which isn't terrible - but we can
do better.
So in this patch, we make the attempt to enable Streams and tablets
together into a clear error. The error message points to the open issue,
and also suggests how to create a table that uses vnodes, not tablets.
Unfortunately, there are slightly two different code paths and error
messages for two cases: One case is the creation of a new table (where
the validation happens before the keyspace is actually created), and
the other case is an attempt to enable streams on an existing table
with an existing keyspace (which already might or might not be using
tablets).
This patch also adds a test that verifies that trying to enable Streams
with tablets is an error - in both cases (table creation and update).
Obviously, this test - and the validation code - should be removed once
the issue is solved and Alternator Streams begins working with tablets.
Fixes#16497
Refs #16807
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#17311
If an Alternator table uses tablets (we'll turn this on in a following
patch), some tests are known to fail because of features not yet
supported with tablets, namely:
Refs #16317 - Support Alternator Streams with tablets (CDC)
Refs #16567 - Support Alternator TTL with tablets
This patch changes all tests failing on tablets due to one of these two
known issues to explicitly ask to disable tablets when creating their
test table. This means that at least we continue to test these two
features (Streams and TTL) even if they don't yet work with tablets.
We'll need to remember to remove this override when tablet support
for CDC and Alternator TTL arrives. I left a comment in the right
places in the code with the relevant issue numbers, to remind us what
to change when we fix those issues.
This patch also adds xfail_tablets and skip_tablets fixtures that can
be used to xfail or skip tests when running with tablets - but we
don't use them yet - and may never use them, but since I already wrote
this code it won't hurt having it, just in case. When running without
tablets, or against an older Scylla or on DynamoDB, the tests with
these marks are run normally.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
In issue #12601, a dtest involving paging of ListStreams showed
incorrect results - the paged results had one duplicate stream and one
missing stream. We believe that the cause of this bug was that the
unsorted map of tables can change order between pages. In this patch
we add a test test_list_streams_paged_with_new_table which can
demonstrate this bug - by adding a lot of tables in mid-paging, we
cause the unsorted map to be reshufled and the paging to break.
This is not the same situation as in #12601 (which did not involve
new tables) but we believe it demonstrates the same bug - and check
its fix. Indeed this passes with the fix in pull request #12614 and
fails without it.
This patch also adds a second test, test_stream_arn_unchanging:
That test eliminates a guess we had for the cause of #12601. We
thought that maybe stream ARN changing on a table if its schema
version changes, but the new test confirms that it actually behaves
as expected (the stream ARN doesn't change).
Refs #12601
Refs #12614
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12616
DynamoDB Streams limits the "Limit" parameter of ListStreams to 100 -
anything larger will result in an error. Scylla doesn't necessarily
need to uphold the same limit, but we should uphold *some* limit, as
not having any limit can result (in the theoretical case of a huge
number of tables with streams enabled) in an unbounded response size.
So here we add a test to check that a Limit of 100,000 is not allowed.
It passes on DynamoDB (in fact, any number higher than 100 will be
enough threre) but fails on Alternator, so is marked "xfail".
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We had a skipped test on how Alternator handles Limit=0 for ListStreams
which should be reported as an error. We had to skip it because boto3
did us a "favor" of discovering this parameter error before ever sending
it to the server. We discovered long ago how to avoid this client-side
checking in boto3, but only used it for the "dynamodb" fixture and
forgot to copy the same trick to the "dynamodbstreams" fixture - and
in this patch we do, and can run this test successfully.
While at it, also copy the extented timeout configuration we had in
the dynamodb fixture also to the dynamodbstreams fixture. There is
no reason why it should be different.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This reverts commit 8e892426e2 and fixes
the code in a different way:
That commit moved the scylla_inject_error function from
test/alternator/util.py to test/cql-pytest/util.py and renamed
test/alternator/util.py. I found the rename confusing and unnecessary.
Moreover, the moved function isn't even usable today by the test suite
that includes it, cql-pytest, because it lacks the "rest_api" fixture :-)
so test/cql-pytest/util.py wasn't the right place for it anyway.
test/rest_api/rest_util.py could have been a good place for this function,
but there is another complication: Although the Alternator and rest_api
tests both had a "rest_api" fixture, it has a different type, which led
to the code in rest_api which used the moved function to have to jump
through hoops to call it instead of just passing "rest_api".
I think the best solution is to revert the above commit, and duplicate
the short scylla_inject_error() function. The duplication isn't an
exact copy - the test/rest_api/rest_util.py version now accepts the
"rest_api" fixture instead of the URL that the Alternator version used.
In the future we can remove some of this duplication by having some
shared "library" code but we should do it carefully and starting with
agreeing on the basic fixtures like "rest_api" and "cql", without that
it's not useful to share small functions that operate on them.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11275
Move scylla_inject_error from alternator/ to cql-pytest/ so it
can be reached from various tests dirs. alternator/util.py is
renamed to alternator/alternator_util.py to avoid name shadowing.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
We have a utility function test_table_name() to create a unique name for
a test table. The funny thing is, that because this function starts with
the string "test_", pytest believes it's a test. This doesn't cause any
problems (it's consider a *passing* test), but it's nevertheless strange
to see it listed on the list of tests.
So in this page, we trivially rename this function to unique_table_name(),
a name why pytest doesn't think is the name of test.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
In pull request #8568, the CDC API changed slightly, with preimage data
gaining extra "delete$k" values for columns whose preimage was missing.
In this new test, we verify that this change did not break Alternator.
We didn't expect it to break Alternator, because it just outputs the known
base-table columns and ignores the columns which weren't a real base-table
column - like this "delete$k".
In the test we set up a stream with preimages, ensure that a real column
(note that an LSI key is a real column instead of a map element) has a
null preimage - and see that the preimage is returned as expected,
without fake columns like "delete$k".
The test passes, showing that PR #8568 was ok.
The test also passes, as expected, on DynamoDB.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210504120121.915829-1-nyh@scylladb.com>
In the alternator and cql-pytest test frameworks, we have some convenient
contextmanager-based functions that allows us to create a temporary
resource (e.g., a table) that will be automatically deleted, for
example:
with create_stream_test_table(...) as table:
test_something(table)
However, our implementation of these functions wasn't safe. We had
code looking like:
table = ...
yield table
table.delete()
The thinking was that the cleanup part (the table.delete()) will be
called after the user's code. However, if the user's code threw
(i.e., a failed assertion), the cleanup wasn't called... When the user's
code throws, it looks as if the "yield" throws. So the correct code
should look like:
table = ...
try:
yield table
finally:
table.delete()
Python's contextmanager documentation indeed gives this idiom in its
example.
This patch fixes all contextmanager implementations in our tests to do
the cleanup even if the user's "with" block throws.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210428083748.552203-1-nyh@scylladb.com>
In conftest.py we have several fixtures creating shared tables which many
test files can share, so they are marked with the "session" scope - all
the tests in the testing session may share the same instance. This is fine.
Some of test files have additional fixtures for creating special tables
needed only in those files. Those were also, unnecessarily, marked
"session" scope as well. This means that these temporary tables are
only deleted at the very end of test suite, event though they can be
deleted at the end of the test file which needed them. This is exactly
what the "module" fixture scope is, so this patch changes all the
fixtures private to one test file to be "module".
After this patch, the teardown of the last test in the suite goes down
from 4 seconds to just 1.5 seconds (it's still long because there are
still plenty of session-scoped fixtures in conftest.py).
Another small benefit is that the peak disk usage of the test suite is
lower, because some of the temporary tables are deleted sooner.
This patch does not change any test functionality, and also does not
make any test faster - it just changes the order of the fixture
teardowns.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210317175036.1773774-1-nyh@scylladb.com>
The slowest test in test_streams.py is test_list_streams_paged. It is meant
to test the ListStreams operation with paging. The existing test repeated
its test four times, for four different stream types. However, there is
no reason to suspect that the ListStreams operation might somehow be
different for the four stream types... We already have other tests which
create streams of the four types, and uses these streams - we don't
need the test for ListStreams to also test creating the four types.
By doing this test just once, not four times, we can save around 1.5
seconds of test time.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210318073755.1784349-1-nyh@scylladb.com>
In test_streams.py we had some code to get a list of shards and iterators
duplicated three times. Put it in a function, shards_and_latest_iterators(),
to reduce this duplication.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201006112421.426096-1-nyh@scylladb.com>
Three tests in test_streams.py run update_table() on a table without
waiting for it to complete, and then call update_table() on the same
table or delete it. This always works in Scylla, and usually works in
AWS, but if we reach the second call, it may fail because the previous
update_table() did not take effect yet. We sometimes see these failures
when running the Alternator test suite against AWS.
So in this patch, after an each update_table() we wait for the table
to return from UPDATING to ACTIVE status.
The entire Alternator test suite now passes (or skipped) on AWS,
so: Fixes#7778.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201213164931.2767236-1-nyh@scylladb.com>
Add a test that better clarifies what StartingSequenceNumber returned by
DescribeStream really guarantees (this question was raised in a review
of a different patch). The main thing we can guarantee is that reading a
shard from that position returns all the information in that shard -
similar to TRIM_HORIZON. This test verifies this, and it passes.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201112081250.862119-1-nyh@scylladb.com>
We already have a test for the behavior of a closed shard and how
iterators previously created for it are still valid. In this patch
we add to this also checking that the shard id itself, not just the
iterator, is still valid.
Additionally, although the aforementioned test used a disabled stream
to create a closed shard, it was not a complete test for the behavior
of a disabled stream, and this patch adds such a test. We check that
although the stream is disabled, it is still fully usable (for 24 hours) -
its original ARN is still listed on ListStreams, the ARN is still usable,
its shards can be listed, all are marked as closed but still fully readable.
Both tests pass on DynamoDB, and xfail on Alternator because of
issue #7239 - CDC drops the CDC log table as soon as CDC is disabled,
so the stream data is lost immediately instead of being retained for
24 hours.
Refs #7239
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201006183915.434055-1-nyh@scylladb.com>
Fixes#7424
AWS sdk (kinesis) assumes SequenceNumbers are monotonically
growing bigints. Since we sort on and use timeuuids are these
a "raw" bit representation of this will _not_ fulfill the
requirement. However, we can "unwrap" the timestamp of uuid
msb and give the value as timestamp<<64|lsb, which will
ensure sort order == bigint order.
Fixes#7409
AWS kinesis Java sdk requires/expects shards to be reported in
lexical order, and even worse, ignores lastevalshard. Thus not
upholding said order will break their stream intropection badly.
Added asserts to unit tests.
v2:
* Added more comments
* use unsigned_cmp
* unconditional check in streams_test
Fixes#7344
It is not data really needed, as shard_id:s are not required
to be unique across streams, and also because the length limit
on shard_id text representation.
As a side effect, shard iter instead carries the stream arn.
As the test test_streams_closed_read confirmed, when a stream shard is
closed, GetRecords should not return a NextShardIterator at all.
Before this patch we wrongly returned an empty string for it.
Before this patch, several Alternator Stream tests (in test_streams.py)
failed when running against a multi-node Scylla cluster. The reason is as
follows: As a multi-node cluster boots and more and more nodes enter the
cluster, the cluster changes its mind about the token ownership, and
therefore the list of stream shards changes. By the time we have the full
cluster, a bunch of shards were created and closed without any data yet.
All the tests will see these closed shards, and need to understand them.
The fetch_more() utility function correctly assumed that a closed shard
does not return a NextShardIterator, and got confused by the empty string
we used to return.
Now that closed shards can return responses without NextShardIterator,
we also needed to fix in this patch a couple of tests which wrongly assumed
this can't happen. These tests did not fail on DynamoDB because unlike in
Scylla, DynamoDB does not have any closed shards in normal tests which
do not specifically cause them (only test_streams_closed_read).
We also need to fix test_streams_closed_read to get rid of an unnecessary
assumption: It currently assumes that when we read the very last item in
a closed shard is read, the end-of-shard is immediately signaled (i.e.,
NextShardIterator is not returned). Although DynamoDB does in fact do this,
it is also perfectly legal for Alternator's implementation to return the
last item with a new NextShardIterator - and only when the client reads
from that iterator, we finally return the signal the end of the shard.
Fixes#7237.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200922082529.511199-1-nyh@scylladb.com>
This patch adds a test, test_streams_closed_read, which reproduces
two issues in Alternator Streams, regarding the behavior of *closed*
stream shards:
Refs #7239: After streaming is disabled, the stream should still be readable,
it's just that all its shards are now "closed".
Refs #7237: When reaching the end of a closed shard, NextShardIterator should
be missing. Not set to an empty string as we do today.
The test passes on DynamoDB, and xfails on Alterator, and should continue to
do so until both issues are fixed.
This patch changes the implementation of the disable_stream() function.
This function was never actually used by the existing code, and now that
I wanted to use it, I discovered it didn't work as expected and had to fix it.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200915134643.236273-1-nyh@scylladb.com>
When the test suite is run with Scylla serving in HTTPS mode, using
test/alternator/run --https, two Alternator Streams tests failed.
With this patch fixing a bug in the test, the tests pass.
The bug was in the is_local_java() function which was supposed to detect
DynamoDB Local (which behaves in some things differently from the real
DynamoDB). When that detection code makes an HTTPS request and does not
disable checking the server's certificate (which on Alternator is
self-signed), the request fails - but not in the way that the code expected.
So we need to fix the is_local_java() to allow the failure mode of the
self-signed certificate. Anyway, this case is *not* DynamoDB Local so
the detection function would return false.
Fixes#7214
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200910194738.125263-1-nyh@scylladb.com>
This patch adds regression tests for four recently-fixed issues which did not yet
have tests:
Refs #7157 (LatestStreamArn)
Refs #7158 (SequenceNumber should be numeric)
Refs #7162 (LatestStreamLabel)
Refs #7163 (StreamSpecification)
I verified that all the new tests failed before these issues were fixed, but
now pass.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200907155334.562844-1-nyh@scylladb.com>
This patch adds a test for the TRIM_HORIZON option of GetShardIterator in
Alternator Streams. This option asks to fetch again *all* the available
history in this shard stream. We had an implementation for it, but not a
test - so this patch adds one. The test passes.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200830131458.381350-1-nyh@scylladb.com>
Alternator Streams already support the AT_SEQUENCE_NUMBER and
AFTER_SEQUENCE_NUMBER options for iterators. These options allow to replay
a stream of changes from a known position or after that known position.
However, we never had a test verifying that these features actually work
as intended, beyond just checking syntax. Having such tests is important
because recently we changed the implementation of these iterators, but
didn't have a test verifying that they still work.
So in this patch we add such tests. The tests pass (as usual, on both
Alternator and DynamoDB).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200830115817.380075-1-nyh@scylladb.com>
We had a test, test_streams_last_result, that verifies that after reading from
an Alternator Stream the last event, reading again will find nothing.
But we didn't actually have a test which checks that if at that point a new event
*does* arrive, we can read it. This test checks this case, and it passes (we don't
have a bug there, but it's good as a regression test for NextShardIterator).
This test also verifies that after reading an event for a particular key on a
a specific stream "shard", the next event for the same key will arrive on the
same shard.
This test passes on both Alternator and DynamoDB.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200830105744.378790-1-nyh@scylladb.com>