Commit Graph

21 Commits

Author SHA1 Message Date
Nadav Har'El
ba97fd98a3 alternator: reduce stall for Query and Scan with large pages
Before this patch, Alternator's Query and Scan operations convert an
entire result page to JSON without yielding. For a page of maximum
size (1MB) and tiny rows, this can cause a significant stall - the
test included in this patch reported stalls of 14-26ms on my laptop.

The problem is the describe_items() function, which does this conversion
immediately, without yielding. This patch changes this function to
return a future, and use the result_set::visit_gently() method
instead of visit() that yields when needed.

This patch does not completely eliminate stalls in the test, but
on my laptop usually reduces them to around 5ms. It appears that
the remaining stalls some from other places not fixed in this PR,
such as perhaps query_page::handle_result(), and will need to be
fixed by additional patches.

The test included in this patch is useful for manually reproducing
the stall, but not useful as a regression test: It is slow (requiring
a couple of seconds to set up the large partition) and doesn't
check anything, and can't even report the stall without modifying the
test runner. So the test is skipped by default (using the "veryslow"
marker) and can be enabled and run manually by developers who want
to continue working on #17995.

Refs #17995.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-03-26 18:32:45 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Nadav Har'El
d03bd82222 Revert "test: move scylla_inject_error from alternator/ to cql-pytest/"
This reverts commit 8e892426e2 and fixes
the code in a different way:

That commit moved the scylla_inject_error function from
test/alternator/util.py to test/cql-pytest/util.py and renamed
test/alternator/util.py. I found the rename confusing and unnecessary.
Moreover, the moved function isn't even usable today by the test suite
that includes it, cql-pytest, because it lacks the "rest_api" fixture :-)
so test/cql-pytest/util.py wasn't the right place for it anyway.
test/rest_api/rest_util.py could have been a good place for this function,
but there is another complication: Although the Alternator and rest_api
tests both had a "rest_api" fixture, it has a different type, which led
to the code in rest_api which used the moved function to have to jump
through hoops to call it instead of just passing "rest_api".

I think the best solution is to revert the above commit, and duplicate
the short scylla_inject_error() function. The duplication isn't an
exact copy - the test/rest_api/rest_util.py version now accepts the
"rest_api" fixture instead of the URL that the Alternator version used.

In the future we can remove some of this duplication by having some
shared "library" code but we should do it carefully and starting with
agreeing on the basic fixtures like "rest_api" and "cql", without that
it's not useful to share small functions that operate on them.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11275
2022-08-11 06:43:26 +03:00
Aleksandra Martyniuk
8e892426e2 test: move scylla_inject_error from alternator/ to cql-pytest/
Move scylla_inject_error from alternator/ to cql-pytest/ so it
can be reached from various tests dirs. alternator/util.py is
renamed to alternator/alternator_util.py to avoid name shadowing.
2022-07-29 09:35:20 +02:00
Nadav Har'El
84143c2ee5 alternator: implement Select option of Query and Scan
This patch implements the previously-unimplemented Select option of the
Query and Scan operators.

The most interesting use case of this option is Select=COUNT which means
we should only count the items, without returning their actual content.
But there are actually four different Select settings: COUNT,
ALL_ATTRIBUTES, SPECIFIC_ATTRIBUTES, and ALL_PROJECTED_ATTRIBUTES.

Five previously-failing tests now pass, and their xfail mark is removed:

 *  test_query.py::test_query_select
 *  test_scan.py::test_scan_select
 *  test_query_filter.py::test_query_filter_and_select_count
 *  test_filter_expression.py::test_filter_expression_and_select_count
 *  test_gsi.py::test_gsi_query_select_1

These tests cover many different cases of successes and errors, including
combination of Select and other options. E.g., combining Select=COUNT
with filtering requires us to get the parts of the items needed for the
filtering function - even if we don't need to return them to the user
at the end.

Because we do not yet support GSI/LSI projection (issue #5036), the
support for ALL_PROJECTED_ATTRIBUTES is a bit simpler than it will need
to be in the future, but we can only finish that after #5036 is done.

Fixes #5058.

The most intrusive part of this patch is a change from attrs_to_get -
a map of top-level attributes that a read needs to fetch - to an
optional<attrs_to_get>. This change is needed because we also need
to support the case that we want to read no attributes (Select=COUNT),
and attrs_to_get.empty() used to mean that we want to read *all*
attributes, not no attributes. After this patch, an unset
optional<attrs_to_get> means read *all* attributes, a set but empty
attrs_to_get means read *no* attributes, and a set and non-empty
attrs_to_get means read those specific attributes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405113700.9768-2-nyh@scylladb.com>
2022-04-11 10:04:32 +02:00
Nadav Har'El
9c1ebdceea alternator: forbid empty AttributesToGet
In DynamoDB one can retrieve only a subset of the attributes using the
AttributesToGet or ProjectionExpression paramters to read requests.
Neither allows an empty list of attributes - if you don't want any
attributes, you should use Select=COUNT instead.

Currently we correctly refuse an empty ProjectionExpression - and have
a test for it:
test_projection_expression.py::test_projection_expression_toplevel_syntax

However, Alternator is missing the same empty-forbidding logic for
AttributesToGet. An empty AttributesToGet is currently allowed, and
basically says "retrieve everything", which is sort of unexpected.

So this patch adds the missing logic, and the missing test (actually
two tests for the same thing - one using GetItem and the other Query).

Fixes #10332

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405113700.9768-1-nyh@scylladb.com>
2022-04-11 10:21:02 +03:00
Nadav Har'El
4349514064 test/alternator: add smaller reproducer for Limit-less reverse query
The regression test we have for Alternator's issue #9487 (where a reverse
query without a Limit given was broken into 100MB pages instead of the
expected 1MB) is test_query.py::test_query_reverse_long. But this is a
very long test requiring a 100MB partition, and because of its slowness
isn't run by default.

This patch adds another version of that test, test_query_reverse_longish,
which reproduces the same issue #9487 with a partition 50 times shorter
(2MB) so it only takes a fraction of a second and can be enabled by
default. It also requires much less network traffic which is important
when running these tests non-locally.

We leave the original test test_query_reverse_long behind, it can be
still useful to stress Scylla even beyond the 100MB boundary, but it
remains in @veryslow mode so won't run in default test runs.

Refs #9487
Refs #7586

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220220161905.852994-1-nyh@scylladb.com>
2022-02-21 09:12:16 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Nadav Har'El
a30e71e27a alternator: doc, test: fix mentions of reverse queries
Now that issues #7586 and #9487 were fixed, reverse queries - even in
long partitions - work well, we can drop the claim in
alternator/docs/compatibility.md that reverse queries are buggy for
large partitions.

We can also remove the "xfail" mark from the tes that checks this
feature, as it now passes.

Refs #7586
Refs #9487

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #9831
2022-01-16 17:46:26 +02:00
Nadav Har'El
1c279118f4 test/alternator: more test cases for Select parameter
Add to the existing tests for the Select parameter of the Query and Scan
operations another check: That when Select is ALL_ATTRIBUTES or COUNT,
specifying AttributesToGet or ProjectionExpression is forbidden -
because the combination doesn't make sense.

The expanded test continues to xfail on Alternator (because the Select
parameter isn't yet implemented), and passes on DynamoDB. Strengthening
the tests for this feature will be helpful when we decide to implement it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211125074128.741677-1-nyh@scylladb.com>
2021-11-29 20:28:25 +01:00
Nadav Har'El
77bd4afda7 test/alternator: avoid client-side validation
Ever since we started testing Alternator with tests written in Python
and using Amazon's "boto3" library, one limitation kept annoying us:

Boto3 verifies the validity of the request parameters before passing
them on to the server. It verifies that mandatory parameters are not
missing, that parameters have the right types, and sometimes even the
right ranges - all in the library before ever sending the request.
This meant that in many cases, we couldn't get good test coverage for
Alternator's server-side handling of *wrong* parameters.

As it turns out, it is trivial to tell boto3 to *not* do its client-side
request validation, with the `parameter_validation=False` config flag.
We just never noticed that such a flag existed :-)

So this patch adds this flag. It then fixes a few tests which expected
ParameterValidationError - this error is the client-side validation
failure, but should now be replaced by checking the server-side error.

The patch also adds a couple of invalid parameter checks that we
couldn't do before because of boto3's eagerness to check them on the
client side. We can add a lot more of these error tests in the future,
now that we got rid of client-side validation.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211005095514.537226-1-nyh@scylladb.com>
2021-10-05 13:26:51 +02:00
Nadav Har'El
1edcc3a218 test/alternator: add test for reverse queries
This patch adds a reproducer for issue #7586 - that Alternator queries
(Query) operating in reverse order (ScanIndexForward = false) are
artificially limited to 100 MB partitions because of their memory use.

This test generates a partition over 100 MB in size and then tries various
reverse queries on it - with or without Limit, starting at the end or
the middle of the partition. The test currently fails when a reverse query
refuses to operate on such a large partition - the log reports this:

  ERROR ... Memory usage of reversed read exceeds hard limit of 104857600
  (configured via max_memory_for_unlimited_query_hard_limit), while reading
  partition K1H6ON3A1C

With yet-uncommitted reverse-scan improvements, the test proceeds further,
but still fails where we test that a reverse query with Limit not
explicitly specified should still be limited to a certain size (e.g. 1MB)
and cannot return the entire 100 MB partition in one response.

Please note that this is not a comprehensive test for Scylla's reverse
scan implementation: In particular we do not have separate tests for
reverse scan's implementation on different sources - memtables, sstables,
or the cache. Nor do we check all sorts of edge cases. We assume that
Scylla's reverse scan implementation will have its own unit tests
elsewhere that will check these things - and this test can focus on the
Alternator use case.

This test is marked "xfail" because it still fails on Alternator. It is
marked "veryslow" because it's a (relatively) slow test, taking multiple
seconds to set up the 100 MB partition. So run the test with the
pytest options "--runxfail --runveryslow" to see how it fails.

Refs #7586

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210930063700.407511-1-nyh@scylladb.com>
2021-09-30 09:34:39 +02:00
Nadav Har'El
dba184039a test/alternator: another test for Query's ExclusiveStartKey
We already have tests for Query's ExclusiveStartKey option, but we
only exercised it as a way for paging linearly through all the results.
Now we add a test that confirms that ExclusiveStartKey can be used not
just for paging through all the result - but also for jumping directly to
the middle of a partition after any clustering key (existing or non-
existing clustering key). The new test also for the first time verifies
that ExclusiveStartKey with a specific format works (previous tests just
copied LastEvaluatedKey to ExclusiveStartKey, so any opaque cookie could
have worked).

The test passes on both DynamoDB and Alternator so it did not find a new
bug. But it's useful to have as a regression test, in case in the future
we want to improve paging performance (see #6278) - and need to keep in
mind that ExclusiveStartKey is not just for paging.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210729114703.1609058-1-nyh@scylladb.com>
2021-08-04 15:24:47 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Nadav Har'El
86779664f4 alternator: fix broken Scan/Query paging with bytes keys
When an Alternator table has partition keys or sort keys of type "bytes"
(blobs), a Scan or Query which required paging used to fail - we used
an incorrect function to output LastEvaluatedKey (which tells the user
where to continue at the next page), and this incorrect function was
correct for strings and numbers - but NOT for bytes (for bytes, we
need to encode them as base-64).

This patch also includes two tests - for bytes partition key and
for bytes sort key - that failed before this patch and now pass.
The test test_fetch_from_system_tables also used to fail after a
Limit was added to it, because one of the tables it scans had a bytes
key. That test is also fixed by this patch.

Fixes #7768

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207175957.2585456-1-nyh@scylladb.com>
2020-12-08 09:38:23 +01:00
Nadav Har'El
095ddf0d41 alternator test: use ConsistentRead=True where missing
All tests that write some data and then read it back need to use
ConsistentRead=True, otherwise the test may sporadically fail on a multi-
node cluster.

In the previous patch we fixed the full_query()/full_scan() convenience
functions. In this patch, I audited the calls to the boto3 read methods -
get_item(), batch_get_item(), query(), scan(), and although most of them
did use ConsistentRead=True as needed, I found some missing and this patch
fixes them.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200616080334.825893-1-nyh@scylladb.com>
2020-06-17 14:57:45 +02:00
Nadav Har'El
0b9f25ab50 alternator: implement FilterExpression
This patch provides a complete implementation for the FilterExpression
parameter - the newer syntax for filtering the results of the Query or
Scan operations.

The implementation is pretty straightforward - we already added earlier
a result-filtering framework to Alternator, and used it for the older
filtering syntax - QuryFilter and ScanFilter. All we had to do now was
to run the FilterExpression (which has the same syntax as a
ConditionExpression) on each individual items. The previous cleanup
patches were important to reduce the friction of running these expressions
on the items.

After the previous patches fixing small esoteric bugs in a few expression
functions, with this patch *all* the tests in test_filter_expression.py
now pass, and so do the two FilterExpression tests in test_query.py and
test_scan.py. As far as I know (and of course minus any bugs we'll discover
later), this marks the FilterExpression feature complete.

Fixes #5038.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-06-14 12:16:26 +03:00
Nadav Har'El
cd0fbb8d38 alternator test: add comprehensive tests for QueryFilter feature
The QueryFilter parameter of Query is only partially implemented (issue
tests for it.

In this patch, we add comprehensive tests for this feature and all its
various operators, types, and corner cases. The tests cover both the
parts we already implemented, and the parts we did not yet.

As usual, all tests succeed on DynamoDB, but many still xfail on Alternator
pending the complete implementation.

Refs #5028.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200525141242.133710-1-nyh@scylladb.com>
2020-05-27 15:29:27 +02:00
Nadav Har'El
bf7b5a0a0d alternator test: add tests for Query's KeyConditions
We had a very limited set of tests for the KeyConditions feature of
Query, which some error cases as well as important use cases (such as
bytes keys), leading to bugs #6490 and #6495 remaining undiscovered.

This patch adds a comprehensive test for the KeyConditions and (hopefully)
all its different combinations of operators, types, and many cases of errors.

We already had a comprehensive test suite for the newer
KeyConditionsExpression syntax, and this patch brings a similar level of
coverage for the older KeyConditions syntax.

Refs #6490
Refs #6495

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200524141800.104950-3-nyh@scylladb.com>
2020-05-25 09:59:06 +02:00
Piotr Sarna
09e4f3b917 alternator: implement ScanIndexForward
The ScanIndexForward parameter is now fully implemented
and can accept ScanIndexForward=false in order to query
the partitions in reverse clustering order.
Note that reading partition slices in reverse order is less
efficient than forward scans and may put a strain on memory
usage, especially for large partitions, since the whole partition
is currently fetched in order to be reversed.

Fixes #5153
2020-04-28 11:44:46 +03:00
Nadav Har'El
4e2bf28b84 alternator-test: make Alternator tests runnable from test.py
To make the tests in alternator-test runnable by test.py, we need to
move the directory alternator-test/ to test/alternator, because test.py
only looks for tests in subdirectories of test/. Then, we need to create
a test/alternator/suite.yaml saying that this test directory is of type
"Run", i.e., it has a single run script "run" which runs all its tests.

The "run" script had to be slightly modified to be aware of its new
location relative to the source directory.

To run the Alternator tests from test.py, do:

	./test.py --mode dev alternator

Note that in this version, the "--mode" has no effect - test/alternator/run
always runs the latest compiled Scylla, regardless of the chosen mode.

The Alternator tests can still be run manually and individually against
a running Scylla or DynamoDB as before - just go to the test/alternator
directory (instead of alternator-test previously) and run "pytest" with
the desired parameters.

Fixes #6046

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-04-12 16:27:45 +03:00