to 1024 bytes, and the entire item to 400 KB which therefore also
limits the size of one attribute. This test checks that we can
reach up to these limits, with binary keys and attributes.
The test does *not* check what happens once we exceed these
limits. In such a case, DynamoDB throws an error (I checked that
manually) but Alternator currently simply succeeds. If in the
future we decide to add artificial limits to Alternator as well,
we should add such tests as well.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
"len" is an unfortunate choice for a variable name, in case one
day the implementation may want to call the built-in "len" function.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We already have a test for *string* sort-key ordering of items returned
by the Scan operation, and this test adds a similar test for the Query
operation. We verify that items are retrieved in the desired sorted
order (sorted by the aptly-named sort key) and not in creation order
or any other wrong order.
But beyond just checking that Query works as expected (it should,
given it uses the same machinary as Scan), the nice thing about this
test is that it doesn't create a new table - it uses a shared table
and creates one random partition inside it. This makes this test
faster and easier to write (no need for a new fixture), and most
importantly - easily allows us to write similar tests for other
key types.
So this patch also tests the correct ordering of *binary* sort keys.
It helped exposed bugs in previous versions of the binary key implementation.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Simple tests for item operations (PutItem, GetItem) with binary key instead
of string for the hash and sort keys. We need to be able to store such
keys, and then retrieve them correctly.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Until now we only supported string for key columns (hash or sort key).
This patch adds support for the bytes type (a.k.a binary or blob) as well.
The last missing type to be supported in keys is the number type.
Note that in JSON, bytes values are represented with base64 encoding,
so we need to decode them before storing the decoded value, and re-encode
when the user retrieves the value. The decoding is important not just
for saving storage space (the encoding is 4/3 the size of the decoded)
but also for correct *sorting* of the binary keys.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The DynamoDB API uses base64 encoding to encode binary blobs as JSON
strings. So we need functions to do these conversions.
This code was "inspired" by https://github.com/ReneNyffenegger/cpp-base64
but doesn't actually copy code from it.
I didn't write any specific unit tests for this code, but it will be
exercised and tested in a following patch which tests Alternator's use
of these functions.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
BEGINS_WITH behaves in a special way when a key postfix
consists of <255> bytes. The initial test does not use that
and instead checks UTF-8 characters, but once bytes type
is implemented for keys, it should also test specifically for
corner cases, like strings that consist of <255> byte only.
Message-Id: <fe10d7addc1c9d095f7a06f908701bb2990ce6fe.1558603189.git.sarna@scylladb.com>
BEGINS_WITH statement increments a string in order to compute
the upper bound for a clustering range of a query.
Unfortunately, previous implementation was not correct,
as it appended a <0> byte if the last character was <255>,
instead of incrementing a last-but-one character.
If the string contains <255> bytes only, the upper bound
of the returned upper bound is infinite.
Message-Id: <3a569f08f61fca66cc4f5d9e09a7188f6daad578.1558524028.git.sarna@scylladb.com>
We had several places in the code that need to parse the
ConsistentRead flag in the request. Let's add a function
that does this, and while at it, checks for more error
cases and also returns LOCAL_QUORUM and LOCAL_ONE instead
of QUORUM and ONE.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
As Shlomi suggested in the past, it is more likely that when we
eventually support global tables, we will use LOCAL_QUORUM,
not QUORUM. So let's switch to that now.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
So far, all of the tests in test_item.py (for PutItem, GetItem, UpdateItem),
were arbitrarily done on a test table with both hash key and sort key
(both with string type). While this covers most of the code paths, we still
need to verify that the case where there is *not* a sort key, also works
fine. E.g., maybe we have a bug where a missing clustering key is handled
incorrectly or an error is incorrectly reported in that case?
But in this patch we add tests for the hash-key-only case, and see that
it already works correctly. No bug :-)
We add a new fixture test_table_s for creating a test table with just
a single string key. Later we'll probably add more of these test tables
for additional key types.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Another type of key type error can be to forget part of the key
(the hash or sort key). Let's test that too (it already works correctly,
no need to patch the code).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
When a table has a hash key or sort key of a certain type (this can
be string, bytes, or number), one cannot try to choose an item using
values of different types.
We previously did not handle this case gracefully, and PutItem handled
it particularly bad - writing malformed data to the sstable and basically
hanging Scylla. In this patch we fix the pk_from_json() and ck_from_json()
functions to verify the expected type, and fail gracefully if the user
sent the wrong type.
This patch also adds tests for these failures, for the GetItem, PutItem,
and UpdateItem operations.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
According to the documentation, trying to GetItem a non-existant item
should result in an empty response - NOT a response with an empty "Item"
map as we do before this patch.
This patch fixes this case, and adds a test case for it. As usual,
we verify that the test case also works on Amazon DynamoDB, to verify
DynamoDB really behaves the way we thik it does.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
If an empty item (i.e., no attributes except the key) is created, or an item
becomes empty (by deleting its existing attributes), the empty item must be
maintained - it cannot just disappear. To do this in Scylla, we must add a
row marker - otherwise an empty attribute map is not enough to keep the
row alive.
This patch includes 4 test cases for all the various ways an empty item can be
created empty or non-empty item be emptied, and verifies that the empty item
can be correctly retrieved (as usual, to verify that our expectation of
"correctness" is indeed correct, we run the same tests against DynamoDB).
All these 4 tests failed before this patch, and now succeed.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
These lines of codes were superfluous and their result unused: the
make_item_mutation() function finds the pk and ck on its own.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
his patch adds a statistics framework to Alternator: Executor has (for
each shard) a _stats object which contains counters for various events,
and also is in charge of making these counters visible via Scylla's regular
metrics API (http://localhost:9180/metrics).
This patch includes a counter for each of DynamoDB's operation types,
and we increase the ones we support when handled. We also added counters
for total operations and unsupported operations (operation types we don't
yet handle). In the future we can easily add many more counters: Define
the counter in stats.hh, export it in stats.cc, and increment it in
where relevant in executor.cc (or server.cc).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Ask to retrieve only an attribute name which *none* of the items have.
The result should be a silly list of empty items, and indeed it is.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Use full_scan() in another test instead of open-coding the scan.
There are two more tests that could have used full_scan(), but
since they seem to be specifically adding more assertions or
using a different API ("paginators"), I decided to leave them
as-is. But new tests should use full_scan().
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This is a short, but extensive, test to the AttributesToGet parameter
to Scan, allowing to select for output only some of the attributes.
The AttributesToGet feature has several non-obvious features. Firstly,
it doesn't require that any key attributes be selected. So since each
item may have different non-key attributes, some scanned items may
be missing some of the selected columns, and some of the items may
even be missing *all* the selected columns - in which case DynamoDB
returns an empty item (and doesn't entirely skip this item). This
test covers all these cases, and it adds yet another item to the
'filled_test_table' fixture, one which has different attributes,
so we can see these issues.
As usual, this test passes in both DynamoDB and Alternator, to
assure we correspond to the *right* behavior, not just what we
think is right.
This test actually exposed a bug in the way our code returned
empty items (items which had none of the selected columns),
a bug which was fixed by the previous patch.
Instead of having yet another copy of table-scanning code, this
patch adds a utility function full_scan(), to scan an entire
table (with optional extra parameters for the scan) and return
the result as an array. We should simply existing tests in
test_scan.py by using this new function.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
When a Scan selects only certain attributes, and none of the key
attributes are selected, for some of the scanned items *nothing*
will remain to be output, but still Dynamo outputs an empty item
in this case. Our code had a bug where after each item we "moved"
the object leaving behind a null object, not an empty map, so a
completely empty item wasn't output as an empty map as expected,
and resulted in boto3 failing to parse the response.
This simple one-line patch fixes the bug, by resetting the item
to an empty map after moving it out.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Instead of blindly returning "localhost:8000" in response to
DescribeEndpoints and for sure causing us problems in the future,
the right thing to do is to return the same domain name which the
user originally used to get to us, be it "localhost:8000" or
"some.domain.name:1234". But how can we know what this domain name
was? Easy - this is why HTTP 1.1 added a mandatory "Host:" header,
and the DynamoDB driver I tested (boto3) adds it as expected,
indeed with the expected value of "localhost:8000" on my local setup.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Although different partitions are returned by a Scan in (seemingly)
random order, items in a single partition need to be returned sorted
by their sort key. This adds a test to verify this.
This patch adds to the filled_test_table fixture, which until now
had just one item in each partition, another partition (with the key
"long") with 164 additional items. The test_scan_sort_order_string
test then scans this table, and verifies that the items are really
returned in sorted order.
The sort order is, of course, string order. So we have the first
item with sort key "1", then "10", then "100", then "101", "102",
etc. When we implement numeric keys we'll need to add a version
of this test which uses a numeric clustering key and verifies the
sort order is numeric.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Because of a typo, we incorrectly set the table's sort key as a second
partition key column instead of a clustering key column. This has bad
but subtle consequences - such as that the items are *not* sorted
according to the sort key. So in this patch we fix the typo.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
DescribeEndpoints is not a very important API (and by default, clients
don't use it) but I wanted to understand how DynamoDB responds to it,
and what better way than to write a test :-)
And then, if we already have a test, let's implement this request in
Scylla as well. This is a silly implementation, which always returns
"localhost:8000". In the future, this will need to be configurable -
we're not supposed here to return *this* server's IP address, but rather
a domain name which can be used to get to all servers.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Most of the request types need to a TableName parameter, specifying the
name of the table they operate on. There's a lot of boilerplate code
required to get this table name and verify that it is valid (the parameter
exists, is a string, passes DynamoDB's naming rules, and the table
actually exists), which resulted in a lot of code duplication - and
in some cases missing checks.
So this patch introduces two utility functions, get_table_name()
and get_table(), to fetch a table name or the schema of an existing
table, from the request, with all necessary validation. If validation
fails, the appropriate api_error() is thrown so the user gets the
right error message.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Remove unused random-string code from conftest.py, and also add a
TODO comment how we should speed up filled_test_table fixture by
using a batch write - when that becomes available in Alternator.
(right now this fixture takes almost 4 seconds to prepare on a local
Alternator, and a whopping 3 minutes (!) to prepare on DynamoDB).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The test test_put_and_get_attribute_types needlessly named all the
different attributes and their variables, causing a lot of repetition
and chance for mistakes when adding additional attributes to the test.
In this rewrite, we only have a list of items, and automatically build
attributes with them as values (using sequential names for the attributes)
and check we read back the same item (Python's dict equality operator
checks the equality recursively, as expected).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Although we planned to initially support only string types, it turns out
for the attributes (*not* the key), we actually support all types already,
including all scalar types (string, number, bool, binary and null) and
more complex types (list, nested document, and sets).
This adds a tests which PutItem's these types and verifies that we can
retrieve them.
Note that this test deals with top-level attributes only. There is no
attempt to modify only a nested attribute (and with the current code,
it wouldn't work).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
In our tests, we cannot really assume that ListTables should returns *only*
the tables we created for the test, or even that a page size of 100 will
be enough to list our 3 pages. The issue is that on a shared DynamoDB, or
in hypothetical cases where multiple tests are run in parallel, or previous
tests had catestrophic errors and failed to clean up, we have no idea how
many unrelated tables there are in the system. There may be hundreds of
them. So every ListTables test will need to use paging.
So in this re-implementation, we begin with a list_tables() utility function
which calls ListTables multiple times to fetch all tables, and return the
resulting list (we assume this list isn't so huge it becomes unreasonable
to hold it in memory). We then use this utility function to fetch the table
list with various page sizes, and check that the test tables we created are
listed in the resulting list.
There's no longer a separate test for "all" tables (really was a page of 100
tables) and smaller pages (1,2,3,4) - we now have just one test that does the
page sizes 1,2,3,4, 50 and 100.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch cleans up some comments and reorganizes some functions in
conftest.py, where the test_table fixture was defined. The goal is to
later add additional types of test tables with different schemas (e.g.,
just a partition key, different key types, etc.) without too much
code duplication.
This patch doesn't change anything functional in the tests, and they
still pass ("pytest --local" runs all tests against the local Alternator).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The ck_from_json() utility function is easier to use if it handles
the no-clustering-key case as the callers need them too, instead of
requiring them to handle the no-clustering-key case separately.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>