In order to prepare alternator for adding regular columns to schema,
i.e. in order to create a materialized view over them,
the code is changed so that updating no longer assumes that only keys
are included in the table schema.
Since in the future we may want to have more regular columns
in alternator tables' schemas, the code is changed accordingly,
so all regular columns will be fetched instead of just the attribute
map.
If no regular column attributes are passed to PutItem, the attr
collector serializes an empty collection mutation nonetheless
and sends it. It's redundant, so instead, if the attr colector
is empty, the collection does not get serialized and sent to replicas.
Keeping an instance of client_state is a convenient way of being able
to use tracing for alternator. It's also currently used in paging,
so adding a client state to executor removes the need of keeping
a dummy value.
If a user tries to create a table with a unsupported feature -
a local secondary index, a used-defined encryption key or supporting
streams (CDC), let's refuse the table creation, so the application
doesn't continue thinking this feature is available to it.
The "Tags" feature is also not supported, but it is more harmless
(it is used mostly for accounting purposes) so we do not fail the
table creation because of it.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818125528.9091-1-nyh@scylladb.com>
In CQL, before a user can create a table, they must create a keyspace to
contain this table and, among other things, specify this keyspace's RF.
But in the DynamoDB API, there is no "create keyspace" operation - the
user just creates a table, and there is no way, and no opportunity,
to specify the requested RF. Presumably, Amazon always uses the same
RF for all tables, most likely 3, although this is not officially
documented anywhere.
The existing code creates the keyspace during Scylla boot, with RF=1.
This RF=1 always works, and is a good choice for a one-node test run,
but was a really bad choice for a real cluster with multiple nodes, so
this patch fixes this choice:
With this patch, the keyspace creation is delayed - it doesn't happen
when the first node of the cluster boots, but only when the user creates
the first table. Presumably, at that time, the cluster is already up,
so at that point we can make the obvious choice automatically: a one-node
cluster will get RF=1, a >=3 node cluster will get RF=3. The choice of
RF is logged - and the choice of RF=1 is considered a warning.
Note that with this patch, keyspace creation is still automatic as it
was before. The user may manually create the keyspace via CQL, to
override this automatic choice. In the future we may also add additional
keyspace configuration options via configuration flags or new REST
requests, and the keyspace management code will also likely change
as we start to support clusters with multiple regions and global
tables. But for now, I think the automatic method is easiest for
users who want to test-drive Alternator without reading lengthy
instructions on how to set up the keyspace.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190820180610.5341-1-nyh@scylladb.com>
We allow BillingMode to be set to either PAY_PER_REQUEST (the default)
or PROVISIONED, although neither mode is fully implemented: In the former
case the payment isn't accounted, and in the latter case the throughput
limits are not enforced.
But other settings for BillingMode are now refused, and we add a new test
to verify that.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818122919.8431-1-nyh@scylladb.com>
Currently Alternator starts all Scylla requests (including both reads
and writes) without any timeout set. Because of bugs and/or network
problems, Requests can theoretically hang and waste Scylla request for
hours, long after the client has given up on them and closed their
connection.
The DynamoDB protocol doesn't let a user specify which timeout to use,
so we should just use something "reasonable", in this patch 10 seconds.
Remember that all DynamoDB read and write requests are small (even scans
just scan a small piece), so 10 seconds should be above and beyond
anything we actually expect to see in practice.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190812105132.18651-1-nyh@scylladb.com>
It turns out that recent rjson patches introduced some buggy
tabs instead of spaces due to bad IDE configuration. The indentation
is restored to spaces.
Alternator allows passing hash and sort key restrictions
as filters - it is, however, better to incorporate these restrictions
directly into partition and clustering ranges, if possible.
It's also necessary, as optimizations inside restrictions_filter
assume that it will not be fed unneeded rows - e.g. if filtering
is not needed on partition key restrictions, they will not be checked.
Some underlying operations (e.g. paging) make use of cql_stats
structure from CQL3. As such, cql_stats structure is added
to alternator stats in order to gather and use these statistics.
This patch adds partial support for GSI (Global Secondary Index) in
Alternator, implemented using a materialized view in Scylla.
This initial version only supports the specific cases of the index indexing
a column which was already part of the base table's key - e.g., indexing
what used to be a sort key (clustering key) in the base table. Indexing
of non-key attributes (which today live in a map) is not yet supported in
this version.
Creation of a table with GSIs is supported, and so is deleting the table.
UpdateTable which adds a GSI to an existing table is not yet supported.
Query and Scan operations on the index are supported.
DescribeTable does not yet list the GSIs as it should.
Seven previously-failing tests now pass, so their "xfail" tag is removed.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808090256.12374-1-nyh@scylladb.com>
With libjsoncpp we were forced to work around the problem of
non-noexcept constructors by using an intermediate unique pointer.
Objects provided by rapidjson have correct noexcept specifiers,
so the workaround can be dropped.
Profiling alternator implied that JSON parsing takes up a fair amount
of CPU, and as such should be optimized. libjsoncpp is a standard
library for handling JSON objects, but it also proves slower than
rapidjson, which is hereby used instead.
The results indicated that libjsoncpp used roughly 30% of CPU
for a single-shard alternator instance under stress, while rapidjson
dropped that usage to 18% without optimizations.
Future optimizations should include eliding object copying, string copying
and perhaps experimenting with different JSON allocators.
Our CreateTable handler assumed that the function
migration_manager::announce_new_column_family()
returns a failed future if the table already exists. But in some of
our code branches, this is not the case - the function itself throws
instead of returning a failed future. The solution is to use
seastar::futurize_apply() to handle both possibilities (direct exception
or future holding an exception).
This fixes a failure of the test_table.py::test_create_table_already_exists
test case.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The function attrs_type() return a supposedly singleton, but because
it is a seastar::shared_ptr we can't use the same one for multiple
threads, and need to use a separate one per thread.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804163933.13772-1-nyh@scylladb.com>
Our ListTables implementation uses get_column_families(), which lists both
base tables and materialized views. We will use materialized views to
implement DynamoDB's secondary indexes, and those should not be listed in
the results of ListTables.
The patch also includes a test for this.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190717133103.26321-2-nyh@scylladb.com>
As in case of set_diff, an exception message in set_sum should include
the user-provided request (ADD) rather than our internal helper function
set_sum.
Although we do not support GSI yet, until now we silently ignored
CreateTable's GSI parameter, and the user wouldn't know the table
wasn't created as intended.
In this patch, GSI is still unsupported, but now CreateTable will
fail with an error message that GSI is not supported.
We need to change some of the tests which test the error path, and
expect an error - but should not consider a table creation error
as the expected error.
After this patch, test_gsi.py still fails all the tests on
Alternator, but much more quickly :-)
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190711161420.18547-1-nyh@scylladb.com>
The Query operation's conditions can be used to search for a particular
hash key or both hash and sort keys - but not any other combinations.
We previously forgot to verify most errors, so in this patch we add
missing verifications - and tests to confirm we fail the query when
DynamoDB does.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190711132720.17248-1-nyh@scylladb.com>
The holds_path() utility function is actually used to check if a value
needs read before write, so its name is changed to more fitting
check_needs_read_before_write.
Alternator currently keeps an item's attributes inside a map, and we
had a serious bug in the way we build mutations for this map:
We didn't know there was a requirement to build this mutation sorted by
the attribute's name. When we neglect to do this sorting, this confuses
Scylla's merging algorithms, which assume collection cells are thus
sorted, and the result can be duplicate cells in a collection, and the
visible effect is a mutation that seems to be ignored - because both
old and new values exist in the collection.
So this patch includes a new helper class, "attribute_collector", which
helps collect attribute updates (put and del) and extract them in correctly
sorted order. This helper class also eliminates some duplication of
arcane code to create collection cells or deletions of collection cells.
This patch includes a simple test that previously failed, and one xfail
test that failed just because of this bug (this was the test that exposed
this bug). Both tests now succeed.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190709160858.6316-1-nyh@scylladb.com>
In order to serve update requests that depend on read-before-write,
a proper helper function which fetches the existing item with a given
key from the database is added.
This read-before-write mechanism is not considered safe, because it
provides no linearizability guarantees and offers no synchronization
protection. As such, it should be consider a placeholder that works
fine on a single machine and/or no concurrent access to the same key.
The calculate_value utility function is going to need more context
in order to resolve paths present in the right-hand side of update_item
operators: update_info and schema.
Historically, resolving a path checked for key columns, which are not
allowed to be on the left-hand side of the assignment. However, path
resolving will now also be used for right-hand side, where it should
be allowed to use the key value.
In order to implement read-before-write in the future, calculate_value
now accepts an additional parameter: previous_item. If read-before-write
was performed, previous_item will contain an item for the given key
which already exists in the database at the time of the update.
Calculating value represented as 'v1 + v2' or 'v1 - v2' was previously
implemented with a double type, which offers limited precision.
From now on, these computations are based on big_decimal, which
allows returning values without losing precision.
This patch depends on 'add big_decimal arithmetic operators' series.
Message-Id: <f741017fe3d3287fa70618068bdc753bfc903e74.1562318971.git.sarna@scylladb.com>
The idiomatic way to use an std::variant depending the type holds is to use
std::visit. This modern API makes it unnecessary to write many boiler-plate
functions to test and cast the type of the variant, and makes it impossible
to forget one of the options. So in this patch we throw out the old ways,
and welcome the new.
Thanks to Piotr Sarna for the idea.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190704205625.20300-1-nyh@scylladb.com>
This patch adds to Alternator an implementation of the BatchGetItem
operation, which allows to start a number of GetItem requests in parallel
in a single request.
The implementation is almost complete - the only missing feature is the
ability to ask only for non-top-level attributes in ProjectionExpression.
Everything else should work, and this patch also includes tests which,
as usual, pass on DynamoDB and now also on Alternator.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Amazingly, it appears we never tested booting Alternator a second time :-)
Our initialization code creates a new keyspace, and was supposed to ignore
the error if this keyspace already existed - but we thought the error will
come as an exceptional future, which it didn't - it came as a thrown
exception. So we need to change handle_exception() to a try/catch.
With this patch, I can kill Alternator and it will correctly start again.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Operations which take a key as parameter, namely GetItem, UpdateItem,
DeleteItem and BatchWriteItem's DeleteRequest, already fail if the given
key is missing one of the nessary key attributes, or has the wrong types
for them. But they should also fail if the given key has spurious
attributes beyond those actually needed in a key.
So this patch adds this check, and tests to confirm that we do these checks
correctly.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The PutItem operation, and also the PutRequest of BatchWriteItem, are
supposed to completely replace the item - not to merge the new value with
the previous value. We implemented this wrongly - we just wrote the new
item forgetting a tombstone to remove the old item.
So this patch fixes these operations, and adds tests which confirm the
fix (as usual, these tests pass on DynamoDB, failed on Alternator before
this patch, and pass after the patch).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>