Commit Graph

184 Commits

Author SHA1 Message Date
Piotr Sarna
1bd79705fb alternator: use partition tombstone if there's no clustering key
As @tgrabiec helpfully pointed out, creating a row tombstone
for a table which does not have a clustering key in its schema
creates something that looks like an open-ended range tombstone.
That's problematic for KA/LA sstable formats, which are incapable
of writing such tombstones, so a workaround is provided
in order to allow using KA/LA in alternator.

Fixes #6035

(cherry picked from commit 0a2d7addc0)
2020-04-16 12:01:51 +03:00
Piotr Sarna
781fbe8070 alternator: add service permit to callbacks
As a first step towards introducing admission control, the API
of alternator callbacks is extended with an additional 'permit'
parameter.
2020-03-16 07:44:25 +01:00
Nadav Har'El
77444a38a1 alternator: allow consistent reads on LSI - but not on GSI
Recently, Materialized Views were modified (see issue #4365) so that local
view updates (when both base and view replicas are the same node) are
synchronous. In particular, when the view's partition key is the same as
the base table's, view writes are synchronous: A write now only returns
after CL copies of the view data have been written.

Alternator's LSI have exactly this case (same partition key as the base).
This makes strongly-consistent (CL=LOCAL_QUORUM) reads in Alternator work
correctly, so we update the documentation accordingly to no longer say
that we don't support this DynamoDB feature.

However unlike LSIs, for GSIs strongly-consistent reads are still not
supported, and should not be supported (they are also not supported by
DynamoDB). Such reads should generate an error. So this patch fixes this
too. A GSI test which tested that strongly consistent reads are forbidden,
which used to xfail, now passes so the patch removes the "xfail".

Finally, we can simplify the LSI tests by using consistent reads instead of
eventually-consistent reads with retries. Beyond simplifying the test, it's
also an opportunity to *use* strongly-consistent reads and make sure that
they work (while, as mentioned above, similar reads for GSIs are refused).

Fixes #5007

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200311170446.28611-1-nyh@scylladb.com>
2020-03-12 09:18:00 +01:00
Nadav Har'El
96ca5ac2c8 alternator: use separate smp_service_group for bouncing requests
Until this patch, we used the default_smp_service_group() when bouncing
Alternator requests between shards (which is needed for LWT).

This patch creates a new smp_service_group for this purpose, which is
limited to 5000 concurrent requests (the same limit used for CQL's
bounce_request_smp_service_group). The purpose of this limit is to avoid
many shards admitting a huge number of requests and bouncing all of them
to the same shard who now can't "unadmit" these requests.

Fixes #5664.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200304170825.27226-1-nyh@scylladb.com>
2020-03-05 10:17:51 +01:00
Nadav Har'El
91d9632909 alternator: add rjson::remove_member() convenience function
This patch adds a rjson::remove_member() wrapper to the RemoveMember
method, which takes a std::string_view. But beyond the convenience, this
actually works around a subtle bug in RemoveMember where, if given a
StringRef parameter, ignores its length (see upstream issue
https://github.com/Tencent/rapidjson/issues/1649).

In the one place we used RemoveMember, it forced us to copy the string
because it wasn't null-terminated. The solution proposed here involves
wrapping the string view in a GenericValue - which no longer needs to copy
the string, but still works around the bug.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200303143524.28300-1-nyh@scylladb.com>
2020-03-03 16:35:41 +01:00
Nadav Har'El
0fcb226412 alternator: switch rjson::find() to use std::string_view
Our rjson::find() convenience function used RapidJson's "StringRef" type,
which is almost exactly like std::string_view. If we switch to use
string_view as we do in this patch, a lot of call sites become much simpler.

Moreover, there was an even more important motivation for this patch:
the RapidJson FindMember() function we used in rjson::find() has a bug when
given a StringRef - although a StringRef contains a length, the FindMember()
code ignores it and expects the string to be null-terminated (see:
https://github.com/Tencent/rapidjson/issues/1649). In this patch, we wrap
the pointer and length of a std::string_view in an rjson::value, a code path
which bypasses the FindMember bug, and yet does not require copying the
string.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200303141814.26929-1-nyh@scylladb.com>
2020-03-03 16:35:41 +01:00
Piotr Sarna
6f8c70d54b alternator: fix returning raw JSON errors
A couple of places in executor code leaked raw JSON errors to the user
instead of formulating a proper ValidationException message.
These places are now fixed, and the next patch in this series will
act as a regression checker, since all JSON errors will be returned
as SerializationException, not ValidationException instances.
2020-02-28 07:57:12 +02:00
Piotr Sarna
2402955d45 alternator: move parsing in front of executor
Parsing a request string into JSON happens as a first thing
in every request, so it can be performed before calling
any executor callbacks. The most important thing however,
is that making parsing a separate stage allows certain optimizations,
e.g. running all parsing in a single seastar thread, which allows
adding yields to rjson parsing later.
2020-02-28 07:57:12 +02:00
Nadav Har'El
0ab6c7fcef alternator: stricter checks for user-supplied attribute values
Until now, PutItem or UpdateItem could be used to insert almost any JSON
as an attribute's value - even those that do not match DynamoDB's typed
value specification.

Among other things, the new validation allows us to reject empty sets,
strings or byte arrays - which are (somewhat artificially) forbidden in
DynamoDB.

Also added tests for the empty sets, strings and byte arrays that should
be rejected.

Fixes #5896

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200225150525.4926-1-nyh@scylladb.com>
2020-02-26 08:12:26 +01:00
Nadav Har'El
6339f419ac alternator: removing all elements from a set should delete it
DynamoDB does not support empty sets. Operations which remove elements
from a set attribute should remove the attribute when the last item is
removed - not leave an empty set as it incorrectly does now.

Incidentally, the same patch fixes another bug - deleting elements from
a non-existent set attribute should be allowed (and do nothing), not fail
as it does now.

This patch also includes tests for both bugs.

Fixes #5895

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200225125343.31629-1-nyh@scylladb.com>
2020-02-26 08:12:19 +01:00
Nadav Har'El
e075eff915 alternator: complete implementation of ReturnValues parameter
This patch completes the support for the ReturnValues parameter for
the UpdateItem operation. This parameter has five settings - NONE, ALL_OLD,
ALL_NEW, UPDATED_OLD and UPDATED_NEW. Before this patch we already
supported NONE and ALL_OLD - and this patch completes the support for the
three remaining modes: ALL_NEW, UPDATED_OLD and UPDATED_NEW.

The patch also continues to improve test_returnvalues.py with additional
corner cases discovered during the development. After this patch, only
one xfailing test remains - testing updates to nested document paths,
which we do not yet support (even without the ReturnValues parameter).

After this patch, the support of ReturnValues is complete - for all
operations (UpdateItem, PutItem and DeleteItem) and all of its possible
settings.

Fixes #5053

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200221224221.31237-5-nyh@scylladb.com>
2020-02-24 10:40:53 +01:00
Nadav Har'El
1e500a2a34 alternator: rjson: another variant of set_with_string_name() utility
The rjson::set_with_string_name() utility function copies the given
string into the JSON key. The existing implementation required that this
input string be an std::string&, but a std::string_view would be fine too,
and I want to use it in new code to avoid yet another unnecessary copy.

Adding the overloads also exposes a few places where things were
implicitly converted to std::string and now cause an ambiguity - and
clearing up this ambiguity also allowed me to find places where this
conversion was unnecessary.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200221224221.31237-4-nyh@scylladb.com>
2020-02-24 10:38:54 +01:00
Nadav Har'El
fa5c2a4f58 alternator: UpdateItem only deleting attribute shouldn't create item
UpdateItem operations usually need to add a row marker:

 * An empty UpdateItem is supposed to create a new empty item (row).
   Such an empty item needs to have a row marker.

 * An UpdateItem to add an attribute x and then later an UpdateItem
   to remove this attribute x should leave an empty item behind.
   This means the first UpdateItem needed to add a row marker, so
   it will be left behind after the second UpdateItem.

So the existing code always added a row marker in UpdateItem.

However, there is one case where we should NOT create the row marker:
When the UpdateItem operation only has attribute deletions, and nothing
else, and it is applied to a key with no pre-existing item, DynamoDB
does not create this item. So neither should we.

This patch includes a new test for this test_update_item_non_existent,
which passes on DynamoDB, failed on Alternator before this patch, and
passes after the patch.

Fixes #5862.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200221224221.31237-3-nyh@scylladb.com>
2020-02-24 10:38:10 +01:00
Nadav Har'El
e8cbbba653 alternator: partial implementation of ReturnValues parameter
Before this patch, we only supported the ReturnValues=NONE setting of the
PutItem, UpdateItem and DeleteItem operations.

This patch also adds full support for the ReturnValues=ALL_OLD option
in all three operation. This option directs Alternator to return the full
old (i.e., pre-modification) contents of the item.

We implement this as a RMW (read-modify-write) operation just as we do
other RMW operations - i.e., by default we use LWT, to ensure that we really
return the value of the item directly before the modification, the same
value that would have been used in a conditional expression if there was one.

NOTE: This implementation means one cannot use ReturnValues=ALL_OLD in
forbid_rmw write isolation mode. One may theorize that if we only need the
read-before-write for ReturnValues and not for a conditional expression,
it should have been enough to use a separate read (as we do in unsafe_rmw
isolation mode) before the write. But we don't have this "optimization" yet
and I'm not sure it's a valid optimization at all - see discussion in
a new issue #5851.

This patch completes the ReturnValues support for the PutItem and DeleteItem
operations. However, the third operation, UpdateItem, supports three more
ReturnValues modes: UPDATED_OLD, ALL_NEW and UPDATED_NEW. We do not yet
support those in this patch. If a user tries to use one of these three modes,
an informative error message will be returned. The three tests for these
three unimplemented settings continue to xfail, but the rest of the tests
in test_returnvalues.py (except one test of nested attribute paths) now
pass so their xfail flag is dropped.

Refs #5053

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200219135658.7158-1-nyh@scylladb.com>
2020-02-21 08:32:47 +01:00
Nadav Har'El
405115fa5f alternator: cleanup of get_string_attribute() function
The get_string_attribute() function used attribute_value->GetString()
to return an std::string. But this function does not actually return a
std::string - it returns a char*, which gets implicitly converted to
an std::string by looking for the first null character. This lookup is
unnecessary, because rjson already knows the length of the string, and
we can use it.

This patch is just a cleanup and a very small performance improvement -
I do not expect it fixes any bugs or changes anything functional, because
JSON strings anyway cannot contain verbatim nulls.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200219101159.26717-1-nyh@scylladb.com>
2020-02-19 11:59:54 +01:00
Avi Kivity
6c7aa18238 Merge "Introduce schema::get_partitioner" from Piotr
"
Introduce schema::get_partitioner and use it instead of dht::global_partitioner.

Fixes #5493

Tests: unit(dev, release, debug)
"

* 'per_table_partitioner_prep' of https://github.com/haaawk/scylla: (35 commits)
  cdc: stop using partitioners
  partitioner_test: stop calling set_global_partitioner
  storage_service: stop calling global_partitioner()
  mutation_writer_test: stop calling global_partitioner()
  schema: reduce number of global_partitioner() calls
  test_services: stop calling global_partitioner()
  sstable_utils: stop calling global_partitioner()
  sstable_resharding_test: stop depending on global partitioner
  sstable_mutation_test: stop calling global_partitioner()
  sstable_data_file_test: stop calling global_partitioner()
  random_schema: stop taking partitioner in constructor
  mutation_reader_test: stop calling global_partitioner()
  multishard_mutation_query_test: stop calling global_partitioner()
  row_level repair: stop calling global_partitioner()
  distribute_reader_and_consume_on_shards: don't take partitioner
  thrift: reduce global_partitioner() calls
  binary_search: stop calling global_partitioner()
  index_entry: stop calling global_partitioner()
  mc writer: stop calling global_partitioner()
  sstable: stop calling global_partitioner()
  ...
2020-02-17 18:12:53 +02:00
Piotr Jastrzebski
2d7532f87f dht: add dht::get_token
and replace all calls to dht::global_partitioner().get_token

dht::get_token is better because it takes schema and uses it
to obtain partitioner instead of using a global partitioner.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:15 +01:00
Piotr Jastrzebski
ca4a89d239 dht: add dht::decorate_key
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key

It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:06 +01:00
Piotr Jastrzebski
abd76e566f dht::shard_of: stop calling global_partitioner()
Take const schema& as a parameter of shard_of and
use it to obtain partitioner instead of calling
global_partitioner().

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:23:16 +01:00
Pavel Solodovnikov
d64fd52ae5 paging_state: switch from shared_ptr to lw_shared_ptr
Change the way `service::pager::paging_state` is passed around
from `shared_ptr` to `lw_shared_ptr`. It's safe since
`paging_state` is final.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:23:36 +03:00
Nadav Har'El
b01b11c1f3 alternator: implement KeyConditionExpression
This patch adds to Alternator's Query operation full support for the
KeyConditionExpression parameter - a newer syntax for specifying which
partition and which sort-key range are to be queried. The older syntax
for the same thing, "KeyConditions", was already supported by Alternator.

The patch also includes additional test cases for more corner cases
discovered during the development. After this patch, all 47 test cases
in test_key_condition_expression.py pass on Alternator (and, of course,
also on DynamoDB).

One interesting thing to note about this patch is that it does *not*
include a new parser for the KeyConditionExpression syntax. It turns out
that we need - to be fully compatible with DynamoDB - to use the
already existing parser for *ConditionExpression* syntax, and then forbid
certain things not allowed in KeyConditionExpression (you can see a lot
of examples in code comments and in the tests included in this patch).
Most importantly, allowing the full ConditionExpression syntax also
means we allow completely useless parentheses on key conditions, e.g.,
'((p=:p) AND (c=:c))'. While the KeyConditionExpression documentation
doesn't mention allowing these parentheses, DynamoDB does support them -
and it turns out that boto3 uses them when you use its condition builders,
as we do in one test case (test_query_key_condition_expression).

Fixes #5037.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200213192509.32685-4-nyh@scylladb.com>
2020-02-16 11:22:30 +02:00
Nadav Har'El
15515b2cc1 alternator: more useful get_key_from_typed_value() utility function
We had a get_key_from_typed_value() utility function to decode a
JSON-encoded value with a known type (the JSON encoding is a map whose
key is the type, the value always a string because all possible key types -
string, bytes and number, are encoded as strings).

However, the function was less useful than it could have been - it was
missing one check for a malformed object (a check which only appeared in
one of its callers), it unnecessarily received the column's expected type
(all the callers passed it the given key column's type).

The cleaned up function will be more useful for the following patch
to support KeyConditionExpression, which wants to reuse it.

While at it, this patch also uses rjson::to_string_view(it->value)
instead of the less correct it->value.GetString() (the latter relies
on null-termination, which is actually true for JSON strings, but there
is no reason to rely on it).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200213192509.32685-3-nyh@scylladb.com>
2020-02-16 11:22:30 +02:00
Nadav Har'El
65d0a776c2 merge: alternator: Add keyspace per table
This series implements keyspace-per-table approach for Alternator.
The changes are as follows:
 - when a table is created, its keyspace is created first
 - after table deletion, its keyspace is deleted as well;
   works with views too, since these must be deleted
   before the base table is dropped
 - instead of SimpleStrategy, network topology is used

Keyspaces are created with a prefix not legal from CQL - 'a#'.
I validated that even though not reachable via CQL, keyspaces
created with # character work well and produce correct directories,
restarts work flawlessly too.

Fixes #5611
Refs #5596

Tests: alternator(local, remote)

Piotr Sarna (3):
  alternator: switch to keyspace-per-table approach
  alternator: move to NetworkTopologyStrategy
  alternator-test: add test for recreating a table
2020-02-16 11:22:30 +02:00
Piotr Sarna
fa4ddd2947 alternator: add validating write_isolation tag
In order to prevent users from using incorrect write isolation
configuration, a set of allowed values is introduced.
When tagging a resource (which is considered rare), a tag
will only be allowed if it belongs to the allowed set.
2020-02-13 13:51:31 +01:00
Piotr Sarna
7e6c9cad9a alternator: move rmw_operation to a header
rmw_operation is a class with a public interface, including
a write_isolation enum and a fixed tag name for its configuration.
For convenience, it's moved to a header file, so that code
from executor.cc can use the definitions regardless of their
position in the source file - it prevents reordering functions
just to make sure that rmw_operation is defined before a function
that uses its attributes.
2020-02-13 13:51:31 +01:00
Piotr Sarna
dca6c2c81d alternator: move to NetworkTopologyStrategy
Imstead of SimpleStrategy, NetworkTopologyStrategy is used
for setting up the replication configuration for alternator tables.
Replication factor 3 is used along with a local datacenter,
unless alternator discovers that it's running on a test cluster with
less than 3 nodes - then, RF is reduced accordingly and emits a warning,
which was also the case for SimpleStrategy.
2020-02-13 09:46:46 +01:00
Piotr Sarna
3eb6da224b alternator: switch to keyspace-per-table approach
Instead of a monolith alternator keyspace, each table creates its own
keyspace, named in the following pattern: `a#TABLE_NAME`.
The `a#` prefix contains an illegal CQL character in order to ensure
that these keyspaces are never created via CQL.
2020-02-13 09:46:19 +01:00
Piotr Sarna
dcf54331ea alternator: allow custom names for keyspaces
The maybe_create_keyspace utility now accepts a parameter - the desired
name for a newly created keyspace.
2020-02-13 09:16:37 +01:00
Nadav Har'El
b93204d6bf Alternator: allow CreateTable with streams explicitly turned off
While Alternator doesn't yet support creating a table with streams
(i.e., CDC) turned on, we should only failed the creation if streams
were really turned on. If the StreamSpecification option exists, but
does *not* ask to turn on streams, we should not fail the creation -
and this patch fixes this.

This patch also adds two tests - one where StreamSpecification is
passed but does not ask to turn on streams (so table creation should
succeed), and another test which explicitly requests to turn on
streams. The second test still xfails on Alternator, and should continue
to do so until we implement streams (we do *not* want to silently
ignore a request to turn on streams).

Fixes #5796

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200212100546.16337-1-nyh@scylladb.com>
2020-02-12 17:29:02 +01:00
Piotr Sarna
f4e51a96ca alternator: replace overloaded with overloaded_functor
Turns out we already have a utility header for a visitor
with overloaded lambdas. This patch purges the explicit
reimplementation of the same trick and uses the existing
class instead.
Message-Id: <60c0b9a978f8208b188ef6ddc0564cb133bed707.1581496049.git.sarna@scylladb.com>
2020-02-12 14:21:42 +02:00
Gleb Natapov
38fcab3db4 alternator: pass tracing state explicitly instead of relying on it been in the client_state
Multiple requests can use the same client_state simultaneously, so it is
not safe to use it as a container for a tracing state which is per
request. This is not yet an issue for the alternator since it creates
new client_state object for each request, but first of all it should not
and second trace state will be dropped from the client_state, by later
patch.
2020-02-10 14:50:55 +02:00
Piotr Sarna
4a9536b7c1 alternator: add configuring write isolation policy via tags
Until now, write isolation policy was hardcoded to always enforcing LWT.
From now on, setting a tag via UpdateTags request or during table
creation will associate a policy with given table.
The tag key is 'system:write_isolation' and its value can be one of:
 * 'f' - forbid RMW
 * 'a' - always enforce RMW
 * 'o' - only RMW writes will go through LWT
 * 'u' - unsafe RMW (to be deprecated/eradicated)
2020-02-06 10:26:26 +01:00
Piotr Sarna
0479a1bf67 alternator: make _write_isolation a protected attribute
No useful semantic changes yet, but it will help produce better
diffs for future patches.
2020-02-06 10:04:34 +01:00
Piotr Sarna
51c14cb1ce alternator: fix overwriting tags
Tagging a resource with a tag key that already exists should result
in overwriting the old value. It wasn't the case, so it's now fixed
and an appropriate test is added.
2020-02-06 10:04:34 +01:00
Piotr Sarna
ed940f000d alternator: return tags for a table via const reference
The signature of the helper function is changed, so that it's possible
to acquire a const reference of the tags, instead of being forced
to get a copy of the whole map (potentially large).
2020-02-06 10:04:34 +01:00
Nadav Har'El
9fd9ec14c2 storage_proxy: make it into a peering sharded service
We consider globals like service::get_storage_proxy() a bad idea,
and would like to reduce their use as much as possible - and eventually,
eliminate it completely.

One easy case to fix case is when we already have a shard-local proxy,
but now we need the sharded object, to invoke_on() something on it.

In this patch, we turn storage_proxy into a peering_sharded_service.
This means that if you already have a storage_proxy, you can call
its container() function to get the sharded<storage_proxy>, without
needing to call the global service::get_storage_proxy().

We found a few such cases in storage_proxy itself, and in Alternator,
and fixed them to use container() instead of the global function.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-02-05 21:14:18 +02:00
Nadav Har'El
95351016fd alternator: use LWT timestamp - in BatchWriteItems too
A previous patch fixed Alternator's writes to use the timestamp provided
by LWT instead of the current timestamp. That patch fixed the PutItem,
DeleteItem and UpdateItem operations - and this patch fixes the remaining
write operation: BatchWriteItems. So,

Fixes #5653.

Unfortunatly, the requirements of both BatchWriteItems and LWT make the
resulting code - and this patch - somewhat inelegant. BatchWriteItems
requires that we prepare all the operations first - failing if any of them
has an error. Before this patch, the result of this preparation was an
array of mutations, which in a second step we wrote to the database.
But we can no longer use mutations for the result of the first step,
because creating a mutation requires knowing the timestamp, which we don't
know during the preparate phase - we will only know it during the later
LWT operation. So now we need to invent a new intermediate format between
the request and the mutation. This intermediate format is further
complicated by the need to be send it between shards (for LWT's shard
forwarding) so it cannot, for example, contain a reference to a schema.
The fact that different sub-operations need to be sent to different shards,
and that different sub-operations may write to different tables, further
complicate the book-keeping and gives us a bunch of funky-typed maps.
But eventually it all fits together.

After this patch, as before this patch, the same code (now called
put_or_delete_item), is used to implement both the PutItem and DeleteItem
stand-alone operation, and the BachWriteItems operation which includes a
whole list of these PutItem and DeleteItem operation.

This patch also includes two more tests in test_batch.py, which test two
more corner tests we haven't tested before: One tests the capability of
BatchWriteItems to write to more than one table. The other tests that
BatchWriteItems can write an empty item (it is not surprising that it does,
but we do have special code for this case, so we should test it).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-02-05 21:14:18 +02:00
Nadav Har'El
acafcbfdf4 alternator: use LWT timestamp, not current timestamp
The DynamoDB API doesn't have the notion of client-supplied timestamps,
so the server is supposed to use its own current timestamp for write
operations.

However, for LWT writes, we should not use this node's current time:
Different nodes may slightly differ in their clocks, and LWT needs
a monotonically-increasing notion of time for the consistent operations.
LWT provides to the operation's apply() method the specific timestamp that
it should use in its returned mutation - and we should use this timestamp,
not the current timestamp.

In the optional write modes where LWT is not used, we continue to use
the current timestamp (api::new_timestamp()) as before.

This patch fixes the PutItem, UpdateItem and DeleteItem operations.
The BatchWriteItem operation is not yet fixed by this patch - fixing
it will require more elaborate code changes so will be done in a
separate patch.

Refs #5653.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200130122853.7658-1-nyh@scylladb.com>
2020-02-04 10:18:49 +01:00
Nadav Har'El
0a23471eae alternator: switch BatchWriteItems to use LWT too
Today, we use LWT for all PutItem, UpdateItem and GetItem operations.
We do this even for pure writes - writes which do not involve a read
before the write).

But BatchWriteItem also does pure writes - and it doesn't use LWT yet.
So this patch changes it so it does. As before we keep in the code -
not yet configurable by a user - also the option to do these unconditional
writes without LWT.

A BatchWriteItem may change multiple partitions (but a fairly low number -
DynamoDB allows each BatchWriteItem to only do 25 updates) and we start the
different LWT operations in parallel.

This patch collects multiple mutations to the same partition together to
be done with a single LWT operation, so we also add a test for this case,
were we have a batch of writes involving several items in each of several
partitions.

Fixes #5637

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200128160538.11775-1-nyh@scylladb.com>
2020-02-04 10:08:18 +01:00
Piotr Sarna
668e15643d alternator: allow tagging on table creation
During table creation, it's now possible to provide a 'Tags' parameter,
which will add tags to a newly created table. Note that creating a table
and tagging it is not atomic, so in case of failure it's possible to end
up with a created table, but without appropriate tags.
This commit comes with a test.
Message-Id: <00c2e202e9075d2c61e4ee5ba322ff4d5dbe718c.1579618972.git.sarna@scylladb.com>
2020-01-29 10:20:05 +01:00
Piotr Sarna
4c9f2f3c0a alternator: implement tagging
The following requests are implemented:
 - TagResource
 - UntagResource
 - ListTagsOfResource

Also, more tests are added for validating inputs, for both
arns, tag values and tag keys.

Message-Id: <a7ce9534ca580736fea445813fafef75a6139e29.1579618972.git.sarna@scylladb.com>
2020-01-29 10:20:05 +01:00
Piotr Sarna
a81640d402 alternator: replace top-level throws with returns in executor
In order to elide unnecessary throwing, all errors previously thrown
from top-level executor methods (the ones that handle user requests)
are now returned directly.
Message-Id: <73e05d1057ee842576fae11be9d77265ffb2e96f.1579515640.git.sarna@scylladb.com>
2020-01-28 12:39:23 +02:00
Piotr Sarna
854adf5b70 alternator: elide throwing in condition checks
Conditional updates inform the user that the condition is not met
by returning an error. An initial implementation was based on rethrowing
these errors, but returning them directly is considered better
for performance.
2020-01-28 12:39:23 +02:00
Piotr Sarna
a6a65abc3c alternator: change request return type to variant<value, error>
In order to minimize the use of exceptions during normal operations,
each request handler is now able to return either a proper JSON value,
or an instance of api_error, which indicates that something went wrong,
but without having to throw, catch and rethrow C++ exceptions.
This is especially important for conditional updates, since it's
expected to be common to return ConditionalCheckFailedException.
Message-Id: <d8996a0a270eb0d9db8fdcfb7046930b96781e69.1579515640.git.sarna@scylladb.com>
2020-01-28 12:39:23 +02:00
Nadav Har'El
b50274e8a7 alternator: add support for ConditionExpression
This patch adds support for the ConditionExpression parameter of the
item-writing operations in Alternator: PutItem, UpdateItem and DeleteItem.

We already supported conditional updates/put/delete using the "Expected"
parameter. The ConditionExpression parameter implemented here provides a
very similar feature, using a different - and also newer and more powerful -
syntax.

The implementation here reuses much of our existing expression-parsing
infrastructure. Unsurprisingly, ConditionExpression's syntax has much in
common with UpdateExpression which we already support) and also many of the
comparison functions already implemented for "Expected". However, it's still
quite a bit of new code, because of the many different comparisons, functions,
and syntax variations we need to support.

This patch also expands alternator-test/test_condition_expression.py with
a few additional corner cases discovered during the development of this
patch.

Almost all of the tests for this feature (35 out of 39) now pass.
Two tests still fail because we don't yet support nested attributes (this
is a missing feature across Alternator), and two tests fail because of minor
ideosyncracies in DynamoDB's error path that we chose not to duplicate
yet (but still remember the difference in the form of an xfailing test).

Fixes #5035

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-01-23 13:57:33 +02:00
Nadav Har'El
370b963ce5 alternator: reimplement read-modify-write operations using LWT
In this patch, we re-implement the three read-modify-write operations -
PutItem, UpdateItem, DeleteItem. All three operations may need to read the
item before writing it to support conditional updates (the "Expected"
parameter) and UpdateItem may also need the previous item's value for
its update expression (e.g., a user may ask to "set a=a+1" or "set a=b").

Before this patch, the implementation of RMW operations simply did a read,
and then a write - without any attempt to protect concurrent operations.

In this patch, Scylla's LWT mechanism (storage_proxy::cas()) is used
instead, to ensure that concurrent update operations are correctly
isolated even if they are conditional. This means that Alternator now
requires the experimental LWT feature to be enabled (and refuses to
boot if it isn't).

The version presented here is configured to always use LWT for *every*
write, regardless of whether it has a condition or not. So it will
will significantly slow down write-only workloads like YCSB. But the code
in this patch actually includes three other modes, which can be chosen by
setting an enum constant in the code. In the future we will want to let the
user configure this mode, globally, per table or per attribute.

Note that read requests are NOT modified, and work exactly as they did
before: i.e., strongly-consistent reads are done using a normal
CL=LOCAL_QUORUM read - not via LWT. I believe this is good enough given
Dynamo's guarantees, and critical for our read performance.

Also note that patch doesn't yet fix the BatchWriteItem operation.
Although BatchWriteItem does not support any RMW operations - just pure
writes - we may still need to do those pure writes using LWT. This
should be fixed in a follow-up patch.

Unfortunately, this patch involves a large amount of code movement and
reorganization, because:
1. The cas operation requires each operation to be made into an object,
   with a separate apply() function, forcing a lot of code to move.
2. Moreover, we need to do this for three different operations (PutItem,
   UpdateItem, DeleteItem) so to avoid massive code duplication, I had
   to move some common code.
3. The cas operation also forced us to change some of the utility functions'
   APIs.

The end result is that this patch focuses more on a compact and
understandable *end result* than it does on an easy to understand *patch*,
so reviewers - sorry about that.

All alternator-test/ tests pass with this patch (and also with all of the
different optional modes enabled). However, other than that, I did not yet
do any real isolation tests (are concurrent operations really isolated
correctly? or is LWT just faking it? :-) ), performance tests or stress
tests - and I'll definitely need to do those as well.

Fixes #5054

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-01-23 13:57:28 +02:00
Piotr Sarna
8c17b5aec4 alternator: add Arn support for tables
Several API-s, e.g. TagResource, UntagResource and ListTagsOfResource
rely on identifying tables by their "Arn". According to the docs,
an Arn should uniquely identify a resource, so it's implemented as:
  arn:KEYSPACE_NAME:TABLE_NAME
which is a minimal set of information that uniquely identifies a table
in Scylla. The `arn:` prefix is needed for compatibility purposes.

This commit adds a simple function for generating the Arn string,
and also includes it in DescribeTable result under the TableArn attribute.

Refs #5066
2020-01-20 12:24:51 +01:00
Nadav Har'El
5b08ec3d2c alternator: error on unsupported ScanIndexForward=false
We do not yet support the ScanIndexForward=false option for reversing
the sort order of a Query operation, as reported in issue #5153.
But even before implementing this feature, it is important that we
produce an error if a user attempts to use it - instead of outright
ignoring this parameter and giving the user wrong results. This is
what this patch does.

Before this patch, the reverse-order query in the xfailing test
test_query.py::test_query_reverse seems to succeed - yet gives
results in the wrong order. With this patch, the query itself fails -
stating that the ScanIndexForward=false argument is not supported.

Refs #5153

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200105113719.26326-1-nyh@scylladb.com>
2020-01-14 10:01:06 +02:00
Nadav Har'El
1f64a3bbc9 alternator: error on unsupported ReturnValues option
We don't support yet the ReturnValues option on PutItem, UpdateItem or
DeleteItem operations (see issue #5053), but if a user tries to use such
an option anyway, we silently ignore this option. It's better to fail,
reporting the unsupported option.

In this patch we check the ReturnValues option and if it is anything but
the supported default ("NONE"), we report an error.

Also added a test to confirm this fix. The test verifies that "NONE" is
allowed, and something which is unsupported (e.g., "DOG") is not ignored
but rather causes an error.

Refs #5053.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191216193310.20060-1-nyh@scylladb.com>
2020-01-03 15:48:20 +02:00
Nadav Har'El
fc85c49491 alternator: error on unsupported parallel scan
We do not yet support the parallel Scan options (TotalSegments, Segment),
as reported in issue #5059. But even before implementing this feature, it
is important that we produce an error if a user attempts to use it - instead
of outright ignoring this parameter. This is what this patch does.

The patch also adds a full test, test_scan.py::test_scan_parallel, for the
parallel scan feature. The test passes on DynamoDB, and still xfails
on Alternator after this patch - but now the Scan request fails immediately
reporting the unsupported option - instead of what the pre-patch code did:
returning the wrong results and the test failing just when the results
do not match the expectations.

Refs #5059.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191217084917.26191-1-nyh@scylladb.com>
2019-12-17 11:27:56 +02:00