Commit Graph

262 Commits

Author SHA1 Message Date
Piotr Sarna
495b7b5596 alternator: use unique_ptr for storing attribute paths
Previous commit eliminated the only copying of the attribute paths,
so it's now safe to make the object noncopyable.
Message-Id: <5468e8c17d3d42a03c1dd33706bbaac0c58959ce.1613398751.git.sarna@scylladb.com>
2021-02-15 18:22:59 +02:00
Piotr Sarna
7e1641224c alternator: batch: pass attrs_to_get by a shared pointer
The attrs_to_get object was previously copied, but it's quite
a heavyweight operations, since this object may contain an
instance of std::map or std::unordered_map.
To avoid copying whole maps, the object is wrapped in a shared
const pointer.
Message-Id: <75ad810de16c630b65ae8d319cb4b37e1de8085f.1613398751.git.sarna@scylladb.com>
2021-02-15 18:22:56 +02:00
Nadav Har'El
49cd9b3fd5 alternator: correct implemention of UpdateItem with nested attributes and ReturnValues
This patch fixes the last missing part of nested attribute support in
UpdateItem - returning the correct attributes when ReturnValues is requested.
When the expression says "a.b = :val" and ReturnValues is set to UPDATED_OLD
or UPDATED_NEW, only the actual updated attribute a.b should be returned, not
the entire top-level attribute a as we did before this patch.

This patch was made very simple because our existing hierarchy_filter()
function already does exactly the right thing, and can trivially be made to
accept any attribute_path_map<T> (in our case attribute_path_map<action>),
not just attrs_to_get as it did until now.

This patch also adds several more checks to the test in test_returnvalues.py
to improve the test's coverage even more. Interestingly, I discovered two
esoteric cases where DynamoDB does something which makes little sense, but
apparently simplified their implementation - but the beautiful thing is that
it also simplifies our implementation! See long comments about these two
cases in the test code.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
964500e47a alternator: fix bug in ReturnValues=UPDATED_NEW
Commit 0c460927bf broke UpdateItem's
ReturnValues=UPDATED_NEW by moving previous_item while it is still
needed. None of the existing tests broke because none of them needed
previous_item after it was moved - but it started to break when we
add support for nested attribute paths, which need this previous_item.

So this patch returns the move to a copy, as it was before the
aforementioned patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
33685a683e alternator: implemented nested attribute paths in UpdateExpression
This patch adds full support for nested attribute paths (e.g., a.b[3].c)
in UpdateExpression. After in previous patches we already added such
support for ProjectionExpression, ConditionExpression and FilterExpression
this means the nested attribute paths feature is now complete, so we
remove the warning from the documents. However, there is one last loose
end to tie and we will do it in the next patch: After this patch, the
combination of UpdateExpression with nested attributes and ReturnValues
is still wrong, and the test for it in test_returnvalues.py still xfails.

Note that previous patches already implemented support for attribute paths
in expression evaluations - i.e., the right-hand side of UpdateExpression
actions, and in this patch we just needed to implement the left hand side:
When an update action is on an attribute a.b we need to read the entire
content of the top-level a (an RWM operation), modify just the b part of
its json with the result of the action, and finally write back the entire
content of a. Of course everything gets complicated by the fact that we
can have multiple actions on multiple pieces of the same JSON, and we also
need to detect overlapping and conflicting actions (we already have this
detection in the attribute_path_map<> class we introduced in a previous
patch).

I decided to leave one small esoteric difference, reproduced by the xfailing
test_update_expression.py::test_nested_attribute_remove_from_missing_item:
As expected, "SET x.y = :val" fails for an item if its attribute x doesn't
exist or the item itself does not exist. For the update expression
"REMOVE x.y", DynamoDB fails if the attribute x doesn't exist, but oddly
silently passes if the entire item doesn't exist. Alternator does not
currently reproduce this oddity - it will fail this write as well.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
4c7e27c688 alternator: prepare for UpdateItem nested attribute paths
This patch prepares UpdateItem for updating of nested attribute paths
(e.g., "SET a.b = :val"), but does not yet support them.

Instead of _update_expression holding an unsorted list of "actions",
we change it to hold a attribute_path_map of actions. This will allow
us to process all the actions on a top-level attribute together, and
moreover gets us "for free" the correct checking for overlapping and
conflicting updates - exactly the same checking we already had in
attribute_path_map for ProjectionExpression. Other than this change,
most of this patch is just code movement, not functional changes.

After this patch, the tests for update path overlap and conflict pass:
test_update_expression_multi_overlap_nested and
test_update_expression_multi_conflict_nested.

We can also mark test_update_expression_nested_attribute_rhs as passing -
this test involves an attribute path in the right-hand-side of an update,
but the left-hand-side is still a top-level attribute, so it works (it
actually worked before this patch - it started working when we implemented
attribute paths in expressions, for ConditionExpression and
FilterExpression).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
7c5db2da83 alternator: overhaul ProjectionExpression hierarchy implementation
For ProjectionExpression we implemented a hierarchical filter object which
can be used to hold a tree of attribute paths groups by a the top-level
attributes, and also detect overlapping and conflicting entries.

For UpdateExpression, we need almost exactly the same object: We need to
group update actions (e.g., SET a.b=3) by the top-level attribute, and
also detect and fail overlapping or conflicting paths.

So in this patch we rewrite the data structure we had for ProjectionExpression
in a more genric manner, using the template attribute_path_map<T> - which
holds data of type T for each attribute path. We also implement a template
function attribute_path_map_add() to add a path/value pair to this map,
and includes all the overlap and conflict detecting logic.

There shouldn't be functional changes in this patch. The ProjectionExpression
code uses the new generic code instead of the specific code, but should work
the same. In the next patch we can use the new generic code to implement
UpdateExpression as well.

The only somewhat functional change is better error messages for
conflicting or overlapping paths - which now include one of the
conflicting paths.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
104ef5242b alternator: support attribute paths in ProjectionExpression
This patch fully implements support for attribute paths (e.g. a.b.c, a.d[3])
for the ProjectionExpression in the various operations where this parameter
is supported - GetItem, BatchGetItem, Query and Scan. After this patch, all
xfailing tests in test_projection_expression.py now pass.

In the previous patch we remembered in the "attrs_to_get" object not only
the top-level attributes to read from the table, but also how to filter
from it only the desired pieces of the nested document. In this patch we
add a filter() function to do this filtering, and call it in the right
places to post-process the JSON objects we read from the table.

We also had to fix reference resolution in paths to resolve all the
components of the path (e.g., #name1.#name2) and not just the top-level
attribute.

This is not the end of attribute path support, there are still other
expressions (ConditionExpression, UpdateExpression, FilterExpression,
ReturnValues) where they are not yet supported. This will come in following
patches.

Refs #5024

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-08 14:16:40 +02:00
Nadav Har'El
6340619e69 alternator: overhaul attrs_to_get handling
In the existing code, the variable "attrs_to_get" is a list of top-level
attributes to fetch for an item. It is used to implement features like
ProjectionExpression or AttributesToGet in GetItem and other places.

However, to support attribute paths (e.g., a.b.c[2]) in ProjectionExpression,
i.e., issue #5024, we need more than that. We still need to know the top-
level attribute "a", because this is the granularity we have in the Scylla
table (all the content inside "a" is serialized as a single JSON); But we
also need to remember exactly which parts *inside* "a" we will need to
extract and return.

So in this patch we add a new type, "attrs_to_get", which is more than
just a list of top-level attributes. Instead, it is a *map*, whose keys
are the top-level attributes, and the value for each of them is a
"hierarchy_filter", an object which describes which part of the attribute
is needed.

This patch includes the code which converts the AttributesToGet and
ProjectionExpression into the new attrs_to_get structure. During this
conversion, we recognize two kinds of errors which DynamoDB complains
about: We recognize "overlapping" attributes (e.g., requesting both
a.b and a.b.c) and "conflicting" attributes (e.g, requesting both
a.b and a[1]). After this, two xfailing tests we had for detecting
these overlap and conflicts finally pass and their "xfail" label is
removed.

After this patch, we have the attrs_to_get object which can allow us
to filter only the requested pieces of the top-level attributes, but
we don't use it yet - so this patch is not enough for complete support
of attribute paths in ProjectionExpression. We will complete this
support in the next patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-08 14:16:40 +02:00
Piotr Sarna
6ae94d31c1 treewide: remove shared pointer usage from the pager
The pager interface doesn't really need to be virtual,
so the next step could be to remove the need for pointers
entirely, but migrating from shared_ptr to unique_ptr
is a low-hanging fruit.

Message-Id: <a5bdecb17ae58e914da020fb58a41f4574565c66.1610709560.git.sarna@scylladb.com>
2021-01-15 15:03:14 +02:00
Piotr Sarna
12b5184933 alternator: drop unneeded sstring creation
It's now possible to use string views to check if a particular
table is a system table, so it's no longer needed to explicitly
create an sstring instance.
2021-01-04 09:47:01 +01:00
Gleb Natapov
d3aa17591c migration_manager: drop announce_locally flag
It looks like the history of the flag begins in Cassandra's
https://issues.apache.org/jira/browse/CASSANDRA-7327 where it is
introduced to speedup tests by not needing to start the gossiper.
The thing is we always start gossiper in our cql tests, so the flag only
introduce noise. And, of course, since we want to move schema to use raft
it goes against the nature of the raft to be able to apply modification only
locally, so we better get rid of the capability ASAP.

Tests: units(dev, debug)
Message-Id: <20201230111101.4037543-2-gleb@scylladb.com>
2021-01-03 13:58:09 +02:00
Piotr Sarna
3b26fc01c2 alternator: coroutinize untagging a resource
Historically, a seastar thread was used for this request
because it's not on a critical path, but a coroutine makes
the code simpler.
2020-12-23 15:53:57 +01:00
Piotr Sarna
1ca39cc8c1 alternator: coroutinize tagging a resource
Historically, a seastar thread was used for this request
because it's not on a critical path, but a coroutine makes
the code simpler.
2020-12-23 15:53:57 +01:00
Nadav Har'El
a8fdbf31cd alternator: fix UpdateItem ADD for non-existent attribute
UpdateItem's "ADD" operation usually adds elements to an existing set
or adds a number to an existing counter. But it can *also* be used
to create a new set or counter (as if adding to an empty set or zero).

We unfortunately did not have a test for this case (creating a new set
or counter), and when I wrote such a test now, I discovered the
implementation was missing. So this patch adds both the test and the
implementation. The new test used to fail before this patch, and passes
with it - and passes on DynamoDB.

Note that we only had this bug for the newer UpdateItem syntax.
For the old AttributeUpdates syntax, we already support ADD actions
on missing attributes, and already tested it in test_update_item_add().
I just forgot to test the same thing for the newer syntax, so I missed
this bug :-(

Fixes #7763.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207085135.2551845-1-nyh@scylladb.com>
2020-12-09 18:44:30 +01:00
Nadav Har'El
781f9d9aca alternator: make default timeout configurable
Whereas in CQL the client can pass a timeout parameter to the server, in
the DynamoDB API there is no such feature; The server needs to choose
reasonable timeouts for its own internal operations - e.g., writes to disk,
querying other replicas, etc.

Until now, Alternator had a fixed timeout of 10 seconds for its
requests. This choice was reasonable - it is much higher than we expect
during normal operations, and still lower than the client-side timeouts
that some DynamoDB libraries have (boto3 has a one-minute timeout).
However, there's nothing holy about this number of 10 seconds, some
installations might want to change this default.

So this patch adds a configuration option, "--alternator-timeout-in-ms",
to choose this timeout. As before, it defaults to 10 seconds (10,000ms).

In particular, some test runs are unusually slow - consider for example
testing a debug build (which is already very slow) in an extremely
over-comitted test host. In some cases (see issue #7706) we noticed
the 10 second timeout was not enough. So in this patch we increase the
default timeout chosen in the "test/alternator/run" script to 30 seconds.

Please note that as the code is structured today, this timeout only
applies to some operations, such as GetItem, UpdateItem or Scan, but
does not apply to CreateTable, for example. This is a pre-existing
issue that this patch does not change.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207122758.2570332-1-nyh@scylladb.com>
2020-12-09 14:30:43 +01:00
Nadav Har'El
86779664f4 alternator: fix broken Scan/Query paging with bytes keys
When an Alternator table has partition keys or sort keys of type "bytes"
(blobs), a Scan or Query which required paging used to fail - we used
an incorrect function to output LastEvaluatedKey (which tells the user
where to continue at the next page), and this incorrect function was
correct for strings and numbers - but NOT for bytes (for bytes, we
need to encode them as base-64).

This patch also includes two tests - for bytes partition key and
for bytes sort key - that failed before this patch and now pass.
The test test_fetch_from_system_tables also used to fail after a
Limit was added to it, because one of the tables it scans had a bytes
key. That test is also fixed by this patch.

Fixes #7768

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207175957.2585456-1-nyh@scylladb.com>
2020-12-08 09:38:23 +01:00
Nadav Har'El
78c598e08e alternator: add missing TableId field to DescribeTable response
DescribeTable should return a UUID "TableId" in its reponse.
We alread had it for CreateTable, and now this patch adds it to
DescribeTable.

The test for this feature is no longer xfail. Moreover, I improved
the test to not only check that the TableId field is present - it
should also match the documented regular expression (the standard
representation of a UUID).

Refs #5026

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201104114234.363046-1-nyh@scylladb.com>
2020-11-09 20:21:47 +01:00
Nadav Har'El
2fc3a30b45 alternator: forbid combining old and new-style parameters
The DynamoDB API has for the Query and Scan requests two filtering
syntaxes - the old (QueryFilter or ScanFilter) and the new (FilterExpression).
Also for projection, it has an old syntax (AttributesToGet) and a new
one (ProjectionExpression). Combining an old-style and new-style parameter
is forbidden by DynamoDB, and should also be forbidden by Alternator.

This patch fixes, and removes the "xfails" tag, of two tests:
  test_query_filter.py::test_query_filter_and_projection_expression
  test_filter_expression.py::test_filter_expression_and_attributes_to_get

Refs #6951

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-10-05 02:19:22 +03:00
Nadav Har'El
282742a469 alternator: fix query with both projection and filtering
We had a bug when a Query/Scan had both projection (ProjectionExpression
or AttributesToGet) and filtering (FilterExpression or Query/ScanFilter).
The problem was that projection left only the requested attributes, and
the filter might have needed - and not got - additional attributes.

The solution in this patch is to add the generated JSON item also
the extra attributes needed by filtering (if any), run the filter on
that, and only at the end remove the extra filtering attributes from
the item to be returned.

The two tests

 test_query_filter.py::test_query_filter_and_attributes_to_get
 test_filter_expression.py::test_filter_expression_and_projection_expression

Which failed before this patch now pass so we drop their "xfail" tag.

Fixes #6951.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-10-05 02:19:22 +03:00
Nadav Har'El
5e8bdf6877 alternator: fix corruption of PutItem operation in case of contention
This patch fixes a bug noted in issue #7218 - where PutItem operations
sometimes lose part of the item's data - some attributes were lost,
and the name of other attributes replaced by empty strings. The problem
happened when the write-isolation policy was LWT and there was contention
of writes to the same partition (not necessarily the same item).

To use CAS (a.k.a. LWT), Alternator builds an alternator::rmw_operation
object with an apply() function which takes the old contents of the item
(if needed) and a timestamp, and builds a mutation that the CAS should
apply. In the case of the PutItem operation, we wrongly assumed that apply()
will be called only once - so as an optimization the strings saved in the
put_item_operation were moved into the returned mutation. But this
optimization is wrong - when there is contention, apply() may be called
again when the changed proposed by the previous one was not accepted by
the Paxos protocol.

The fix is to change the one place where put_item_operation *moved* strings
out of the saved operations into the mutations, to be a copy. But to prevent
this sort of bug from reoccuring in future code, this patch enlists the
compiler to help us verify that it can't happen: The apply() function is
marked "const" - it can use the information in the operation to build the
mutation, but it can never modify this information or move things out of it,
so it will be fine to call this function twice.

The single output field that apply() does write (_return_attributes) is
marked "mutable" to allow the const apply() to write to it anyway. Because
apply() might be called twice, it is important that if some apply()
implementation sometimes sets _return_attributes, then it must always
set it (even if to the default, empty, value) on every call to apply().

The const apply() means that the compiler verfies for us that I didn't
forget to fix additional wrong std::move()s. Additionally, a test I wrote
to easily reproduce issue #7218 (which I will submit as a dtest later)
passes after this fix.

Fixes #7218.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200916064906.333420-1-nyh@scylladb.com>
2020-09-16 10:30:19 +02:00
Calle Wilund
f16792aad0 alternator: Alloc BILLING_MODE in update_table
While it does not do anything, we want something to
update for testing (dynamo python libs refuse empty
update).
2020-09-07 14:15:17 +00:00
Calle Wilund
f5c79d15a8 alternator: Include stream arn in table description if enabled
Fixes #7157

When creating/altering/describing a table, if streams are enabled, the
"latest active" stream arn should be included as LatestStreamArn.

Not doing so breaks java kinesis.
2020-09-07 08:16:11 +00:00
Nadav Har'El
4c73d43153 Alternator: allow CreateTable with SSESpecification explicitly disabled
While Alternator doesn't yet support creating a table with a different
"server-side encryption" (a.k.a. encryption-at-rest) parameters, the
SSESpecification option with Enabled=false should still be allowed, as
it is just the default, and means exactly the same as would a missing
SSESpecification.

This patch also adds a test for this case, which failed on Alternator
before this patch.

Fixes #7031.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200812205853.173846-1-nyh@scylladb.com>
2020-08-17 13:48:52 +02:00
Piotr Jastrzebski
01ea159fde codebase wide: use try_emplace when appropriate
C++17 introduced try_emplace for maps to replace a pattern:
if(element not in a map) {
    map.emplace(...)
}

try_emplace is more efficient and results in a more concise code.

This commit introduces usage of try_emplace when it's appropriate.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <4970091ed770e233884633bf6d46111369e7d2dd.1597327358.git.piotr@scylladb.com>
2020-08-16 14:41:09 +03:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Wojciech Mitros
45215746fe increase the maximum size of query results to 2^64
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.

The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).

In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.

The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).

Fixes #5101.
2020-08-03 17:32:49 +02:00
Botond Dénes
92a7b16cba query: read_command: add max_result_size
This field will replace max size which is currently passed once per
established rpc connection via the CLIENT_ID verb and stored as an
auxiliary value on the client_info. For now it is unused, but we update
all sites creating a read command to pass the correct value to it. In the
next patch we will phase out the old max size and use this field to pass
max size on each verb instead.
2020-07-28 18:00:29 +03:00
Botond Dénes
8992bcd1f8 query: read_command: use tagged ints for limit ctor params
The convenience constructor of read_command now has two integer
parameter next to each other. In the next patch we intend to add another
one. This is recipe for disaster, so to avoid mistakes this patch
converts these parameters to tagged integers. This makes sure callers
pass what they meant to pass. As a matter of fact, while fixing up
call-sites, I already found several ones passing `query::max_partitions`
to the `row_limit` parameter. No harm done yet, as
`query::max_partitions` == `query::max_rows` but this shows just how
easy it is to mix up parameters with the same type.
2020-07-28 18:00:29 +03:00
Piotr Sarna
d08e22c4eb alternator: fix tracing BatchGetItem
The BatchGetItem request did not pass its trace state to lower layers
in a correct manner, which resulted in losing tracing information.

Refs #6891
Message-Id: <078f58a0f76b9f182f671a8d16e147ded489138c.1595515815.git.sarna@scylladb.com>
2020-07-23 20:05:10 +03:00
Piotr Sarna
7256572e41 alternator: fix tracing GetItem
The GetItem request did not pass the trace state properly,
which resulted in having almost empty traces.

Refs #6891

Tests: manual:

Before:
 session_id                           | event_id                             | activity                                                                                                               | scylla_parent_id | scylla_span_id  | source    | source_elapsed | thread
--------------------------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------------------+------------------+-----------------+-----------+----------------+---------
 57995da0-cce4-11ea-97ea-000000000000 | 579971c4-cce4-11ea-97ea-000000000000 |                                                                                                                GetItem |                0 | 131309406144163 | 127.0.0.1 |              0 | shard 0

After:
 session_id                           | event_id                             | activity                                                                                                               | scylla_parent_id | scylla_span_id  | source    | source_elapsed | thread
--------------------------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------------------+------------------+-----------------+-----------+----------------+---------
 57995da0-cce4-11ea-97ea-000000000000 | 579971c4-cce4-11ea-97ea-000000000000 |                                                                                                                GetItem |                0 | 131309406144163 | 127.0.0.1 |              0 | shard 0
 57995da0-cce4-11ea-97ea-000000000000 | 57997327-cce4-11ea-97ea-000000000000 | Creating read executor for token -7535857341981351089 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE |                0 | 131309406144163 | 127.0.0.1 |             35 | shard 0
 57995da0-cce4-11ea-97ea-000000000000 | 5799733d-cce4-11ea-97ea-000000000000 |                                                                                            read_data: querying locally |                0 | 131309406144163 | 127.0.0.1 |             38 | shard 0
 57995da0-cce4-11ea-97ea-000000000000 | 57997358-cce4-11ea-97ea-000000000000 |                                                   Start querying the token range that starts with -7535857341981351089 |                0 | 131309406144163 | 127.0.0.1 |             40 | shard 0
 57995da0-cce4-11ea-97ea-000000000000 | 57997579-cce4-11ea-97ea-000000000000 |                                                                                                       Querying is done |                0 | 131309406144163 | 127.0.0.1 |             95 | shard 0
Message-Id: <d585ff7aaaeebf2050890643d40cdafb2efb8d98.1595509338.git.sarna@scylladb.com>
2020-07-23 20:05:06 +03:00
Nadav Har'El
06ba0c0232 alternator: use api_error factory functions in executor.cc
All the places in executor.cc where we constructed an api_error with inline
strings now use api_error factory functions. Most of them, but not all of
them, were api_error::validation(). We also needed to add a couple more of
these factory functions.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-07-23 15:36:39 +03:00
Nadav Har'El
5a35632cd3 alternator: refactor api_error class
In the patch "Add exception overloads for Dynamo types", Alternator's single
api_error exception type was replaced by a more complex hierarchy of types.
The implementation was not only longer and more complex to understand -
I believe it also negated an important observation:

The "api_error" exception type is special. It is not an exception created
by code for other code. It is not meant to be caught in Alternator code.
Instead, it is supposed to contain an error message created for the *user*,
containing one of the few supported exception exception "names" described
in the DynamoDB documentation, and a user-readable text message. Throwing
such an exception in Alternator code means the thrower wants the request
to abort immediately, and this message to reach the user. These exceptions
are not designed to be caught in Alternator code. Code should use other
exceptions - or alternatives to exceptions (e.g., std::optional) for
problems that should be handled before returning a different error to the
user. Moreover, "api_error" isn't just thrown as an exception - it can
also be returned-by-value in a executor::request_return_type) - which is
another reason why it should not be subclassed.

For these reasons, I believe we should have a single api_error type, and
it's wrong to subclass it. So in this patch I am reverting the subclasses
and template added in the aforementioned patch.

Still, one correct observation made in that patch was that it is
inconvenient to type in DynamoDB exception names (no help from the editor
in completing those strings) and also error-prone. In this patch we
propse a different - simpler - solution to the same problem:

We add trivial factory functions, e.g., api_error::validation(std::string)
as a shortcut to api_error("ValidationException"). The new implementation
is easy to understand, and also more self explanatory to readers:
It is now clear that "api_error::validation()" is actually a user-visible
"api_error", something which was obscured by the name validation_exception()
used before this patch.

Finally, this patch also improves the comment in error.hh explaining the
purpose of api_error and the fact it can be returned or thrown. The fact
it should not be subclassed is legislated with a "finally". There is also
no point of this class inheriting from std::exception or having virtual
functions, or an empty constructor - so all these are dropped as well.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-07-23 15:36:39 +03:00
Calle Wilund
cbb70f4af4 executor: "UpdateTable" support for streams
Partial implementation of the "UpdateTable" command.
Supports only enabling/disabling streams.
2020-07-15 08:21:34 +00:00
Calle Wilund
45ee73969d executor: Allow streams specification in CreateTable schema 2020-07-15 08:21:34 +00:00
Calle Wilund
3756febbf5 alternator: expose describe_single_item and default_timeout
To be able to describe single alternator items from other files.
And query with the default timeout.
2020-07-15 08:10:23 +00:00
Calle Wilund
e382d79bcd executor: Make some helper and subroutines class-visible
Subroutines needed by (in this case) streams implementation
moved from being file-static to class-static (exported).
To make putting handler routines in separate sources possible.
Because executor.cc is large and slow to compile.
Separation is nice.

Unfortunately, not all methods can be kept class-private,
since unrelated types also use them.

Reviewer suggested to instead place there is a top-level
header for export, i.e. not class-private at all.
I am skipping that for now, mainly because I can't come up
with a good file name. Can be part of a generate refactor
of helper routine organization in executor.
2020-07-15 08:10:23 +00:00
Amnon Heiman
edd3c97364 alternator: change estimated_histogram to time_estimated_histogram
This patch moves the alternator latencies histograms to use the time_estimated_histogram.
The changes requires changing the defined type and use the simpler
insertion method.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-07-14 11:17:43 +03:00
Nadav Har'El
35f7048228 alternator: CreateTable with bad Tags shouldn't create a table
Currently, if a user tries to CreateTable with a forbidden set of tags,
e.g., the Tags list is too long or contains an invalid value for
system:write_isolation, then the CreateTable request fails but the table
is still created. Without the tag of course.

This patch fixes this bug, and adds two test cases for it that fail
before this patch, and succeed with it. One of the test cases is
scylla_only because it checks the Scylla-specific system:write_isolation
tag, but the second test case works on DynamoDB as well.

What this patch does is to split the update_tags() function into two
parts - the first part just parses the Tags, validates them, and builds
a map. Only the second part actually writes the tags to the schema.
CreateTable now does the first part early, before creating the table,
so failure in parsing or validating the Tags will not leave a created
table behind.

Fixes #6809.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200713120611.767736-1-nyh@scylladb.com>
2020-07-13 17:14:44 +03:00
Piotr Sarna
df91e9a4c7 alternator: clean up string view conversions
Manual translation from JSON to string_view is replaced
with rjson::to_string_view helper function. In one place,
a redundant string_view intermediary is removed
in favor of creating the string straight from JSON.
Message-Id: <2aa9d9fedd73f14b7640870d14db4f2f0bd7bd8a.1592936139.git.sarna@scylladb.com>
2020-06-23 21:45:27 +03:00
Piotr Sarna
4558401aee alternator: drop using global migration manager
As part of "war on globals", the unneeded usage of global migration manager
instance is dropped.
Message-Id: <c9b2fab57e62185daa2441458f9a3a5e7e0a3908.1592936139.git.sarna@scylladb.com>
2020-06-23 21:43:57 +03:00
Piotr Sarna
f4e8cfe03b alternator: fix propagating tags
Updating tags was erroneously done locally, which means that
the schema change was not propagated to other nodes.
The new code announces new schema globally.

Fixes #6513
Branches: 4.0,4.1
Tests: unit(dev)
       dtest(alternator_tests.AlternatorTest.test_update_condition_expression_and_write_isolation)
Message-Id: <3a816c4ecc33c03af4f36e51b11f195c231e7ce1.1592935039.git.sarna@scylladb.com>
2020-06-23 21:27:55 +03:00
Piotr Sarna
7480015721 cql3, service: decouple cql_stats from query pagers
Pager belongs to a different layer than CQL and thus should not be
coupled with CQL stats - if any different frontends want to use paging,
they shouldn't be forced to instantiate CQL stats at all.

Same goes with CQL restrictions, but that will require much bigger
refactoring, so is left for later.

Message-Id: <5585eb470949e3457334ffd6dba80742abf3a631.1592902295.git.sarna@scylladb.com>
2020-06-23 19:40:18 +03:00
Piotr Sarna
45bf039357 alternator: use has_function instead of try-catch
With the new interface available, the try-catch idiom
can be removed, thus resolving a TODO.

Tests: unit(dev)

Message-Id: <788a29f8f9d7bcf952b28a6148670dbadb97a619.1592233511.git.sarna@scylladb.com>
2020-06-15 23:55:20 +03:00
Piotr Sarna
e76fba6f86 alternator: remove outdated TODO for adding timeouts
The TODO is already fixed, not to mention that it had
an incorrect ordinal number (:
Message-Id: <006dc3061e0f30641c2e63ff471686f4c2e82829.1592230155.git.sarna@scylladb.com>
2020-06-15 23:04:42 +03:00
Nadav Har'El
8c026b9f10 alternator: move some code out of executor.cc
The source file alternator/executor.cc has grown too much, reaching almost
4,000 lines. In this patch I move about 400 lines out of executor.cc:

1. Some functions related to serialization of sets and lists were moved to
   serialization.cc,
2. Functions related to evaluating parsed expressions were moved to
   expressions.cc.

The header file expressions_eval.hh was also removed - the calculate_value()
functions now live in expressions.cc, so we can just define them in
expressions.hh, no need for a separate header files.

This patch just moves code around. It doesn't make any functional changes.

Refs #5783.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-06-14 12:16:26 +03:00
Nadav Har'El
0b9f25ab50 alternator: implement FilterExpression
This patch provides a complete implementation for the FilterExpression
parameter - the newer syntax for filtering the results of the Query or
Scan operations.

The implementation is pretty straightforward - we already added earlier
a result-filtering framework to Alternator, and used it for the older
filtering syntax - QuryFilter and ScanFilter. All we had to do now was
to run the FilterExpression (which has the same syntax as a
ConditionExpression) on each individual items. The previous cleanup
patches were important to reduce the friction of running these expressions
on the items.

After the previous patches fixing small esoteric bugs in a few expression
functions, with this patch *all* the tests in test_filter_expression.py
now pass, and so do the two FilterExpression tests in test_query.py and
test_scan.py. As far as I know (and of course minus any bugs we'll discover
later), this marks the FilterExpression feature complete.

Fixes #5038.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-06-14 12:16:26 +03:00
Nadav Har'El
f87259a762 alternator: improve error path of attribute_type() function
The attribute_type() function, which can be used in expressions like
ConditionExpression and FilterExpression, is supposed to generate an
error if its second parameter is not one of the known types. What we
did until now was to just report a failed check in this case.

We already had a reproducing test with FilterExpression, but in this patch
we also add a test with ConditionExpression - which fails before this
patch and passes afterwards (and of course, passes with DynamoDB).

Fixes #6641.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-06-14 12:16:20 +03:00
Nadav Har'El
11d86dfb06 alternator: fix begins_with() error path
The begins_with() function should report an error if a constant is
passed to it which isn't one of the supported types - string or bytes
(e.g., a number).

The code we had to check this had wrong logic, though. If the item
attribute was also a number, we silently returned false, and didn't
go on to detect that the second parameter - a constant - was a number
too and should generate an error - not be silent.

Fixed and added a reproducing test case and another test to validate
my understanding of the type of parameters that begins_with() accepts.

Fixes #6640.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-06-14 12:13:23 +03:00
Nadav Har'El
13ef31f38b alternator: refactor resolving of references in expressions
In the DynamoDB API, expressions (e.g., ConditionExpression and many more)
may contain references to column names ("#name") or to values (":val")
given in a separate part of the request - ExpressionAttributeNames and
ExpressionAttributeValues respectively.

Before this patch, we resolved these references as part of the expression's
evaluation. This approach had two downsides:

1. It often misdiagnosed (both false negatives and false positives) cases
   of unused names and values in expressions. We already had two xfailing
   tests with examples - which pass after this patch. This patch also
   adds two additional tests, which failed before this patch and pass
   with it.

2. In one of the following patches we will add support for FilterExpression,
   where the same expression is used repeatedly on many items. It is a waste
   (as well as makes the code uglier) to resolve the same references again
   and again each time the expression is evaluated. We should be able
   to do it just once.

So this patch introduces an intermediate step between parsing and evaluating
an expression - "resolving" the expression. The new resolve_*() functions
modify the already parsed expression, replacing references to attribute
names and constant values by the actual names and values taken from the
request. The resolve_*() functions also keep track which references were
used, making it very easy to check (as DynamoDB does) if there are any
unused names or values, before starting the evaluation.

The interface of evaluate() functions become much simpler - they no longer
need to know the original request (which was previously needed for
ExpressionAttributeNames/Values), the table's schema (which was previously
needed only for some error checking), keep track of which references were
used. This simplification is helpful for using the expressions in contexts
where these things (request and schema) are no longer conveniently available,
namely in FilterExpression.

A small side-benefit of this patch is that it moves a bit of code, which
handled resolving of references in expressions, from executor.cc to
expressions.cc. This is just the first step in a bigger effort to
reduce the size of executor.cc by moving code to smaller source files.
There is no attempt in this patch to move as much code as we can.
We will move more code in a separate patch in this series.

Fixes #6572.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-06-14 11:57:13 +03:00