Commit Graph

462 Commits

Author SHA1 Message Date
Piotr Sarna
ba264e7199 alternator: drop read_content_and_verify_signature
The only use of this helper function was inlined in a bigger
coroutine, so it's no longer needed.
2021-03-10 14:42:53 +01:00
Piotr Sarna
35da51879f alternator: coroutinize handle_api_request
The indentation level is significantly reduced, and so is the number
of allocations.
The function signature is changed from taking an rvalue ref to taking
the unique_ptr by value, because otherwise the coroutine captures
the request as a reference, which results in use-after-free.
2021-03-10 14:42:52 +01:00
Nadav Har'El
f41dac2a3a alternator: avoid large contiguous allocation for request body
Alternator request sizes can be up to 16 MB, but the current implementation
had the Seastar HTTP server read the entire request as a contiguous string,
and then processed it. We can't avoid reading the entire request up-front -
we want to verify its integrity before doing any additional processing on it.
But there is no reason why the entire request needs to be stored in one big
*contiguous* allocation. This always a bad idea. We should use a non-
contiguous buffer, and that's the goal of this patch.

We use a new Seastar HTTPD feature where we can ask for an input stream,
instead of a string, for the request's body. We then begin the request
handling by reading lthe content of this stream into a
vector<temporary_buffer<char>> (which we alias "chunked_content"). We then
use this non-contiguous buffer to verify the request's signature and
if successful - parse the request JSON and finally execute it.

Beyond avoiding contiguous allocations, another benefit of this patch is
that while parsing a long request composed of chunks, we free each chunk
as soon as its parsing completed. This reduces the peak amount of memory
used by the query - we no longer need to store both unparsed and parsed
versions of the request at the same time.

Although we already had tests with requests of different lengths, most
of them were short enough to only have one chunk, and only a few had
2 or 3 chunks. So we also add a test which makes a much longer request
(a BatchWriteItem with large items), which in my experiment had 17 chunks.
The goal of this test is to verify that the new signature and JSON parsing
code which needs to cross chunk boundaries work as expected.

Fixes #7213.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210309222525.1628234-1-nyh@scylladb.com>
2021-03-10 09:22:34 +01:00
Calle Wilund
8bbc976ff1 alternator::streams: Use better method for generation timestamp
Get timestamp via system_distributed, instead of local gen.
2021-03-03 15:46:38 +00:00
Avi Kivity
8747c684e0 Merge 'Move timeouts to client state' from Piotr Sarna
This series is extracted from #7913 as it may prove useful to other series as well, and #7913 might take a while until its merged, given that it also depends on other unmerged pull requests.

The idea of this series is to move timeouts to the client state, which will allow changing them independently for each session - e.g. by setting per-service-level timeouts and initializing the values from attached service levels (see #7867).

Closes #8140

* github.com:scylladb/scylla:
  treewide: remove timeout config from query options
  cql3: use timeout config from client state instead of query options
  cql3: use timeout config from client state instead of query options
  cql3: use timeout config from client state instead of query options
  service: add timeout config to client state
2021-03-01 20:34:35 +02:00
Piotr Sarna
0e0282cdf1 Merge ' cdc: move (most of) CDC generation management to a new service' from Kamil Braun
Currently all management of CDC generations happens in storage_service,
which is a big ball of mud that does many unrelated things.

This PR introduces a new service crafted to handle CDC generation
management: listening and reacting to generation changes in the cluster.

We plug the service in, initializing it in main and test code,
passing a reference to storage_service and having storage_service call
the service (using the `after_join` method): the service only starts
doing its job after the node joins the token ring (either on bootstrap
or restart).

Some parts of generation management still remain in storage_service:
the bootstrap procedure, which happens inside storage_service,
must also do some initialization regarding CDC generations,
for example: on restart it must retrieve the latest known generation
timestamp from disk; on bootstrap it must create a new generation
and announce it to other nodes. The order of these operations w.r.t
the rest of the startup procedure is important, hence the startup
procedure is the only right place for them. We may try decoupling
these services even more in follow-up PRs, but that requires a bit
of careful reasoning. What this PR does is a low-hanging fruit.

Still, what remains in storage_service is a small part of the entire
CDC generation management logic; most of it has been moved to the
new service. This includes listening for generation changes and
updating the data structures for performing CDC log writes (cdc::metadata).
Furthermore these handling functions now return futures (and are internally
coroutines), where previously they required a seastar::async context.

This PR is a prerequisite to fixing #7985. The fact that all the CDC generation
management code was in storage_service is technical debt. It will be easier
to modify the management algorithms when they sit in their own module.

Tests: unit (dev) and cdc_tests.py dtest (dev), and local replication test using scylla-cdc-java

Closes #8172

* github.com:scylladb/scylla:
  cdc: move (most of) CDC generation management code to the new service
  cdc: coroutinize make_new_cdc_generation
  cdc: coroutinize update_streams_description
  cdc: introduce cdc::generation_service
  main: move cdc_service initialization just prior to storage_service initialization
2021-02-26 12:42:27 +01:00
Kamil Braun
e2f03e4aba cdc: move (most of) CDC generation management code to the new service
Currently all management of CDC generations happens in storage_service,
which is a big ball of mud that does many unrelated things.

Previous commits have introduced a new service for managing CDC
generations. This code moves most of the relevant code to this new
service.

However, some part still remains in storage_service: the bootstrap
procedure, which happens inside storage_service, must also do some
initialization regarding CDC generations, for example: on restart it
must retrieve the latest known generation timestamp from disk; on
bootstrap it must create a new generation and announce it to other
nodes. The order of these operations w.r.t the rest of the startup
procedure is important, hence the startup procedure is the only right
place for them.

Still, what remains in storage_service is a small part of the entire
CDC generation management logic; most of it has been moved to the
new service. This includes listening for generation changes and
updating the data structures for performing CDC log writes (cdc::metadata).
Furthermore these functions now return futures (and are internally
coroutines), where previously they required a seastar::async context.
2021-02-26 12:06:12 +01:00
Piotr Sarna
c5214eb096 treewide: remove timeout config from query options
Timeout config is now stored in each connection, so there's no point
in tracking it inside each query as well. This patch removes
timeout_config from query_options and follows by removing now
unnecessary parameters of many functions and constructors.
2021-02-25 17:20:27 +01:00
Nadav Har'El
d905e71a90 Alternator: add support for CORS protocol
This patch adds to Alternator support for the CORS (Cross-Origin Resource
Sharing) protocol - a simple extension over the HTTP protocol which
browsers use when Javascript code contacts HTTP-based servers.

Although we usually think of Alternator as being used in a three-tier
application, in some setups there is no middle layer and the user's
browser, running Javascript code, wants to communicate directly with the
database. However, for security reasons, by default Javascript loaded
from domain X is not allowed to communicate with different domains Y.
The CORS protocol is meant to allow this, and Alternator needs to
participate in this protocol if it is to be used directly from Javascript
in browsers.

To implement CORS, Alternator needs to respond to the OPTIONS method
which it didn't allow before - with certain headers based on the
input headers. It also needs to do some of these things for the
regular methods (mostly, POST). The patch includes a comprehensive
test that runs against both Alternator and DynamoDB and shows that
Alternator handles these headers and methods the same as DynamoDB.

Additionally, I tested manually a Javascript DynamoDB client - which
didn't work prior to this patch (the browser reported CORS errors),
and works after this patch.

Fixes #8025.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210217222027.1219319-1-nyh@scylladb.com>
2021-02-23 13:15:03 +01:00
Kamil Braun
67d4e5576d sys_dist_ks: split CDC streams table partitions into clustered rows
Until now, the lists of streams in the `cdc_streams_descriptions` table
for a given generation were stored in a single collection. This solution
has multiple problems when dealing with large clusters (which produce
large lists of streams):
1. large allocations
2. reactor stalls
3. mutations too large to even fit in commitlog segments

This commit changes the schema of the table as described in issue #7993.
The streams are grouped according to token ranges, each token range
being represented by a separate clustering row. Rows are inserted in
reasonably large batches for efficiency.

The table is renamed to enable easy upgrade. On upgrade, the latest CDC
generation's list of streams will be (re-)inserted into the new table.

Yet another table is added: one that contains only the generation
timestamps clustered in a single partition. This makes it easy for CDC
clients to learn about new generations. It also enables an elegant
two-phase insertion procedure of the generation description: first we
insert the streams; only after ensuring that a quorum of replicas
contains them, we insert the timestamp. Thus, if any client observes a
timestamp in the timestamps table (even using a ONE query),
it means that a quorum of replicas must contain the list of streams.
2021-02-18 11:44:59 +01:00
Piotr Sarna
495b7b5596 alternator: use unique_ptr for storing attribute paths
Previous commit eliminated the only copying of the attribute paths,
so it's now safe to make the object noncopyable.
Message-Id: <5468e8c17d3d42a03c1dd33706bbaac0c58959ce.1613398751.git.sarna@scylladb.com>
2021-02-15 18:22:59 +02:00
Piotr Sarna
7e1641224c alternator: batch: pass attrs_to_get by a shared pointer
The attrs_to_get object was previously copied, but it's quite
a heavyweight operations, since this object may contain an
instance of std::map or std::unordered_map.
To avoid copying whole maps, the object is wrapped in a shared
const pointer.
Message-Id: <75ad810de16c630b65ae8d319cb4b37e1de8085f.1613398751.git.sarna@scylladb.com>
2021-02-15 18:22:56 +02:00
Piotr Sarna
cbbb7f08a0 Merge 'Alternator: support nested attribute paths...
in all expressions' from Nadav Har'El.

This series fixes #5024 - which is about adding support for nested attribute
paths (e.g., a.b.c[2]) to Alternator.  The series adds complete support for this
feature in ProjectionExpression, ConditionExpression, FilterExpression and
UpdateExpression - and also its combination with ReturnValues. Many relevant
tests - and also some new tests added in this series - now pass.

The first patch in the series fixes #8043 a bug in some error cases in
conditions, which was discovered while working in this series, and is
conceptually separate from the rest of the series.

Closes #8066

* github.com:scylladb/scylla:
  alternator: correct implemention of UpdateItem with nested attributes and ReturnValues
  alternator: fix bug in ReturnValues=UPDATED_NEW
  alternator: implemented nested attribute paths in UpdateExpression
  alternator: limit the depth of nested paths
  alternator: prepare for UpdateItem nested attribute paths
  alternator: overhaul ProjectionExpression hierarchy implementation
  alternator: make parsed::path object printable
  alternator-test: a few more ProjectionExpression conflict test cases
  alternator-test: improve tests for nested attributes in UpdateExpression
  alternator: support attribute paths in ConditionExpression, FilterExpression
  alternator-test: improve tests for nested attributes in ConditionExpression
  alternator: support attribute paths in ProjectionExpression
  alternator: overhaul attrs_to_get handling
  alternator-test: additional tests for attribute paths in ProjectionExpression
  alternator-test: harden attribute-path tests for ProjectionExpression
  alternator: fix ValidationException in FilterExpression - and more
2021-02-15 15:45:49 +02:00
Nadav Har'El
49cd9b3fd5 alternator: correct implemention of UpdateItem with nested attributes and ReturnValues
This patch fixes the last missing part of nested attribute support in
UpdateItem - returning the correct attributes when ReturnValues is requested.
When the expression says "a.b = :val" and ReturnValues is set to UPDATED_OLD
or UPDATED_NEW, only the actual updated attribute a.b should be returned, not
the entire top-level attribute a as we did before this patch.

This patch was made very simple because our existing hierarchy_filter()
function already does exactly the right thing, and can trivially be made to
accept any attribute_path_map<T> (in our case attribute_path_map<action>),
not just attrs_to_get as it did until now.

This patch also adds several more checks to the test in test_returnvalues.py
to improve the test's coverage even more. Interestingly, I discovered two
esoteric cases where DynamoDB does something which makes little sense, but
apparently simplified their implementation - but the beautiful thing is that
it also simplifies our implementation! See long comments about these two
cases in the test code.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
964500e47a alternator: fix bug in ReturnValues=UPDATED_NEW
Commit 0c460927bf broke UpdateItem's
ReturnValues=UPDATED_NEW by moving previous_item while it is still
needed. None of the existing tests broke because none of them needed
previous_item after it was moved - but it started to break when we
add support for nested attribute paths, which need this previous_item.

So this patch returns the move to a copy, as it was before the
aforementioned patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
33685a683e alternator: implemented nested attribute paths in UpdateExpression
This patch adds full support for nested attribute paths (e.g., a.b[3].c)
in UpdateExpression. After in previous patches we already added such
support for ProjectionExpression, ConditionExpression and FilterExpression
this means the nested attribute paths feature is now complete, so we
remove the warning from the documents. However, there is one last loose
end to tie and we will do it in the next patch: After this patch, the
combination of UpdateExpression with nested attributes and ReturnValues
is still wrong, and the test for it in test_returnvalues.py still xfails.

Note that previous patches already implemented support for attribute paths
in expression evaluations - i.e., the right-hand side of UpdateExpression
actions, and in this patch we just needed to implement the left hand side:
When an update action is on an attribute a.b we need to read the entire
content of the top-level a (an RWM operation), modify just the b part of
its json with the result of the action, and finally write back the entire
content of a. Of course everything gets complicated by the fact that we
can have multiple actions on multiple pieces of the same JSON, and we also
need to detect overlapping and conflicting actions (we already have this
detection in the attribute_path_map<> class we introduced in a previous
patch).

I decided to leave one small esoteric difference, reproduced by the xfailing
test_update_expression.py::test_nested_attribute_remove_from_missing_item:
As expected, "SET x.y = :val" fails for an item if its attribute x doesn't
exist or the item itself does not exist. For the update expression
"REMOVE x.y", DynamoDB fails if the attribute x doesn't exist, but oddly
silently passes if the entire item doesn't exist. Alternator does not
currently reproduce this oddity - it will fail this write as well.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
7789606545 alternator: limit the depth of nested paths
DynamoDB limits the depth of a nested path in expressions (e.g. "a.b.c.d")
to 32 levels. This patch adds the same limit also to Alternator.

The exact value of this limit is less important (although it did make
sense to choose the same limit as DynamoDB does), but it's important
to have *some* limit: It's often convenient to handle paths with a
recursive algorithm, and if we allow unlimited path depth, it can
result in unlimited recursion depth, and a crash. Let's avoid this
possibility.

We detect the over-long path while building the parsed::path object
in the parser, and generate a parse error.

This patch also includes a test that verifies that both Alternator
and DynamoDB have the same 32-level nesting limit on paths.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
4c7e27c688 alternator: prepare for UpdateItem nested attribute paths
This patch prepares UpdateItem for updating of nested attribute paths
(e.g., "SET a.b = :val"), but does not yet support them.

Instead of _update_expression holding an unsorted list of "actions",
we change it to hold a attribute_path_map of actions. This will allow
us to process all the actions on a top-level attribute together, and
moreover gets us "for free" the correct checking for overlapping and
conflicting updates - exactly the same checking we already had in
attribute_path_map for ProjectionExpression. Other than this change,
most of this patch is just code movement, not functional changes.

After this patch, the tests for update path overlap and conflict pass:
test_update_expression_multi_overlap_nested and
test_update_expression_multi_conflict_nested.

We can also mark test_update_expression_nested_attribute_rhs as passing -
this test involves an attribute path in the right-hand-side of an update,
but the left-hand-side is still a top-level attribute, so it works (it
actually worked before this patch - it started working when we implemented
attribute paths in expressions, for ConditionExpression and
FilterExpression).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
7c5db2da83 alternator: overhaul ProjectionExpression hierarchy implementation
For ProjectionExpression we implemented a hierarchical filter object which
can be used to hold a tree of attribute paths groups by a the top-level
attributes, and also detect overlapping and conflicting entries.

For UpdateExpression, we need almost exactly the same object: We need to
group update actions (e.g., SET a.b=3) by the top-level attribute, and
also detect and fail overlapping or conflicting paths.

So in this patch we rewrite the data structure we had for ProjectionExpression
in a more genric manner, using the template attribute_path_map<T> - which
holds data of type T for each attribute path. We also implement a template
function attribute_path_map_add() to add a path/value pair to this map,
and includes all the overlap and conflict detecting logic.

There shouldn't be functional changes in this patch. The ProjectionExpression
code uses the new generic code instead of the specific code, but should work
the same. In the next patch we can use the new generic code to implement
UpdateExpression as well.

The only somewhat functional change is better error messages for
conflicting or overlapping paths - which now include one of the
conflicting paths.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
f78d33dd73 alternator: make parsed::path object printable
Make the parsed::path object printable - which is useful for error messages.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-14 12:21:34 +02:00
Nadav Har'El
e52785be08 alternator: support attribute paths in ConditionExpression, FilterExpression
This patch fully implements support for attribute paths (e.g. a.b.c, a.d[3])
for the ConditionExpression in conditional updates, and FilterExpression in
queries and scans. After this patch, all previously-xfailing tests in
test_projection_expression.py and test_filter_expression.py now pass.

The fix is simple: Both ConditionExpression and FilterExpression use the
function calculate_value() to calculate the value of the expression. When
this function calculates the value of a path, it mustn't just take the
top-level attribute - it needs to walk into the specific sub-object as
specified by the attribute path.

This is not the end of attribute path support, UpdateExpression and
ReturnValues are not yet fully supported. This will come in following
patches.

Refs #5024

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-08 19:19:09 +02:00
Nadav Har'El
104ef5242b alternator: support attribute paths in ProjectionExpression
This patch fully implements support for attribute paths (e.g. a.b.c, a.d[3])
for the ProjectionExpression in the various operations where this parameter
is supported - GetItem, BatchGetItem, Query and Scan. After this patch, all
xfailing tests in test_projection_expression.py now pass.

In the previous patch we remembered in the "attrs_to_get" object not only
the top-level attributes to read from the table, but also how to filter
from it only the desired pieces of the nested document. In this patch we
add a filter() function to do this filtering, and call it in the right
places to post-process the JSON objects we read from the table.

We also had to fix reference resolution in paths to resolve all the
components of the path (e.g., #name1.#name2) and not just the top-level
attribute.

This is not the end of attribute path support, there are still other
expressions (ConditionExpression, UpdateExpression, FilterExpression,
ReturnValues) where they are not yet supported. This will come in following
patches.

Refs #5024

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-08 14:16:40 +02:00
Nadav Har'El
6340619e69 alternator: overhaul attrs_to_get handling
In the existing code, the variable "attrs_to_get" is a list of top-level
attributes to fetch for an item. It is used to implement features like
ProjectionExpression or AttributesToGet in GetItem and other places.

However, to support attribute paths (e.g., a.b.c[2]) in ProjectionExpression,
i.e., issue #5024, we need more than that. We still need to know the top-
level attribute "a", because this is the granularity we have in the Scylla
table (all the content inside "a" is serialized as a single JSON); But we
also need to remember exactly which parts *inside* "a" we will need to
extract and return.

So in this patch we add a new type, "attrs_to_get", which is more than
just a list of top-level attributes. Instead, it is a *map*, whose keys
are the top-level attributes, and the value for each of them is a
"hierarchy_filter", an object which describes which part of the attribute
is needed.

This patch includes the code which converts the AttributesToGet and
ProjectionExpression into the new attrs_to_get structure. During this
conversion, we recognize two kinds of errors which DynamoDB complains
about: We recognize "overlapping" attributes (e.g., requesting both
a.b and a.b.c) and "conflicting" attributes (e.g, requesting both
a.b and a[1]). After this, two xfailing tests we had for detecting
these overlap and conflicts finally pass and their "xfail" label is
removed.

After this patch, we have the attrs_to_get object which can allow us
to filter only the requested pieces of the top-level attributes, but
we don't use it yet - so this patch is not enough for complete support
of attribute paths in ProjectionExpression. We will complete this
support in the next patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-08 14:16:40 +02:00
Nadav Har'El
653610f4bc alternator: fix ValidationException in FilterExpression - and more
The first condition expressions we implemented in Alternator were the old
"Expected" syntax of conditional updates. That implementation had some
specific assumptions on how it handles errors: For example, in the "LT"
operator in "Expected", the second operand is always part of the query, so
an error in it (e.g., an unsupported type) resulted it a ValidationException
error.

When we implemented ConditionExpression and FilterExpression, we wrongly
used the same functions check_compare(), check_BETWEEN(), etc., to implement
them. This results in some inaccurate error handling. The worst example is
what happens when you use a FilterExpression with an expression such as
"x < y" - this filter is supposed to silently skip items whose "x" and "y"
attributes have unsupported or different types, but in our implementation
a bad type (e.g., a list) for y resulted in a ValidationException which
aborted the entire scan! Interestingly, in once case (that of BEGINS_WITH)
we actually noticed the slightly different behavior needed and implemented
the same operator twice - with ugly code duplication. But in other operators
we missed this problem completely.

This patch first adds extensive tests of how the different expressions
(Expected, QueryFilter, FilterExpression, ConditionExpression) and the
different operators handle various input errors - unsupported types,
missing items, incompatible types, etc. Importantly, the tests demonstrate
that there is often different behavior depending on whether the bad
input comes from the query, or from the item. Some of the new tests
fail before this patch, but others pass and were useful to verify that
the patch doesn't break anything that already worked correctly previously.
As usual, all the tests pass on Cassandra.

Finally, this patch *fixes* all these problems. The comparison functions
like check_compare() and check_BETWEEN() now not only take the operands,
they also take booleans saying if each of the operands came from the
query or from an item. The old-syntax caller (Expected or QueryFilter)
always say that the first operand is from the item and the second is
from the query - but in the new-syntax caller (ConditionExpression or
FilterExpression) any or all of the operands can come from the query
and need verification.

The old duplicated code for check_BEGINS_WITH() - which a TODO to remove
it - is finally removed. Instead we use the same idea of passing booleans
saying if each of its operands came from an item or from the query.

Fixes #8043

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-02-08 14:16:30 +02:00
Piotr Sarna
d7848750d8 alternator: server: return api_error instead of throwing
Throwing a C++ exception creates unnecessary overhead, so when
an unsupported operation is encountered, the api error is directly
returned instead of being thrown.
2021-02-04 17:23:41 +01:00
Piotr Sarna
868e04e8e2 alternator: add requests_shed metrics
The counter shows the total number of requests shed due to overload.
2021-02-04 17:23:41 +01:00
Piotr Sarna
1b8c946ad7 alternator: add handling max_concurrent_requests_per_shard
The config value is already used to set an upper limit of concurrent
CQL requests, and now it's also abided by alternator.
Excessive requests result in returning RequestLimitExceeded error
to the client.

Tests: manual
Running multiple concurrent requests via the test suite results in:
botocore.errorfactory.RequestLimitExceeded: An error occurred (RequestLimitExceeded) when calling the CreateTable operation: too many in-flight requests: 17
2021-02-04 17:23:41 +01:00
Piotr Sarna
32dc692b8b alternator: add RequestLimitExceeded error
The error code is used when requests are shed due to crossing
the user-defined threshold of the rate of incoming requests.
2021-02-04 17:14:21 +01:00
Piotr Sarna
6ae94d31c1 treewide: remove shared pointer usage from the pager
The pager interface doesn't really need to be virtual,
so the next step could be to remove the need for pointers
entirely, but migrating from shared_ptr to unique_ptr
is a low-hanging fruit.

Message-Id: <a5bdecb17ae58e914da020fb58a41f4574565c66.1610709560.git.sarna@scylladb.com>
2021-01-15 15:03:14 +02:00
Piotr Sarna
12b5184933 alternator: drop unneeded sstring creation
It's now possible to use string views to check if a particular
table is a system table, so it's no longer needed to explicitly
create an sstring instance.
2021-01-04 09:47:01 +01:00
Gleb Natapov
d3aa17591c migration_manager: drop announce_locally flag
It looks like the history of the flag begins in Cassandra's
https://issues.apache.org/jira/browse/CASSANDRA-7327 where it is
introduced to speedup tests by not needing to start the gossiper.
The thing is we always start gossiper in our cql tests, so the flag only
introduce noise. And, of course, since we want to move schema to use raft
it goes against the nature of the raft to be able to apply modification only
locally, so we better get rid of the capability ASAP.

Tests: units(dev, debug)
Message-Id: <20201230111101.4037543-2-gleb@scylladb.com>
2021-01-03 13:58:09 +02:00
Piotr Sarna
3b26fc01c2 alternator: coroutinize untagging a resource
Historically, a seastar thread was used for this request
because it's not on a critical path, but a coroutine makes
the code simpler.
2020-12-23 15:53:57 +01:00
Piotr Sarna
1ca39cc8c1 alternator: coroutinize tagging a resource
Historically, a seastar thread was used for this request
because it's not on a critical path, but a coroutine makes
the code simpler.
2020-12-23 15:53:57 +01:00
Nadav Har'El
4ab98a4c68 alternator: use a more specific error when Authorization header is missing
When request signature checking is enabled in Alternator, each request
should come with the appropriate Authorization header. Most errors in
this preparing this header will result in an InvalidSignatureException
response; But DynamoDB returns a more specific error when this header is
completely missing: MissingAuthenticationTokenException. We should do the
same, but before this patch we return InvalidSignatureException also for
a missing header.

The test test_authorization.py::test_no_authorization_header used to
enshrine our wrong error message, and failed when run against AWS.
After this patch, we fix the error message and the test - which now
passes against both Alternator and AWS.

Refs #7778.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201213133825.2759357-1-nyh@scylladb.com>
2020-12-14 09:18:24 +01:00
Nadav Har'El
a8fdbf31cd alternator: fix UpdateItem ADD for non-existent attribute
UpdateItem's "ADD" operation usually adds elements to an existing set
or adds a number to an existing counter. But it can *also* be used
to create a new set or counter (as if adding to an empty set or zero).

We unfortunately did not have a test for this case (creating a new set
or counter), and when I wrote such a test now, I discovered the
implementation was missing. So this patch adds both the test and the
implementation. The new test used to fail before this patch, and passes
with it - and passes on DynamoDB.

Note that we only had this bug for the newer UpdateItem syntax.
For the old AttributeUpdates syntax, we already support ADD actions
on missing attributes, and already tested it in test_update_item_add().
I just forgot to test the same thing for the newer syntax, so I missed
this bug :-(

Fixes #7763.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207085135.2551845-1-nyh@scylladb.com>
2020-12-09 18:44:30 +01:00
Nadav Har'El
781f9d9aca alternator: make default timeout configurable
Whereas in CQL the client can pass a timeout parameter to the server, in
the DynamoDB API there is no such feature; The server needs to choose
reasonable timeouts for its own internal operations - e.g., writes to disk,
querying other replicas, etc.

Until now, Alternator had a fixed timeout of 10 seconds for its
requests. This choice was reasonable - it is much higher than we expect
during normal operations, and still lower than the client-side timeouts
that some DynamoDB libraries have (boto3 has a one-minute timeout).
However, there's nothing holy about this number of 10 seconds, some
installations might want to change this default.

So this patch adds a configuration option, "--alternator-timeout-in-ms",
to choose this timeout. As before, it defaults to 10 seconds (10,000ms).

In particular, some test runs are unusually slow - consider for example
testing a debug build (which is already very slow) in an extremely
over-comitted test host. In some cases (see issue #7706) we noticed
the 10 second timeout was not enough. So in this patch we increase the
default timeout chosen in the "test/alternator/run" script to 30 seconds.

Please note that as the code is structured today, this timeout only
applies to some operations, such as GetItem, UpdateItem or Scan, but
does not apply to CreateTable, for example. This is a pre-existing
issue that this patch does not change.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207122758.2570332-1-nyh@scylladb.com>
2020-12-09 14:30:43 +01:00
Nadav Har'El
86779664f4 alternator: fix broken Scan/Query paging with bytes keys
When an Alternator table has partition keys or sort keys of type "bytes"
(blobs), a Scan or Query which required paging used to fail - we used
an incorrect function to output LastEvaluatedKey (which tells the user
where to continue at the next page), and this incorrect function was
correct for strings and numbers - but NOT for bytes (for bytes, we
need to encode them as base-64).

This patch also includes two tests - for bytes partition key and
for bytes sort key - that failed before this patch and now pass.
The test test_fetch_from_system_tables also used to fail after a
Limit was added to it, because one of the tables it scans had a bytes
key. That test is also fixed by this patch.

Fixes #7768

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207175957.2585456-1-nyh@scylladb.com>
2020-12-08 09:38:23 +01:00
Piotr Jastrzebski
d2897d8f8b alternator: guard streams with an experimental flag
Add new alternator-streams experimental flag for
alternator streams control.

CDC becomes GA and won't be guarded by an experimental flag any more.
Alternator Streams stay experimental so now they need to be controlled
by their own experimental flag.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-11-12 12:36:16 +01:00
Benny Halevy
3fab0f8694 storage_proxy: convert to shared_token_metadata
get() the latest token_metadata_ptr from the
shared_token_metadata before each use.

expose get_token_metadata_ptr() rather than get_token_metadata()
so that caller can keep it across continuations.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-11 14:20:23 +02:00
Nadav Har'El
78c598e08e alternator: add missing TableId field to DescribeTable response
DescribeTable should return a UUID "TableId" in its reponse.
We alread had it for CreateTable, and now this patch adds it to
DescribeTable.

The test for this feature is no longer xfail. Moreover, I improved
the test to not only check that the TableId field is present - it
should also match the documented regular expression (the standard
representation of a UUID).

Refs #5026

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201104114234.363046-1-nyh@scylladb.com>
2020-11-09 20:21:47 +01:00
Tomasz Grabiec
6d0d55aa72 Merge "Unglobal query processor instance" from Pavel Emelyanov
The query processor is present in the global namespace and is
widely accessed with global get(_local)?_query_processor().
There's a long-term task to get rid of this globality and make
services and componenets reference each-other and, for and
due-to this, start and stop in specific order. This set makes
this for the query processor.

The remaining users of it are -- alternator, controllers for
client services, schema_tables and sys_dist_ks. All of them
except for the schema_tables are fixed just by passing the
reference on query processor with small patches. The schema
tables accessing qp sit deep inside the paxos code, but can
be "fixed" with the qctx thing until the qctx itself is
de-globalized.

* https://github.com/xemul/scylla/tree/br-rip-global-query-processor:
  code: RIP global query processor instance
  cql test env: Keep query processor reference on board
  system distributed keyspace: Start sharded service erarlier
  schema_tables: Use qctx to make internal requests
  transport: Keep sharded query processor reference on controller
  thrift: Keep sharded query processor reference on controller
  alternator: Use local query processor reference to get keys
  alternator: Keep local query processor reference in server
2020-11-06 14:24:41 +01:00
Calle Wilund
1db9da2353 alternator::streams: Workaround fix for apparent code gen bug in seq_number
Fixes #7325

When building with clang on fedora32, calling the string_view constructor
of bignum generates broken ID:s (i.e. parsing borks). Creating a temp
std::string fixes it.

Closes #7542
2020-11-04 09:26:08 +02:00
Nadav Har'El
ffbd487c86 Merge 'alternator::streams: Use end-of-record info in get_records' from Calle Wilund
Fixes #7496

Since cdc log now has an end-of-batch/record marker that tells
us explicitly that we've read the last row of a change, we
can use this instead of timestamp checks + limit extra to
ensure we have complete records.

Note that this does not try to fulfill user query limit
exact. To do this we would need to add a loop and potentially
re-query if quried rows are not enough. But that is a
separate exercise, and superbly suited for coroutines!

Closes #7498

* github.com:scylladb/scylla:
  alternator::streams: Reduce the query limit depending on cdc opts
  alternator::streams: Use end-of-record info in get_records
2020-11-02 13:34:00 +02:00
Calle Wilund
7c8f457bab alternator::streams: Reduce the query limit depending on cdc opts
Avoid querying much more than needed.
Since we have exact row markers now, this is more safe to do.
2020-11-02 08:37:27 +00:00
Calle Wilund
c79108edbb alternator::streams: Use end-of-record info in get_records
Fixes #7496

Since cdc log now has an end-of-batch/record marker that tells
us explicitly that we've read the last row of a change, we
can use this instead of timestamp checks + limit extra to
ensure we have complete records.

Note that this does not try to fulfill user query limit
exact. To do this we would need to add a loop and potentially
re-query if quried rows are not enough. But that is a
separate exercise, and superbly suited for coroutines!
2020-11-02 08:35:36 +00:00
Piotr Sarna
ed047d54bf Merge 'alternator: fix combination of filter and projection' from Nadav
The main goal of this this series is to fix issue #6951 - a Query (or Scan) with
a combination of filtering and projection parameters produced wrong results if
the filter needs some attributes which weren't projected.

This series also adds new tests for various corner cases of this issue. These
new tests also pass after this fix, or still fail because some other missing
feature (namely, nested attributes). These additional tests will be important if
we ever want to refactor or optimize this code, because they exercise some rare
corner code paths at the intersection of filtering and projection.

This series also fixes some additional problems related to this issue, like
combining old and new filtering/projection syntaxes (should be forbidden), and
even one fix to a wrong comment.

Closes #7328

* github.com:scylladb/scylla:
  alternator test: tests for nested attributes in FilterExpression
  alternator test: fix comment
  alternator tests: additional tests for filter+projection combination
  alternator: forbid combining old and new-style parameters
  alternator: fix query with both projection and filtering
2020-11-02 07:28:41 +01:00
Pavel Emelyanov
cf172cf656 alternator: Use local query processor reference to get keys
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-31 15:44:21 +03:00
Pavel Emelyanov
94a9f22002 alternator: Keep local query processor reference in server
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-31 15:44:21 +03:00
Calle Wilund
1bc96a5785 alternator::streams: Make describe_stream use actual log ttl as window
Allows QA to bypass the normal hardcoded 24h ttl of data and still
get "proper" behaviour w.r.t. available stream set/generations.
I.e. can manually change cdc ttl option for alternator table after
streams enabled. Should not be exposed, but perhaps useful for
testing.

Closes #7483
2020-10-26 12:16:36 +02:00
Calle Wilund
83339f4bac Alternator::streams: Make SequenceNumber monotinically growing
Fixes #7424

AWS sdk (kinesis) assumes SequenceNumbers are monotonically
growing bigints. Since we sort on and use timeuuids are these
a "raw" bit representation of this will _not_ fulfill the
requirement. However, we can "unwrap" the timestamp of uuid
msb and give the value as timestamp<<64|lsb, which will
ensure sort order == bigint order.
2020-10-14 16:45:21 +03:00