Following DynamoDB, Alternator also places a 16 MB limit on the size of
a request. Such a limit is necessary to avoid running out of memory -
because the AWS message authentication protocol requires reading the
entire request into memory before its signature can be verified.
Our implementation for this limit used Seastar's HTTP server's
content_length_limit feature. However, this Seastar feature is
incomplete - it only works when the request uses the Content-Length
header, and doesn't do anything if the request doesn't have a
Content-Length (it may use chunked encoding, or have no length at all).
So malicious users can cause Scylla to OOM by sending a huge request
without a Content-Length.
So in this patch we stop using the incomplete Seastar feature, and
implement the length limit in Scylla in a way that works correctly with
or without Content-Length: We read from the input stream and if we go
over 16MB, we generate an error.
Because we dropped Seastar's protection against a long Content-Length,
we also need to fix a piece of code which used Content-Length to reserve
some semaphore units to prevent reading many large requests in parallel.
We fix two problems in the code:
1. If Content-Length is over the limit, we shouldn't attempt to reserve
semaphore units - this should just be a Payload Too Large error.
2. If Content-Length is missing, the existing code did nothing and had
a TODO that we should. In this patch we implement what was suggested
in that TODO: We temporarily reserve the whole 16 MB limit, and
after reading the actual request, we return part of the reservation
according to the real request size.
That last fix is important, because typically the largest requests will be
BatchWriteItem where a well-written client would want to use chunked
encoding, not Content-Length, to avoid materializing the entire request
up-front. For such clients, the memory use semaphore did nothing, and
now it does the right thing.
Note that this patch does *not* solve the problem #12166 that existed
with Seastar's length-limiting implementation but still exists in the
new in-Scylla length-limiting implementation: The fact we send an
error response in the middle of the request and then close the
connection, while the client continues to send the request, can lead
to an RST being sent by the server kernel. Usually this will be fine -
well-written client libraries will be able to read the response before
the RST. But even with a well-written library in some rare timings
the client may get the RST before the response, and will miss the
response, and get an empty or partial response or "connection reset
by peer". This issue existed before this patch, and still exists, but
is probably of minor impact.
Fixes#8196
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#23434
This patch adds to Alternator's api_error type yet another type of
error, api_error::limit_exceeded (error code "LimitExceededException").
DynamoDB returns this error code in certain situations where certain
low limits were exceeded, such as the case we'll need in a following
patch - an UpdateTable that tries to create more than one GSI at once.
The LimitExceededException error type should not be confused with
other similarly-named but different error messages like
ProvisionedThroughputExceededException or RequestLimitExceeded.
In general, we make an attempt to return the same error code that
DynamoDB returns for a given error.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
While it may not be explicitly documented DynamoDB sometimes enchriches error
message by additional fields. For instance when ConditionalCheckFailedException
occurs while ReturnValuesOnConditionCheckFailure is set it will add Item object,
similarly for TransactionCanceledException it will add CancellationReasons object.
There may be more cases like this so generic json field is added to our error class.
The change will be used by future commit implementing ReturnValuesOnConditionCheckFailure
feature.
Although we don't yet support the DynamoDB API's backup features (see
issue #5063), we can already implement the DescribeContinuousBackups
operation. It should just say that continuous backups, and point-in-time
restores, and disabled.
This will be useful for client code which tries to inquire about
continuous backups, even if not planning to use them in practice
(e.g., see issue #10660).
Refs #5063
Refs #10660
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
In the DynamoDB API, a number is encoded in JSON requests as something
like: {"N": "123"} - the type is "N" and the value "123". Note that the
value of the number is encoded as a string, because the floating-point
range and accuracy of DynamoDB differs from what various JSON libraries
may support.
We have a function unwrap_number() which supported the value of the
number being encoded as an actual number, not a string. But we should
NOT support this case - DynamoDB doesn't. In this patch we add a test
that confirms that DynamoDB doesn't, and remove the unnecessary case
from unwrap_number(). The unnecessary case also had a FIXME, so it's
a good opportunity to get rid of a FIXME.
When writing the test, I noticed that the error which DynamoDB returns
in this case is SerializionException instead of the more usual
ValidationException. I don't know why, but let's also change the error
type in this patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211115125738.197099-1-nyh@scylladb.com>
Objects of type "api_error" are used in Alternator when throwing an
error which will be reported as-is to the user as part of the official
DynamoDB protocol.
Although api_error objects are often thrown, the api_error class was not
derived from std::exception, because that's not necessary in C++.
However, it is *useful* for this exception to derive from std::except,
so this is what this patch does. It is useful for api_error to inherit
from std::exception because then our logging and debugging code knows
how to print this exception with all its details. All we need to do is
to implement a what() virtual function for api_error.
Before this patch, logging an api_error just logs the type's name
(i.e., the string "api_error"). After this patch, we get the full
information stored in the api_error - the error's type and its message.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211017150555.225464-1-nyh@scylladb.com>
When request signature checking is enabled in Alternator, each request
should come with the appropriate Authorization header. Most errors in
this preparing this header will result in an InvalidSignatureException
response; But DynamoDB returns a more specific error when this header is
completely missing: MissingAuthenticationTokenException. We should do the
same, but before this patch we return InvalidSignatureException also for
a missing header.
The test test_authorization.py::test_no_authorization_header used to
enshrine our wrong error message, and failed when run against AWS.
After this patch, we fix the error message and the test - which now
passes against both Alternator and AWS.
Refs #7778.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201213133825.2759357-1-nyh@scylladb.com>
All the places in executor.cc where we constructed an api_error with inline
strings now use api_error factory functions. Most of them, but not all of
them, were api_error::validation(). We also needed to add a couple more of
these factory functions.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
All the places in server.cc where we constructed an api_error with inline
strings now use api_error factory functions - we needed to add a few more.
Interestingly, we had a wrong type string for "Internal Server Error",
which we fix in this patch. We wrote the type string like that - with spaces -
because this is how it was listed in the DynamoDB documentation at
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html
But this was in fact wrong, and it should be without spaces:
"InternalServerError". The botocore library (for example) recognizes it
this way, and this string can also be seen in other online DynamoDB examples.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
In the patch "Add exception overloads for Dynamo types", Alternator's single
api_error exception type was replaced by a more complex hierarchy of types.
The implementation was not only longer and more complex to understand -
I believe it also negated an important observation:
The "api_error" exception type is special. It is not an exception created
by code for other code. It is not meant to be caught in Alternator code.
Instead, it is supposed to contain an error message created for the *user*,
containing one of the few supported exception exception "names" described
in the DynamoDB documentation, and a user-readable text message. Throwing
such an exception in Alternator code means the thrower wants the request
to abort immediately, and this message to reach the user. These exceptions
are not designed to be caught in Alternator code. Code should use other
exceptions - or alternatives to exceptions (e.g., std::optional) for
problems that should be handled before returning a different error to the
user. Moreover, "api_error" isn't just thrown as an exception - it can
also be returned-by-value in a executor::request_return_type) - which is
another reason why it should not be subclassed.
For these reasons, I believe we should have a single api_error type, and
it's wrong to subclass it. So in this patch I am reverting the subclasses
and template added in the aforementioned patch.
Still, one correct observation made in that patch was that it is
inconvenient to type in DynamoDB exception names (no help from the editor
in completing those strings) and also error-prone. In this patch we
propse a different - simpler - solution to the same problem:
We add trivial factory functions, e.g., api_error::validation(std::string)
as a shortcut to api_error("ValidationException"). The new implementation
is easy to understand, and also more self explanatory to readers:
It is now clear that "api_error::validation()" is actually a user-visible
"api_error", something which was obscured by the name validation_exception()
used before this patch.
Finally, this patch also improves the comment in error.hh explaining the
purpose of api_error and the fact it can be returned or thrown. The fact
it should not be subclassed is legislated with a "finally". There is also
no point of this class inheriting from std::exception or having virtual
functions, or an empty constructor - so all these are dropped as well.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Add types exception overloads for ValidationException, ResourceNotFoundException, etc,
to avoid writing explicit error type as string everywhere (with the potential for
spelling errors ever present).
Also allows intellisense etc to complete the exception when coded.
error.hh file implicitly assumed that seastar:: namespace is available
when it's included, which is not always the case. To remedy that,
seastar::httpd namespace is used explicitly.