scylladb

Author	SHA1	Message	Date
Nadav Har'El	653f2df28f	alternator: fix JSON escaping of error responses In the DynamoDB API, error responses are in JSON format with specific fields ("__type" and "message" in the x-amz-json-1.0 format currently used). Alternator tried to be clever and build the string representation of this JSON itself, instead of using RapidJSON. But this optimization was a mistake - if the error message contains characters that need escaping (such as double quotes and newlines), they weren't escaped, and the resulting JSON was malformed. When the client library boto3 read this malformed JSON it got confused, cosidered the entire error response to be a string, which resulted in an ugly error message. The fix is easy - just build the JSON output as usual with RapidJSON instead of trying to optimize using string operation. The patch also includes two tests reproducing this bug and checking its fix. The first test uses boto3 and shows it got confused on the type of error (not understanding that it is a ValidationException). The second test bypasses boto3 and shows exactly where the bug happens - the response is an unparsable JSON. Fixes #10278 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220327132705.3707979-1-nyh@scylladb.com>	2022-03-27 16:32:36 +03:00
Pavel Solodovnikov	95c8d65949	treewide: fix compilation issues with fmtlib 8.1.0+ Due to `fd62fba985` scoped enums are not automatically converted to integers anymore, this is the intended behavior, according to the fmtlib devs. A bit nicer solution would be to use `std::to_underlying` instead of a direct `static_cast`, but it's not available until C++23 and some compilers are still missing the support for it. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-03-16 12:31:50 +03:00
Nadav Har'El	79776ff2ff	alternator: fix error handling during Alternator startup A recent restructuring of the startup of Alternator (and also other protocol servers) led to incorrect error-handling behavior during startup: If an error was detected on one of the shards of the sharded service (in alternator/server.cc), the sharded service itself was never stopped (in alternator/controller.cc), leading to an assertion failure instead of the desired error message. A common example of this problem is when the requested port for the server was already taken (this was issue #9914). So in this patch, exception handling is removed from server.cc - the exception will propegate to the code in controller.cc, which will properly stop the server (including the sharded services) before returning. Fixes #9914. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220130131709.1166716-1-nyh@scylladb.com>	2022-02-02 10:35:57 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Nadav Har'El	36c3b92b19	alternator, schema_loader: get rid of deprecation warnings Seastar moved the read_entire_stream(), read_entire_stream_contiguous() and skip_entire_stream() from the "httpd" namespace to the "util" namespace. Using them with their old names causes deprecation warnings when compiling alternator/server.cc. This patch fixes the namespace (and adds the new include) to get rid of the deprecation warnings. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211209132759.1319420-1-nyh@scylladb.com>	2021-12-09 21:11:56 +03:00
Nadav Har'El	56eb994d8f	alternator: allow Authorization header to be without spaces The "Authorization" HTTP header is used in DynamoDB API to sign requests. Our parser for this header, in server::verify_signature(), required the different components of this header to be separated by a comma followed by a whitespace - but it turns out that in DynamoDB both spaces and commas are optional - one of them is enough. At least one DynamoDB client library - the old "boto" (which predated boto3) - builds this header without spaces. In this patch we add a test that shows that an Authorization header with spaces removed works fine in DynamoDB but didn't work in Alternator, and after this patch modifies the parsing code for this header, the test begins to pass (and the other tests show that the previously-working cases didn't break). Fixes #9568 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211101214114.35693-1-nyh@scylladb.com>	2021-11-03 06:38:28 +02:00
Nadav Har'El	6ae0ea0c48	alternator: return the correct Content-Type header Although the DynamoDB API responses are JSON, additional conventions apply to these responses - such as how error codes are encoded in JSON. For this reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead of the standard `application/json` in its responses. Until this patch, Scylla used `application/json` in its responses. This unexpected content-type didn't bother any of the AWS libraries which we tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27). Moreover, we should return the x-amz-json-1.0 content type for future proofing: It turns out that AWS already defined x-amz-json-1.1 - see: https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html The 1.1 content type differs (only) in how it encodes error replies. If one day DynamoDB starts to use this new reply format (it doesn't yet) and if DynamoDB libraries will need to differenciate between the two reply formats, Alternator better return the right one. This patch also includes a new test that the Content-Type header is returned with the expected value. The test passes on DynamoDB, and after this patch it starts to pass on Alternator as well. Fixes #9554. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211031094621.1193387-1-nyh@scylladb.com>	2021-10-31 10:50:25 +01:00
Nadav Har'El	034f79cfb4	alternator: make api_error an std::exception Objects of type "api_error" are used in Alternator when throwing an error which will be reported as-is to the user as part of the official DynamoDB protocol. Although api_error objects are often thrown, the api_error class was not derived from std::exception, because that's not necessary in C++. However, it is useful for this exception to derive from std::except, so this is what this patch does. It is useful for api_error to inherit from std::exception because then our logging and debugging code knows how to print this exception with all its details. All we need to do is to implement a what() virtual function for api_error. Before this patch, logging an api_error just logs the type's name (i.e., the string "api_error"). After this patch, we get the full information stored in the api_error - the error's type and its message. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211017150555.225464-1-nyh@scylladb.com>	2021-10-29 10:23:55 +03:00
Piotr Sarna	0b11771731	alternator: decouple auth from CQL query processor Alternator auth module used to piggy-back on top of CQL query processor to retrieve authentication data, but it's no longer the case. Instead, storage proxy is used directly. Closes #9538	2021-10-28 21:55:56 +03:00
Nadav Har'El	4ffd8c1f2b	alternator: stub TTL operations This patch adds stubs for the UpdateTimeToLive and DescribeTimeToLive operations to Alternator. These operations can enable, disable, or inquire about, the chosen expiration-time attribute. Currently, the information about the chosen attribute is only saved, with no actual expiration of any items taking place. Some of the tests for the TTL feature start to pass, so their xfail tag is removed. Because this this new feature is incomplete, it is not enabled unless the "alternator-ttl" experimental feature is enabled. Moreover, for these operations to be allowed, the entire cluster needs to support this experimental feature, because all nodes need to participate in the data expiration - if some old nodes don't support Alternator TTL, some of the data they hold won't get expired... So we don't allow enabling TTL until all the nodes in the cluster support this feature. The implementation is in a new source file, alternator/ttl.cc. This source file will continue to grow as we implement the expiration feature. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-09-19 21:05:21 +03:00
Avi Kivity	4aaddd8609	alternator: remove uses of get_local_gossiper() Replace with a gossiper parameter passed from the controller.	2021-09-07 20:08:15 +03:00
Benny Halevy	e9aff2426e	everywhere: make deferred actions noexcept Prepare for updating seastar submodule to a change that requires deferred actions to be noexcept (and return void). Test: unit(dev, debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:52 +03:00
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Pavel Emelyanov	a965a742fc	alternator: Take token metadata from server's storage_proxy There's a local_nodelist_handler serving API requests that calls for global storage service to get token metadata from. Now it can get storage proxy reference from server upon construction and use it for tokens. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-29 05:12:36 +03:00
Pavel Emelyanov	ba10e96c75	alternator: Keep storage_proxy on server It's already available on controller and will be needed by API handlers in the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-29 05:12:36 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Piotr Sarna	1b400b07b9	alternator: add user context to tracing Before this patch, each entry in alternator tracing included an "<unauthenticated request>" field. It's not really true, because most of alternator requests are actually performed by authenticated users (unless auth is disabled).	2021-04-26 11:54:01 +02:00
Piotr Sarna	ddd9c2f2d7	alternator: return username when verifying signature The username will be used later for tracing purposes. It will also very likely be useful later when we decide to add ACL support.	2021-04-26 11:53:19 +02:00
Piotr Sarna	f9adee70d2	alternator: allow enabling slow query logging Alternator is now aware of the slow query logging configuration and can start tracing slow queries.	2021-03-17 11:20:42 +01:00
Piotr Sarna	ba264e7199	alternator: drop read_content_and_verify_signature The only use of this helper function was inlined in a bigger coroutine, so it's no longer needed.	2021-03-10 14:42:53 +01:00
Piotr Sarna	35da51879f	alternator: coroutinize handle_api_request The indentation level is significantly reduced, and so is the number of allocations. The function signature is changed from taking an rvalue ref to taking the unique_ptr by value, because otherwise the coroutine captures the request as a reference, which results in use-after-free.	2021-03-10 14:42:52 +01:00
Nadav Har'El	f41dac2a3a	alternator: avoid large contiguous allocation for request body Alternator request sizes can be up to 16 MB, but the current implementation had the Seastar HTTP server read the entire request as a contiguous string, and then processed it. We can't avoid reading the entire request up-front - we want to verify its integrity before doing any additional processing on it. But there is no reason why the entire request needs to be stored in one big contiguous allocation. This always a bad idea. We should use a non- contiguous buffer, and that's the goal of this patch. We use a new Seastar HTTPD feature where we can ask for an input stream, instead of a string, for the request's body. We then begin the request handling by reading lthe content of this stream into a vector<temporary_buffer<char>> (which we alias "chunked_content"). We then use this non-contiguous buffer to verify the request's signature and if successful - parse the request JSON and finally execute it. Beyond avoiding contiguous allocations, another benefit of this patch is that while parsing a long request composed of chunks, we free each chunk as soon as its parsing completed. This reduces the peak amount of memory used by the query - we no longer need to store both unparsed and parsed versions of the request at the same time. Although we already had tests with requests of different lengths, most of them were short enough to only have one chunk, and only a few had 2 or 3 chunks. So we also add a test which makes a much longer request (a BatchWriteItem with large items), which in my experiment had 17 chunks. The goal of this test is to verify that the new signature and JSON parsing code which needs to cross chunk boundaries work as expected. Fixes #7213. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210309222525.1628234-1-nyh@scylladb.com>	2021-03-10 09:22:34 +01:00
Nadav Har'El	d905e71a90	Alternator: add support for CORS protocol This patch adds to Alternator support for the CORS (Cross-Origin Resource Sharing) protocol - a simple extension over the HTTP protocol which browsers use when Javascript code contacts HTTP-based servers. Although we usually think of Alternator as being used in a three-tier application, in some setups there is no middle layer and the user's browser, running Javascript code, wants to communicate directly with the database. However, for security reasons, by default Javascript loaded from domain X is not allowed to communicate with different domains Y. The CORS protocol is meant to allow this, and Alternator needs to participate in this protocol if it is to be used directly from Javascript in browsers. To implement CORS, Alternator needs to respond to the OPTIONS method which it didn't allow before - with certain headers based on the input headers. It also needs to do some of these things for the regular methods (mostly, POST). The patch includes a comprehensive test that runs against both Alternator and DynamoDB and shows that Alternator handles these headers and methods the same as DynamoDB. Additionally, I tested manually a Javascript DynamoDB client - which didn't work prior to this patch (the browser reported CORS errors), and works after this patch. Fixes #8025. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210217222027.1219319-1-nyh@scylladb.com>	2021-02-23 13:15:03 +01:00
Piotr Sarna	d7848750d8	alternator: server: return api_error instead of throwing Throwing a C++ exception creates unnecessary overhead, so when an unsupported operation is encountered, the api error is directly returned instead of being thrown.	2021-02-04 17:23:41 +01:00
Piotr Sarna	868e04e8e2	alternator: add requests_shed metrics The counter shows the total number of requests shed due to overload.	2021-02-04 17:23:41 +01:00
Piotr Sarna	1b8c946ad7	alternator: add handling max_concurrent_requests_per_shard The config value is already used to set an upper limit of concurrent CQL requests, and now it's also abided by alternator. Excessive requests result in returning RequestLimitExceeded error to the client. Tests: manual Running multiple concurrent requests via the test suite results in: botocore.errorfactory.RequestLimitExceeded: An error occurred (RequestLimitExceeded) when calling the CreateTable operation: too many in-flight requests: 17	2021-02-04 17:23:41 +01:00
Nadav Har'El	4ab98a4c68	alternator: use a more specific error when Authorization header is missing When request signature checking is enabled in Alternator, each request should come with the appropriate Authorization header. Most errors in this preparing this header will result in an InvalidSignatureException response; But DynamoDB returns a more specific error when this header is completely missing: MissingAuthenticationTokenException. We should do the same, but before this patch we return InvalidSignatureException also for a missing header. The test test_authorization.py::test_no_authorization_header used to enshrine our wrong error message, and failed when run against AWS. After this patch, we fix the error message and the test - which now passes against both Alternator and AWS. Refs #7778. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20201213133825.2759357-1-nyh@scylladb.com>	2020-12-14 09:18:24 +01:00
Pavel Emelyanov	cf172cf656	alternator: Use local query processor reference to get keys Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-10-31 15:44:21 +03:00
Pavel Emelyanov	94a9f22002	alternator: Keep local query processor reference in server Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-10-31 15:44:21 +03:00
Nadav Har'El	81589be00a	alternator: use api_error factory functions in server.cc All the places in server.cc where we constructed an api_error with inline strings now use api_error factory functions - we needed to add a few more. Interestingly, we had a wrong type string for "Internal Server Error", which we fix in this patch. We wrote the type string like that - with spaces - because this is how it was listed in the DynamoDB documentation at https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html But this was in fact wrong, and it should be without spaces: "InternalServerError". The botocore library (for example) recognizes it this way, and this string can also be seen in other online DynamoDB examples. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-07-23 15:36:39 +03:00
Nadav Har'El	5a35632cd3	alternator: refactor api_error class In the patch "Add exception overloads for Dynamo types", Alternator's single api_error exception type was replaced by a more complex hierarchy of types. The implementation was not only longer and more complex to understand - I believe it also negated an important observation: The "api_error" exception type is special. It is not an exception created by code for other code. It is not meant to be caught in Alternator code. Instead, it is supposed to contain an error message created for the user, containing one of the few supported exception exception "names" described in the DynamoDB documentation, and a user-readable text message. Throwing such an exception in Alternator code means the thrower wants the request to abort immediately, and this message to reach the user. These exceptions are not designed to be caught in Alternator code. Code should use other exceptions - or alternatives to exceptions (e.g., std::optional) for problems that should be handled before returning a different error to the user. Moreover, "api_error" isn't just thrown as an exception - it can also be returned-by-value in a executor::request_return_type) - which is another reason why it should not be subclassed. For these reasons, I believe we should have a single api_error type, and it's wrong to subclass it. So in this patch I am reverting the subclasses and template added in the aforementioned patch. Still, one correct observation made in that patch was that it is inconvenient to type in DynamoDB exception names (no help from the editor in completing those strings) and also error-prone. In this patch we propse a different - simpler - solution to the same problem: We add trivial factory functions, e.g., api_error::validation(std::string) as a shortcut to api_error("ValidationException"). The new implementation is easy to understand, and also more self explanatory to readers: It is now clear that "api_error::validation()" is actually a user-visible "api_error", something which was obscured by the name validation_exception() used before this patch. Finally, this patch also improves the comment in error.hh explaining the purpose of api_error and the fact it can be returned or thrown. The fact it should not be subclassed is legislated with a "finally". There is also no point of this class inheriting from std::exception or having virtual functions, or an empty constructor - so all these are dropped as well. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-07-23 15:36:39 +03:00
Calle Wilund	cbb70f4af4	executor: "UpdateTable" support for streams Partial implementation of the "UpdateTable" command. Supports only enabling/disabling streams.	2020-07-15 08:21:34 +00:00
Calle Wilund	bbc544748f	alternator: Implement GetRecords Simplistic variant, using 1:1 mapping of scylla stream id <-> shard	2020-07-15 08:21:34 +00:00
Calle Wilund	c45781de1e	alternator: Implement GetShardIterator	2020-07-15 08:10:23 +00:00
Calle Wilund	8084b5a9b7	alternator: Implement DescribeStream	2020-07-15 08:10:23 +00:00
Calle Wilund	8fb9b32bd3	alternator: Implement ListStreams command	2020-07-15 08:10:23 +00:00
Nadav Har'El	8b3dac040a	alternator: add request headers to trace-level logging When "trace"-level logging is enabled for Alternator, we log every request, but currently only the request's body. For debugging, it is sometimes useful to also see the headers - which are important to debug authentication, for example. So let's print the headers as well. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200709103414.599883-1-nyh@scylladb.com>	2020-07-09 12:38:45 +02:00
Piotr Sarna	4de23d256e	alternator,utils: move rjson.hh to utils/ rjson is going to replace libjsoncpp, so it's moved from alternator to the common utils/ directory.	2020-07-03 08:30:01 +02:00
Rafael Ávila de Espíndola	555d8fe520	build: Be consistent about system versus regular headers We were not consistent about using '#include "foo.hh"' instead of '#include <foo.hh>' for scylla's own headers. This patch fixes that inconsistency and, to enforce it, changes the build to use -iquote instead of -I to find those headers. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200608214208.110216-1-espindola@scylladb.com>	2020-06-10 15:49:51 +03:00
Piotr Sarna	b8df958811	alternator: deduplicate logs on boot Alternator server used to print a startup log line for each shard, which is redundant and creates churn for nodes with many cores. Instead of all that, a single line is now printed once alternator server properly boots. Fixes #6347 Tests: manual(boot), unit(dev)	2020-05-05 16:19:18 +03:00
Calle Wilund	cc9bb6454c	alternator: Use reloadable tls certificates	2020-05-04 11:32:21 +00:00
Rafael Ávila de Espíndola	eca0ac5772	everywhere: Update for deprecated apply functions Now apply is only for tuples, for varargs use invoke. This depends on the seastar changes adding invoke. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200324163809.93648-1-espindola@scylladb.com>	2020-03-25 08:49:53 +02:00
Piotr Sarna	f43e68b383	alternator: hook admission control to alternator server From now on, alternator requests use the memory limiter semaphore to control the amount of memory used by alternator requests.	2020-03-16 08:43:49 +01:00
Piotr Sarna	a1ea650d83	alternator: add memory limiter to alternator server With the memory limiter semaphore, the server will be able to apply admission control to alternator requets.	2020-03-16 07:44:26 +01:00
Piotr Sarna	781fbe8070	alternator: add service permit to callbacks As a first step towards introducing admission control, the API of alternator callbacks is extended with an additional 'permit' parameter.	2020-03-16 07:44:25 +01:00
Piotr Sarna	2137017bc3	alternator: revert to ValidationException for JSON errors Both rapidjson library and DynamoDB induce enough corner cases for incorrect JSON, that the simplest way out is to simply conform back to ValidationException in all cases. This commit comes with an updated test, which is now aware of 3 possible outcomes for an incorrect JSON: a ValidationException, a SerializationException and HTTP 404. Message-Id: <5e39d2dc077f4ea5ce360035a4adcddaf3a342a0.1582876734.git.sarna@scylladb.com>	2020-03-01 14:35:20 +02:00
Piotr Sarna	c370586189	alternator: change json errors class to SerializationException In order to be consistent with DynamoDB - a parsing error on incorrect JSON input is reported as SerializationException instead of ValidationException.	2020-02-28 07:57:12 +02:00
Piotr Sarna	1be1cfc5d8	alternator: make rjson yieldable in thread context In order to fight reactor stalls, rjson parsing and writing routines can now yield if they run in seastar thread context. In order to run a yieldable version of the parser which needs to be run in seastar thread context, use parse_yieldable() instead of parse().	2020-02-28 07:57:12 +02:00
Piotr Sarna	aad6c01b98	alternator: implement json parser inside the server The json parser runs in a static thread which accepts and parses documents. Documents smaller than a parsing threshold (currently: 16KiB) will be parsed in place without yielding. The assumption is that most alternator requests are small and there's no need to parse them in a yieldable way, which also induces overhead. For reference, parsing a 128KiB document made of many small objects with rapidjson takes around 0.5 millisecond, and a 16KiB document is parsed in around 0.06ms - a value small enough not to disturb Seastar's current value of 0.5ms task quota too much.	2020-02-28 07:57:12 +02:00
Piotr Sarna	2402955d45	alternator: move parsing in front of executor Parsing a request string into JSON happens as a first thing in every request, so it can be performed before calling any executor callbacks. The most important thing however, is that making parsing a separate stage allows certain optimizations, e.g. running all parsing in a single seastar thread, which allows adding yields to rjson parsing later.	2020-02-28 07:57:12 +02:00

1 2 3

105 Commits