scylladb

Author	SHA1	Message	Date
Piotr Sarna	45bf039357	alternator: use has_function instead of try-catch With the new interface available, the try-catch idiom can be removed, thus resolving a TODO. Tests: unit(dev) Message-Id: <788a29f8f9d7bcf952b28a6148670dbadb97a619.1592233511.git.sarna@scylladb.com>	2020-06-15 23:55:20 +03:00
Piotr Sarna	e76fba6f86	alternator: remove outdated TODO for adding timeouts The TODO is already fixed, not to mention that it had an incorrect ordinal number (: Message-Id: <006dc3061e0f30641c2e63ff471686f4c2e82829.1592230155.git.sarna@scylladb.com>	2020-06-15 23:04:42 +03:00
Nadav Har'El	8c026b9f10	alternator: move some code out of executor.cc The source file alternator/executor.cc has grown too much, reaching almost 4,000 lines. In this patch I move about 400 lines out of executor.cc: 1. Some functions related to serialization of sets and lists were moved to serialization.cc, 2. Functions related to evaluating parsed expressions were moved to expressions.cc. The header file expressions_eval.hh was also removed - the calculate_value() functions now live in expressions.cc, so we can just define them in expressions.hh, no need for a separate header files. This patch just moves code around. It doesn't make any functional changes. Refs #5783. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-06-14 12:16:26 +03:00
Nadav Har'El	0b9f25ab50	alternator: implement FilterExpression This patch provides a complete implementation for the FilterExpression parameter - the newer syntax for filtering the results of the Query or Scan operations. The implementation is pretty straightforward - we already added earlier a result-filtering framework to Alternator, and used it for the older filtering syntax - QuryFilter and ScanFilter. All we had to do now was to run the FilterExpression (which has the same syntax as a ConditionExpression) on each individual items. The previous cleanup patches were important to reduce the friction of running these expressions on the items. After the previous patches fixing small esoteric bugs in a few expression functions, with this patch all the tests in test_filter_expression.py now pass, and so do the two FilterExpression tests in test_query.py and test_scan.py. As far as I know (and of course minus any bugs we'll discover later), this marks the FilterExpression feature complete. Fixes #5038. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-06-14 12:16:26 +03:00
Nadav Har'El	f87259a762	alternator: improve error path of attribute_type() function The attribute_type() function, which can be used in expressions like ConditionExpression and FilterExpression, is supposed to generate an error if its second parameter is not one of the known types. What we did until now was to just report a failed check in this case. We already had a reproducing test with FilterExpression, but in this patch we also add a test with ConditionExpression - which fails before this patch and passes afterwards (and of course, passes with DynamoDB). Fixes #6641. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-06-14 12:16:20 +03:00
Nadav Har'El	11d86dfb06	alternator: fix begins_with() error path The begins_with() function should report an error if a constant is passed to it which isn't one of the supported types - string or bytes (e.g., a number). The code we had to check this had wrong logic, though. If the item attribute was also a number, we silently returned false, and didn't go on to detect that the second parameter - a constant - was a number too and should generate an error - not be silent. Fixed and added a reproducing test case and another test to validate my understanding of the type of parameters that begins_with() accepts. Fixes #6640. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-06-14 12:13:23 +03:00
Nadav Har'El	13ef31f38b	alternator: refactor resolving of references in expressions In the DynamoDB API, expressions (e.g., ConditionExpression and many more) may contain references to column names ("#name") or to values (":val") given in a separate part of the request - ExpressionAttributeNames and ExpressionAttributeValues respectively. Before this patch, we resolved these references as part of the expression's evaluation. This approach had two downsides: 1. It often misdiagnosed (both false negatives and false positives) cases of unused names and values in expressions. We already had two xfailing tests with examples - which pass after this patch. This patch also adds two additional tests, which failed before this patch and pass with it. 2. In one of the following patches we will add support for FilterExpression, where the same expression is used repeatedly on many items. It is a waste (as well as makes the code uglier) to resolve the same references again and again each time the expression is evaluated. We should be able to do it just once. So this patch introduces an intermediate step between parsing and evaluating an expression - "resolving" the expression. The new resolve_() functions modify the already parsed expression, replacing references to attribute names and constant values by the actual names and values taken from the request. The resolve_() functions also keep track which references were used, making it very easy to check (as DynamoDB does) if there are any unused names or values, before starting the evaluation. The interface of evaluate() functions become much simpler - they no longer need to know the original request (which was previously needed for ExpressionAttributeNames/Values), the table's schema (which was previously needed only for some error checking), keep track of which references were used. This simplification is helpful for using the expressions in contexts where these things (request and schema) are no longer conveniently available, namely in FilterExpression. A small side-benefit of this patch is that it moves a bit of code, which handled resolving of references in expressions, from executor.cc to expressions.cc. This is just the first step in a bigger effort to reduce the size of executor.cc by moving code to smaller source files. There is no attempt in this patch to move as much code as we can. We will move more code in a separate patch in this series. Fixes #6572. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-06-14 11:57:13 +03:00
Nadav Har'El	0c460927bf	alternator: cleanup - don't use unique_ptr when not needed In the existing Alternator code, we used std::unique_ptr<rjson::value> for passing the optional old value of an item read for a RMW operation. The benfit of this type over the simpler "const rjson::value" is that it gives the callee ownership of the item, and thus the ability to move parts of it into the response without copying them. We only used this ability in a handful of obscure cases involving ReturnedValues, but I am not going to break this dubious feature in this patch. Nevertheless, a lot of internal code, like condition checks, just needs read-only access to that previous item, so we passed a reference to the unique_ptr, i.e., "const std::unique_ptr<rjson::value>&". This is ugly, and also forces new code that wants to use the same condition checks (i.e., filtering code), to artificially allocate a unique_ptr just because that is what these functions expect. So in this patch, we change the utility functions such as verify_condition_expression() and everything they use, to pass around a "const rjson::value" instead of a "const std::unique_ptr<rjson::value>&. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200604131352.436506-1-nyh@scylladb.com>	2020-06-10 07:33:31 +02:00
Nadav Har'El	db45ff2733	alternator: clean up usage of describe_item() The DynamoDB GetItem request returns the requested item in a specific way, wrapped in a map with a "Item" member. For historic reasons, we used the same function that returns this (describe_item()) also in other code which reads items - e.g. for checking conditional operations. The result is wasteful - after adding this "Item" member we had other code to extract it, all for no good reason. It is also ugly and confusing. Importantly, this situation also makes it harder for me to add support for FilterExpression. The issue is that the expression evaluator got the item with the wrapper (from the existing ConditionExpression code) but the filtering code had it without this wrapper, as it didn't use describe_item(). So this patch uses describe_single_item(), which doesn't add the wrapper map, instead of describe_item(). The latter function is used just once - to implement GetItem. The unnecessary code to unwrap the item in multiple places was then dropped. All the tests still pass. I also tested test_expected.py in unsafe_rmw write isolation mode, because code only for this mode had to be modified as well. Refs #5038. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200604092050.422092-1-nyh@scylladb.com>	2020-06-04 12:33:48 +02:00
Piotr Sarna	8fc3ca855e	alternator: fix the return type of PutItem Even if there are no attributes to return from PutItem requests, we should return a valid JSON object, not an empty string. Fixes #6568 Tests: unit(dev)	2020-06-03 16:03:13 +03:00
Piotr Sarna	3aff52f56e	alternator: fix returning UnprocessedKeys unconditionally Client libraries (e.g. PynamoDB) expect the UnprocessedKeys and UnprocessedItems attributes to appear in the response unconditionally - it's hereby added, along with a simple test case. Fixes #6569 Tests: unit(dev)	2020-06-03 15:48:16 +03:00
Nadav Har'El	bea9629031	alternator: implement remaining QueryFilter / ScanFilter functionality This patch implements the missing QueryFilter (and ScanFilter) functionality:` 1. All operators. Previously, only the "EQ" operator was implemented. 2. Either "OR" or "AND" of conditions (previously only "AND"). 3. Correctly returning Count and ScannedCount for post-filter and pre-filter item counts, respectively. All of the previously-xfailing tests in test_query_filter.py are now passing. The implementation in this patch abandons our previous attempts to translate the DynamoDB API filters into Scylla's CQL filters. Doing this correctly for all operators would have been exceedingly difficult (for reasons explained in #5028), and simply not worth the effort: CQL's filters receive a page of results and then filter them, and we can do exactly the same without CQL's filters: The new code just retrieves an unfiltered page of items, and then for each of these items checks whether it passes the filters. The great thing is that we already had code for this checking - the QueryFilter syntax is identical to the "Expected" syntax (for conditional operations) that we already supported, so we already had code for checking these conditions, including all the different operators. This patch prepares for the future need to support also the newer FilterExpression syntax (see issue #5038), and the "filter" class supports either type of filter - the implementation for the second syntax is just missing and can be added (fairly easily) later. Fixes #5028. Refs #5038. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200603110118.399325-1-nyh@scylladb.com>	2020-06-03 13:16:45 +02:00
Nadav Har'El	17649ad0b5	alternator: error when unimplemented ConditionalOperator is used The ScanFilter and QueryFilter features are only partially implemented. Most of their unimplemented features cause clear errors telling the user of the unimplemented feature, but one exception is the ConditionalOperator parameter, which can be used to "OR", instead of the default "AND", of several conditions. Before this patch, we simply ignored this parameter - causing wrong results to be returned instead of an error. In this patch, ScanFilter and QueryFilter parse, instead of ignoring, the ConditionalOperator. The common implementation, get_filtering_restrictions(), still does not implement the OR case, but returns an error if we reach this case instead of just ignoring it. There is no new test. The existing test_query_filter.py::test_query_filter_or xfailed before this patch, and continues to xfail after it, but the failure is different (you can see it by running the test with "--runxfail"): Before this patch, the failure was because of different results. After this patch, the failure is because of an "unimplemented" error message. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200528214721.230587-2-nyh@scylladb.com>	2020-05-29 08:26:43 +02:00
Nadav Har'El	51adaea499	alternator: use C++20 std::string_view::starts_with() We had to wait many years for it, but finally we have a starts_with() method in C++20. Let's use it instead of ugly substr()-based code. This is probably not a performance gain - substr() for a string_view was already efficient. But it makes the code easier to understand, and it allows us to rejoice in our decision to switch to C++20. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200526185812.165038-2-nyh@scylladb.com>	2020-05-27 08:14:12 +02:00
Nadav Har'El	c3da9f2bd4	alternator: add mandatory configurable write isolation mode Alternator supports four ways in which write operations can use quorum writes or LWT or both, which we called "write isolation policies". Until this patch, Alternator defaulted to the most generally safe policy, "always_use_lwt". This default could have been overriden for each table separately, but there was no way to change this default for all tables. This patch adds a "--alternator-write-isolation" configuration option which allows changing the default. Moreover, @dorlaor asked that users must explicitly choose this default mode, and not get "always_use_lwt" without noticing. The previous default, "always_use_lwt" supports any workload correctly but because it uses LWT for all writes it may be disappointingly slow for users who run write-only workloads (including most benchmarks) - such users might find the slow writes so disappointing that they will drop Scylla. Conversely, a default of "forbid_rmw" will be faster and still correct, but will fail on workloads which need read-modify-write operations - and suprise users that need these operations. So Dor asked that that none of the write modes be made the default, and users must make an informed choice between the different write modes, rather than being disappointed by a default choice they weren't aware of. So after this patch, Scylla refuses to boot if Alternator is enabled but a "--alternator-write-isolation" option is missing. The patch also modifies the relevant documentation, adds the same option to our docker image, and the modifies the test-running script test/alternator/run to run Scylla with the old default mode (always_use_lwt), which we need because we want to test RMW operations as well. Fixes #6452 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200524160338.108417-1-nyh@scylladb.com>	2020-05-27 08:40:05 +03:00
Nadav Har'El	f2eab853a5	alternator: improve Query's KeyConditions error message Improve error messages coming from Query's KeyCondition parameter when wrong ComparisonOperators were used (issue discovered by @Orenef11). At one point the error message was missing a parameter so resulted in an internal error, while in another place the message mentioned an unuseful number (enum) for the operator instead of its name. This patch fixes these error messages. Fixes #6490 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200524141800.104950-2-nyh@scylladb.com>	2020-05-25 09:59:00 +02:00
Nadav Har'El	6b38126a8f	alternator: fix support for bytes type in Query's KeyConditions Our parsing of values in a KeyConditions paramter of Query was done naively. As a result, we got bizarre error messages "condition not met: false" when these values had incorrect type (this is issue #6490). Worse - the naive conversion did not decode base64-encoded bytes value as needed, so KeyConditions on bytes-typed keys did not work at all. This patch fixes these bugs by using our existing utility function get_key_from_typed_value(), which takes care of throwing sensible errors when types don't match, and decoding base64 as needed. Unfortunately, we didn't have test coverage for many of the KeyConditions features including bytes keys, which is why this issue escaped detection. A patch will follow with much more comprehensive tests for KeyConditions, which also reproduce this issue and verify that it is fixed. Refs #6490 Fixes #6495 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200524141800.104950-1-nyh@scylladb.com>	2020-05-25 09:58:37 +02:00
Nadav Har'El	5ef9854e86	alternator: better error messages when 'forbid_rmw' mode is on When the 'forbid_rmw' write isolation policy is selected, read-modify-write are intentionally forbidden. The error message in this case used to say: "Read-modify-write operations not supported" Which can lead users to believe that this operation isn't supported by this version of Alternator - instead of realizing that this is in fact a configurable choice. So in this patch we just change the error message to say: "Read-modify-write operations are disabled by 'forbid_rmw' write isolation policy. Refer to https://github.com/scylladb/scylla/blob/master/docs/alternator/alternator.md#write-isolation-policies for more information." Fixes #6421. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200518125538.8347-1-nyh@scylladb.com>	2020-05-24 16:31:38 +02:00
Piotr Sarna	e503075aac	alternator: apply the string_view helper function Explicit transformation from a JSON value to a string view can be replaced with a shorter helper function from rjson.hh.	2020-05-21 18:26:59 +03:00
Piotr Sarna	cb7d3c6b55	alternator: compute begins_with on base64 without decoding In order to remove a FIXME, code which checks a BEGINS_WITH relation between base64-encoded strings is computed in a way which does not involve decoding the whole string. In case of padding, the remainders are still decoded, but their size is bounded by 3, which means they will be eligible for the small string optimization.	2020-05-21 18:26:59 +03:00
Piotr Sarna	3148571834	alternator: compute decoded base64 size without actually decoding In order to get rid of a FIXME, the code which computes the size of decoded base64 string based only on encoded size + padding is added. The result is an O(1) function with just a couple of ops (15 when checking with godbolt and gcc9), so it's a general improvement over having to allocate a string and get its size.	2020-05-21 18:26:59 +03:00
Piotr Sarna	9f8202806a	alternator: allow empty strings in values Given the new update from DynamoDB: https://aws.amazon.com/about-aws/whats-new/2020/05/amazon-dynamodb-now-supports-empty-values-for-non-key-string-and-binary-attributes-in-dynamodb-tables/ ... empty strings are now allowed for non-key attributes, so alternator and its tests are updated accordingly. Fixes #6480 Tests: alternator(local, remote)	2020-05-19 11:32:18 +02:00
Piotr Sarna	5f2eadce09	alternator: wait for schema agreement after table creation In order to be sure that all nodes acknowledged that a table was created, the CreateTable request will now only return after seeing that schema agreement was reached. Rationale: alternator users check if the table was created by issuing a DescribeTable request, and assume that the table was correctly created if it returns nonempty results. However, our current implementation of DescribeTable returns local results, which is not enough to judge if all the other nodes acknowledge the new table. CQL drivers are reported to always wait for schema agreement after issuing DDL-changing requests, so there should be no harm in waiting a little longer for alternator's CreateTable as well. Fixes #6361 Tests: alternator(local)	2020-05-11 21:51:12 +03:00
Piotr Sarna	517f2c0490	alternator: unify error messages for existing tables/keyspaces Since alternator is based on Scylla, two "already exists" error types can appear when trying to create a table - that a table itself exists, or that its keyspace does. That's however an implementation detail, since alternator does not have a notion of keyspaces at all. This patch unifies the error message to simply mention that a table already exists, and comes with a more robust test case. If the keyspace already exists, table creation will still be attempted. Fixes #6340 Tests: alternator(local, remote)	2020-05-11 18:30:02 +03:00
Gleb Natapov	0fed86e4c6	lwt: change cas_request::apply signature Change the way query result is passed from getting a reference to a result to getting a foreign_ptr<lw_shared_ptr<query::result>>. This will allow cas_request to keep it without copying.	2020-05-05 12:38:23 +03:00
Piotr Sarna	09e4f3b917	alternator: implement ScanIndexForward The ScanIndexForward parameter is now fully implemented and can accept ScanIndexForward=false in order to query the partitions in reverse clustering order. Note that reading partition slices in reverse order is less efficient than forward scans and may put a strain on memory usage, especially for large partitions, since the whole partition is currently fetched in order to be reversed. Fixes #5153	2020-04-28 11:44:46 +03:00
Pavel Solodovnikov	ed7a7554b8	storage_proxy: allow cas() to accept nullptr read_command This patch allows users of storage_proxy::cas() to supply nullptr as `query::read_command` which is supposed to skip the procedure of reading the existing value. The feature is used in alternator code for Read-Modify-Write operations: some of them don't require reading previous item values before updating. Move `read_nothing_read_command` from alternator code to storage_proxy layer and fabricate a new no-op command from it when storage_proxy::cas() is used with nullptr read_command. This allows to avoid sprinkling if-else branches all over the code in order to check for null-equality of `cmd`. We return from storage_proxy::query() very early with an empty result in case we're given an empty partition_slice (which resides inside the passed `read_command`) so this approach should be perfectly fine. Expand documentation for the `cas()` function to cover new possible value for `cmd` argument. Fixes: #6238 Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200428065235.5714-1-pa.solodovnikov@scylladb.com>	2020-04-28 10:44:19 +03:00
Piotr Sarna	e17c237feb	alternator: fix integer overflow warning in token generation When generating tokens for parallel scan, debug mode undefined behavior sanitizer complained that integer overflow sometimes happens when multiplying two big values - delta and segment number. In order to mitigate this warning, the multiplication is now split into two smaller ones, and the generated machine code remains identical (verified on gcc and clang via compiler explorer). Fixes #6280 Tests: unit(dev)	2020-04-26 19:06:07 +03:00
Nadav Har'El	1f75efb556	alternator: use RF=3 even if some nodes are temporarily down Alternator is supposed to use RF=3 for new tables. Only when the cluster is smaller than 3 nodes do we use RF=1 (and warn about it) - this is useful for testing. However, our implementation incorrectly tested the number of live nodes in the cluster instead of the total number of nodes. As a result, if a 3-node cluster had one node down, and a new table was created, it was created with RF=1, and immediately could not be written because when RF=1, any node down means part of the data is unavailable. This patch fixes this: The total number of nodes in the cluster - not the number of live nodes - is consulted. The three-node-cluster-with-a-dead-node setup above creates the table with RF=3, and it can be written because two living nodes out of three are enough when RF=3 and we do quorum writes and reads. We have a dtest to reproduce this bug (and its fix), and it's also easy to reproduce manually by starting a 3-node cluster, killing one of the nodes, and then running "pytests". Before this patch, the tests can create tables but then fail to write to them. After this patch, the test succeed on the same cluster with the dead node. Fixes #6267 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200422182035.15106-2-nyh@scylladb.com>	2020-04-23 08:23:05 +02:00
Piotr Sarna	dbb9574aa2	alternator: allow parallel scan Parallel scans can be performed by providing Segment and TotalSegments attributes to Scan request, which can be used to split the work among many workers. This test makes the parallel scan test succeed, so the xfail is removed. Fixes #5059	2020-04-22 11:06:15 +03:00
Piotr Sarna	53bbef1e6c	alternator: add a way of accessing system tables from alternator Scylla's system tables often provide interesting information for clients. In order to be able to access this information without CQL, a notion of virtual tables is introduced to alternator. When a table named .scylla.alternator.KS_NAME.TABLE_NAME is accessed with read-only operation - Query or Scan, Scylla's internal KS_NAME.TABLE_NAME table will be queried instead. For instance, if a user wants to read about system_auth.roles, the Scan request should target the following table: ".scylla.alternator.system_auth.roles". Fixes #6122	2020-04-09 09:41:30 +02:00
Piotr Sarna	09d09ddefb	alternator: add fetching static columns if they exist Until now, the list of static column ids was always empty for alternator tables anyway, so the list wasn't fetched. However, with the virtual interface of fetching Scylla internal tables, we need to list the ids of selected static columns explicitly to avoid segfaults - since we select the whole row, static columns included.	2020-04-09 09:41:30 +02:00
Piotr Sarna	123edfc10c	alternator: fix failure on incorrect table name with no indexes If a table name is not found, it may still exist as a local index, but the check tried to fetch a local index name regardless if it was present in the request, which was a nullptr dereference bug. Fixes #6161 Tests: alternator-test(local, remote) Message-Id: <428c21e94f6c9e450b1766943677613bd46cbc68.1586347130.git.sarna@scylladb.com>	2020-04-08 15:33:48 +03:00
Piotr Sarna	0a2d7addc0	alternator: use partition tombstone if there's no clustering key As @tgrabiec helpfully pointed out, creating a row tombstone for a table which does not have a clustering key in its schema creates something that looks like an open-ended range tombstone. That's problematic for KA/LA sstable formats, which are incapable of writing such tombstones, so a workaround is provided in order to allow using KA/LA in alternator. Fixes #6035	2020-04-08 08:08:45 +02:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Rafael Ávila de Espíndola	eca0ac5772	everywhere: Update for deprecated apply functions Now apply is only for tuples, for varargs use invoke. This depends on the seastar changes adding invoke. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200324163809.93648-1-espindola@scylladb.com>	2020-03-25 08:49:53 +02:00
Piotr Sarna	781fbe8070	alternator: add service permit to callbacks As a first step towards introducing admission control, the API of alternator callbacks is extended with an additional 'permit' parameter.	2020-03-16 07:44:25 +01:00
Nadav Har'El	77444a38a1	alternator: allow consistent reads on LSI - but not on GSI Recently, Materialized Views were modified (see issue #4365) so that local view updates (when both base and view replicas are the same node) are synchronous. In particular, when the view's partition key is the same as the base table's, view writes are synchronous: A write now only returns after CL copies of the view data have been written. Alternator's LSI have exactly this case (same partition key as the base). This makes strongly-consistent (CL=LOCAL_QUORUM) reads in Alternator work correctly, so we update the documentation accordingly to no longer say that we don't support this DynamoDB feature. However unlike LSIs, for GSIs strongly-consistent reads are still not supported, and should not be supported (they are also not supported by DynamoDB). Such reads should generate an error. So this patch fixes this too. A GSI test which tested that strongly consistent reads are forbidden, which used to xfail, now passes so the patch removes the "xfail". Finally, we can simplify the LSI tests by using consistent reads instead of eventually-consistent reads with retries. Beyond simplifying the test, it's also an opportunity to use strongly-consistent reads and make sure that they work (while, as mentioned above, similar reads for GSIs are refused). Fixes #5007 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200311170446.28611-1-nyh@scylladb.com>	2020-03-12 09:18:00 +01:00
Nadav Har'El	96ca5ac2c8	alternator: use separate smp_service_group for bouncing requests Until this patch, we used the default_smp_service_group() when bouncing Alternator requests between shards (which is needed for LWT). This patch creates a new smp_service_group for this purpose, which is limited to 5000 concurrent requests (the same limit used for CQL's bounce_request_smp_service_group). The purpose of this limit is to avoid many shards admitting a huge number of requests and bouncing all of them to the same shard who now can't "unadmit" these requests. Fixes #5664. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200304170825.27226-1-nyh@scylladb.com>	2020-03-05 10:17:51 +01:00
Nadav Har'El	91d9632909	alternator: add rjson::remove_member() convenience function This patch adds a rjson::remove_member() wrapper to the RemoveMember method, which takes a std::string_view. But beyond the convenience, this actually works around a subtle bug in RemoveMember where, if given a StringRef parameter, ignores its length (see upstream issue https://github.com/Tencent/rapidjson/issues/1649). In the one place we used RemoveMember, it forced us to copy the string because it wasn't null-terminated. The solution proposed here involves wrapping the string view in a GenericValue - which no longer needs to copy the string, but still works around the bug. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303143524.28300-1-nyh@scylladb.com>	2020-03-03 16:35:41 +01:00
Nadav Har'El	0fcb226412	alternator: switch rjson::find() to use std::string_view Our rjson::find() convenience function used RapidJson's "StringRef" type, which is almost exactly like std::string_view. If we switch to use string_view as we do in this patch, a lot of call sites become much simpler. Moreover, there was an even more important motivation for this patch: the RapidJson FindMember() function we used in rjson::find() has a bug when given a StringRef - although a StringRef contains a length, the FindMember() code ignores it and expects the string to be null-terminated (see: https://github.com/Tencent/rapidjson/issues/1649). In this patch, we wrap the pointer and length of a std::string_view in an rjson::value, a code path which bypasses the FindMember bug, and yet does not require copying the string. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303141814.26929-1-nyh@scylladb.com>	2020-03-03 16:35:41 +01:00
Piotr Sarna	6f8c70d54b	alternator: fix returning raw JSON errors A couple of places in executor code leaked raw JSON errors to the user instead of formulating a proper ValidationException message. These places are now fixed, and the next patch in this series will act as a regression checker, since all JSON errors will be returned as SerializationException, not ValidationException instances.	2020-02-28 07:57:12 +02:00
Piotr Sarna	2402955d45	alternator: move parsing in front of executor Parsing a request string into JSON happens as a first thing in every request, so it can be performed before calling any executor callbacks. The most important thing however, is that making parsing a separate stage allows certain optimizations, e.g. running all parsing in a single seastar thread, which allows adding yields to rjson parsing later.	2020-02-28 07:57:12 +02:00
Nadav Har'El	0ab6c7fcef	alternator: stricter checks for user-supplied attribute values Until now, PutItem or UpdateItem could be used to insert almost any JSON as an attribute's value - even those that do not match DynamoDB's typed value specification. Among other things, the new validation allows us to reject empty sets, strings or byte arrays - which are (somewhat artificially) forbidden in DynamoDB. Also added tests for the empty sets, strings and byte arrays that should be rejected. Fixes #5896 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200225150525.4926-1-nyh@scylladb.com>	2020-02-26 08:12:26 +01:00
Nadav Har'El	6339f419ac	alternator: removing all elements from a set should delete it DynamoDB does not support empty sets. Operations which remove elements from a set attribute should remove the attribute when the last item is removed - not leave an empty set as it incorrectly does now. Incidentally, the same patch fixes another bug - deleting elements from a non-existent set attribute should be allowed (and do nothing), not fail as it does now. This patch also includes tests for both bugs. Fixes #5895 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200225125343.31629-1-nyh@scylladb.com>	2020-02-26 08:12:19 +01:00
Nadav Har'El	e075eff915	alternator: complete implementation of ReturnValues parameter This patch completes the support for the ReturnValues parameter for the UpdateItem operation. This parameter has five settings - NONE, ALL_OLD, ALL_NEW, UPDATED_OLD and UPDATED_NEW. Before this patch we already supported NONE and ALL_OLD - and this patch completes the support for the three remaining modes: ALL_NEW, UPDATED_OLD and UPDATED_NEW. The patch also continues to improve test_returnvalues.py with additional corner cases discovered during the development. After this patch, only one xfailing test remains - testing updates to nested document paths, which we do not yet support (even without the ReturnValues parameter). After this patch, the support of ReturnValues is complete - for all operations (UpdateItem, PutItem and DeleteItem) and all of its possible settings. Fixes #5053 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200221224221.31237-5-nyh@scylladb.com>	2020-02-24 10:40:53 +01:00
Nadav Har'El	1e500a2a34	alternator: rjson: another variant of set_with_string_name() utility The rjson::set_with_string_name() utility function copies the given string into the JSON key. The existing implementation required that this input string be an std::string&, but a std::string_view would be fine too, and I want to use it in new code to avoid yet another unnecessary copy. Adding the overloads also exposes a few places where things were implicitly converted to std::string and now cause an ambiguity - and clearing up this ambiguity also allowed me to find places where this conversion was unnecessary. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200221224221.31237-4-nyh@scylladb.com>	2020-02-24 10:38:54 +01:00
Nadav Har'El	fa5c2a4f58	alternator: UpdateItem only deleting attribute shouldn't create item UpdateItem operations usually need to add a row marker: * An empty UpdateItem is supposed to create a new empty item (row). Such an empty item needs to have a row marker. * An UpdateItem to add an attribute x and then later an UpdateItem to remove this attribute x should leave an empty item behind. This means the first UpdateItem needed to add a row marker, so it will be left behind after the second UpdateItem. So the existing code always added a row marker in UpdateItem. However, there is one case where we should NOT create the row marker: When the UpdateItem operation only has attribute deletions, and nothing else, and it is applied to a key with no pre-existing item, DynamoDB does not create this item. So neither should we. This patch includes a new test for this test_update_item_non_existent, which passes on DynamoDB, failed on Alternator before this patch, and passes after the patch. Fixes #5862. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200221224221.31237-3-nyh@scylladb.com>	2020-02-24 10:38:10 +01:00
Nadav Har'El	e8cbbba653	alternator: partial implementation of ReturnValues parameter Before this patch, we only supported the ReturnValues=NONE setting of the PutItem, UpdateItem and DeleteItem operations. This patch also adds full support for the ReturnValues=ALL_OLD option in all three operation. This option directs Alternator to return the full old (i.e., pre-modification) contents of the item. We implement this as a RMW (read-modify-write) operation just as we do other RMW operations - i.e., by default we use LWT, to ensure that we really return the value of the item directly before the modification, the same value that would have been used in a conditional expression if there was one. NOTE: This implementation means one cannot use ReturnValues=ALL_OLD in forbid_rmw write isolation mode. One may theorize that if we only need the read-before-write for ReturnValues and not for a conditional expression, it should have been enough to use a separate read (as we do in unsafe_rmw isolation mode) before the write. But we don't have this "optimization" yet and I'm not sure it's a valid optimization at all - see discussion in a new issue #5851. This patch completes the ReturnValues support for the PutItem and DeleteItem operations. However, the third operation, UpdateItem, supports three more ReturnValues modes: UPDATED_OLD, ALL_NEW and UPDATED_NEW. We do not yet support those in this patch. If a user tries to use one of these three modes, an informative error message will be returned. The three tests for these three unimplemented settings continue to xfail, but the rest of the tests in test_returnvalues.py (except one test of nested attribute paths) now pass so their xfail flag is dropped. Refs #5053 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200219135658.7158-1-nyh@scylladb.com>	2020-02-21 08:32:47 +01:00
Nadav Har'El	405115fa5f	alternator: cleanup of get_string_attribute() function The get_string_attribute() function used attribute_value->GetString() to return an std::string. But this function does not actually return a std::string - it returns a char*, which gets implicitly converted to an std::string by looking for the first null character. This lookup is unnecessary, because rjson already knows the length of the string, and we can use it. This patch is just a cleanup and a very small performance improvement - I do not expect it fixes any bugs or changes anything functional, because JSON strings anyway cannot contain verbatim nulls. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200219101159.26717-1-nyh@scylladb.com>	2020-02-19 11:59:54 +01:00

1 2 3 4 5

219 Commits