scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 19:10:42 +00:00

Author	SHA1	Message	Date
Piotr Sarna	5f2eadce09	alternator: wait for schema agreement after table creation In order to be sure that all nodes acknowledged that a table was created, the CreateTable request will now only return after seeing that schema agreement was reached. Rationale: alternator users check if the table was created by issuing a DescribeTable request, and assume that the table was correctly created if it returns nonempty results. However, our current implementation of DescribeTable returns local results, which is not enough to judge if all the other nodes acknowledge the new table. CQL drivers are reported to always wait for schema agreement after issuing DDL-changing requests, so there should be no harm in waiting a little longer for alternator's CreateTable as well. Fixes #6361 Tests: alternator(local)	2020-05-11 21:51:12 +03:00
Piotr Sarna	517f2c0490	alternator: unify error messages for existing tables/keyspaces Since alternator is based on Scylla, two "already exists" error types can appear when trying to create a table - that a table itself exists, or that its keyspace does. That's however an implementation detail, since alternator does not have a notion of keyspaces at all. This patch unifies the error message to simply mention that a table already exists, and comes with a more robust test case. If the keyspace already exists, table creation will still be attempted. Fixes #6340 Tests: alternator(local, remote)	2020-05-11 18:30:02 +03:00
Gleb Natapov	0fed86e4c6	lwt: change cas_request::apply signature Change the way query result is passed from getting a reference to a result to getting a foreign_ptr<lw_shared_ptr<query::result>>. This will allow cas_request to keep it without copying.	2020-05-05 12:38:23 +03:00
Piotr Sarna	09e4f3b917	alternator: implement ScanIndexForward The ScanIndexForward parameter is now fully implemented and can accept ScanIndexForward=false in order to query the partitions in reverse clustering order. Note that reading partition slices in reverse order is less efficient than forward scans and may put a strain on memory usage, especially for large partitions, since the whole partition is currently fetched in order to be reversed. Fixes #5153	2020-04-28 11:44:46 +03:00
Pavel Solodovnikov	ed7a7554b8	storage_proxy: allow cas() to accept nullptr read_command This patch allows users of storage_proxy::cas() to supply nullptr as `query::read_command` which is supposed to skip the procedure of reading the existing value. The feature is used in alternator code for Read-Modify-Write operations: some of them don't require reading previous item values before updating. Move `read_nothing_read_command` from alternator code to storage_proxy layer and fabricate a new no-op command from it when storage_proxy::cas() is used with nullptr read_command. This allows to avoid sprinkling if-else branches all over the code in order to check for null-equality of `cmd`. We return from storage_proxy::query() very early with an empty result in case we're given an empty partition_slice (which resides inside the passed `read_command`) so this approach should be perfectly fine. Expand documentation for the `cas()` function to cover new possible value for `cmd` argument. Fixes: #6238 Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200428065235.5714-1-pa.solodovnikov@scylladb.com>	2020-04-28 10:44:19 +03:00
Piotr Sarna	e17c237feb	alternator: fix integer overflow warning in token generation When generating tokens for parallel scan, debug mode undefined behavior sanitizer complained that integer overflow sometimes happens when multiplying two big values - delta and segment number. In order to mitigate this warning, the multiplication is now split into two smaller ones, and the generated machine code remains identical (verified on gcc and clang via compiler explorer). Fixes #6280 Tests: unit(dev)	2020-04-26 19:06:07 +03:00
Nadav Har'El	1f75efb556	alternator: use RF=3 even if some nodes are temporarily down Alternator is supposed to use RF=3 for new tables. Only when the cluster is smaller than 3 nodes do we use RF=1 (and warn about it) - this is useful for testing. However, our implementation incorrectly tested the number of live nodes in the cluster instead of the total number of nodes. As a result, if a 3-node cluster had one node down, and a new table was created, it was created with RF=1, and immediately could not be written because when RF=1, any node down means part of the data is unavailable. This patch fixes this: The total number of nodes in the cluster - not the number of live nodes - is consulted. The three-node-cluster-with-a-dead-node setup above creates the table with RF=3, and it can be written because two living nodes out of three are enough when RF=3 and we do quorum writes and reads. We have a dtest to reproduce this bug (and its fix), and it's also easy to reproduce manually by starting a 3-node cluster, killing one of the nodes, and then running "pytests". Before this patch, the tests can create tables but then fail to write to them. After this patch, the test succeed on the same cluster with the dead node. Fixes #6267 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200422182035.15106-2-nyh@scylladb.com>	2020-04-23 08:23:05 +02:00
Piotr Sarna	dbb9574aa2	alternator: allow parallel scan Parallel scans can be performed by providing Segment and TotalSegments attributes to Scan request, which can be used to split the work among many workers. This test makes the parallel scan test succeed, so the xfail is removed. Fixes #5059	2020-04-22 11:06:15 +03:00
Piotr Sarna	53bbef1e6c	alternator: add a way of accessing system tables from alternator Scylla's system tables often provide interesting information for clients. In order to be able to access this information without CQL, a notion of virtual tables is introduced to alternator. When a table named .scylla.alternator.KS_NAME.TABLE_NAME is accessed with read-only operation - Query or Scan, Scylla's internal KS_NAME.TABLE_NAME table will be queried instead. For instance, if a user wants to read about system_auth.roles, the Scan request should target the following table: ".scylla.alternator.system_auth.roles". Fixes #6122	2020-04-09 09:41:30 +02:00
Piotr Sarna	09d09ddefb	alternator: add fetching static columns if they exist Until now, the list of static column ids was always empty for alternator tables anyway, so the list wasn't fetched. However, with the virtual interface of fetching Scylla internal tables, we need to list the ids of selected static columns explicitly to avoid segfaults - since we select the whole row, static columns included.	2020-04-09 09:41:30 +02:00
Piotr Sarna	123edfc10c	alternator: fix failure on incorrect table name with no indexes If a table name is not found, it may still exist as a local index, but the check tried to fetch a local index name regardless if it was present in the request, which was a nullptr dereference bug. Fixes #6161 Tests: alternator-test(local, remote) Message-Id: <428c21e94f6c9e450b1766943677613bd46cbc68.1586347130.git.sarna@scylladb.com>	2020-04-08 15:33:48 +03:00
Piotr Sarna	0a2d7addc0	alternator: use partition tombstone if there's no clustering key As @tgrabiec helpfully pointed out, creating a row tombstone for a table which does not have a clustering key in its schema creates something that looks like an open-ended range tombstone. That's problematic for KA/LA sstable formats, which are incapable of writing such tombstones, so a workaround is provided in order to allow using KA/LA in alternator. Fixes #6035	2020-04-08 08:08:45 +02:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Rafael Ávila de Espíndola	eca0ac5772	everywhere: Update for deprecated apply functions Now apply is only for tuples, for varargs use invoke. This depends on the seastar changes adding invoke. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200324163809.93648-1-espindola@scylladb.com>	2020-03-25 08:49:53 +02:00
Piotr Sarna	781fbe8070	alternator: add service permit to callbacks As a first step towards introducing admission control, the API of alternator callbacks is extended with an additional 'permit' parameter.	2020-03-16 07:44:25 +01:00
Nadav Har'El	77444a38a1	alternator: allow consistent reads on LSI - but not on GSI Recently, Materialized Views were modified (see issue #4365) so that local view updates (when both base and view replicas are the same node) are synchronous. In particular, when the view's partition key is the same as the base table's, view writes are synchronous: A write now only returns after CL copies of the view data have been written. Alternator's LSI have exactly this case (same partition key as the base). This makes strongly-consistent (CL=LOCAL_QUORUM) reads in Alternator work correctly, so we update the documentation accordingly to no longer say that we don't support this DynamoDB feature. However unlike LSIs, for GSIs strongly-consistent reads are still not supported, and should not be supported (they are also not supported by DynamoDB). Such reads should generate an error. So this patch fixes this too. A GSI test which tested that strongly consistent reads are forbidden, which used to xfail, now passes so the patch removes the "xfail". Finally, we can simplify the LSI tests by using consistent reads instead of eventually-consistent reads with retries. Beyond simplifying the test, it's also an opportunity to use strongly-consistent reads and make sure that they work (while, as mentioned above, similar reads for GSIs are refused). Fixes #5007 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200311170446.28611-1-nyh@scylladb.com>	2020-03-12 09:18:00 +01:00
Nadav Har'El	96ca5ac2c8	alternator: use separate smp_service_group for bouncing requests Until this patch, we used the default_smp_service_group() when bouncing Alternator requests between shards (which is needed for LWT). This patch creates a new smp_service_group for this purpose, which is limited to 5000 concurrent requests (the same limit used for CQL's bounce_request_smp_service_group). The purpose of this limit is to avoid many shards admitting a huge number of requests and bouncing all of them to the same shard who now can't "unadmit" these requests. Fixes #5664. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200304170825.27226-1-nyh@scylladb.com>	2020-03-05 10:17:51 +01:00
Nadav Har'El	91d9632909	alternator: add rjson::remove_member() convenience function This patch adds a rjson::remove_member() wrapper to the RemoveMember method, which takes a std::string_view. But beyond the convenience, this actually works around a subtle bug in RemoveMember where, if given a StringRef parameter, ignores its length (see upstream issue https://github.com/Tencent/rapidjson/issues/1649). In the one place we used RemoveMember, it forced us to copy the string because it wasn't null-terminated. The solution proposed here involves wrapping the string view in a GenericValue - which no longer needs to copy the string, but still works around the bug. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303143524.28300-1-nyh@scylladb.com>	2020-03-03 16:35:41 +01:00
Nadav Har'El	0fcb226412	alternator: switch rjson::find() to use std::string_view Our rjson::find() convenience function used RapidJson's "StringRef" type, which is almost exactly like std::string_view. If we switch to use string_view as we do in this patch, a lot of call sites become much simpler. Moreover, there was an even more important motivation for this patch: the RapidJson FindMember() function we used in rjson::find() has a bug when given a StringRef - although a StringRef contains a length, the FindMember() code ignores it and expects the string to be null-terminated (see: https://github.com/Tencent/rapidjson/issues/1649). In this patch, we wrap the pointer and length of a std::string_view in an rjson::value, a code path which bypasses the FindMember bug, and yet does not require copying the string. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200303141814.26929-1-nyh@scylladb.com>	2020-03-03 16:35:41 +01:00
Piotr Sarna	6f8c70d54b	alternator: fix returning raw JSON errors A couple of places in executor code leaked raw JSON errors to the user instead of formulating a proper ValidationException message. These places are now fixed, and the next patch in this series will act as a regression checker, since all JSON errors will be returned as SerializationException, not ValidationException instances.	2020-02-28 07:57:12 +02:00
Piotr Sarna	2402955d45	alternator: move parsing in front of executor Parsing a request string into JSON happens as a first thing in every request, so it can be performed before calling any executor callbacks. The most important thing however, is that making parsing a separate stage allows certain optimizations, e.g. running all parsing in a single seastar thread, which allows adding yields to rjson parsing later.	2020-02-28 07:57:12 +02:00
Nadav Har'El	0ab6c7fcef	alternator: stricter checks for user-supplied attribute values Until now, PutItem or UpdateItem could be used to insert almost any JSON as an attribute's value - even those that do not match DynamoDB's typed value specification. Among other things, the new validation allows us to reject empty sets, strings or byte arrays - which are (somewhat artificially) forbidden in DynamoDB. Also added tests for the empty sets, strings and byte arrays that should be rejected. Fixes #5896 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200225150525.4926-1-nyh@scylladb.com>	2020-02-26 08:12:26 +01:00
Nadav Har'El	6339f419ac	alternator: removing all elements from a set should delete it DynamoDB does not support empty sets. Operations which remove elements from a set attribute should remove the attribute when the last item is removed - not leave an empty set as it incorrectly does now. Incidentally, the same patch fixes another bug - deleting elements from a non-existent set attribute should be allowed (and do nothing), not fail as it does now. This patch also includes tests for both bugs. Fixes #5895 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200225125343.31629-1-nyh@scylladb.com>	2020-02-26 08:12:19 +01:00
Nadav Har'El	e075eff915	alternator: complete implementation of ReturnValues parameter This patch completes the support for the ReturnValues parameter for the UpdateItem operation. This parameter has five settings - NONE, ALL_OLD, ALL_NEW, UPDATED_OLD and UPDATED_NEW. Before this patch we already supported NONE and ALL_OLD - and this patch completes the support for the three remaining modes: ALL_NEW, UPDATED_OLD and UPDATED_NEW. The patch also continues to improve test_returnvalues.py with additional corner cases discovered during the development. After this patch, only one xfailing test remains - testing updates to nested document paths, which we do not yet support (even without the ReturnValues parameter). After this patch, the support of ReturnValues is complete - for all operations (UpdateItem, PutItem and DeleteItem) and all of its possible settings. Fixes #5053 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200221224221.31237-5-nyh@scylladb.com>	2020-02-24 10:40:53 +01:00
Nadav Har'El	1e500a2a34	alternator: rjson: another variant of set_with_string_name() utility The rjson::set_with_string_name() utility function copies the given string into the JSON key. The existing implementation required that this input string be an std::string&, but a std::string_view would be fine too, and I want to use it in new code to avoid yet another unnecessary copy. Adding the overloads also exposes a few places where things were implicitly converted to std::string and now cause an ambiguity - and clearing up this ambiguity also allowed me to find places where this conversion was unnecessary. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200221224221.31237-4-nyh@scylladb.com>	2020-02-24 10:38:54 +01:00
Nadav Har'El	fa5c2a4f58	alternator: UpdateItem only deleting attribute shouldn't create item UpdateItem operations usually need to add a row marker: * An empty UpdateItem is supposed to create a new empty item (row). Such an empty item needs to have a row marker. * An UpdateItem to add an attribute x and then later an UpdateItem to remove this attribute x should leave an empty item behind. This means the first UpdateItem needed to add a row marker, so it will be left behind after the second UpdateItem. So the existing code always added a row marker in UpdateItem. However, there is one case where we should NOT create the row marker: When the UpdateItem operation only has attribute deletions, and nothing else, and it is applied to a key with no pre-existing item, DynamoDB does not create this item. So neither should we. This patch includes a new test for this test_update_item_non_existent, which passes on DynamoDB, failed on Alternator before this patch, and passes after the patch. Fixes #5862. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200221224221.31237-3-nyh@scylladb.com>	2020-02-24 10:38:10 +01:00
Nadav Har'El	e8cbbba653	alternator: partial implementation of ReturnValues parameter Before this patch, we only supported the ReturnValues=NONE setting of the PutItem, UpdateItem and DeleteItem operations. This patch also adds full support for the ReturnValues=ALL_OLD option in all three operation. This option directs Alternator to return the full old (i.e., pre-modification) contents of the item. We implement this as a RMW (read-modify-write) operation just as we do other RMW operations - i.e., by default we use LWT, to ensure that we really return the value of the item directly before the modification, the same value that would have been used in a conditional expression if there was one. NOTE: This implementation means one cannot use ReturnValues=ALL_OLD in forbid_rmw write isolation mode. One may theorize that if we only need the read-before-write for ReturnValues and not for a conditional expression, it should have been enough to use a separate read (as we do in unsafe_rmw isolation mode) before the write. But we don't have this "optimization" yet and I'm not sure it's a valid optimization at all - see discussion in a new issue #5851. This patch completes the ReturnValues support for the PutItem and DeleteItem operations. However, the third operation, UpdateItem, supports three more ReturnValues modes: UPDATED_OLD, ALL_NEW and UPDATED_NEW. We do not yet support those in this patch. If a user tries to use one of these three modes, an informative error message will be returned. The three tests for these three unimplemented settings continue to xfail, but the rest of the tests in test_returnvalues.py (except one test of nested attribute paths) now pass so their xfail flag is dropped. Refs #5053 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200219135658.7158-1-nyh@scylladb.com>	2020-02-21 08:32:47 +01:00
Nadav Har'El	405115fa5f	alternator: cleanup of get_string_attribute() function The get_string_attribute() function used attribute_value->GetString() to return an std::string. But this function does not actually return a std::string - it returns a char*, which gets implicitly converted to an std::string by looking for the first null character. This lookup is unnecessary, because rjson already knows the length of the string, and we can use it. This patch is just a cleanup and a very small performance improvement - I do not expect it fixes any bugs or changes anything functional, because JSON strings anyway cannot contain verbatim nulls. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200219101159.26717-1-nyh@scylladb.com>	2020-02-19 11:59:54 +01:00
Avi Kivity	6c7aa18238	Merge "Introduce schema::get_partitioner" from Piotr " Introduce schema::get_partitioner and use it instead of dht::global_partitioner. Fixes #5493 Tests: unit(dev, release, debug) " * 'per_table_partitioner_prep' of https://github.com/haaawk/scylla: (35 commits) cdc: stop using partitioners partitioner_test: stop calling set_global_partitioner storage_service: stop calling global_partitioner() mutation_writer_test: stop calling global_partitioner() schema: reduce number of global_partitioner() calls test_services: stop calling global_partitioner() sstable_utils: stop calling global_partitioner() sstable_resharding_test: stop depending on global partitioner sstable_mutation_test: stop calling global_partitioner() sstable_data_file_test: stop calling global_partitioner() random_schema: stop taking partitioner in constructor mutation_reader_test: stop calling global_partitioner() multishard_mutation_query_test: stop calling global_partitioner() row_level repair: stop calling global_partitioner() distribute_reader_and_consume_on_shards: don't take partitioner thrift: reduce global_partitioner() calls binary_search: stop calling global_partitioner() index_entry: stop calling global_partitioner() mc writer: stop calling global_partitioner() sstable: stop calling global_partitioner() ...	2020-02-17 18:12:53 +02:00
Piotr Jastrzebski	2d7532f87f	dht: add dht::get_token and replace all calls to dht::global_partitioner().get_token dht::get_token is better because it takes schema and uses it to obtain partitioner instead of using a global partitioner. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	ca4a89d239	dht: add dht::decorate_key and replace all dht::global_partitioner().decorate_key with dht::decorate_key It is an improvement because dht::decorate_key takes schema and uses it to obtain partitioner instead of using global partitioner as it was before. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:06 +01:00
Piotr Jastrzebski	abd76e566f	dht::shard_of: stop calling global_partitioner() Take const schema& as a parameter of shard_of and use it to obtain partitioner instead of calling global_partitioner(). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:23:16 +01:00
Pavel Solodovnikov	d64fd52ae5	paging_state: switch from shared_ptr to lw_shared_ptr Change the way `service::pager::paging_state` is passed around from `shared_ptr` to `lw_shared_ptr`. It's safe since `paging_state` is final. Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2020-02-16 17:23:36 +03:00
Nadav Har'El	b01b11c1f3	alternator: implement KeyConditionExpression This patch adds to Alternator's Query operation full support for the KeyConditionExpression parameter - a newer syntax for specifying which partition and which sort-key range are to be queried. The older syntax for the same thing, "KeyConditions", was already supported by Alternator. The patch also includes additional test cases for more corner cases discovered during the development. After this patch, all 47 test cases in test_key_condition_expression.py pass on Alternator (and, of course, also on DynamoDB). One interesting thing to note about this patch is that it does not include a new parser for the KeyConditionExpression syntax. It turns out that we need - to be fully compatible with DynamoDB - to use the already existing parser for ConditionExpression syntax, and then forbid certain things not allowed in KeyConditionExpression (you can see a lot of examples in code comments and in the tests included in this patch). Most importantly, allowing the full ConditionExpression syntax also means we allow completely useless parentheses on key conditions, e.g., '((p=:p) AND (c=:c))'. While the KeyConditionExpression documentation doesn't mention allowing these parentheses, DynamoDB does support them - and it turns out that boto3 uses them when you use its condition builders, as we do in one test case (test_query_key_condition_expression). Fixes #5037. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200213192509.32685-4-nyh@scylladb.com>	2020-02-16 11:22:30 +02:00
Nadav Har'El	15515b2cc1	alternator: more useful get_key_from_typed_value() utility function We had a get_key_from_typed_value() utility function to decode a JSON-encoded value with a known type (the JSON encoding is a map whose key is the type, the value always a string because all possible key types - string, bytes and number, are encoded as strings). However, the function was less useful than it could have been - it was missing one check for a malformed object (a check which only appeared in one of its callers), it unnecessarily received the column's expected type (all the callers passed it the given key column's type). The cleaned up function will be more useful for the following patch to support KeyConditionExpression, which wants to reuse it. While at it, this patch also uses rjson::to_string_view(it->value) instead of the less correct it->value.GetString() (the latter relies on null-termination, which is actually true for JSON strings, but there is no reason to rely on it). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200213192509.32685-3-nyh@scylladb.com>	2020-02-16 11:22:30 +02:00
Nadav Har'El	65d0a776c2	merge: alternator: Add keyspace per table This series implements keyspace-per-table approach for Alternator. The changes are as follows: - when a table is created, its keyspace is created first - after table deletion, its keyspace is deleted as well; works with views too, since these must be deleted before the base table is dropped - instead of SimpleStrategy, network topology is used Keyspaces are created with a prefix not legal from CQL - 'a#'. I validated that even though not reachable via CQL, keyspaces created with # character work well and produce correct directories, restarts work flawlessly too. Fixes #5611 Refs #5596 Tests: alternator(local, remote) Piotr Sarna (3): alternator: switch to keyspace-per-table approach alternator: move to NetworkTopologyStrategy alternator-test: add test for recreating a table	2020-02-16 11:22:30 +02:00
Piotr Sarna	fa4ddd2947	alternator: add validating write_isolation tag In order to prevent users from using incorrect write isolation configuration, a set of allowed values is introduced. When tagging a resource (which is considered rare), a tag will only be allowed if it belongs to the allowed set.	2020-02-13 13:51:31 +01:00
Piotr Sarna	7e6c9cad9a	alternator: move rmw_operation to a header rmw_operation is a class with a public interface, including a write_isolation enum and a fixed tag name for its configuration. For convenience, it's moved to a header file, so that code from executor.cc can use the definitions regardless of their position in the source file - it prevents reordering functions just to make sure that rmw_operation is defined before a function that uses its attributes.	2020-02-13 13:51:31 +01:00
Piotr Sarna	dca6c2c81d	alternator: move to NetworkTopologyStrategy Imstead of SimpleStrategy, NetworkTopologyStrategy is used for setting up the replication configuration for alternator tables. Replication factor 3 is used along with a local datacenter, unless alternator discovers that it's running on a test cluster with less than 3 nodes - then, RF is reduced accordingly and emits a warning, which was also the case for SimpleStrategy.	2020-02-13 09:46:46 +01:00
Piotr Sarna	3eb6da224b	alternator: switch to keyspace-per-table approach Instead of a monolith alternator keyspace, each table creates its own keyspace, named in the following pattern: `a#TABLE_NAME`. The `a#` prefix contains an illegal CQL character in order to ensure that these keyspaces are never created via CQL.	2020-02-13 09:46:19 +01:00
Piotr Sarna	dcf54331ea	alternator: allow custom names for keyspaces The maybe_create_keyspace utility now accepts a parameter - the desired name for a newly created keyspace.	2020-02-13 09:16:37 +01:00
Nadav Har'El	b93204d6bf	Alternator: allow CreateTable with streams explicitly turned off While Alternator doesn't yet support creating a table with streams (i.e., CDC) turned on, we should only failed the creation if streams were really turned on. If the StreamSpecification option exists, but does not ask to turn on streams, we should not fail the creation - and this patch fixes this. This patch also adds two tests - one where StreamSpecification is passed but does not ask to turn on streams (so table creation should succeed), and another test which explicitly requests to turn on streams. The second test still xfails on Alternator, and should continue to do so until we implement streams (we do not want to silently ignore a request to turn on streams). Fixes #5796 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200212100546.16337-1-nyh@scylladb.com>	2020-02-12 17:29:02 +01:00
Piotr Sarna	f4e51a96ca	alternator: replace overloaded with overloaded_functor Turns out we already have a utility header for a visitor with overloaded lambdas. This patch purges the explicit reimplementation of the same trick and uses the existing class instead. Message-Id: <60c0b9a978f8208b188ef6ddc0564cb133bed707.1581496049.git.sarna@scylladb.com>	2020-02-12 14:21:42 +02:00
Gleb Natapov	38fcab3db4	alternator: pass tracing state explicitly instead of relying on it been in the client_state Multiple requests can use the same client_state simultaneously, so it is not safe to use it as a container for a tracing state which is per request. This is not yet an issue for the alternator since it creates new client_state object for each request, but first of all it should not and second trace state will be dropped from the client_state, by later patch.	2020-02-10 14:50:55 +02:00
Piotr Sarna	4a9536b7c1	alternator: add configuring write isolation policy via tags Until now, write isolation policy was hardcoded to always enforcing LWT. From now on, setting a tag via UpdateTags request or during table creation will associate a policy with given table. The tag key is 'system:write_isolation' and its value can be one of: * 'f' - forbid RMW * 'a' - always enforce RMW * 'o' - only RMW writes will go through LWT * 'u' - unsafe RMW (to be deprecated/eradicated)	2020-02-06 10:26:26 +01:00
Piotr Sarna	0479a1bf67	alternator: make _write_isolation a protected attribute No useful semantic changes yet, but it will help produce better diffs for future patches.	2020-02-06 10:04:34 +01:00
Piotr Sarna	51c14cb1ce	alternator: fix overwriting tags Tagging a resource with a tag key that already exists should result in overwriting the old value. It wasn't the case, so it's now fixed and an appropriate test is added.	2020-02-06 10:04:34 +01:00
Piotr Sarna	ed940f000d	alternator: return tags for a table via const reference The signature of the helper function is changed, so that it's possible to acquire a const reference of the tags, instead of being forced to get a copy of the whole map (potentially large).	2020-02-06 10:04:34 +01:00
Nadav Har'El	9fd9ec14c2	storage_proxy: make it into a peering sharded service We consider globals like service::get_storage_proxy() a bad idea, and would like to reduce their use as much as possible - and eventually, eliminate it completely. One easy case to fix case is when we already have a shard-local proxy, but now we need the sharded object, to invoke_on() something on it. In this patch, we turn storage_proxy into a peering_sharded_service. This means that if you already have a storage_proxy, you can call its container() function to get the sharded<storage_proxy>, without needing to call the global service::get_storage_proxy(). We found a few such cases in storage_proxy itself, and in Alternator, and fixed them to use container() instead of the global function. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-02-05 21:14:18 +02:00
Nadav Har'El	95351016fd	alternator: use LWT timestamp - in BatchWriteItems too A previous patch fixed Alternator's writes to use the timestamp provided by LWT instead of the current timestamp. That patch fixed the PutItem, DeleteItem and UpdateItem operations - and this patch fixes the remaining write operation: BatchWriteItems. So, Fixes #5653. Unfortunatly, the requirements of both BatchWriteItems and LWT make the resulting code - and this patch - somewhat inelegant. BatchWriteItems requires that we prepare all the operations first - failing if any of them has an error. Before this patch, the result of this preparation was an array of mutations, which in a second step we wrote to the database. But we can no longer use mutations for the result of the first step, because creating a mutation requires knowing the timestamp, which we don't know during the preparate phase - we will only know it during the later LWT operation. So now we need to invent a new intermediate format between the request and the mutation. This intermediate format is further complicated by the need to be send it between shards (for LWT's shard forwarding) so it cannot, for example, contain a reference to a schema. The fact that different sub-operations need to be sent to different shards, and that different sub-operations may write to different tables, further complicate the book-keeping and gives us a bunch of funky-typed maps. But eventually it all fits together. After this patch, as before this patch, the same code (now called put_or_delete_item), is used to implement both the PutItem and DeleteItem stand-alone operation, and the BachWriteItems operation which includes a whole list of these PutItem and DeleteItem operation. This patch also includes two more tests in test_batch.py, which test two more corner tests we haven't tested before: One tests the capability of BatchWriteItems to write to more than one table. The other tests that BatchWriteItems can write an empty item (it is not surprising that it does, but we do have special code for this case, so we should test it). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-02-05 21:14:18 +02:00

1 2 3 4

197 Commits