scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 13:45:53 +00:00

Author	SHA1	Message	Date
Botond Dénes	f3735dc8e0	Merge 'utils: add fmt::formatter for utils types' from Kefu Chai before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * utils::human_readable_value * std::strong_ordering * std::weak_ordering * std::partial_ordering * utils::exception_container Refs https://github.com/scylladb/scylladb/issues/13245 Closes scylladb/scylladb#17710 * github.com:scylladb/scylladb: utils/exception_container: add fmt::formatter for exception_container utils/human_readable: add fmt::formatter for human_readable_value utils: add fmt::formatter for std::strong_ordering and friends	2024-03-12 13:27:37 +02:00
Botond Dénes	3a7364525f	Merge 'test/alternator: improve metrics tests' from Nadav Har'El This small series improves the Alternator tests for metrics: 1. Improves some comments in the test. 2. Restores a test that was previously hidden by two tests having the same name. 3. Adds tests for latency histogram metrics. Closes scylladb/scylladb#17623 * github.com:scylladb/scylladb: test/alternator: tests for latency metrics test/alternator: improve comments and unhide hidden test	2024-03-12 09:13:17 +02:00
Kefu Chai	007d7f1355	utils: add fmt::formatter for std::strong_ordering and friends before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * std::strong_ordering * std::weak_ordering * std::partial_ordering and their operator<<:s are moved to test/lib/test_utils.{hh,cc}, as they are only used by Boost.test. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-12 14:53:55 +08:00
Tomasz Grabiec	47a66d0150	Merge 'Handle tablet migration failure in wrapping-up stages' from Pavel Emelyanov There are four stages left to handle: cleanup, cleanup_target, end_migration and revert_migration. All are handling removed nodes already, so the PR just extends the test. fixes: #16527 Closes scylladb/scylladb#17684 * github.com:scylladb/scylladb: test/tablets_migration: Test revert_migration failure handling test/tablets_migration: Test end_migration failure handling test/tablets_migration: Test cleanup_target failure handling test/tablets_migration: Test cleanup failure handling test/tablets_migration: Prepare for do_... stages test/tablets_migration: Add ability to removenode via any other node test/tablets_migration: Wrap migration stages failing code into a helper class storage_service: Add failure injection to crash cleanup_tablet	2024-03-12 00:20:56 +01:00
Asias He	ebc0ab94e5	repair: Add ranges option support for tablet repair The management tool, e.g., scylla manager, needs the ranges option to select which ranges to repair on a node to schedule repair jobs. This patch adds ranges option support. E.g., curl -X POST "http://127.0.0.1:10000/storage_service/repair_async/ks1?ranges=-4611686018427387905:-1,4611686018427387903:9223372036854775807" Fixes: #17416 Tests: test_tablet_repair_ranges_selection Closes scylladb/scylladb#17436	2024-03-11 20:03:12 +02:00
Nadav Har'El	d207962e40	test/alternator: tests for latency metrics In test/alternator/test_metrics.py we had tests for the operation-count metrics for different Alternator API operations, but not for the latency histograms for these same operations. So this patch adds the missing tests (and removes a TODO asking to do that). Note that only a subset of the operations - PutItem, GetItem, DeleteItem, UpdateItem, and GetRecords - currently have a latency history, and this test verifies this. We have an issue (Refs #17616) about adding latency histograms for more operations - at which point we will be able to expand this test for the additional operations. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-03-11 19:26:59 +02:00
Nadav Har'El	970c2dc7a6	test/alternator: improve comments and unhide hidden test The original goal of this patch was to improve comments in test/alternator/test_metrics.py, but while doing that I discovered that one of the test functions was hidden by a second test with the same name! So this patch also renames the second test. The test continues to work after this patch - the hidden test was successful. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-03-11 19:26:59 +02:00
Botond Dénes	7d31093d4b	Merge 'storage_service/ownership: handle requests when tablets are enabled' from Patryk Wróbel Before this change, when user tried to utilize 'storage_service/ownership/{keyspace}' API with keyspace parameter that uses tablets, then internal error was thrown. The code was calling a function, that is intended for vnodes: get_vnode_effective_replication_map(). This commit introduces graceful handling of such scenario and extends the API to allow passing 'cf' parameter that denotes table name. Now, when keyspace uses tablets and cf parameter is not passed a descriptive error message is returned via BAD_REQUEST. Users cannot query ownership for keyspace that uses tablets, but they can query ownership for a table in a given keyspace that uses tablets. Also, new tests have been added to test/rest_api/test_storage_service.py and to test/topology_experimental_raft/test_tablets.py in order to verify the behavior with and without tablets enabled. Fixes: https://github.com/scylladb/scylladb/issues/17342 Closes scylladb/scylladb#17405 * github.com:scylladb/scylladb: storage_service/ownership: discard get_ownership() requests when tablets enabled storage_service/ownership/{keyspace}: handle requests when tablets are enabled locator/effective_replication_map: make 'get_ranges(inet_address ep)' virtual locator/tablets: add tablet_map::get_sorted_tokens() pylib/rest_client.py: add ownership API to ScyllaRESTAPIClient rest_api/test_storage_service: add simplistic tests of ownership API for vnodes	2024-03-11 14:55:26 +02:00
Kefu Chai	1ab30fc306	clustering_bounds_comparator: add fmt::formtter for bound_{kind,view} before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `bound_kind` and `bound_view`, and drop the latter's operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17706	2024-03-11 11:37:48 +02:00
Patryk Wrobel	9eb91b5526	storage_service/ownership: discard get_ownership() requests when tablets enabled This change introduces a logic, that is responsible for checking if tablets are enabled for any of keyspaces when get_ownership() is invoked. Without it, the result would be calculated based solely on sorted_tokens() which was invalid. Refs: scylladb#17342 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-03-11 09:52:25 +01:00
Patryk Wrobel	51da80da7d	storage_service/ownership/{keyspace}: handle requests when tablets are enabled Before this change, when user tried to utilize 'storage_service/ownership/{keyspace}' API with keyspace parameter that uses tablets, then internal error was thrown. The code was calling a function, that is intended for vnodes: get_vnode_effective_replication_map(). This commit introduces graceful handling of such scenario and extends the API to allow passing 'cf' parameter that denotes table name. Now, when keyspace uses tablets and cf parameter is not passed a descriptive error message is returned via BAD_REQUEST. Users cannot query ownership for keyspace that uses tablets, but they can query ownership for a table in a given keyspace that uses tablets. Also, new tests have been added to test/rest_api/test_storage_service.py and to test/topology_experimental_raft/test_tablets.py in order to verify the behavior with and without tablets enabled. Refs: scylladb#17342 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-03-11 09:52:23 +01:00
Patryk Wrobel	a39a5b671e	pylib/rest_client.py: add ownership API to ScyllaRESTAPIClient This change adds a member function that can be used to access 'storage_service/ownership' API. It will be used by tests that need to access this API. Refs: scylladb#17342 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-03-11 09:50:20 +01:00
Patryk Wrobel	dea76c4763	rest_api/test_storage_service: add simplistic tests of ownership API for vnodes This change is intended to introduce tests for vnodes for the following API paths: - 'storage_service/ownership' - 'storage_service/ownership/{keyspace}' In next patches the logic that is tested will be adjusted to work correctly when tablets are enabled. This is a safety net that ensures that the logic is not broken. Refs: scylladb#17342 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-03-11 09:50:20 +01:00
Kefu Chai	38ae52d5cd	add fmt::formatter for reader_permit::state and reader_resources before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * reader_permit::state * reader_resources Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17707	2024-03-11 09:55:51 +02:00
Pavel Emelyanov	feae470475	test/tablets_migration: Test revert_migration failure handling This stage is also the error path that starts from write_both_read_old, so check this failure in two steps -- first fail the latter stage in one of the nodes, then fail the former in another. For that one more node in the cluster is needed. Also, to avoid name conflicts, the do_revert_migration pseudo stage name is used. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 08:16:13 +03:00
Pavel Emelyanov	c3d96b1a86	test/tablets_migration: Test end_migration failure handling This stage is pure barrier. Barriers already take ignored nodes into account, so do the fail-injector, so just wire the stage name into the test. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 08:16:13 +03:00
Pavel Emelyanov	180446e7b8	test/tablets_migration: Test cleanup_target failure handling This stage is error path, so in order to fail it we need to fail some other stage prior to that. This leads to the testing sequence of 1. fail streaming via source node 2. stop and remove source node to let state machine proceed 3. fail cleanup_target on the destination node 4. stop and remove destination node First thing to note here, is that the test doesn't fail source node for cleanup_target stage, symmetrically to how it does for cleanup stage. Next, since we're removing two nodes, the cluster is equipeed with more nodes nodes to have raft quorum. Finally, since remove of source node doesn't finish until tablet migration finishes, it's impossible to remove destination node via the same node-0, so the 2nd removenode happens via node-3. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 08:16:13 +03:00
Pavel Emelyanov	724c79ecf6	test/tablets_migration: Test cleanup failure handling The handling itself is already there -- if the leaving node is excluded the cleanup stage resolves immediately. So just add a code that validates that. Also, skip testing of pending replica failure during cleanup stage, as it doesn't really participate in it any longer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 08:16:13 +03:00
Pavel Emelyanov	ccefb7f21f	test/tablets_migration: Prepare for do_... stages The tablets migration test is parametrized with stage name to inject failure in. Internal class node_failer uses this parameter as is when injecting a failure into scylla barrier handler. Next patch will need to extend the test with revert_migration value and add handling of this name to node_failer class. The node_failer class, in turn, will want to instantiate two other instances of the same class -- one to fail the write_both_read_old stage, and the other one to fail the revert_migration barrier. So internally the class will need to tell revert_migration value as full test parameter from revert_migration as barrier-only parameter. This test adds the ability to add do_ prefix to node_failer parameter to tell full test from barrier-only. When injecting a failure into scylla the do_ prefix needs to be cut off, since scylla still needs to fail the barrier named revert_migration, not do_revert_migration. Also split the long line while at it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 07:56:58 +03:00
Pavel Emelyanov	abbd22cb90	test/tablets_migration: Add ability to removenode via any other node Currently the test calls removenode via node-0 in the cluster, which is always alive. Next test case will need to call removenode on some other node (more details in that patch later). refs: #17681 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 07:56:55 +03:00
Pavel Emelyanov	5d3291f322	test/tablets_migration: Wrap migration stages failing code into a helper class One of the next stages will need to use two of them at the same time and it's going to be easier if the failing code is encapsulated. No functional changes here, just large portions of code and local variables are moved into class and its methods. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 07:56:55 +03:00
Botond Dénes	9f97d21339	Merge 'Enhance perf-simple-query test' from Pavel Emelyanov While measuring #17149 with this test some changes were applied, here they are - keep initial_tablets number in output json's parameters section - disable auto compaction - add control over the amount of sstables generated for --bypass-cache case Closes scylladb/scylladb#17473 * github.com:scylladb/scylladb: perf_simple_query: Add --memtable-partitions option perf_simple_query: Disable auto compaction perf_simple_query: Keep number of initial tablets in output json	2024-03-08 15:21:04 +02:00
Kefu Chai	079d70145e	raft: add fmt::formatter for raft tracker types before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * raft::election_tracker * raft::votes * raft::vote_result and drop their operator<<:s. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17670	2024-03-08 15:19:37 +02:00
Botond Dénes	630be97d2f	Merge 'tools/scylla-nodetool: print hostname if --resolve-ip is passed to "ring"' from Kefu Chai before this change, "ring" subcommand has two issues: 1. `--resolve-ip` option accepts a boolean argument, but this option should be a switch, which does not accept any argument at all 2. it always prints the endpoint no matter if `--resolve-ip` is specified or not. but it should print the resolved name, instead of an IP address if `--resolve-ip` is specified. in this change, both issues are addressed. and the test is updated accordingly to exercise the case where `--resolve-ip` is used. Closes scylladb/scylladb#17553 * github.com:scylladb/scylladb: tools/scylla-nodetool: print hostname if --resolve-ip is passed to "ring" test/nodetool: calc max_width from all_hosts test/nodetool: keep tokens as Host's member test/nodetool: remove unused import	2024-03-08 15:15:19 +02:00
Kefu Chai	8ca672a02c	test/pylib: return better error if self.create_server() raises in `ScyllaServer::add_server()`, `self.create_server()` is called to create a server, but if it raises, we would reference a local variable of `server` which is not bound to any value, as `server` is not assigned at that moment. if `ScyllaServer` is used by `ScyllaClusterManager`, we would not be able to see the real exception apart from the error like ``` cannot access local variable 'server' where it is not associated with a value ``` which is but the error from Python runtime. in this change, `server` is always initialized, and we check for None, before dereference it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17693	2024-03-08 15:10:27 +02:00
Botond Dénes	505f137cc9	Merge 'Make object_store suite use ManagerClient' from Pavel Emelyanov The test cases in this suite need to start scylla with custom config options, restart it and call API on it. By the time the suite was created all this wasn't possible with any library facility, so the suite carries its version of managed_cluster class that piggy-backs cql-pytest scylla starting. Now test.py has pretty flexible manager that provides all the scylla cluster management object_store suite needs. This PR makes the suite use the manager client instead of the home-brew managed_cluster thing refs: #16006 fixes: #16268 Closes scylladb/scylladb#17292 * github.com:scylladb/scylladb: test/object_store: Remove unused managed_cluster (and other stuff) test/object_store: Use tmpdir fixture in flush-retry case test/object_store: Turn flush-retry case to use ManagerClient test/object_store: Turn "misconfigured" case to use ManagerClient test/object_store: Turn garbage-collect case to use ManagerClient test/object_store: Turn basic case to use ManagerClient test/object_store: Prepare to work with ManagerClient	2024-03-08 15:04:46 +02:00
Kamil Braun	ae954fb2ec	test: unflake test_tablets_removenode These tests are inserting data into RF=3 tables, but used the default consistency level which is taken from the default execution profile which is set to LOCAL_QUORUM. The tests would then read with CL=ONE, so we cannot give a guarantee that some of the data won't be missed. Fix this by inserting the data with CL=ALL. (Do it for all RF cases for simplicity.) Fixes scylladb/scylladb#17695 Closes scylladb/scylladb#17700	2024-03-08 12:47:47 +01:00
Kamil Braun	76fb902858	test: unflake test_topology_remove_garbage_group0 The test is booting nodes, and then immediately starts shutting down nodes and removing them from the cluster. The shutting down and removing may happen before driver manages to connect to all nodes in the cluster. In particular, the driver didn't yet connect to the last bootstrapped node. Or it can even happen that the driver has connected, but the control connection is established to the first node, and the driver fetched topology from the first node when the first node didn't yet consider the last node to be normal. So the driver decides to close connection to the last node like this: ``` 22:34:03.159 DEBUG> [control connection] Removing host not found in peers metadata: <Host: 127.42.90.14:9042 datacenter1> ``` Eventually, at the end of the test, only the last node remains, all other nodes have been removed or stopped. But the driver does not have a connection to that last node. Fix this problem by ensuring that: - all nodes see each other as NORMAL, - the driver has connected to all nodes at the beginning of the test, before we start shutting down and removing nodes. Fixes scylladb/scylladb#16373 Closes scylladb/scylladb#17676	2024-03-08 10:08:09 +01:00
Nadav Har'El	ea53db379f	Merge 'tools/scylla-nodetool: listsnapshot: make it compatible with origin' from Botond Dénes The following incompatibilities were identified by `listsnapshots_test.py` in dtests: * Command doesn't bail out when there are no snapshots, instead it prints meaningless empty report * Formatting is incompatible Both are fixed in this mini-series. Closes scylladb/scylladb#17541 * github.com:scylladb/scylladb: tools/scylla-nodetool: listsnapshots: make the formatting compatible with origin's tools/scylla-nodetool: listsnapshots: bail out if there are no snapshots	2024-03-08 10:08:09 +01:00
Kefu Chai	de276901f2	tools/scylla-nodetool: print hostname if --resolve-ip is passed to "ring" before this change, "ring" subcommand has two issues: 1. `--resolve-ip` option accepts a boolean argument, but this option should be a switch, which does not accept any argument at all 2. it always prints the endpoint no matter if `--resolve-ip` is specified or not. but it should print the resolved name, instead of an IP address if `--resolve-ip` is specified. in this change, both issues are addressed. and the test is updated accordingly to exercise the case where `--resolve-ip` is used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-07 22:29:31 +08:00
Kefu Chai	d927ee8d8f	test/nodetool: calc max_width from all_hosts for better readability. as `token_to_endpoint` is but a derived variable from `all_hosts`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-07 22:28:54 +08:00
Kefu Chai	4a748c7fb0	test/nodetool: keep tokens as Host's member to be more consistent with the test_status.py. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-07 22:28:54 +08:00
Kefu Chai	aefc385786	test/nodetool: remove unused import and add two empty lines in between global functions Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-07 22:28:54 +08:00
Botond Dénes	b69ee6bc27	Merge 'Fix load-and-stream for tablets' from Raphael "Raph" Carvalho It might happen that multiple tablets co-habit the same shard, so we want load-and-stream to jump into a new streaming session for every tablet, such that the receiver will have the data properly segregated. That's a similar treatment we gave to repair. Today, load-and-stream fails due to sstables spanning more than 1 tablet in the receiver. Synchronization with migration is done by taking replication map, so migrations cannot advance while streaming new data. A bug was fixed too, where data must be streamed to pending replicas too, to handle case where migration is ongoing and new data must reach both old and new replica set. A test was added stressing this synchronization path. Another bug was fixed in sstable loading, which expected sharder to not be invalidated throughout the operation, but that breaks during migrations. Fixes #17315. Closes scylladb/scylladb#17449 * github.com:scylladb/scylladb: test: test_tablets: Add load-and-stream test sstables_loader: Stream to pending tablet replica if needed sstables_loader: Implement tablet based load-and-stream sstables_loader: Virtualize sstable_streamer for tablet sstables_loader: Avoid reallocations in vector sstable_loader: Decouple sstable streaming from selection sstables_loader: Introduce sstable_streamer Fix online SSTable loading with concurrent tablet migration	2024-03-07 14:18:30 +02:00
Botond Dénes	09068d20ea	tools/scylla-nodetool: scrub: make keyspace parameter optional When no keyspace is provided, request all keyspaces from the server, then scrub all of them. This is what the legacy nodetool does, for some reason this was missed when re-implementing scrub. Closes scylladb/scylladb#17495	2024-03-07 11:15:46 +02:00
Tomasz Grabiec	ec6ed18b5c	Merge 'Handle tablet migration failure in barrier stages' from Pavel Emelyanov There are 4 barrier-only stages when migrating a tablet and the test needs to fail pending/leaving replica that handles it in order to validate how coordinator handles dead node. Failing the barrier is done by suspending it with injection code and stopping the node without waking it up. The main difficulty here is how to tell one barrier RPC call from another, because they don't have anything onboard that could tell which stage the barrier is run for. This PR suggests that barrier injection code looks directly into the system.tablets table for the transition stage, the stage is already there by the time barrier is about to ack itself over RPC. refs: #16527 Closes scylladb/scylladb#17450 * github.com:scylladb/scylladb: topology.tablets_migration: Handle failed use_new topology.tablets_migration: Handle failed write_both_read_new topology.tablets_migration: Handle failed write_both_read_old topology.tablets_migration: Handle failed allow_write_both_read_old test/tablets_migration: Add conditional break-point into barrier handler replica: Add helper to read tablet transition stage topology_coordinator: Add action_failed() helper	2024-03-07 09:56:13 +01:00
Botond Dénes	5dfaa69bde	tools/scylla-nodetool: listsnapshots: make the formatting compatible with origin's The author (me) tried to be clever and fix the formatting, but then he realized this just means a lot of unnecessary fighting with tests. So this patch makes the formatting compatible with that of the legacy nodetool: * Use compatible rounding and precision formatting * Use incorrect unit (KB instead of KiB) * Align numbers to the left * Add trailing white-space to "Snapshot Details: "	2024-03-07 03:54:54 -05:00
Botond Dénes	80483ba732	tools/scylla-nodetool: listsnapshots: bail out if there are no snapshots Print a message and exit, don't continue to output the snapshot table. This is what the legacy nodetool does too.	2024-03-07 03:54:54 -05:00
Botond Dénes	ac15e4c109	tools/scylla-nodetool: repair: accept and ignore -full/--full and -j/--job-threads These two parameters are not used by the native nodetool, because ScyllaDB itself doesn't support them. These should be just ignored and indeed there was a unit test checking that this is the case. However, due to a mistake in the unit test, this was not actually tested and nodetool complained when seeing these params. This patch fixes both the test and the native nodetool. Closes scylladb/scylladb#17477	2024-03-07 11:53:50 +03:00
Botond Dénes	75fe2f5c3a	Merge 'test: rest_api: fix tests to work with tablets' from Aleksandra Martyniuk Fix test_compaction_task.py, test_repair_task.py and test_storage_service.py to work with tablets. Fixes: #17338. Closes scylladb/scylladb#17474 * github.com:scylladb/scylladb: test: rest_api: enable tablets by default test: fix indentation and delete unused this_dc param test: rest_api: fix test_storage_service.py test: rest_api: fix test_repair_task.py test: rest_api: fix test_compaction_task.py test: rest_api: use skip_without_tablets fixture test: rest_api: add some tablet related fixtures	2024-03-07 10:00:09 +02:00
Michał Chojnowski	f9e97fa632	sstables: fix a use-after-free in key_view::explode() key_view::explode() contains a blatant use-after-free: unless the input is already linearized, it returns a view to a local temporary buffer. This is rare, because partition keys are usually not large enough to be fragmented. But for a sufficiently large key, this bug causes a corrupted partition_key down the line. Fixes #17625 Closes scylladb/scylladb#17626	2024-03-07 09:07:07 +02:00
Kefu Chai	64e14d21db	locator/tablets: add fmt::formatter for tablet_* before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * tablet_id * tablet_replica * tablet_metadata * tablet_map their operator<<:s are dropped Refs scylladb/scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17504	2024-03-07 09:00:49 +03:00
Pavel Emelyanov	52a1b2c413	Merge 'mutation: add fmt::formatter for mutation types' from Kefu Chai before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * position_range * mutation_fragment * range_tombstone_stream * mutation_fragment_v2::printer Refs #13245 Closes scylladb/scylladb#17521 * github.com:scylladb/scylladb: mutation: add fmt::formatter for position_range mutation: add fmt::formatter for mutation_fragment and range_tombstone_stream mutation: add fmt::formatter for mutation_fragment_v2::printer	2024-03-07 08:56:21 +03:00
Pavel Emelyanov	df6048adec	topology.tablets_migration: Handle failed use_new This stage doesn't need any special treatment, because we cannot revert to old replicas and should proceed normally. The barrier itself won't get stuck, because it already handles excluded/ignored nodes. Just make the test validate it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-07 08:47:26 +03:00
Pavel Emelyanov	fb7428c560	topology.tablets_migration: Handle failed write_both_read_new Two options here -- go revert to old replicas by jumping into cleanup_target stage or proceed noramlly. The choice depends on which replica set has less number of dead nodes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-07 08:47:26 +03:00
Pavel Emelyanov	324eaaf873	topology.tablets_migration: Handle failed write_both_read_old At this stage it can happen that target replica got some writes, so its tablet needs to be cleaned up, so jump to cleanup_target stage. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-07 08:47:26 +03:00
Pavel Emelyanov	f81e0b2e88	topology.tablets_migration: Handle failed allow_write_both_read_old This is early stage, just proceed to existing revert_migration Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-07 08:47:26 +03:00
Pavel Emelyanov	5bb1597a30	test/tablets_migration: Add conditional break-point into barrier handler There are several transition stages that are executed by the topology coordinator with the help of barrier-and-drain raft commands. For the test to stop and remove a node while handling this stage it must inject a break-point into barrier handler, wait for it to happen and then stop the node without resuming the break-point. Then removenode from the cluster. The break-point suspends barrier handling when a specific tablet is in specific transition stage. Tablet ID and desired stage are configured via injector parameters. With today's error-injection facilities the way to suspend code execution is with injecting a lambda that waits for a message from the injection engine. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-07 08:47:26 +03:00
Botond Dénes	8dd6fe75e7	Merge 'tools/scylla-nodetool: implement info ' from Kefu Chai Refs #15588 Closes scylladb/scylladb#17498 * github.com:scylladb/scylladb: tools/scylla-nodetool: implement info test/nodetool: move format_size into utils.py	2024-03-07 07:14:51 +02:00
Kamil Braun	19b816bb68	Merge 'Migrate system_auth to raft group0' from Marcin Maliszkiewicz This patch series makes all auth writes serialized via raft. Reads stay eventually consistent for performance reasons. To make transition to new code easier data is stored in a newly created keyspace: system_auth_v2. Internally the difference is that instead of executing CQL directly for writes we generate mutations and then announce them via raft group0. Per commit descriptions provide more implementation details. Refs https://github.com/scylladb/scylladb/issues/16970 Fixes https://github.com/scylladb/scylladb/issues/11157 Closes scylladb/scylladb#16578 * github.com:scylladb/scylladb: test: extend auth-v2 migration test to catch stale static test: add auth-v2 migration test test: add auth-v2 snapshot transfer test test: auth: add tests for lost quorum and command splitting test: pylib: disconnect driver before re-connection test: adjust tests for auth-v2 auth: implement auth-v2 migration auth: remove static from queries on auth-v2 path auth: coroutinize functions in password_authenticator auth: coroutinize functions in standard_role_manager auth: coroutinize functions in default_authorizer storage_service: add support for auth-v2 raft snapshots storage_service: extract getting mutations in raft snapshot to a common function auth: service: capture string_view by value alternator: add support for auth-v2 auth: add auth-v2 write paths auth: add raft_group0_client as dependency cql3: auth: add a way to create mutations without executing cql3: run auth DML writes on shard 0 and with raft guard service: don't loose service_level_controller when bouncing client_state auth: put system_auth and users consts in legacy namespace cql3: parametrize keyspace name in auth related statements auth: parametrize keyspace name in roles metadata helpers auth: parametrize keyspace name in password_authenticator auth: parametrize keyspace name in standard_role_manager auth: remove redundant consts auth::meta::*::qualified_name auth: parametrize keyspace name in default_authorizer db: make all system_auth_v2 tables use schema commitlog db: add system_auth_v2 tables db: add system_auth_v2 keyspace	2024-03-06 10:11:33 +01:00

1 2 3 4 5 ...

6480 Commits