scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Kefu Chai	aea6cd0b2d	test/tablets: do not compare signed and unsigned this change should silence following warning: ``` test/boost/tablets_test.cc:1600:27: error: comparison of integers of different signs: 'int' and 'unsigned int' [-Werror,-Wsign-compare] 19:47:04 for (int i = 0; i < smp::count * 20; i++) { 19:47:04 ~ ^ ~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-02 20:49:21 +08:00
Botond Dénes	dc8e13baed	Merge 'Move some tablets tests from topology_custom to cql-pytest' from Pavel Emelyanov The latter suite is now tablets-aware and tablets cases from the former one can happily work with single shared scylla instance Closes scylladb/scylladb#17101 * github.com:scylladb/scylladb: test/topology_custom: Remove test_tablets.py test/topology: Move test_tablet_change_initial_tablets test/topology: Move test_tablet_explicit_disabling test/topology: Move test_tablet_default_initialization test/topology: Move test_tablet_change_replication_strategy test/topology: Move test_tablet_change_replication_vnode_to_tablets cql-pytest: Add skip_without_tablets fixture	2024-02-01 16:28:43 +02:00
Kamil Braun	c911bf1a33	test_raft_snapshot_request: fix flakiness (again) At the end of the test, we wait until a restarted node receives a snapshot from the leader, and then verify that the log has been truncated. To check the snapshot, the test used the `system.raft_snapshots` table, while the log is stored in `system.raft`. Unfortunately, the two tables are not updated atomically when Raft persists a snapshot (scylladb/scylladb#9603). We first update `system.raft_snapshots`, then `system.raft` (see `raft_sys_table_storage::store_snapshot_descriptor`). So after the wait finishes, there's no guarantee the log has been truncated yet -- there's a race between the test's last check and Scylla doing that last delete. But we can check the snapshot using `system.raft` instead of `system.raft_snapshots`, as `system.raft` has the latest ID. And since `1640f83fdc`, storing that ID and truncating the log in `system.raft` happens atomically. Closes scylladb/scylladb#17106	2024-02-01 16:06:12 +02:00
Patryk Wrobel	25324bbe50	cql_test_env.cc: remove dead code This change removes empty anonymous namespace that is a dead code. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#17099	2024-02-01 13:17:48 +02:00
Pavel Emelyanov	64cb3a6496	test/topology_custom: Remove test_tablets.py It's now empty, all test cases had been moved to cql-pytest Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 13:59:51 +03:00
Pavel Emelyanov	3fbe93e45d	test/topology: Move test_tablet_change_initial_tablets Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 13:59:51 +03:00
Pavel Emelyanov	480227fcad	test/topology: Move test_tablet_explicit_disabling Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 13:59:51 +03:00
Pavel Emelyanov	45b0490100	test/topology: Move test_tablet_default_initialization Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 13:59:51 +03:00
Pavel Emelyanov	3258c56ca3	test/topology: Move test_tablet_change_replication_strategy Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 13:59:51 +03:00
Pavel Emelyanov	6f50cc2783	test/topology: Move test_tablet_change_replication_vnode_to_tablets Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 13:59:51 +03:00
Botond Dénes	2a4b991772	Merge 'Fix mintimeuuid() call that could crash Scylla' from Nadav Har'El This PR fixes the bug of certain calls to the `mintimeuuid()` CQL function which large negative timestamps could crash Scylla. It turns out we already had protections in place against very positive timestamps, but very negative timestamps could still cause bugs. The actual fix in this series is just a few lines, but the bigger effort was improving the test coverage in this area. I added tests for the "date" type (the original reproducer for this bug used totimestamp() which takes a date parameter), and also reproducers for this bug directly, without totimestamp() function, and one with that function. Finally this PR also replaces the assert() which made this molehill-of-a-bug into a mountain, by a throw. Fixes #17035 Closes scylladb/scylladb#17073 * github.com:scylladb/scylladb: utils: replace assert() by on_internal_error() utils: add on_internal_error with common logger utils: add a timeuuid minimum, like we had maximum test/cql-pytest: tests for "date" type	2024-02-01 10:48:48 +02:00
Asias He	2888c3086c	utils: Add uuid_xor_to_uint32 helper Convert the uuid to a uint32_t using xor. It is useful to get a uint32_t number from the uuid. Refs: #16927 Closes scylladb/scylladb#17049	2024-02-01 10:27:55 +02:00
Pavel Emelyanov	ab7ce3d1fa	cql-pytest: Add skip_without_tablets fixture It's opposite to skip_with_tablets one and thus also depends on scylla_only one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-01 10:58:13 +03:00
Pavel Emelyanov	7c5c89ba8d	Revert "Merge 'Use utils::directories instead of db::config to get dirs' from Patryk Wróbel" This reverts commit `370fbd346c`, reversing changes made to `0912d2a2c6`. This makes scylla-manager mis-interpret the data_file_directories somehow, issue #17078	2024-01-31 15:08:14 +03:00
Avi Kivity	c8397f0287	Merge 'Implement tablet splitting' from Raphael "Raph" Carvalho The motivation for tablet resizing is that we want to keep the average tablet size reasonable, such that load rebalancing can remain efficient. Too large tablet makes migration inefficient, therefore slowing down the balancer. If the avg size grows beyond the upper bound (split threshold), then balancer decides to split. Split spans all tablets of a table, due to power-of-two constraint. Likewise, if the avg size decreases below the lower bound (merge threshold), then merge takes place in order to grow the avg size. Merge is not implemented yet, although this series lays foundation for it to be impĺemented later on. A resize decision can be revoked if the avg size changes and the decision is no longer needed. For example, let's say table is being split and avg size drops below the target size (which is 50% of split threshold and 100% of merge one). That means after split, the avg size would drop below the merge threshold, causing a merge after split, which is wasteful, so it's better to just cancel the split. Tablet metadata gains 2 new fields for managing this: resize_type: resize decision type, can be either of "merge", "split", or "none". resize_seq_number: a sequence number that works as the global identifier of the decision (monotonically increasing, increased by 1 on every new decision emitted by the coordinator). A new RPC was implemented to pull stats from each table replica, such that load balancer can calculate the avg tablet size and know the "split status", for a given table. Avg size is aggregated carefully while taking RF of each DC into account (which might differ). When a table is done splitting its storage, it loads (mirror) the resize_seq_number from tablet metadata into its local state (in another words, my split status is ready). If a table is split ready, coordinator will see that table's seq number is the same as the one in tablet metadata. Helps to distinguish stale decisions from the latest one (in case decisions are revoked and re-emited later on). Also, it's aggregated carefully, by taking the minimum among all replicas, so coordinator will only update topology when all replicas are ready. When load balancer emits split decision, replicas will listen to need to split with a "split monitor" that is awakened once a table has replication metadata updated and detects the need for split (i.e. resize_type field is "split"). The split monitor will start splitting of compaction groups (using mechanism introduced here: `081f30d149`) for the table. And once splitting work is completed, the table updates its local state as having completed split. When coordinator pulls the split status of all replicas for a table via RPC, the balancer can see whether that table is ready for "finalizing" the decision, which is about updating tablet metadata to split each tablet into two. Once table replicas have their replication metadata updated with the new tablet count, they can update appropriately their set of compaction groups (that were previously split in the preparation step). Fixes #16536. Closes scylladb/scylladb#16580 * github.com:scylladb/scylladb: test/topology_experimental_raft: Add tablet split test replica: Bypass reshape on boot with tablets temporarily replica: Fix table::compaction_group_for_sstable() for tablet streaming test/topology_experimental_raft: Disable load balancer in test fencing replica: Remap compaction groups when tablet split is finalized service: Split tablet map when split request is finalized replica: Update table split status if completed split compaction work storage_service: Implement split monitor topology_cordinator: Generate updates for resize decisions made by balancer load_balancer: Introduce metrics for resize decisions db: Make target tablet size a live-updateable config option load_balancer: Implement resize decisions service: Wire table_resize_plan into migration_plan service: Introduce table_resize_plan tablet_mutation_builder: Add set_resize_decision() topology_coordinator: Wire load stats into load balancer storage_service: Allow tablet split and migration to happen concurrently topology_coordinator: Periodically retrieve table_load_stats locator: Introduce topology::get_datacenter_nodes() storage_service: Implement table_load_stats RPC replica: Expose table_load_stats in table replica: Introduce storage_group::live_disk_space_used() locator: Introduce table_load_stats tablets: Add resize decision metadata to tablet metadata locator: Introduce resize_decision	2024-01-31 13:59:56 +02:00
Botond Dénes	181f68f248	Merge 'raft_group0: trigger snapshot if existing snapshot index is 0' from Kamil Braun The persisted snapshot index may be 0 if the snapshot was created in older version of Scylla, which means snapshot transfer won't be triggered to a bootstrapping node. Commands present in the log may not cover all schema changes --- group 0 might have been created through the upgrade upgrade procedure, on a cluster with existing schema. So a deployment with index=0 snapshot is broken and we need to fix it. We can use the new `raft::server::trigger_snapshot` API for that. Also add a test. Fixes scylladb/scylladb#16683 Closes scylladb/scylladb#17072 * github.com:scylladb/scylladb: test: add test for fixing a broken group 0 snapshot raft_group0: trigger snapshot if existing snapshot index is 0	2024-01-31 13:04:59 +02:00
Nadav Har'El	827c20467c	utils: add a timeuuid minimum, like we had maximum Our time-handling code in UUID_gen.hh is very fragile for very large timestamps, because the different types - such as Cassandra "timestamp" and Timeuuid use very different resolution and ranges. In issue #17035 we discovered a situation where a certain CQL "timestamp"-type value could cause an assertion-failure and a crash in the create_time() function that creates a timeuuid - because that timestamp didn't fit the place we have in timeuuid. We already added in the past a limit, UUID_UNIXTIME_MAX, beyond which we refuse timestamps, to avoid these assertions failure. However, we missed the possibility of negative timestamps (which are allowed in CQL), and indeed a negative timestamp (or a timestamp which was "wrapped" to a negative value) is what caused issue #17035. So this patch adds a second limit, UUID_UNIXTIME_MIN - limiting the most negative timestamp that we support to well below the area which causes problems, and adds tests that reproduce #17035 and that we didn't break anything else (e.g., negative timestamps are still allowed - just not extremely negative timestamps). Fixes #17035. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-31 11:32:26 +02:00
Kamil Braun	bb22e06a9e	Merge 'abort failed rebuild instead of retrying it forever' from Gleb Add error handling to rebuild instead of retrying it until succeeds. * 'gleb/rebuild-fail-v2' of github.com:scylladb/scylla-dev: test: add test for rebuild failure test: add expected_error to rebuild_node operation topology_coordinator: Propagate rebuild failure to the initiator	2024-01-31 10:07:28 +01:00
Nadav Har'El	47955642d9	test/cql-pytest: tests for "date" type This patch adds a few simple tests for the values of the "date" column type, and how it can be initialized from string or integers, and what do those values mean. Two of the tests reproduce issue #17066, where validation is missing for values that don't fit in a 32-bit unsigned integer. Refs #17066 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-31 10:58:02 +02:00
Kamil Braun	57d5aa5a68	test: add test for fixing a broken group 0 snapshot In a cluster with group 0 with snapshot at index 0 (such group 0 might be established in a 5.2 cluster, then preserved once it upgrades to 5.4 or later), no snapshot transfer will be triggered when a node is bootstrapped. This way to new node might not obtain full schema, or obtain incorrect schema, like in scylladb/scylladb#16683. Simulate this scenario in a test case using the RECOVERY mode and error injections. Check that the newly added logic for creating a new snapshot if such situation is detected helps in this case.	2024-01-30 16:44:01 +01:00
Kamil Braun	74bf60a8ca	test_raft_snapshot_request: fix flakiness Add workaround for scylladb/python-driver#295. Also an assert made at the end of the test was false, it is fixed with appropriate comment added.	2024-01-30 16:21:24 +01:00
Kamil Braun	39339b9f70	test: topology/util: update comment for `reconnect_driver` The issues mentioned in the comment before are already fixed. Unfortunately, there is another, opposite issue which this function can be used for. The previous issue was about the existing driver session not reconnecting. The current issue is about the existing driver session reconnecting too much... (and in the middle of queries.)	2024-01-30 15:36:48 +01:00
Kamil Braun	cf3f26dc94	test_maintenance_mode: fix flakiness Wait until CQL is available and nodes see each other before trying to perform a query. Closes scylladb/scylladb#17059	2024-01-30 14:11:14 +02:00
Gleb Natapov	8b50613465	test: add test for rebuild failure	2024-01-30 11:04:19 +02:00
Gleb Natapov	d62204e758	test: add expected_error to rebuild_node operation	2024-01-30 11:04:19 +02:00
Michał Chojnowski	904bb25987	test: test_tablet_cleanup: wait for servers to see each other before multi-node queries Waiting for CQL connections is not enough. For the queries to succeed, nodes must see each other. We have to wait for this, otherwise the test will be flaky. Fixes #17029 Closes scylladb/scylladb#17040	2024-01-30 08:56:01 +02:00
Tomasz Grabiec	36f218c83b	Merge 'main: refuse startup when tablet resharding is required' from Botond Dénes We do not support tablet resharding yet. All tablet-related code assumes that the (host_id, shard) tablet replica is always valid. Violating this leads to undefined behaviour: errors in the tablet load balancer and potential crashes. Avoid this by refusing to start if the need to resharding is detected. Be as lenient as possible: check all tablets with a replica on this node, and only refuse startup if at least one tablet has an invalid replica shard. Startup will fail as: ERROR 2024-01-26 07:03:06,931 [shard 0:main] init - Startup failed: std::runtime_error (Detected a tablet with invalid replica shard, reducing shard count with tablet-enabled tables is not yet supported. Replace the node instead.) Refs: #16739 Fixes: #16843 Closes scylladb/scylladb#17008 * github.com:scylladb/scylladb: test/topolgy_experimental_raft: test_tablets.py: add test for resharding test/pylib: manager[_client]: add update_cmdline() main: refuse startup when tablet resharding is required locator: tablets: add check_tablet_replica_shards()	2024-01-29 23:39:41 +01:00
Pavel Emelyanov	370fbd346c	Merge 'Use utils::directories instead of db::config to get dirs' from Patryk Wróbel `db::config` is a class, that is used in many places across the code base. When it is changed, its clients' code need to be recompiled. It represents the configuration of the database. Some fields of the configuration that describe the location of directories may be empty. In such cases `db::config::setup_directories()` function is called - it modifies the provided configuration. Such modification is not good - it is better to keep `db::config` intact. This PR: - extends the public interface of utils::directories class to provide required directory paths to the users - removes 'db::config::setup_directories()' to avoid altering the fields of configuration object - replaces usages of db::config object with utils::directories object in places that require obtaining paths to dirs Fixes: scylladb#5626 Closes scylladb/scylladb#16787 * github.com:scylladb/scylladb: utils/directories: make utils::directories::set an internal type db::config: keep dir paths unchanged cql_transport/controler: use utils::directories to get paths of dirs service/storage_proxy: use utils::directories to get paths of dirs api/storage_service.cc: use utils::directories to get paths of dirs tools/scylla-sstable.cc: use utils::directories to get paths db/commitlog: do not use db::config to get dirs Use utils::directories to get dirs paths in replica::database Allow utils::directories to provide paths to dirs Clean-up of utils::directories	2024-01-29 18:01:15 +03:00
Kamil Braun	0912d2a2c6	Merge 'raft topology: make left_token_ring a transition state' from Patryk Jędrzejczak When a node is in the `left_token_ring` state, we don't know how it has ended up in this state. We cannot distinguish a node that has finished decommissioning from a node that has failed bootstrap. The main problem it causes is that we incorrectly send the `barrier_and_drain` command to a node that has failed bootstrapping or replacing. We must do it for a node that has finished decommissioning because it could still coordinate requests. However, since we cannot distinguish nodes in the `left_token_ring` state, we must send the command to all of them. This issue appeared in scylladb/scylladb#16797 and this PR is a follow-up that fixes it. The solution is changing `left_token_ring` from a node state to a transition state. Fixes scylladb/scylladb#16944 Closes scylladb/scylladb#17009 * github.com:scylladb/scylladb: docs: dev: topology-over-raft: document the left_token_ring state topology_coordinator: adjust reason string in left_token_ring handler raft topology: make left_token_ring a transition state topology_coordinator: rollback_current_topology_op: remove unused exclude_nodes	2024-01-29 15:29:01 +01:00
Botond Dénes	d202d32f81	Merge 'Add an API to trigger snapshot in Raft servers' from Kamil Braun This allows the user of `raft::server` to cause it to create a snapshot and truncate the Raft log (leaving no trailing entries; in the future we may extend the API to specify number of trailing entries left if needed). In a later commit we'll add a REST endpoint to Scylla to trigger group 0 snapshots. One use case for this API is to create group 0 snapshots in Scylla deployments which upgraded to Raft in version 5.2 and started with an empty Raft log with no snapshot at the beginning. This causes problems, e.g. when a new node bootstraps to the cluster, it will not receive a snapshot that would contain both schema and group 0 history, which would then lead to inconsistent schema state and trigger assertion failures as observed in scylladb/scylladb#16683. In 5.4 the logic of initial group 0 setup was changed to start the Raft log with a snapshot at index 1 (`ff386e7a44`) but a problem remains with these existing deployments coming from 5.2, we need a way to trigger a snapshot in them (other than performing 1000 arbitrary schema changes). Another potential use case in the future would be to trigger snapshots based on external memory pressure in tablet Raft groups (for strongly consistent tables). The PR adds the API to `raft::server` and a HTTP endpoint that uses it. In a follow-up PR, we plan to modify group 0 server startup logic to automatically call this API if it sees that no snapshot is present yet (to automatically fix the aforementioned 5.2 deployments once they upgrade.) Closes scylladb/scylladb#16816 * github.com:scylladb/scylladb: raft: remove `empty()` from `fsm_output` test: add test for manual triggering of Raft snapshots api: add HTTP endpoint to trigger Raft snapshots raft: server: add `trigger_snapshot` API raft: server: track last persisted snapshot descriptor index raft: server: framework for handling server requests raft: server: inline `poll_fsm_output` raft: server: fix indentation raft: server: move `io_fiber`'s processing of `batch` to a separate function raft: move `poll_output()` from `fsm` to `server` raft: move `_sm_events` from `fsm` to `server` raft: fsm: remove constructor used only in tests raft: fsm: move trace message from `poll_output` to `has_output` raft: fsm: extract `has_output()` raft: pass `max_trailing_entries` through `fsm_output` to `store_snapshot_descriptor` raft: server: pass `*_aborted` to `set_exception` call	2024-01-29 15:06:04 +02:00
Patryk Wrobel	f08768e767	service/storage_proxy: use utils::directories to get paths of dirs This change replaces usage of db::config with usage of utils::directories to get paths of directories in service/storage_proxy. Refs: scylladb#5626 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-29 13:11:33 +01:00
Patryk Wrobel	804afffb11	db/commitlog: do not use db::config to get dirs This change removes usage of db::config to get path of commitlog_directory. Instead, it introduces a new parameter to directly pass the path to db::commitlog::config::from_db_config(). Refs: scylladb#5626 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-29 13:11:33 +01:00
Patryk Wrobel	9483d149af	Use utils::directories to get dirs paths in replica::database This change replaces the usage of db::config with usage of utils::directories to get dirs paths in replica::database class. Moreover, it adjusts tests that require construction of replica::database - its constructor has been changed to accept utils::directories object. Refs: scylladb#5626 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-29 13:11:33 +01:00
Patryk Wrobel	1cd676e438	Allow utils::directories to provide paths to dirs This change extends utils::directories class in the following way: - adds new member variables that correspond to fields from db::config that describe paths of directories - introduces a public interface to retrieve the values of the new members - allows construction of utils::directories object based on db::config to setup internal member variables related to paths to dirs The new members of utils::directories are overriden when the provided values are empty. The way of setting paths is taken from db::config. To ensure that the new logic works correctly `utils_directories_test` has been created. Refs: scylladb#5626 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-29 13:11:33 +01:00
Botond Dénes	fd66ce1591	test/topolgy_experimental_raft: test_tablets.py: add test for resharding Check that scylla refuses to start when the shard count is reduced.	2024-01-29 07:04:33 -05:00
Botond Dénes	a7a5aada2a	test/pylib: manager[_client]: add update_cmdline() Similar to the existing update_config(). Updates the command-line arguments of the specified nodes, merging the new options into the existing ones. Needs a restart to take effect.	2024-01-29 07:04:33 -05:00
Patryk Jędrzejczak	b0eef50b2e	raft topology: make left_token_ring a transition state A node can be in the `left_token_ring` state after: - a finished decommission, - a failed bootstrap, - a failed replace. When a node is in the `left_token_ring` state, we don't know how it has ended up in this state. We cannot distinguish a node that has finished decommissioning from a node that has failed bootstrap. The main problem it causes is that we incorrectly send the `barrier_and_drain` command to a node that has failed bootstrapping or replacing. We must do it for a node that has finished decommissioning because it could still coordinate requests. However, since we cannot distinguish nodes in the `left_token_ring` state, we must send the command to all of them. This issue appeared in scylladb/scylladb#16797 and this patch is a follow-up that fixes it. The solution is changing `left_token_ring` from a node state to a transition state. Regarding implementation, most of the changes are simple refactoring. The less obvious are: - Before this patch, in `system_keyspace::left_topology_state`, we had to keep the ignored nodes' IDs for replace to ensure that the replacing node will have access to it after moving to the `left_token_ring` state, which happens when replace fails. We don't need this workaround anymore. When we enter the new `left_token_ring` transition state, the new node will still be in the `decommissioning` state, so it won't lose its request param. - Before this patch, a decommissioning node lost its tokens while moving to the `left_token_ring` state. After the patch, it loses tokens while still being in the `decommissioning` state. We ensure that all `decommissioning` handlers correctly handle a node that lost its tokens. Moving the `left_token_ring` handler from `handle_node_transition` to `handle_topology_transition` created a large diff. There are only three changes: - adding `auto node = get_node_to_work_on(std::move(guard));`, - adding `builder.del_transition_state()`, - changing error logged when `global_token_metadata_barrier` fails.	2024-01-29 10:39:07 +01:00
Kefu Chai	8f38bd5376	commitlog: add formatter for db::replay_position before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `db::replay_position`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17014	2024-01-29 09:59:30 +02:00
Botond Dénes	d3c1be9107	Merge 'alternator: enable tablets by default if experimental feature is enabled' from Nadav Har'El This series does a similar change to Alternator as was done recently to CQL: 1. If the "tablets" experimental feature in enabled, new Alternator tables will use tablets automatically, without requiring an option on each new table. A default choice of initial_tablets is used. These choices can still be overridden per-table if the user wants to. 3. In particular, all test/alternator tests will also automatically run with tablets enabled 4. However, some tests will fail on tablets because they use features that haven't yet been implemented with tablets - namely Alternator Streams (Refs #16317) and Alternator TTL (Refs #16567). These tests will - until those features are implemented with tablets - continue to be run without tablets. 5. An option is added to the test/alternator/run to allow developers to manually run tests without tablets enabled, if they wish to (this option will be useful in the short term, and can be removed later). Fixes #16355 Closes scylladb/scylladb#16900 * github.com:scylladb/scylladb: test/alternator: add "--vnodes" option to run script alternator: use tablets by default, if available test/alternator: run some tests without tablets	2024-01-29 09:22:13 +02:00
Dawid Medrek	b92fb3537a	main: Postpone start-up of hint manager In this commit, we postpone the start-up of the hint manager until we obtain information about other nodes in the cluster. When we start the hint managers, one of the things that happen is creating endpoint managers -- structures managed by db::hints::manager. Whether we create an instance of endpoint manager depends on the value returned by host_filter::can_hint_for, which, in turn, may depend on the current state of locator::topology. If locator::topology is incomplete, some endpoint managers may not be started even though they should (because the target node IS part of the cluster and we SHOULD send hints to it if there are some). The situation like that can happen because we start the hint managers too early. This commit aims to solve that problem. We only start the hint managers when we've gathered information about the other nodes in the cluster and created the locator::topology using it. Hinted Handoff is not negatively affected by these changes since in between the previous point of starting the hint managers and the current one, all of the mutations performed by service::storage_proxy target the local node, so no hints would need to be generated anyway. Fixes scylladb/scylladb#11870 Closes scylladb/scylladb#16511	2024-01-26 12:49:40 +01:00
Kefu Chai	a9d781d70f	test/nodetool: only test "storage_service/cleanup_all" with scylla this RESTful API is a scylla specific extension and is only used by scylla-nodetool. currently, the java-based nodetool does not use it at all, so mark it with "scylla_only". one can verify this change with: ``` pytest --mode=debug --nodetool=cassandra test_cleanup.py::test_cleanup ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17001	2024-01-26 13:19:15 +02:00
Botond Dénes	582ddc70ec	Merge 'test/nodetool: return a randomized address if not running with unshare' from Kefu Chai we should allow user to run nodetool tests without `test.py`. but there are good chance that the host could be reused by multiple tests or multiple users who could be using port 12345. by randomizing the IP and port, they would have better chance to complete the test without running into used port problem. Closes scylladb/scylladb#16996 * github.com:scylladb/scylladb: test/nodetool: return a randomized address if not running with unshare test/nodetool: return an address from loopback_network fixture	2024-01-26 13:15:58 +02:00
Kamil Braun	4f736894e1	Merge 'Add maintenance mode' from Mikołaj Grzebieluch In this mode, the node is not reachable from the outside, i.e. * it refuses all incoming RPC connections, * it does not join the cluster, thus * all group0 operations are disabled (e.g. schema changes), * all cluster-wide operations are disabled for this node (e.g. repair), * other nodes see this node as dead, * cannot read or write data from/to other nodes, * it does not open Alternator and Redis transport ports and the TCP CQL port. The only way to make CQL queries is to use the maintenance socket. The node serves only local data. To start the node in maintenance mode, use the `--maintenance-mode true` flag or set `maintenance_mode: true` in the configuration file. REST API works as usual, but some routes are disabled: * authorization_cache * failure_detector * hinted_hand_off_manager This PR also updates the maintenance socket documentation: * add cqlsh usage to the documentation * update the documentation to use `WhiteListRoundRobinPolicy` Fixes #5489. Closes scylladb/scylladb#15346 * github.com:scylladb/scylladb: test.py: add test for maintenance mode test.py: generalize usage of cluster_con test.py: when connecting to node in maintenance mode use maintenance socket docs: add maintenance mode documentation main: add maintenance mode main: move some REST routes initialization before joining group0 message_service: add sanity check that rpc connections are not created in the maintenance mode raft_group0_client: disable group0 operations in the maintenance mode service/storage_service: add start_maintenance_mode() method storage_service: add MAINTENANCE option to mode enum service/maintenance_mode: add maintenance_mode_enabled bool class service/maintenance_mode: move maintenance_socket_enabled definition to seperate file db/config: add maintenance mode flag docs: add cqlsh usage to maintenance socket documentation docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy	2024-01-26 11:02:34 +01:00
Kefu Chai	01727a5399	test/nodetool: return a randomized address if not running with unshare we should allow user to run nodetool tests without `test.py`. but there are good chance that the host could be reused by multiple tests or multiple users who could be using port 12345. by randomizing the IP and port, they would have better chance to complete the test without running into used port problem. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-26 13:32:47 +08:00
Kefu Chai	358d30fd29	test/nodetool: return an address from loopback_network fixture * rename "maybe_setup_loopback_network" to "server_address" * return an address from the fixture this change prepares for bringing back the randomized IP and port, in case users run this test without test.py, by randomizing the IP and port, they would have better chance to complete the test without running into used port problem. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-26 13:20:37 +08:00
Raphael S. Carvalho	3b14c5b84a	test/topology_experimental_raft: Add tablet split test Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-01-25 18:58:43 -03:00
Raphael S. Carvalho	4245ad333a	test/topology_experimental_raft: Disable load balancer in test fencing This is easier to reproducer after changes in load balancer, to emit resize decisions, which in turn results in topology version being incremented, and that might race with fencing tests that manipulate the topology version manually. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-01-25 18:58:43 -03:00
Raphael S. Carvalho	7ed5b44d52	load_balancer: Implement resize decisions This implements the ability in load balancer to emit split or merge requests, cancel ongoing ones if they're no longer needed, and also finalize those that are ready for the topology changes. That's all based on average tablet size, collected by coordinator from all nodes, and split and merge thresholds. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-01-25 18:36:08 -03:00
Raphael S. Carvalho	ed2138a35a	tablet_mutation_builder: Add set_resize_decision() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-01-25 18:36:08 -03:00
Raphael S. Carvalho	0d5ba1ee4b	tablets: Add resize decision metadata to tablet metadata The new metadata describes the ongoing resize operation (can be either of merge, split or none) that spans tablets of a given table. That's managed by group0, so down nodes will be able to see the decision when they come back up and see the changes to the metadata. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-01-25 18:36:06 -03:00

1 2 3 4 5 ...

6251 Commits