Files
scylladb/test/lib
Tomasz Grabiec c077283352 Merge 'service: support conversion of tablet keyspaces to rack-list using ALTER KEYSPACE' from Aleksandra Martyniuk
If a keyspace has a numeric replication factor in a DC and rf < #racks,
then the replicas of tablets in this keyspace can be distributed among
all racks in the DC (different for each tablet). With rack list, we need all
tablet replicas to be placed on the same racks. Hence, the conversion
requires tablet co-location.

After this series, the conversion can be done using ALTER KEYSPACE
statement. The statement that does this conversion in any DC is not
allowed to change a rf in any DC. So, if we have dc1 and dc2 with 3 racks
each and a keyspace ks then with a single ALTER KEYSPACE we can do:
- {dc1 : 2} -> {dc1 : [r1, r2]};
- {dc1 : 2, dc2: 2} ->  {dc1 : [r1, r2], dc2: [r2,r3]};
- {dc1 : 2, dc2: 2} -> {dc1 : [r1, r2], dc2: 2}
- {dc1 : 2} -> {dc1 : 2, dc2 : [r1]}
But we cannot do:
- {dc1 : 2} -> {dc1 : [r1, r2, r3]};
- {dc1 : 1, dc2 : [r1, r2] → dc1: [r1], dc2: [r1].

In order to do the co-locations rf change request is paused. Tablet
load balancer examines the paused rf change requests and schedules
necessary tablet migrations. During the process of co-location, no other
cross-rack migration is allowed.

Load balancer checks whether any paused rf change request is
ready to be resumed. If so, it puts the request back to global topology
request queue.

While an rf change request for a keyspace is running, any other rf change
of this keyspace will fail.

Fixes: #26398.

New feature, no backport

Closes scylladb/scylladb#27279

* github.com:scylladb/scylladb:
  test: add est_rack_list_conversion_with_two_replicas_in_rack
  test: test creating tablet_rack_list_colocation_plan
  test: add test_numeric_rf_to_rack_list_conversion test
  tasks: service: add global_topology_request_virtual_task
  cql3: statements: allow altering from numeric rf to rack list
  service: topology_coordinator: pause keyspace_rf_change request
  service: implement make_rack_list_colocation_plan
  service: add tablet_rack_list_colocation_plan
  cql3: reject concurrent alter of the same keyspace
  test: check paused rf change requests persistence
  db: service: add paused_rf_change_requests to system.topology
  service: pass topology and system_keyspace to load_balancer ctor
  service: tablet_allocator: extract load updates
  service: tablet_allocator: extract ensure_node
  tasks, system_keyspace: Introduce get_topology_request_entry_opt()
  node_ops: Drop get_pending_ids()
  node_ops: Drop redundant get_status_helper()
2025-12-17 10:05:06 +01:00
..