scylladb/service at a95ad052dff4b388ff35ca6cffa3fadf259fe831 - scylladb - Anomalous Gitea

mirrors/scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 16:22:15 +00:00

Files

History

Botond Dénes 22942c0a85 Merge '[Backport 2025.2] Raft-based recovery procedure: simplify rolling restart with recovery_leader' from Scylladb[bot]

The following steps are performed in sequence as part of the
Raft-based recovery procedure:
- set `recovery_leader` to the host ID of the recovery leader in
  `scylla.yaml` on all live nodes,
- send the `SIGHUP` signal to all Scylla processes to reload the config,
- perform a rolling restart (with the recovery leader being restarted
  first).

These steps are not intuitive and more complicated than they could be.

In this PR, we simplify these steps. From now on, we will be able to
simply set `recovery_leader` on each node just before restarting it.

Apart from making necessary changes in the code, we also update all
tests of the Raft-based recovery procedure and the user-facing
documentation.

Fixes scylladb/scylladb#25015

The Raft-based procedure was added in 2025.2. This PR makes the
procedure simpler and less error-prone, so it should be backported
to 2025.2 and 2025.3.

- (cherry picked from commit ec69028907)

- (cherry picked from commit 445a15ff45)

- (cherry picked from commit 23f59483b6)

- (cherry picked from commit ba5b5c7d2f)

- (cherry picked from commit 9e45e1159b)

- (cherry picked from commit f408d1fa4f)

Parent PR: #25032

Closes scylladb/scylladb#25334

* github.com:scylladb/scylladb:
  docs: document the option to set recovery_leader later
  test: delay setting recovery_leader in the recovery procedure tests
  gossip: add recovery_leader to gossip_digest_syn
  db: system_keyspace: peers_table_read_fixup: remove rows with null host_id
  db/config, gms/gossiper: change recovery_leader to UUID
  db/config, utils: allow using UUID as a config option

2025-08-06 09:41:17 +03:00

..

broadcast_tables/experimental

service: do not include unused headers

2025-03-20 11:18:16 +08:00

direct_failure_detector

Move direct_failure_detector from root to service/

2025-04-08 13:03:24 +03:00

service: do not include unused headers

2025-03-20 11:18:16 +08:00

…

storage_service, group0_state_machine: move SL cache update from topology_state_load() to load_snapshot()

2025-08-06 09:39:55 +03:00

Merge '[Backport 2025.2] Raft-based recovery procedure: simplify rolling restart with recovery_leader' from Scylladb[bot]

2025-08-06 09:41:17 +03:00

address_map.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

cache_hitrate_calculator.hh

…

client_state.cc

auth: forbid modifying system ks by non-superusers

2025-03-30 16:55:04 +03:00

client_state.hh

…

CMakeLists.txt

raft: implement the limited voters feature

2025-04-07 12:31:18 +02:00

endpoint_lifecycle_subscriber.hh

treewide: pass host id to endpoint_lifecycle_subscriber

2025-03-11 12:09:22 +02:00

load_broadcaster.hh

load_meter: move to host id

2025-03-11 12:09:22 +02:00

load_meter.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

maintenance_mode.hh

…

mapreduce_service.cc

mapreduce_service: Prevent race condition

2025-06-06 08:49:15 +03:00

mapreduce_service.hh

service/mapreduce_service: Cancel query when stopping

2025-02-10 20:12:59 +02:00

memory_limiter.hh

…

migration_listener.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

migration_manager.cc

service: migration_manager: Run group0 barrier in gossip scheduling group

2025-07-17 17:25:10 +00:00

migration_manager.hh

service: migration_manager: use named gate

2025-04-12 11:28:49 +03:00

misc_services.cc

load_meter: move to host id

2025-03-11 12:09:22 +02:00

query_state.hh

…

session.cc

service: do not include unused headers

2025-03-20 11:18:16 +08:00

session.hh

service: session: use named gate

2025-04-12 11:28:49 +03:00

state_id.hh

…

storage_proxy_stats.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

storage_proxy.cc

storage_service: Cancel all write requests on storage_proxy shutdown

2025-07-24 13:02:56 +00:00

storage_proxy.hh

storage_service: Cancel all write requests on storage_proxy shutdown

2025-07-24 13:02:56 +00:00

storage_service.cc

Merge '[Backport 2025.2] Raft-based recovery procedure: simplify rolling restart with recovery_leader' from Scylladb[bot]

2025-08-06 09:41:17 +03:00

storage_service.hh

token_metadata: move make_token_metadata_ptr into shared_token_metadata class

2025-07-21 09:36:40 +03:00

tablet_allocator_fwd.hh

…

tablet_allocator.cc

test_tablet_tasks: use injection to revoke resize

2025-04-30 07:04:57 +03:00

tablet_allocator.hh

service: tablets: Keep load_stats inside tablet_allocator

2025-04-09 20:21:51 +02:00

tablet_operation.hh

…

task_manager_module.cc

tasks: check whether a node is alive before rpc

2025-04-17 12:51:22 +02:00

task_manager_module.hh

tasks: replace ip with host_id in task_identity

2025-02-05 10:11:52 +01:00

topology_coordinator.cc

topology_coordinator: Trigger load stats refresh after replace

2025-08-02 01:26:59 +02:00

topology_coordinator.hh

topology coordinator: add REST endpoint to query the status of ongoing topology cmd rpc

2025-07-08 06:23:48 +00:00

topology_guard.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

topology_mutation.cc

…

topology_mutation.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

topology_state_machine.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

topology_state_machine.hh

service: do not include unused headers

2025-03-20 11:18:16 +08:00

view_update_backlog_broker.hh

treewide: pass host id to endpoint state change subscribers

2025-03-11 12:09:22 +02:00