scylladb/service at 9258d0e1cfceebe68ec0081a569a159fd5577415 - scylladb - Anomalous Gitea

mirrors/scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Files

History

Piotr Dulikowski e59374b721 Merge '[Backport 2025.1] batchlog_manager: abort replay of a failed batch on shutdown or node down' from Scylladb[bot]

When replaying a failed batch and sending the mutation to all replicas, make the write response handler cancellable and abort it on shutdown or if some target is marked down. also set a reasonable timeout so it gets aborted if it's stuck for some other unexpected reason.

Previously, the write response handler is not cancellable and has no timeout. This can cause a scenario where some write operation by the batchlog manager is stuck indefinitely, and node shutdown gets stuck as well because it waits for the batchlog manager to complete, without aborting the operation.

backport to relevant versions since the issue can cause node shutdown to hang

Fixes scylladb/scylladb#24599

- (cherry picked from commit 8d48b27062)

- (cherry picked from commit fc5ba4a1ea)

- (cherry picked from commit 7150632cf2)

- (cherry picked from commit 74a3fa9671)

- (cherry picked from commit a9b476e057)

- (cherry picked from commit d7af26a437)

Parent PR: #24595

Closes scylladb/scylladb#24878

* github.com:scylladb/scylladb:
  test: test_batchlog_manager: batchlog replay includes cdc
  test: test_batchlog_manager: test batch replay when a node is down
  batchlog_manager: set timeout on writes
  batchlog_manager: abort writes on shutdown
  batchlog_manager: create cancellable write response handler
  storage_proxy: add write type parameter to mutate_internal

2025-07-09 17:23:26 +02:00

..

broadcast_tables/experimental

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

service: query_pager: fix last-position for filtering queries

2025-02-13 09:40:05 +02:00

treewide: use angle brackets when including seastar headers

2024-12-20 16:16:28 +02:00

raft topology: Add support for raft topology system tables initialization to happen before group0 initialization

2025-02-20 21:21:31 +00:00

group0: modify start_operation logic to account for synchronize phase race condition

2025-06-29 14:33:01 +03:00

address_map.hh

service: address_map: add lookup function that expects address to exist

2025-01-15 16:30:28 +02:00

cache_hitrate_calculator.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

client_state.cc

[Backport 2025.1] auth: forbid modifying system ks by non-superusers

2025-04-06 15:10:06 +03:00

client_state.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

CMakeLists.txt

service: add tablet_virtual_task

2024-11-28 11:42:38 +01:00

endpoint_lifecycle_subscriber.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

load_broadcaster.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

load_meter.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

maintenance_mode.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

mapreduce_service.cc

mapreduce_service: Prevent race condition

2025-06-13 14:45:21 +03:00

mapreduce_service.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

memory_limiter.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

migration_listener.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

migration_manager.cc

base_info: remove the lw_shared_ptr variant

2025-05-27 21:40:23 +02:00

migration_manager.hh

raft topology: Add support for raft topology system tables initialization to happen before group0 initialization

2025-02-20 21:21:31 +00:00

misc_services.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_state.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

session.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

session.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

state_id.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

storage_proxy_stats.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

storage_proxy.cc

test: test_batchlog_manager: test batch replay when a node is down

2025-07-08 12:32:26 +03:00

storage_proxy.hh

batchlog_manager: abort writes on shutdown

2025-07-08 06:24:30 +00:00

storage_service.cc

replica: Fix truncate assert failure

2025-07-09 17:39:19 +03:00

storage_service.hh

topology coordinator: add REST endpoint to query the status of ongoing topology cmd rpc

2025-07-08 11:56:30 +03:00

tablet_allocator_fwd.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tablet_allocator.cc

test: add reproducer and test for mutation source refresh after merge

2025-06-30 10:50:53 -03:00

tablet_allocator.hh

service: Introduce rack-aware co-location migrations for tablet merge

2025-03-14 20:02:33 +01:00

tablet_operation.hh

service: Add tablet_operation.hh

2025-01-17 16:12:05 +08:00

task_manager_module.cc

tasks: check whether a node is alive before rpc

2025-04-30 10:14:58 +02:00

task_manager_module.hh

service: retrun status_helper struct from tablet_virtual_task::get_status_helper

2025-01-10 10:03:08 +01:00

topology_coordinator.cc

topology coordinator: add REST endpoint to query the status of ongoing topology cmd rpc

2025-07-08 11:56:30 +03:00

topology_coordinator.hh

topology coordinator: add REST endpoint to query the status of ongoing topology cmd rpc

2025-07-08 11:56:30 +03:00

topology_guard.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

topology_mutation.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

topology_mutation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

topology_state_machine.cc

Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec

2025-03-13 14:08:30 +01:00

topology_state_machine.hh

Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec

2025-03-13 14:08:30 +01:00

view_update_backlog_broker.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00