scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Files

Dawid Mędrek c56e47f72f db/hints: Cancel draining when stopping node

Draining hints may occur in one of the two scenarios:

* a node leaves the cluster and the local node drains all of the hints
  saved for that node,
* the local node is being decommissioned.

Draining may take some time and the hint manager won't stop until it
finishes. It's not a problem when decommissioning a node, especially
because we want the cluster to retain the data stored in the hints.
However, it may become a problem when the local node started draining
hints saved for another node and now it's being shut down.

There are two reasons for that:

* Generally, in situations like that, we'd like to be able to shut down
  nodes as fast as possible. The data stored in the hints won't
  disappear from the cluster yet since we can restart the local node.
* Draining hints may introduce flakiness in tests. Replaying hints doesn't
  have the highest priority and it's reflected in the scheduling groups we
  use as well as the explicitly enforced throughput. If there are a large
  number of hints to be replayed, it might affect our tests.
  It's already happened, see: scylladb/scylladb#21949.

To solve those problems, we change the semantics of draining. It will behave
as before when the local node is being decommissioned. However, when the
local node is only being stopped, we will immediately cancel all ongoing
draining processes and stop the hint manager. To amend for that, when we
start a node and it initializes a hint endpoint manager corresponding to
a node that's already left the cluster, we will begin the draining process
of that endpoint manager right away.

That should ensure all data is retained, while possibly speeding up
the shutdown process.

There's a small trade-off to it, though. If we stop a node, we can then
remove it. It won't have a chance to replay hints it might've before
these changes, but that's an edge case. We expect this commit to bring
more benefit than harm.

We also provide tests verifying that the implementation works as intended.

Fixes scylladb/scylladb#21949

Closes scylladb/scylladb#22811

(cherry picked from commit 0a6137218a)

Closes scylladb/scylladb#23370

2025-04-03 09:09:05 +02:00

alternator

alternator: document the state of tablet support in Alternator

2025-03-16 18:25:21 +02:00

auth_cluster

test/auth_cluster: make test_service_level_metric_name_change useful

2025-01-20 18:17:15 +01:00

boost

Merge 'tablets: Make load balancing capacity-aware' from Tomasz Grabiec

2025-03-25 23:16:35 +01:00

broadcast_tables

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cql

test: enable_create_table_with_compact_storage for tests that need it

2025-01-20 08:14:37 +02:00

cqlpy

service: query_pager: fix last-position for filtering queries

2025-02-13 09:40:05 +02:00

ldap

Introduce LDAP role manager & saslauthd authenticator

2025-01-12 14:50:29 +02:00

lib

Merge 'tablets: Make load balancing capacity-aware' from Tomasz Grabiec

2025-03-25 23:16:35 +01:00

manual

messaging_service: drop the usage of ip based token_metadata APIs

2025-01-16 16:37:06 +02:00

nodetool

tools/scylla-nodetool: netstats: don't assume both senders and receivers

2025-02-17 14:34:36 +02:00

object_store

sstables_loader: report progress with the unit of batch

2025-01-13 09:04:35 +03:00

perf

Merge 'tablets: Make load balancing capacity-aware' from Tomasz Grabiec

2025-03-25 23:16:35 +01:00

pylib

test: add test to check dcs and hosts repair filter

2025-02-27 12:14:47 +01:00

pylib_test

test.py: Create central conftest.

2024-11-24 20:09:48 +02:00

raft

test: Use linux-aio backend again on seastar-based tests

2025-02-12 20:50:51 +02:00

redis

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

resource

build: cmake: use wasm32-wasip1 as an alternative of wasm32-wasi

2025-01-16 16:28:29 +03:00

rest_api

api: task_manager: do not unregister tasks on get_status

2025-01-31 08:21:03 +00:00

scylla_gdb

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

topology

test: Fix inconsistent naming of the log files.

2025-01-21 10:45:17 +02:00

topology_custom

db/hints: Cancel draining when stopping node

2025-04-03 09:09:05 +02:00

topology_random_failures

test.py: random_failures: deselect topology ops for some injections

2025-03-27 13:19:59 +02:00

topology_tasks

Merge '[Backport 2025.1] api: task_manager: do not unregister finish task when its status is queried' from Scylladb[bot]

2025-02-13 09:38:12 +02:00

unit

utils: do not include unused headers

2025-01-14 07:56:39 -05:00

__init__.py

…

CMakeLists.txt

Introduce LDAP role manager & saslauthd authenticator

2025-01-12 14:50:29 +02:00

conftest.py

test.py: Create central conftest.

2024-11-24 20:09:48 +02:00

pytest.ini

test/pytest.ini: ignore warning on deprecated record_property fixture

2024-12-30 10:58:31 +02:00

README.md

…

README.md

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.