Previously, raft_group0::abort() was called in storage_service::do_drain (introduced in #24418) to stop the group0 Raft server before destroying local storage. This was necessary because raft::server depends on storage (via raft_sys_table_storage and group0_state_machine). However, this caused issues: services like sstable_dict_autotrainer and auth::service, which use group0_client but are not stopped by storage_service, could trigger use-after-free if raft_group0 was destroyed too early. This can happen both during normal shutdown and when 'nodetool drain' is used. This commit reworks the shutdown logic: * Introduces abort_and_drain(), which aborts the server and waits for background tasks to finish, but keeps the server object alive. Clients will see raft::stopped_error if they try to access group0 after abort_and_drain(). * Final destruction happens in a separate method destroy(), called later from main.cc. The raft_server_for_group::aborted is changed to a shared_future -- abort_server now returns a future so that we can wait for it in abort_and_drain(), it should return the future from the previous abort_server call, which can happen in the on_background_error callback. Node startup can fail before reaching storage_service, in which case ss.drain_on_shutdown() and abort_and_drain() are never called. To ensure proper cleanup, abort_and_drain() is called from main.cc before destroy(). Clients of raft_group_registry are expected to call destroy_server() for the servers they own. Currently, the only such client is raft_group0, which satisfies this requirement. As a result, raft_group_registry::stop_servers() is no longer needed. Instead, raft_group_registry::stop() now verifies that all servers have been properly destroyed. If any remain, it calls on_internal_error(). The call to drain_on_shutdown() in cql_test_env.cc appears redundant. The only source of raft::server instances in raft_group_registry is group0_service, and if group0_service.start() succeeds, both abort_and_drain() and destroy() are guaranteed to be called during shutdown.
Scylla in-source tests.
For details on how to run the tests, see docs/dev/testing.md
Shared C++ utils, libraries are in lib/, for Python - pylib/
alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool
If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).
To add a new folder, create a new directory, and then
copy & edit its suite.ini.