Files
scylladb/test/boost
Patryk Jędrzejczak 07a7a75b98 Merge 'raft: implement the limited voters feature' from Emil Maskovsky
Currently if raft is enabled all nodes are voters in group0. However it is not necessary to have all nodes to be voters - it only slows down the raft group operation (since the quorum is large) and makes deployments with asymmetrical DCs problematic (2 DCs with 5 nodes along 1 DC with 10 nodes will lose the majority if large DC is isolated).

The topology coordinator will now maintain a state where there are only limited number of voters, evenly distributed across the DCs and racks.

After each node addition or removal the voters are recalculated and rebalanced if necessary. That means:
* When a new node is added, it might become a voter depending on the current distribution of voters - either if there are still some voter "slots" available, or if the new node is a better candidate than some existing voter (in which case the existing node voter status might be revoked).
* When a voter node is removed or stopped (shut down), its voter status is revoked and another node might become a voter instead (this can also depend on other circumstances, like e.g. changing the number of DCs).
* If a node addition or removal causes a change in number of data centers (DCs) or racks, the rebalance action might become wider (as there are some special rules applying to 1 vs 2 vs more DCs, also changing the number of racks might cause similar effects in the voters distribution)

Special conditions for various number of DCs:
* 1 DC: Can have up to the maximum allowed number of voters (5 - see below)
* 2 DCs: The distribution of the voters will be asymmetric (if possible), meaning that we can tolerate a loss of the DC with the smaller number of voters (if both would have the same number of voters we'd lose majority if any of the DCs is lost). For example, if we have 2 DCs with 2 nodes each, one of them will only have 1 voter (despite the limit of 5). Also, if one of the 2 DCs has more racks than the other and the node count allows it, the DC with the more racks will have more voters.
* 3 and more DCs: The distribution of the voters will be so that every DC has strictly less than half of the total voters (so a loss of any of the DCs cannot lead to the majority loss). Again, DCs with more racks are being preferred in the voter distribution.

At the moment we will be handling the zero-token nodes in the same way as the regular nodes (i.e. the zero-token nodes will not take any priority in the voter distribution). Technically it doesn't make much sense to have a zero-token node that is not a voter (when there are regular nodes in the same DC being voters), but currently the intended purpose of zero-token nodes is to form an "arbiter DC" (in case of 2 DCs, creating a third DC with zero-token nodes only), so for that intended purpose no special handling is needed and will work out of the box. If a preference of zero token nodes will eventually be needed/requested, it will be added separately from this PR.

The maximum number of voters of 5 has been chosen as the smallest "safe" value. We can lose majority when multiple nodes (possibly in different dcs and racks) die independently in a short time span. With less than 5 voters, we would lose majority if 2 voters died, which is very unlikely to happen but not entirely impossible. With 5 voters, at least 3 voters must die to lose majority, which can be safely considered impossible in the case of independent failures.

Currently the limit will not be configurable (we might introduce configurable limits later if that would be needed/requested).

Tests added:
* boost/group0_voter_registry_test.cc: run time on CI: ~3.5s
* topology_custom/test_raft_voters.py: parametrized with 1 or 3 nodes per DC, the run time on CI: 1: ~20s. 3: ~40s, approx 1 min total

Fixes: scylladb/scylladb#18793

No backport: This is a new feature that will not be backported.

Closes scylladb/scylladb#21969

* https://github.com/scylladb/scylladb:
  raft: distribute voters by rack inside DC
  raft/test: fix lint warnings in `test_raft_no_quorum`
  raft/test: add the upgrade test for limited voters feature
  raft topology: handle on_up/on_down to add/remove node from voters
  raft: fix the indentation after the limited voters changes
  raft: implement the limited voters feature
  raft: drop the voter removal from the decommission
  raft/test: disable the `stop_before_becoming_raft_voter` test
  raft/test: stop the server less gracefully in the voters test
2025-04-10 15:29:15 +02:00
..
2025-02-15 20:32:22 +02:00
2025-02-15 20:32:22 +02:00
2024-12-23 23:37:02 +01:00
2025-01-08 09:37:16 +02:00
2025-02-15 20:32:22 +02:00
2025-01-09 10:40:39 +00:00
2025-02-15 20:32:22 +02:00

Scylla unit tests using C++ and the Boost test framework

The source files in this directory are Scylla unit tests written in C++ using the Boost.Test framework. These unit tests come in three flavors:

  1. Some simple tests that check stand-alone C++ functions or classes use Boost's BOOST_AUTO_TEST_CASE.

  2. Some tests require Seastar features, and need to be declared with Seastar's extensions to Boost.Test, namely SEASTAR_TEST_CASE.

  3. Even more elaborate tests require not just a functioning Seastar environment but also a complete (or partial) Scylla environment. Those tests use the do_with_cql_env() or do_with_cql_env_thread() function to set up a mostly-functioning environment behaving like a single-node Scylla, in which the test can run.

While we have many tests of the third flavor, writing new tests of this type should be reserved to white box tests - tests where it is necessary to inspect or control Scylla internals that do not have user-facing APIs such as CQL. In contrast, black-box tests - tests that can be written only using user-facing APIs, should be written in one of newer test frameworks that we offer - such as test/cqlpy or test/alternator (in Python, using the CQL or DynamoDB APIs respectively) or test/cql (using textual CQL commands), or - if more than one Scylla node is needed for a test - using the test/topology* framework.

Running tests

Because these are C++ tests, they need to be compiled before running. To compile a single test executable row_cache_test, use a command like

ninja build/dev/test/boost/row_cache_test

You can also use ninja dev-test to build all C++ tests, or use ninja deb-build to build the C++ tests and also the full Scylla executable (however, note that full Scylla executable isn't needed to run Boost tests).

Replace "dev" by "debug" or "release" in the examples above and below to use the "debug" build mode (which, importantly, compiles the test with ASAN and UBSAN enabling on and helps catch difficult-to-catch use-after-free bugs) or the "release" build mode (optimized for run speed).

To run an entire test file row_cache_test, including all its test functions, use a command like:

build/dev/test/boost/row_cache_test -- -c1 -m1G 

to run a single test function test_reproduce_18045() from the longer test file, use a command like:

build/dev/test/boost/row_cache_test -t test_reproduce_18045 -- -c1 -m1G 

In these command lines, the parameters before the -- are passed to Boost.Test, while the parameters after the -- are passed to the test code, and in particular to Seastar. In this example Seastar is asked to run on one CPU (-c1) and use 1G of memory (-m1G) instead of hogging the entire machine. The Boost.Test option -t test_reproduce_18045 asks it to run just this one test function instead of all the test functions in the executable.

Unfortunately, interrupting a running test with control-C while doesn't work. This is a known bug (#5696). Kill a test with SIGKILL (-9) if you need to kill it while it's running.

Boost tests can also be run using test.py - which is a script that provides a uniform way to run all tests in scylladb.git - C++ tests, Python tests, etc.

Execution with pytest

To run all tests with pytest execute

pytest test/boost

To execute all tests in one file, provide the path to the source filename as a parameter

pytest test/boost/aggregate_fcts_test.cc

Since it's a normal path, autocompletion works in the terminal out of the box.

To execute only one test function, provide the path to the source file and function name

pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg

To provide a specific mode, use the next parameter --mode dev, if parameter isn't provided pytest tries to use ninja mode_list to find out the compiled modes.

Parallel execution is controlled by pytest-xdist and the parameter -n auto. This command starts tests with the number of workers equal to CPU cores. The useful command to discover the tests in the file or directory is

pytest --collect-only -q --mode dev test/boost/aggregate_fcts_test.cc

That will return all test functions in the file. To execute only one function from the test, you can invoke the output from the previous command. However, suffix for mode should be skipped. For example, output shows in the terminal something like this test/boost/aggregate_fcts_test.cc::test_aggregate_avg.dev. So to execute this specific test function, please use the next command

pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg

Writing tests

Because of the large build time and build size of each separate test executable, it is recommended to put test functions into relatively large source files. But not too large - to keep compilation time of a single source file (during development) at reasonable levels.

When adding new source files in test/boost, don't forget to list the new source file in configure.py and also in CMakeLists.txt. The former is needed by our CI, but the latter is preferred by some developers.