Files
Avi Kivity b58dbe57aa Merge 'repair: introduce and use buffer size hint for mixed-shard multishard reader' from Botond Dénes
Add a buffer hint to the multishard reader. This is an internal hint, used by the multishard reader to provide a hint to the shard reader, on how much data exactly is needed by the multishard reader from the respective shard. This hint allows eliminating extraneous cross-shard round-trips and possible shard reader evict-recreate cycles. Building on this, repair sets its own row buffer size as the max buffer size on the multishard reader, ensuring that the row buffer is filled with the minimum amount of cross-shard round trips and minimal reader recreation.
To further eliminate unnecessary evictions, this PR also disables the multishard reader's read-ahead which is a mechanism that was designed to reduce latency for user-reads but it can be too aggressive for repair, causing unnecessary extra congestion on the already struggling streaming semaphores.

Refs: https://github.com/scylladb/scylladb/issues/18269
Fixes: https://github.com/scylladb/scylladb/issues/21113

The performance impact was measured with an SCT test, which creates a cluster of 3 nodes with 16 shards, then adds a 4th one with 12 shards.
Currently, it is the bootstrap time which is the worse in the case of mixed shard clusters, see below for the improvement measured during bootstrap:

|              | master        | buffer-hint   | metric                                              |
| ------------ | ------------- | ------------- | --------------------------------------------------- |
| evictions    |          0.9M |         93.0K | scylla_database_paused_reads_permit_based_evictions |
| read (bytes) |          9.0T |          3.9T | scylla_reactor_aio_bytes_read                       |
| read (ops)   |         88.0M |         33.5M | scylla_reactor_aio_reads                            |
| time         |         56min |         20min | N/A                                                 |

This is a performance improvement, no backport required.

Closes scylladb/scylladb#20815

* github.com:scylladb/scylladb:
  test/boost/mutation_reader_test: add test for multishard reader buffer hint
  repair/row_level: disable read-ahead
  db/config: introduce repair_multishard_reader_enable_read_ahead
  readers/multishard: implement the read_ahead flag
  replica/database: make_multishard_streaming_reader(): expose the read_ahead parameter
  readers/multishard: add read_ahead parameter
  repair/row_level: set max buffer size on multishard reader
  replica/database: make_multishard_streaming_reader(): expose buffer_hint parameter
  db/config: introduce enable_repair_multishard_reader_buffer_hint
  readers/multishard: multishard_reader: pass hint to shard_reader
  readers/multishard: shard_reader_v2::fill_reader_buffer(): respect the hint
  readers/multishard: propagate fill_buffer_hint to shard_reader:fill_reader_buffer()
  readers/multishard: shard_reader: extract buffer-fill into its own method
2024-11-10 12:55:19 +02:00
..
2024-06-07 06:44:59 +08:00
2024-11-06 16:48:36 +02:00
2024-05-27 17:34:38 +03:00

Scylla unit tests using C++ and the Boost test framework

The source files in this directory are Scylla unit tests written in C++ using the Boost.Test framework. These unit tests come in three flavors:

  1. Some simple tests that check stand-alone C++ functions or classes use Boost's BOOST_AUTO_TEST_CASE.

  2. Some tests require Seastar features, and need to be declared with Seastar's extensions to Boost.Test, namely SEASTAR_TEST_CASE.

  3. Even more elaborate tests require not just a functioning Seastar environment but also a complete (or partial) Scylla environment. Those tests use the do_with_cql_env() or do_with_cql_env_thread() function to set up a mostly-functioning environment behaving like a single-node Scylla, in which the test can run.

While we have many tests of the third flavor, writing new tests of this type should be reserved to white box tests - tests where it is necessary to inspect or control Scylla internals that do not have user-facing APIs such as CQL. In contrast, black-box tests - tests that can be written only using user-facing APIs, should be written in one of newer test frameworks that we offer - such as test/cqlpy or test/alternator (in Python, using the CQL or DynamoDB APIs respectively) or test/cql (using textual CQL commands), or - if more than one Scylla node is needed for a test - using the test/topology* framework.

Running tests

Because these are C++ tests, they need to be compiled before running. To compile a single test executable row_cache_test, use a command like

ninja build/dev/test/boost/row_cache_test

You can also use ninja dev-test to build all C++ tests, or use ninja deb-build to build the C++ tests and also the full Scylla executable (however, note that full Scylla executable isn't needed to run Boost tests).

Replace "dev" by "debug" or "release" in the examples above and below to use the "debug" build mode (which, importantly, compiles the test with ASAN and UBSAN enabling on and helps catch difficult-to-catch use-after-free bugs) or the "release" build mode (optimized for run speed).

To run an entire test file row_cache_test, including all its test functions, use a command like:

build/dev/test/boost/row_cache_test -- -c1 -m1G 

to run a single test function test_reproduce_18045() from the longer test file, use a command like:

build/dev/test/boost/row_cache_test -t test_reproduce_18045 -- -c1 -m1G 

In these command lines, the parameters before the -- are passed to Boost.Test, while the parameters after the -- are passed to the test code, and in particular to Seastar. In this example Seastar is asked to run on one CPU (-c1) and use 1G of memory (-m1G) instead of hogging the entire machine. The Boost.Test option -t test_reproduce_18045 asks it to run just this one test function instead of all the test functions in the executable.

Unfortunately, interrupting a running test with control-C while doesn't work. This is a known bug (#5696). Kill a test with SIGKILL (-9) if you need to kill it while it's running.

Boost tests can also be run using test.py - which is a script that provides a uniform way to run all tests in scylladb.git - C++ tests, Python tests, etc.

Writing tests

Because of the large build time and build size of each separate test executable, it is recommended to put test functions into relatively large source files. But not too large - to keep compilation time of a single source file (during development) at reasonable levels.

When adding new source files in test/boost, don't forget to list the new source file in configure.py and also in CMakeLists.txt. The former is needed by our CI, but the latter is preferred by some developers.