mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 20:57:00 +00:00

Files

Tomasz Grabiec 7e2875d648 Merge 'Add tablet merge support' from Raphael Raph Carvalho

The goal of merge is to reduce the tablet count for a shrinking table. Similar to how split increases the count while the table is growing. The load balancer decision to merge is implemented today (came with infrastructure introduced for split), but it wasn't handled until now.

Initial tablet count is respected while the table is in "growing mode". For example, the table leaves it if there was a need to split above the initial tablet count. After the table leaves the mode, the average size can be trusted to determine that the table is shrinking. Merge decision is emitted if the average tablet size is 50% of the target. Hysteresis is applied to avoid oscillations between split and merges.

Similar to split, the decision to merge is recorded in tablet map's resize_type field with the string "merge". This is important in case of coordinator failover, so new coordinator continues from where the old left off.

Unlike split, the preparation phase during merge is not done by the replica (with split compactions), but rather by the coordinator by co-locating sibling tablets in the same node's shard. We can define sibling tablets as tablets that have contiguous range and will become one after merge. The concept is based on the power-of-two constraint and token contiguity. For example, in a table with 4 tablets, tablets of ids 0 and 1 are siblings, 2 and 3 are also siblings.

The algorithm for co-locating sibling tablets is very simple. The balancer is responsible for it, and it will emit migrations so that "odd" tablet will follow the "even" one. For example, tablet 1 will be migrated to where tablet 0 lives. Co-location is low in priority, it's not the end of the world to delay merge, but it's not ideal to delay e.g. decommission or even regular load balancing as that can translate into temporary unbalancing, impacting the user activities. So co-location migrations will happen when there is no more important work to do.
While regular balancing is higher in priority, it will not undo the co-location work done so far. It does that by treating co-located tablets as if they were already merged. The load inversion convergence check was adjusted so balancer understand when two tablets are being migrated instead of one, to avoid oscillations.

When balancer completes co-location work for a table undergoing merge, it will put the id of the table into the resize_plan, which is about communicating with the topology coordinator that a table is ready for it. With all sibling tablets co-located, the coordinator can resize the tablet map (reduce it by a factor of 2) and record the new map into group0. All the replicas will react to it (on token metadata update) by merging the storage (memtable(s) + sstables) of sibling tablets into one.

Fixes #18181.

system test details:

test: https://github.com/pehala/scylla-cluster-tests/blob/tablets_split_merge/tablets_split_merge_test.py
yaml file: https://github.com/pehala/scylla-cluster-tests/blob/tablets_split_merge/test-cases/features/tablets/tablets-split-merge-test.yaml

instance type: i3.8xlarge
nodes: 3
target tablet size: 0.5G (scaled down by 10, to make it easier to trigger splits and merges)
description: multiple cycles of growing and shrinking the data set in order to trigger splits and merges.
data_set_size: ~100G
initial_tablets: 64, so it grew to 128 tablets on split, and back to 64 on merge.

latency of reads and writes that happened in parallel to split and merge:
```
$ for i in scylla-bench*; do cat $i | grep "Mode\|99th:\|99\.9th:"; done
Mode:			 write
  99.9th:	 3.145727ms
  99th:		 1.998847ms
  99.9th:	 3.145727ms
  99th:		 2.031615ms
Mode:			 read
  99.9th:	 3.145727ms
  99th:		 2.031615ms
  99.9th:	 3.145727ms
  99th:		 2.031615ms
Mode:			 write
  99.9th:	 3.047423ms
  99th:		 1.933311ms
  99.9th:	 3.047423ms
  99th:		 1.933311ms
Mode:			 read
  99.9th:	 3.145727ms
  99th:		 1.900543ms
  99.9th:	 3.145727ms
  99th:		 1.900543ms
Mode:			 write
  99.9th:	 5.079039ms
  99th:		 3.604479ms
  99.9th:	 35.389439ms
  99th:		 25.624575ms
Mode:			 write
  99.9th:	 3.047423ms
  99th:		 1.998847ms
  99.9th:	 3.047423ms
  99th:		 1.998847ms
Mode:			 read
  99.9th:	 3.080191ms
  99th:		 2.031615ms
  99.9th:	 3.112959ms
  99th:		 2.031615ms
```

Closes scylladb/scylladb#20572

* github.com:scylladb/scylladb:
  docs: Document tablet merging
  tests/boost: Add test to verify correctness of balancer decisions during merge
  tests/topology_experimental_raft: Add tablet merge test
  service: Handle exception when retrying split
  service: Co-locate sibling tablets for a table undergoing merge
  gms: Add cluster feature for tablet merge
  service: Make merge of resize plan commutative
  replica: Implement merging of compaction groups on merge completion
  replica: Handle tablet merge completion
  service: Implement tablet map resize for merge
  locator: Introduce merge_tablet_info()
  service: Rename topology::transition_state::tablet_split_finalization
  service: Respect initial_tablet_count if table is in growing mode
  service: Wire migration_tablet_set into the load balancer
  locator: Add tablet_map::sibling_tablets()
  service: Introduce sorted_replicas_for_tablet_load()
  locator/tablets: Extend tablet_replica equality comparator to three-way
  service: Introduce alias to per-table candidate map type
  service: Add replication constraint check variant for migration_tablet_set
  service: Add convergence check variant for migration_tablet_set
  service: Add migration helpers for migration_tablet_set
  service/tablet_allocator: Introduce migration_tablet_set
  service: Introduce migration_plan::add(migrations_vector)
  locator/tablets: Introduce tablet_map::for_each_sibling_tablets()
  locator/tablets: Introduce tablet_map::needs_merge()
  locator/tablets: Introduce resize_decision::initial_decision()
  locator/tablets: Fix return type of three-way comparison operators
  service: Extract update of node load on migrations
  service: Extract converge check for intra-node migration
  service: Extract erase of tablet replicas from candidate list
  scripts/tablet-mon: Allow visualization of tablet id

2024-12-06 18:06:20 +01:00

address_map_test.cc

test: rename raft_address_map_test to address_map_test and move if from raft tests

2024-12-02 10:31:14 +02:00

aggregate_fcts_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

allocation_strategy_test.cc

…

alternator_unit_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

anchorless_list_test.cc

…

auth_passwords_test.cc

…

auth_resource_test.cc

test: use generic boost_test_print_type()

2024-05-20 12:56:20 +03:00

auth_test.cc

cql3: Rename SALTED HASH to HASHED PASSWORD

2024-10-30 14:07:58 +02:00

aws_error_injection_test.cc

aws_errors: Make error messages more verbose.

2024-11-07 21:01:25 +02:00

aws_errors_test.cc

aws_errors: Change aws_error::parse to return std::optional<>

2024-11-07 21:01:25 +02:00

batchlog_manager_test.cc

db/batchlog_manager: do_batch_log_replay(): add cleanup flag

2024-10-30 11:07:57 +08:00

big_decimal_test.cc

utils/big_decimal: add fast paths to operator <=>

2024-12-03 14:56:51 +02:00

bloom_filter_test.cc

boost/bloom_filter_test: wait for total memory reclaimed update

2024-07-26 08:15:11 +03:00

bptree_test.cc

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

bptree_validation.hh

test: include fmt/iostream.h and iostream when appropriate

2024-11-12 17:34:08 +02:00

broken_sstable_test.cc

test: Make tests use schema_builder instead of make_shared_schema

2024-09-05 19:31:30 +03:00

btree_test.cc

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

btree_validation.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

bytes_ostream_test.cc

serialization: replace boost::type with std::type_identity

2024-11-05 00:43:27 +01:00

cache_algorithm_test.cc

auth: do not include unused headers

2024-06-17 17:33:55 +03:00

cache_mutation_reader_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cached_file_test.cc

cached_file: Adapt page_view to ContiguousSharedBuffer

2024-09-27 01:25:15 +02:00

caching_options_test.cc

…

canonical_mutation_test.cc

canonical_mutation: add make_canonical_mutation_gently

2024-05-02 19:37:04 +03:00

cartesian_product_test.cc

…

castas_fcts_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cdc_generation_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cdc_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

cell_locker_test.cc

…

checksum_utils_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

chunked_managed_vector_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

chunked_vector_test.cc

utils: chunked_vector: add from_range_t constructor

2024-10-31 19:32:16 +02:00

clustering_ranges_walker_test.cc

…

CMakeLists.txt

test: rename raft_address_map_test to address_map_test and move if from raft tests

2024-12-02 10:31:14 +02:00

collection_stress.hh

test: Move stress-collecton header from unit to boost

2024-09-24 13:42:13 +03:00

column_mapping_test.cc

test: use generic boost_test_print_type()

2024-05-20 12:56:20 +03:00

commitlog_cleanup_test.cc

raft_group0_client: uninclude "db/system_keyspace.hh"

2024-09-28 16:31:53 +03:00

commitlog_test.cc

cross-tree: change to_sstring_view() to to_string_view()

2024-11-18 14:57:49 +02:00

compaction_group_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

compound_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

compress_test.cc

…

config_test.cc

treewide: drop thrift support

2024-06-07 06:44:59 +08:00

continuous_data_consumer_test.cc

…

counter_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

cql_auth_query_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

cql_auth_syntax_test.cc

cql3: Rename SALTED HASH to HASHED PASSWORD

2024-10-30 14:07:58 +02:00

cql_functions_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_group_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_large_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_like_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

crc_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

data_listeners_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

database_test.cc

snapshots: Stop taking snapshots of MVs

2024-11-26 15:27:30 +02:00

dirty_memory_manager_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

double_decker_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

duration_test.cc

…

dynamic_bitset_test.cc

treewide: s/boost::algorithm::any_of/std::ranges::any_of/

2024-11-05 14:06:09 +08:00

enum_option_test.cc

…

enum_set_test.cc

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

error_injection_test.cc

error_injection: Remove unused inject(sleep, then invoke) overload

2024-11-05 09:56:08 +02:00

estimated_histogram_test.cc

test/estimated_histogram_test Add summary tests

2024-08-22 23:34:24 +03:00

exception_container_test.cc

…

exceptions_fallback_test.cc

…

exceptions_optimized_test.cc

…

exceptions_test.inc.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

expr_test.cc

cql3: introduce dialect infrastructure

2024-08-29 21:19:23 +03:00

extensions_test.cc

…

filtering_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

flush_queue_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

fragmented_temporary_buffer_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

frozen_mutation_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

generic_server_test.cc

test/generic_server: add test case

2024-08-28 10:59:44 +02:00

gossiping_property_file_snitch_test.cc

test: use generic boost_test_print_type()

2024-05-20 12:56:20 +03:00

group0_cmd_merge_test.cc

raft_group0_client: uninclude "db/system_keyspace.hh"

2024-09-28 16:31:53 +03:00

group0_test.cc

raft: add the check for the group0 tables

2024-10-08 21:08:11 +02:00

hash_test.cc

…

hashers_test.cc

…

hint_test.cc

test: change sstring_view to std::string_view

2024-11-18 16:26:20 +02:00

idl_test.cc

serialization: replace boost::type with std::type_identity

2024-11-05 00:43:27 +01:00

index_reader_test.cc

sstables: bsearch_clustered_cursor: Narrow down range using "end" position of the block

2024-10-03 14:16:05 +02:00

index_with_paging_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

input_stream_test.cc

…

intrusive_array_test.cc

…

json_cql_query_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

json_test.cc

…

keys_test.cc

serialization: replace boost::type with std::type_identity

2024-11-05 00:43:27 +01:00

large_paging_state_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

like_matcher_test.cc

…

limiting_data_source_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

linearizing_input_stream_test.cc

…

lister_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

loading_cache_test.cc

…

locator_topology_test.cc

locator: topology: make topology object always contain local node

2024-12-02 10:31:11 +02:00

log_heap_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

logalloc_standard_allocator_segment_pool_backend_test.cc

…

logalloc_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

managed_bytes_test.cc

…

managed_vector_test.cc

…

map_difference_test.cc

…

memtable_test.cc

replica: implement memtable_flush_period_in_ms schema option

2024-10-17 13:41:15 +03:00

multishard_combining_reader_as_mutation_source_test.cc

treewide: s/boost::adaptors::map_values/std::views::values/

2024-10-27 21:32:45 +02:00

multishard_mutation_query_test.cc

serialization: replace boost::type with std::type_identity

2024-11-05 00:43:27 +01:00

murmur_hash_test.cc

treewide: include seastar/core/format.hh instead of seastar/core/print.hh

2024-11-14 17:45:07 +02:00

mutation_fragment_test.cc

schema_registry: stop including replica/database.hh

2024-11-04 13:16:27 +01:00

mutation_query_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

mutation_reader_another_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

mutation_reader_test.cc

test/boost/mutation_reader_test: add test for multishard reader buffer hint

2024-11-07 02:47:54 -05:00

mutation_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

mutation_writer_test.cc

treewide: s/boost::adaptors::map_values/std::views::values/

2024-10-27 21:32:45 +02:00

mvcc_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

network_topology_strategy_test.cc

Merge "move storage proxy and adjacent services to identify hosts by ids" from Gleb

2024-12-03 18:18:48 +02:00

nonwrapping_interval_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

observable_test.cc

…

partitioner_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

per_partition_rate_limit_test.cc

test: Avoid using deprecated sharded API

2024-05-16 00:28:47 +02:00

pretty_printers_test.cc

…

querier_cache_test.cc

treewide: use std::ranges sort functions rather than boost

2024-10-01 14:19:05 +03:00

query_processor_test.cc

test: change sstring_view to std::string_view

2024-11-18 16:26:20 +02:00

radix_tree_printer.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

radix_tree_test.cc

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

range_assert.hh

…

range_tombstone_list_assertions.hh

…

range_tombstone_list_test.cc

…

rate_limiter_test.cc

…

reader_concurrency_semaphore_test.cc

treewide: use coroutine::parallel_for_each(range) when appropriate

2024-11-27 21:00:47 +02:00

README.md

test: rename "cql-pytest" to "cqlpy"

2024-11-06 16:48:36 +02:00

recent_entries_map_test.cc

…

repair_test.cc

test/boost/repair_test: close reader after use

2024-09-13 06:52:26 -04:00

restrictions_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

result_utils_test.cc

…

reusable_buffer_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

role_manager_test.cc

cql3: auth: use mutation collector for alter role

2024-06-04 15:43:04 +02:00

row_cache_test.cc

Improve compation on read of expired tombstones

2024-11-22 10:31:21 +02:00

rust_test.cc

…

s3_test.cc

s3_tests: Add s3 test to check object re-uploading

2024-11-28 12:46:59 +03:00

schema_change_test.cc

system_tables: Compute schema version automatically

2024-11-15 19:16:41 +01:00

schema_changes_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

schema_loader_test.cc

test: Use sstables::test_env to make sstables for schema loader test

2024-09-09 14:22:58 +03:00

schema_registry_test.cc

db: move schema merging code into a separate unit

2024-09-23 12:01:36 +02:00

secondary_index_test.cc

treewide: migrate from boost::adaptors::filtered to std::views::filter

2024-11-26 14:26:50 +02:00

serialization_test.cc

serialization: replace boost::type with std::type_identity

2024-11-05 00:43:27 +01:00

serialized_action_test.cc

…

service_level_controller_test.cc

locator: topology: make topology object always contain local node

2024-12-02 10:31:11 +02:00

sessions_test.cc

…

small_vector_test.cc

utils: small_vector: support from_range_t

2024-10-21 09:31:38 +03:00

snitch_reset_test.cc

treewide: include used headers

2024-05-27 17:34:38 +03:00

sorting_test.cc

test/boost: add test for topological sorting

2024-05-16 13:30:03 +02:00

sstable_3_x_test.cc

test: change sstring_view to std::string_view

2024-11-18 16:26:20 +02:00

sstable_compaction_test.cc

Merge 'Use checksummed input streams in validate_checksums()' from Nikos Dragazis

2024-12-04 10:46:18 +02:00

sstable_conforms_to_mutation_source_test.cc

readers: Use reversed schema and native reversed slices

2024-08-13 10:03:46 +02:00

sstable_datafile_test.cc

Merge 'Use checksummed input streams in validate_checksums()' from Nikos Dragazis

2024-12-04 10:46:18 +02:00

sstable_directory_test.cc

test: Fix test_multiple_data_dirs

2024-10-07 12:04:23 +03:00

sstable_generation_test.cc

…

sstable_move_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

sstable_mutation_test.cc

treewide: use std::ranges sort functions rather than boost

2024-10-01 14:19:05 +03:00

sstable_partition_index_cache_test.cc

…

sstable_resharding_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

sstable_set_test.cc

boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor

2024-08-17 23:38:05 +05:30

sstable_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

sstable_test.hh

test: Tune up indentation in uncompressed_schema()

2024-09-05 19:33:29 +03:00

stall_free_test.cc

utils/stall_free: introduce reserve_gently

2024-06-18 23:36:30 +05:30

statement_restrictions_test.cc

test: change sstring_view to std::string_view

2024-11-18 16:26:20 +02:00

storage_proxy_test.cc

storage_proxy: move to addressing nodes by host ids instead of ips

2024-12-02 10:31:11 +02:00

string_format_test.cc

test: string_format_test: disable test if {fmt} >= 10.0.0

2024-05-03 11:34:23 +03:00

suite.yaml

boost/bloom_filter_test: add testcase to verify unlinked sstables are not reloaded

2024-07-19 13:15:57 +05:30

summary_test.cc

…

tablets_test.cc

Merge 'Add tablet merge support' from Raphael Raph Carvalho

2024-12-06 18:06:20 +01:00

tagged_integer_test.cc

…

token_metadata_test.cc

locator: topology: add_or_update_endpoint: use none as the default node state

2024-08-29 10:37:07 +02:00

top_k_test.cc

treewide: include used headers

2024-05-27 17:34:38 +03:00

total_order_check.hh

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

tracing_test.cc

…

transport_test.cc

treewide: migrate from boost::adaptors::transformed to std::views::transform

2024-12-03 09:41:32 +02:00

tree_test_key.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

types_test.cc

treewide: remove dependency on boost asio address_v4

2024-10-01 14:00:50 +03:00

user_function_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

user_types_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

utf8_test.cc

…

UUID_test.cc

test: change sstring_view to std::string_view

2024-11-18 16:26:20 +02:00

view_build_test.cc

test/boost: stop using ranges::to()

2024-10-19 13:21:20 +08:00

view_complex_test.cc

compaction: replace optional<task_info> with task_info param

2024-08-02 14:38:46 +02:00

view_schema_ckey_test.cc

test/lib: do not include unused headers

2024-05-05 23:31:48 +03:00

view_schema_pkey_test.cc

test/lib: do not include unused headers

2024-05-05 23:31:48 +03:00

view_schema_test.cc

test/boost/view_schema_test: Improve comments in test_view_update_generating_writetime

2024-11-24 22:48:15 +01:00

vint_serialization_test.cc

…

virtual_reader_test.cc

test: Move test_query_built_indexes_virtual_table from boost to cqlpy

2024-12-05 09:17:23 +02:00

virtual_table_mutation_source_test.cc

…

virtual_table_test.cc

system_tables: Compute schema version automatically

2024-11-15 19:16:41 +01:00

wasm_alloc_test.cc

main, test: use seastar::handle_signal() instead

2024-09-19 18:10:07 +03:00

wasm_test.cc

…

wrapping_interval_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

README.md

Scylla unit tests using C++ and the Boost test framework

The source files in this directory are Scylla unit tests written in C++ using the Boost.Test framework. These unit tests come in three flavors:

Some simple tests that check stand-alone C++ functions or classes use Boost's BOOST_AUTO_TEST_CASE.
Some tests require Seastar features, and need to be declared with Seastar's extensions to Boost.Test, namely SEASTAR_TEST_CASE.
Even more elaborate tests require not just a functioning Seastar environment but also a complete (or partial) Scylla environment. Those tests use the do_with_cql_env() or do_with_cql_env_thread() function to set up a mostly-functioning environment behaving like a single-node Scylla, in which the test can run.

While we have many tests of the third flavor, writing new tests of this type should be reserved to white box tests - tests where it is necessary to inspect or control Scylla internals that do not have user-facing APIs such as CQL. In contrast, black-box tests - tests that can be written only using user-facing APIs, should be written in one of newer test frameworks that we offer - such as test/cqlpy or test/alternator (in Python, using the CQL or DynamoDB APIs respectively) or test/cql (using textual CQL commands), or - if more than one Scylla node is needed for a test - using the test/topology* framework.

Running tests

Because these are C++ tests, they need to be compiled before running. To compile a single test executable row_cache_test, use a command like

ninja build/dev/test/boost/row_cache_test

You can also use ninja dev-test to build all C++ tests, or use ninja deb-build to build the C++ tests and also the full Scylla executable (however, note that full Scylla executable isn't needed to run Boost tests).

Replace "dev" by "debug" or "release" in the examples above and below to use the "debug" build mode (which, importantly, compiles the test with ASAN and UBSAN enabling on and helps catch difficult-to-catch use-after-free bugs) or the "release" build mode (optimized for run speed).

To run an entire test file row_cache_test, including all its test functions, use a command like:

build/dev/test/boost/row_cache_test -- -c1 -m1G

to run a single test function test_reproduce_18045() from the longer test file, use a command like:

build/dev/test/boost/row_cache_test -t test_reproduce_18045 -- -c1 -m1G

In these command lines, the parameters before the -- are passed to Boost.Test, while the parameters after the -- are passed to the test code, and in particular to Seastar. In this example Seastar is asked to run on one CPU (-c1) and use 1G of memory (-m1G) instead of hogging the entire machine. The Boost.Test option -t test_reproduce_18045 asks it to run just this one test function instead of all the test functions in the executable.

Unfortunately, interrupting a running test with control-C while doesn't work. This is a known bug (#5696). Kill a test with SIGKILL (-9) if you need to kill it while it's running.

Boost tests can also be run using test.py - which is a script that provides a uniform way to run all tests in scylladb.git - C++ tests, Python tests, etc.

Writing tests

Because of the large build time and build size of each separate test executable, it is recommended to put test functions into relatively large source files. But not too large - to keep compilation time of a single source file (during development) at reasonable levels.

When adding new source files in test/boost, don't forget to list the new source file in configure.py and also in CMakeLists.txt. The former is needed by our CI, but the latter is preferred by some developers.