mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 12:36:56 +00:00

Files

Avi Kivity 4d9271df98 Merge 'sstables: introduce sstable version ms' from Michał Chojnowski

This is yet another part in the BTI index project.

Overarching issue: https://github.com/scylladb/scylladb/issues/19191
Previous part: https://github.com/scylladb/scylladb/pull/25626
Next parts: make `ms` the default. Then, general tweaks and improvements. Later, potentially a full `da` format implementation.

This patch series introduces a new, Scylla-only sstable format version `ms`, which is like `me`, but with the index components (Summary.db and Index.db) replaced with BTI index components (Partitions.db and Rows.db), as they are in Cassandra 5.0's `da` format version.

(Eventually we want to just implement `da`, but there are several other changes (unrelated to the index files) between `me` and `da`. By adding this `ms` as an intermediate step we can adapt the new index formats without dragging all the other changes into the mix (and raising the risk of regressions, which is already high)).

The high-level structure of the PR is:
1. Introduce new component types — `Partitions` and `Rows`.
2. Teach `class sstable` to open them when they exist.
3. Teach the sstable writer how to write index data to them.
4. Teach `class sstable` and unit tests how to deal with sstables that have no `Index` or `Summary` (but have `Partitions` and `Rows` instead).
5. Introduce the new sstable version `ms`, specify that it has `Partitions` and `Rows` instead of `Index` and `Summary`.
6. Prepare unit tests for the appearance of `ms`.
7. Enable `ms` in unit tests.
8. Make `ms` enablable via db::config (with a silent fall back to `me` until the new `MS_SSTABLE_FORMAT` cluster feature is enabled).
9. Prepare integration tests for the appearance of `ms`.
10. Enable both `ms` and `me` in tests where we want both versions to be tested.

This series doesn't make `ms` the default yet, because that requires teaching Scylla Manager and a few dtests about the new format first. It can be enabled by setting `sstable_format: ms` in the config.

Per a review request, here is an example from `perf_fast_forward`, demonstrating some motivation for a new format. (Although not the main one. The main motivations are getting rid of restrictions on the RAM:disk ratio, and index read throughput for datasets with tiny partitions). The dataset was populated with `build/release/scylla perf-fast-forward --smp=1 --sstable-format=$VERSION --data-directory=data.$VERSION --column-index-size-in-kb=1 --populate --random-seed=0`.
This test involves a partition with 1000000 clustering rows (with 32-bit keys and 100-byte values) and ~500 index blocks, and queries a few particular rows from the partition. Since the branching factor for the BIG promoted index is 2 (it's a binary search), the lookup involves ~11.2 sequential page reads per row. The BTI format has a more reasonable branching factor, so it involves ~2.3 page reads per row.

`build/release/scylla perf-fast-forward --smp=1 --data-directory=perf_fast_forward_data/me --run-tests=large-partition-select-few-rows`:
```
offset  stride  rows     iterations    avg aio    aio      (KiB)
500000  1       1                70       18.0     18        128
500001  1       1               647       19.0     19        132
0       1000000 1               748       15.0     15        116
0       500000  2               372       29.0     29        284
0       250000  4               227       56.0     56        504
0       125000  8               116      106.0    106        928
0       62500   16               67      195.0    195       1732
```
`build/release/scylla perf-fast-forward --smp=1 --data-directory=perf_fast_forward_data/ms --run-tests=large-partition-select-few-rows`:
```
offset  stride  rows     iterations    avg aio    aio      (KiB)
500000  1       1                51        5.1      5         20
500001  1       1                64        5.3      5         20
0       1000000 1               679        4.0      4         16
0       500000  2               492        8.0      8         88
0       250000  4               804       16.0     16        232
0       125000  8               409       31.0     31        516
0       62500   16               97       54.0     54       1056
```

Index file size comparison for the default `perf_fast_forward` tables with `--random-seed=0`:
Large partition table (dominated by intra-partition index): 2.4 MB with `me`, 732 kB with `ms`.
For the small partitions table (dominated by inter-partition index): 11 MB with `me`, 8.4 MB with `ms`.

External tests:
I ran SCT test `longevity-mv-si-4days-streaming-test` test on 6 nodes with 30 shards each for 8 hours. No anomalies were observed.

New functionality, no backport needed.

Closes scylladb/scylladb#26215

* github.com:scylladb/scylladb:
  test/boost/bloom_filter_test: add test_rebuild_from_temporary_hashes
  test/cluster: add test_bti_index.py
  test: prepare bypass_cache_test.py for `ms` sstables
  sstables/trie/bti_index_reader: add a failure injection in advance_lower_and_check_if_present
  test/cqlpy/test_sstable_validation.py: prepare the test for `ms` sstables
  tools/scylla-sstable: add `--sstable-version=?` to `scylla sstable write`
  db/config: expose "ms" format to the users via database config
  test: in Python tests, prepare some sstable filename regexes for `ms`
  sstables: add `ms` to `all_sstable_versions`
  test/boost/sstable_3_x_test: add `ms` sstables to multi-version tests
  test/lib/index_reader_assertions: skip some row index checks for BTI indexes
  test/boost/sstable_inexact_index_test: explicitly use a `me` sstable
  test/boost/sstable_datafile_test: skip test_broken_promoted_index_is_skipped for `ms` sstables
  test/resource: add `ms` sample sstable files for relevant tests
  test/boost/sstable_compaction_test: prepare for `ms` sstables.
  test/boost/index_reader_test: prepare for `ms` sstables
  test/boost/bloom_filter_tests: prepare for `ms` sstables
  test/boost/sstable_datafile_test: prepare for `ms` sstables
  test/boost/sstable_test: prepare for `ms` sstables.
  sstables: introduce `ms` sstable format version
  tools/scylla-sstable: default to "preferred" sstable version, not "highest"
  sstables/mx/reader: use the same hashed_key for the bloom filter and the index reader
  sstables/trie/bti_index_reader: allow the caller to passing a precalculated murmur hash
  sstables/trie/bti_partition_index_writer: in add(), get the key hash from the caller
  sstables/mx: make Index and Summary components optional
  sstables: open Partitions.db early when it's needed to populate key range for sharding metadata
  sstables: adapt sstable::set_first_and_last_keys to sstables without Summary
  sstables: implement an alternative way to rebuild bloom filters for sstables without Index
  utils/bloom_filter: add `add(const hashed_key&)`
  sstables: adapt estimated_keys_for_range to sstables without Summary
  sstables: make `sstable::estimated_keys_for_range` asynchronous
  sstables/sstable: compute get_estimated_key_count() from Statistics instead of Summary
  replica/database: add table::estimated_partitions_in_range()
  sstables/mx: implement sstable::has_partition_key using a regular read
  sstables: use BTI index for queries, when present and enabled
  sstables/mx/writer: populate BTI index files
  sstables: create and open BTI index files, when enabled
  sstables: introduce Partition and Rows component types
  sstables/mx/writer: make `_pi_write_m.partition_tombstone` a `sstables::deletion_time`

2025-09-30 09:40:02 +03:00

__init__.py

test.py: Add the possibility to run boost test from pytest

2025-02-07 21:40:25 +01:00

address_map_test.cc

…

advanced_rpc_compressor_test.cc

treewide: use angle brackets for including seastar headers

2025-03-17 10:03:06 +02:00

aggregate_fcts_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

allocation_strategy_test.cc

…

alternator_unit_test.cc

alternator: adds expression cache implementation

2025-09-28 04:27:44 +02:00

anchorless_list_test.cc

…

auth_passwords_test.cc

auth: require scheme as parameter for generate_salt

2025-07-15 20:26:39 +02:00

auth_resource_test.cc

…

auth_test.cc

Revert "Merge 'transport: service_level_controller: create and use driver service level' from Andrzej Jackowski"

2025-09-22 09:32:46 +03:00

aws_error_injection_test.cc

treewide: move away from accessing httpd::request::query_parameters

2025-09-24 11:52:15 +03:00

aws_errors_test.cc

…

batchlog_manager_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

big_decimal_test.cc

utils/big_decimal: fix scale overflow when parsing values with large exponents

2025-06-26 15:29:28 +03:00

bloom_filter_test.cc

test/boost/bloom_filter_test: add test_rebuild_from_temporary_hashes

2025-09-29 22:15:26 +02:00

bptree_test.cc

…

bptree_validation.hh

test/boost/bptree_validation.hh: add missing include <fmt/format.h>

2025-01-23 06:05:57 -05:00

broken_sstable_test.cc

…

bti_index_test.cc

sstables/trie/bti_partition_index_writer: in add(), get the key hash from the caller

2025-09-29 13:01:21 +02:00

bti_key_translation_test.cc

sstables/trie: handle partition_regions other than clustered in BTI position encoding

2025-09-07 00:30:08 +02:00

bti_node_sink_test.cc

sstables/trie: fix a special case in max_offset_from_child

2025-09-07 00:30:15 +02:00

btree_test.cc

…

btree_validation.hh

…

bytes_ostream_test.cc

bytes_ostream: overload write() to support writing from FragmentedView

2025-07-01 22:19:07 +05:30

cache_algorithm_test.cc

…

cache_mutation_reader_test.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

cached_file_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

caching_options_test.cc

…

canonical_mutation_test.cc

…

cartesian_product_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

castas_fcts_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

cdc_generation_test.cc

…

cdc_test.cc

cdc: generate_stream_diff helper function

2025-09-17 14:47:12 +02:00

cell_locker_test.cc

treewide: Move replica related files to replica directory

2025-09-18 08:00:35 +03:00

checksum_utils_test.cc

…

chunked_managed_vector_test.cc

…

chunked_vector_test.cc

test: avoid #include <boost/test/included/...>

2025-09-22 15:26:06 +03:00

clustering_ranges_walker_test.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

CMakeLists.txt

Merge 'Alternator/cache expressions' from Szymon Malewski

2025-09-29 11:36:31 +03:00

collection_stress.hh

…

column_mapping_test.cc

…

combined_tests.cc

…

commitlog_cleanup_test.cc

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

commitlog_test.cc

Merge 'transport: replace throwing protocol_exception with returns' from Dario Mirovic

2025-09-10 21:54:15 +03:00

compaction_group_test.cc

compaction: remove using namespace {compaction,sstables}

2025-09-25 15:03:57 +03:00

comparable_bytes_test.cc

types/comparable_bytes: add a missing implementation for date_type_impl

2025-09-17 12:22:40 +02:00

compound_test.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

compress_test.cc

…

config_test.cc

…

conftest.py

test.py: pytest: support --mode/--repeat in a common way for all tests

2025-08-17 15:26:23 +00:00

continuous_data_consumer_test.cc

…

counter_test.cc

treewide: Move mutation related files to a mutation directory

2025-09-24 13:23:38 +03:00

cql_auth_query_test.cc

tree: Make values mutable to enable move semantics

2025-03-03 13:53:02 +03:00

cql_auth_syntax_test.cc

…

cql_functions_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

cql_query_group_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

cql_query_large_test.cc

treewide: Rename table_state to compaction_group_view

2025-08-08 06:51:28 +03:00

cql_query_like_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

cql_query_test.cc

compaction: remove using namespace {compaction,sstables}

2025-09-25 15:03:57 +03:00

crc_test.cc

…

data_listeners_test.cc

…

database_test.cc

root,replica: move multishard_mutation_query to replica/

2025-09-26 11:15:38 +03:00

dict_trainer_test.cc

…

dirty_memory_manager_test.cc

replica/memtable: move region_listener handlers from dirty_memory_manager to memtable

2025-06-20 11:42:30 +02:00

disk_space_monitor_test.cc

disk_space_monitor_test.cc: Start a monitor after fake space source function is registered

2025-09-18 15:03:34 +03:00

double_decker_test.cc

…

duration_test.cc

treewide: Move type related files to a type directory As requested in #22110 , moved the files and fixed other includes and build system.

2025-09-17 17:32:19 +03:00

dynamic_bitset_test.cc

…

encrypted_file_test.cc

encryption: add encrypted_data_source class

2025-07-06 09:18:39 +03:00

encryption_at_rest_test.cc

treewide: move away from accessing httpd::request::query_parameters

2025-09-24 11:52:15 +03:00

enum_option_test.cc

…

enum_set_test.cc

…

error_injection_test.cc

…

estimated_histogram_test.cc

…

exception_container_test.cc

…

exceptions_fallback_test.cc

…

exceptions_optimized_test.cc

…

exceptions_test.inc.cc

exceptions: Add try_catch_nested to universally handle nested exceptions of the same type.

2025-03-26 11:15:13 +01:00

expr_test.cc

boost/expr_test: add vector expression tests

2025-01-28 21:14:49 +01:00

extensions_test.cc

sstables::file_io_extension: Make sstable argument to "wrap" const

2025-03-20 14:54:09 +00:00

file_stream_test.cc

streaming: use host_id in file streaming

2025-05-12 09:36:48 +03:00

filtering_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

flush_queue_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

fragmented_temporary_buffer_test.cc

transport: replace throwing protocol_exception with returns

2025-08-28 23:31:36 +02:00

frozen_mutation_test.cc

readers: mv from_mutations_v2.hh from_mutations.hh

2025-04-16 04:46:08 -04:00

gcp_object_storage_test.cc

test::boost::gcp_object_storage_test: Initial unit tests for GCP obj storage

2025-09-01 18:14:20 +00:00

generic_server_test.cc

treewide: Move transport related files to a transport directory As requested in #22112 , moved the files and fixed other includes and build system.

2025-09-29 11:46:06 +03:00

gossiping_property_file_snitch_test.cc

…

group0_cmd_merge_test.cc

service/migration_manager: pass storage_proxy to prepare_keyspace_drop_announcement()

2025-08-27 08:55:47 +02:00

group0_test.cc

…

group0_voter_calculator_test.cc

raft: refactor can_vote logic and type

2025-09-24 13:55:05 +02:00

hash_test.cc

…

hashers_test.cc

…

hint_test.cc

…

idl_test.cc

…

incremental_compaction_test.cc

compaction: remove using namespace {compaction,sstables}

2025-09-25 15:03:57 +03:00

index_reader_test.cc

test/boost/index_reader_test: prepare for ms sstables

2025-09-29 22:15:25 +02:00

index_with_paging_test.cc

…

input_stream_test.cc

…

intrusive_array_test.cc

utils: do not include unused headers

2025-01-14 07:56:39 -05:00

json_cql_query_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

json_test.cc

…

keys_test.cc

keys: from_nodetool_style_string don't split single partition keys

2025-08-14 19:52:04 +03:00

kmip_wrapper.py

test: Use in-memory SQLite for PyKMIP server

2025-08-01 12:11:27 +03:00

large_paging_state_test.cc

…

like_matcher_test.cc

…

limiting_data_source_test.cc

…

linearizing_input_stream_test.cc

…

lister_test.cc

utils: do not include unused headers

2025-01-14 07:56:39 -05:00

loading_cache_test.cc

loading_cache_test: test_loading_cache_reload_during_eviction: use manual_clock

2025-03-31 14:53:06 +03:00

locator_topology_test.cc

tablets: prevent accidental copy of tablets_map

2025-07-22 15:07:26 +03:00

log_heap_test.cc

…

logalloc_standard_allocator_segment_pool_backend_test.cc

…

logalloc_test.cc

logalloc_test: don't test performance in test background_reclaim

2025-05-06 18:59:18 +02:00

lru_string_map_test.cc

utils: add lru_string_map

2025-09-28 04:06:00 +02:00

managed_bytes_test.cc

managed_bytes: make empty managed_bytes constexpr friendly

2025-07-29 23:51:43 +03:00

managed_vector_test.cc

…

map_difference_test.cc

treewide: Move misc files to utils directory

2025-07-21 11:56:40 +03:00

memtable_test.cc

tests: adjust for incremental repair

2025-08-08 06:49:17 +03:00

multishard_combining_reader_as_mutation_source_test.cc

everywhere: use utils::chunked_vector for list of mutations

2025-07-13 19:13:11 +03:00

multishard_query_test.cc

test/boost: rename multishard_mutation_query_test to multishard_query_test

2025-09-26 11:15:38 +03:00

murmur_hash_test.cc

…

mutation_fragment_test.cc

everywhere: use utils::chunked_vector for list of mutations

2025-07-13 19:13:11 +03:00

mutation_query_test.cc

treewide: Move query related files to a new query directory

2025-09-16 23:40:47 +03:00

mutation_reader_another_test.cc

everywhere: use utils::chunked_vector for list of mutations

2025-07-13 19:13:11 +03:00

mutation_reader_test.cc

compaction: move code to namespace compaction

2025-09-25 15:03:56 +03:00

mutation_test.cc

test: Fix indentation after previous patch

2025-09-26 16:39:09 +03:00

mutation_writer_test.cc

everywhere: use utils::chunked_vector for list of mutations

2025-07-13 19:13:11 +03:00

mvcc_test.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

network_topology_strategy_test.cc

mv: handle mismatched base/view replica count caused by RF change

2025-09-22 12:50:16 +02:00

nonwrapping_interval_test.cc

test: nonwrapping_interval_test: verify an interval of tokens is trivial

2025-09-06 18:41:00 +03:00

observable_test.cc

…

partitioner_test.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

per_partition_rate_limit_test.cc

…

pluggable_test.cc

utils: phased_barrier, pluggable: use named gate

2025-04-12 11:47:00 +03:00

pretty_printers_test.cc

…

querier_cache_test.cc

everywhere: use utils::chunked_vector for list of mutations

2025-07-13 19:13:11 +03:00

query_processor_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

radix_tree_printer.hh

…

radix_tree_test.cc

…

range_assert.hh

…

range_tombstone_list_assertions.hh

…

range_tombstone_list_test.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

rate_limiter_test.cc

…

reader_concurrency_semaphore_test.cc

readers: mv from_mutations_v2.hh from_mutations.hh

2025-04-16 04:46:08 -04:00

README.md

test.py: Add the possibility to run boost test from pytest

2025-02-07 21:40:25 +01:00

recent_entries_map_test.cc

…

repair_test.cc

compaction: Add tablet incremental repair support

2025-08-18 11:01:21 +08:00

reservoir_sampling_test.cc

…

rest_client_test.cc

rest_client: set version on http::request to avoid invalid state

2025-09-18 07:36:25 +03:00

restrictions_test.cc

…

result_utils_test.cc

test/result_utils: Do not assume map_reduce reducing order

2025-05-30 09:38:59 +02:00

reusable_buffer_test.cc

utils/reusable_buffer: accept non-throwing writer callbacks via result_with_exception

2025-07-17 16:40:02 +02:00

role_manager_test.cc

…

row_cache_test.cc

db: optimize cache invalidation following repair/streaming

2025-09-14 19:48:14 +03:00

rust_test.cc

…

s3_test.cc

s3_test: extract file writing code to a function

2025-08-14 16:18:43 +03:00

schema_change_test.cc

db/schema_tables: create/cleanup tasks when an index is created/dropped

2025-08-27 08:55:47 +02:00

schema_changes_test.cc

sstables: add ms to all_sstable_versions

2025-09-29 22:15:25 +02:00

schema_loader_test.cc

test/boost/schema_loader_test: add specific test with interesting types

2025-09-25 11:28:35 +03:00

schema_registry_test.cc

code: Update callers generating feature service config

2025-07-21 19:19:09 +03:00

scoped_item_list_test.cc

utils: unit test for utils::scoped_item_list

2025-08-01 02:15:04 +03:00

secondary_index_test.cc

mv: delete previously undetected ghost rows in PRUNE MATERIALIZED VIEW statement

2025-09-10 07:35:00 +02:00

serialization_test.cc

…

serialized_action_test.cc

utils: phased_barrier, pluggable: use named gate

2025-04-12 11:47:00 +03:00

service_level_controller_test.cc

qos: don't populate effective service level cache until auth is migrated to raft

2025-07-29 11:37:37 +02:00

sessions_test.cc

…

small_vector_test.cc

test: avoid #include <boost/test/included/...>

2025-09-22 15:26:06 +03:00

snitch_reset_test.cc

…

sorting_test.cc

…

sstable_3_x_test.cc

test/boost/sstable_3_x_test: add ms sstables to multi-version tests

2025-09-29 22:15:25 +02:00

sstable_compaction_test.cc

Merge 'sstables: introduce sstable version ms' from Michał Chojnowski

2025-09-30 09:40:02 +03:00

sstable_compression_config_test.cc

test/boost: Add tests for SSTable compression config options

2025-09-26 12:02:42 +03:00

sstable_compressor_factory_test.cc

test/boost/sstable_compressor_factory_test: define a test suite name

2025-05-26 09:35:30 +02:00

sstable_conforms_to_mutation_source_test.cc

sstables: add ms to all_sstable_versions

2025-09-29 22:15:25 +02:00

sstable_datafile_test.cc

test/boost/sstable_datafile_test: skip test_broken_promoted_index_is_skipped for ms sstables

2025-09-29 22:15:25 +02:00

sstable_directory_test.cc

test: Use map_reduce0 in sstable_directory_test.cc (and coroutinize)

2025-09-25 21:22:12 +03:00

sstable_generation_test.cc

test: ignore unused fmt::to_string() result

2025-03-24 10:19:09 +03:00

sstable_inexact_index_test.cc

test/boost/sstable_inexact_index_test: explicitly use a me sstable

2025-09-29 22:15:25 +02:00

sstable_move_test.cc

test: sstable_move_test: always use uuid sstable generation

2025-06-18 11:30:29 +03:00

sstable_mutation_test.cc

sstables: make sstable::estimated_keys_for_range asynchronous

2025-09-29 13:01:21 +02:00

sstable_partition_index_cache_test.cc

…

sstable_resharding_test.cc

compaction: remove using namespace {compaction,sstables}

2025-09-25 15:03:57 +03:00

sstable_set_test.cc

replica: Fix range reads spanning sibling tablets

2025-05-27 22:39:40 -03:00

sstable_test.cc

test/boost/sstable_test: prepare for ms sstables.

2025-09-29 22:15:24 +02:00

sstable_test.hh

compaction: move code to namespace compaction

2025-09-25 15:03:56 +03:00

stall_free_test.cc

utils: stall_free: clear_gently: release wrapped objects

2025-09-17 11:44:26 +03:00

statement_restrictions_test.cc

…

storage_proxy_test.cc

token_metadata: move make_token_metadata_ptr into shared_token_metadata class

2025-07-06 14:22:20 +03:00

stream_compressor_test.cc

…

string_format_test.cc

…

summary_test.cc

…

symmetric_key_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

tablets_test.cc

tablets: disallow chains of colocated tables

2025-09-26 16:52:43 +02:00

tagged_integer_test.cc

…

test_config.yaml

test/boost: rename multishard_mutation_query_test to multishard_query_test

2025-09-26 11:15:38 +03:00

token_metadata_test.cc

locator: abstract_replication_map: rename calculate_effective_replication_map

2025-08-06 16:03:53 +03:00

top_k_test.cc

…

total_order_check.hh

…

tracing_test.cc

…

transport_test.cc

transport: replace make_frame throw with return result

2025-08-28 23:33:33 +02:00

tree_test_key.hh

…

trie_traversal_test.cc

sstables/trie: support reader_permit and trace_state properly

2025-09-17 12:22:40 +02:00

trie_writer_test.cc

tests/lib: extract generate_all_strings to test/lib

2025-08-14 22:38:38 +02:00

types_test.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

unique_view_test.cc

utils: implement drop-in replacement for replacing boost::adaptors::uniqued

2025-01-21 16:24:45 +08:00

user_function_test.cc

treewide: include boost headers as "system" headers

2025-08-22 17:21:24 +03:00

user_types_test.cc

raft: make group0 Raft operation timeout configurable

2025-04-15 10:57:39 +03:00

utf8_test.cc

…

UUID_test.cc

…

view_build_test.cc

view_builder: register view on all shards atomically

2025-09-21 10:39:05 +02:00

view_complex_test.cc

treewide: Rename table_state to compaction_group_view

2025-08-08 06:51:28 +03:00

view_schema_ckey_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

view_schema_pkey_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

view_schema_test.cc

test/boost/view_schema_test.cc: fix race in wait_until_built

2025-07-01 13:20:19 +03:00

vint_serialization_test.cc

test: avoid spaces when defining user-defined literal operator

2025-03-24 10:17:12 +03:00

virtual_reader_test.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

virtual_table_mutation_source_test.cc

everywhere: use utils::chunked_vector for list of mutations

2025-07-13 19:13:11 +03:00

virtual_table_test.cc

config: specialize config_from_string() for sstring

2025-01-26 15:53:12 +02:00

wasm_alloc_test.cc

…

wasm_test.cc

…

wrapping_interval_test.cc

treewide: include boost headers as "system" headers

2025-08-22 17:21:24 +03:00

README.md

Scylla unit tests using C++ and the Boost test framework

The source files in this directory are Scylla unit tests written in C++ using the Boost.Test framework. These unit tests come in three flavors:

Some simple tests that check stand-alone C++ functions or classes use Boost's BOOST_AUTO_TEST_CASE.
Some tests require Seastar features, and need to be declared with Seastar's extensions to Boost.Test, namely SEASTAR_TEST_CASE.
Even more elaborate tests require not just a functioning Seastar environment but also a complete (or partial) Scylla environment. Those tests use the do_with_cql_env() or do_with_cql_env_thread() function to set up a mostly-functioning environment behaving like a single-node Scylla, in which the test can run.

While we have many tests of the third flavor, writing new tests of this type should be reserved to white box tests - tests where it is necessary to inspect or control Scylla internals that do not have user-facing APIs such as CQL. In contrast, black-box tests - tests that can be written only using user-facing APIs, should be written in one of newer test frameworks that we offer - such as test/cqlpy or test/alternator (in Python, using the CQL or DynamoDB APIs respectively) or test/cql (using textual CQL commands), or - if more than one Scylla node is needed for a test - using the test/topology* framework.

Running tests

Because these are C++ tests, they need to be compiled before running. To compile a single test executable row_cache_test, use a command like

ninja build/dev/test/boost/row_cache_test

You can also use ninja dev-test to build all C++ tests, or use ninja deb-build to build the C++ tests and also the full Scylla executable (however, note that full Scylla executable isn't needed to run Boost tests).

Replace "dev" by "debug" or "release" in the examples above and below to use the "debug" build mode (which, importantly, compiles the test with ASAN and UBSAN enabling on and helps catch difficult-to-catch use-after-free bugs) or the "release" build mode (optimized for run speed).

To run an entire test file row_cache_test, including all its test functions, use a command like:

build/dev/test/boost/row_cache_test -- -c1 -m1G

to run a single test function test_reproduce_18045() from the longer test file, use a command like:

build/dev/test/boost/row_cache_test -t test_reproduce_18045 -- -c1 -m1G

In these command lines, the parameters before the -- are passed to Boost.Test, while the parameters after the -- are passed to the test code, and in particular to Seastar. In this example Seastar is asked to run on one CPU (-c1) and use 1G of memory (-m1G) instead of hogging the entire machine. The Boost.Test option -t test_reproduce_18045 asks it to run just this one test function instead of all the test functions in the executable.

Unfortunately, interrupting a running test with control-C while doesn't work. This is a known bug (#5696). Kill a test with SIGKILL (-9) if you need to kill it while it's running.

Boost tests can also be run using test.py - which is a script that provides a uniform way to run all tests in scylladb.git - C++ tests, Python tests, etc.

Execution with pytest

To run all tests with pytest execute

pytest test/boost

To execute all tests in one file, provide the path to the source filename as a parameter

pytest test/boost/aggregate_fcts_test.cc

Since it's a normal path, autocompletion works in the terminal out of the box.

To execute only one test function, provide the path to the source file and function name

pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg

To provide a specific mode, use the next parameter --mode dev, if parameter isn't provided pytest tries to use ninja mode_list to find out the compiled modes.

Parallel execution is controlled by pytest-xdist and the parameter -n auto. This command starts tests with the number of workers equal to CPU cores. The useful command to discover the tests in the file or directory is

pytest --collect-only -q --mode dev test/boost/aggregate_fcts_test.cc

That will return all test functions in the file. To execute only one function from the test, you can invoke the output from the previous command. However, suffix for mode should be skipped. For example, output shows in the terminal something like this test/boost/aggregate_fcts_test.cc::test_aggregate_avg.dev. So to execute this specific test function, please use the next command

pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg

Writing tests

Because of the large build time and build size of each separate test executable, it is recommended to put test functions into relatively large source files. But not too large - to keep compilation time of a single source file (during development) at reasonable levels.

When adding new source files in test/boost, don't forget to list the new source file in configure.py and also in CMakeLists.txt. The former is needed by our CI, but the latter is preferred by some developers.