mirror of https://github.com/scylladb/scylladb.git synced 2026-05-25 01:02:20 +00:00

Files

Avi Kivity 94c21e5c05 Merge 'sstables: Reduce amount of I/O for clustering-key-bounded reads from large partitions' from Tomasz Grabiec

Single-row reads from large partition issue 64 KiB reads to the data file,
which is equal to the default span of the promoted index block in the data file.
If users would want to increase selectivity of the index to speed up single-row reads,
this won't be effective. The reason is that the reader uses promoted index
to look up the start position in the data file of the read, but end position
will in practice extend to the next partition, and amount of I/O will be
determined by the underlying file input stream implementation and its
read-ahead heuristics. By default, that results in at least 2 IOs 32KB each.

There is already infrastructure to lookup end position based on upper
bound of the read, in anticipation for sharing the promoted index cache,
but it's not effective becasue it's a non-populating lookup and the upper
bound cursor has its own private cached_promoted_index, which is cold
when positions are computed. It's non-populating on purpose, to avoid
extra index file IO to read upper bound. In case upper bound is far-enough
from the lower bound, this will only increase the cost of the read.

The solution employed here is to warm up the lower bound cursor's
cache before positions are computed, and use that cursor for
non-populating lookup of the upper bound.

We use the lower bound cursor and the slice's lower bound so that we
read the same blocks as later lower-bound slicing would, so that we
don't incur extra IO for cases where looking up upper bound is not
worth it, that is when upper bound is far from the lower bound. If
upper bound is near lower bound, then warming up using lower bound
will populate cached_promoted_index with blocks which will allow us to
locate the upper bound block accurately.  This is especially important
for single-row reads, where the bounds are around the same key.  In
this case we want to read the data file range which belongs to a
single promoted index block.  It doesn't matter that the upper bound
is not exactly the same. They both will likely lie in the same block,
and if not, binary search will bring adjacent blocks into cache.  Even
if upper bound is not near, the binary search will populate the cache
with blocks which can be used to narrow down the data file range
somewhat.

Fixes #10030.

The change was tested with perf-fast-forward.

I populated the data set with `column_index_size_in_kb` set to 1

  scylla perf-fast-forward --populate --run-tests=large-partition-slicing --column-index-size-in-kb=1

Test run:

  build/release/scylla perf-fast-forward --run-tests=large-partition-select-few-rows -c1 --keep-cache-across-test-cases --test-case-duration=0

This test issues two reads of subsequent keys from the middle of a large partition (1M rows in total). The first read will miss in the index file page cache, the second read will hit.

Notice that before the change, the second read issued 2 aio requests worth of 64KiB in total.
After the change, the second read issued 1 aio worth of 2 KiB. That's because promoted index block is larger than 1 KiB.
I verified using logging that the data file range matches a single promoted index block.

Also, the first read which misses in cache is still faster after the change.

Before:

```
running: large-partition-select-few-rows on dataset large-part-ds1
Testing selecting few rows from a large partition:
stride  rows      time (s)   iterations     frags     frag/s    mad f/s    max f/s    min f/s    avg aio    aio      (KiB) blocked dropped  idx hit idx miss  idx blk    c hit   c miss    c blk    allocs   tasks insns/f    cpu
500000  1         0.009802            1         1        102          0        102        102       21.0     21        196       2       1        0        1        1        0        0        0       568     269 4716050  53.4%
500001  1         0.000321            1         1       3113          0       3113       3113        2.0      2         64       1       0        1        0        0        0        0        0       116      26  555110  45.0%
```

After:

```
running: large-partition-select-few-rows on dataset large-part-ds1
Testing selecting few rows from a large partition:
stride  rows      time (s)   iterations     frags     frag/s    mad f/s    max f/s    min f/s    avg aio    aio      (KiB) blocked dropped  idx hit idx miss  idx blk    c hit   c miss    c blk    allocs   tasks insns/f    cpu
500000  1         0.009609            1         1        104          0        104        104       20.0     20        137       2       1        0        1        1        0        0        0       561     268 4633407  43.1%
500001  1         0.000217            1         1       4602          0       4602       4602        1.0      1          2       1       0        1        0        0        0        0        0       110      26  313882  64.1%
```

Backports: none, not a regression

Closes scylladb/scylladb#20522

* github.com:scylladb/scylladb:
  perf: perf_fast_forward: Add test case for querying missing rows
  perf-fast-forward: Allow overriding promoted index block size
  perf-fast-forward: Test subsequent key reads from the middle in test_large_partition_select_few_rows
  perf-fast-forward: Allow adding key offset in test_large_partition_select_few_rows
  perf-fast-forward: Use single-partition reads in test_large_partition_select_few_rows
  sstables: bsearch_clustered_cursor: Add more tracing points
  sstables: reader: Log data file range
  sstables: bsearch_clustered_cursor: Unify skip_info logging
  sstables: bsearch_clustered_cursor: Narrow down range using "end" position of the block
  sstables: bsearch_clustered_cursor: Skip even to the first block
  test: sstables: sstable_3_x_test: Improve failure message
  sstables: mx: writer: Never include partition_end marker in promoted index block width
  sstables: Reduce amount of I/O for clustering-key-bounded reads from large partitions
  sstables: clustered_cursor: Track current block

2024-10-28 21:13:23 +02:00

aggregate_fcts_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

allocation_strategy_test.cc

…

alternator_unit_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

anchorless_list_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

auth_passwords_test.cc

…

auth_resource_test.cc

test: use generic boost_test_print_type()

2024-05-20 12:56:20 +03:00

auth_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

aws_error_injection_test.cc

test: add complete_multipart_upload completion tests

2024-10-01 09:06:24 +03:00

aws_errors_test.cc

test: add complete_multipart_upload completion tests

2024-10-01 09:06:24 +03:00

batchlog_manager_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

big_decimal_test.cc

…

bloom_filter_test.cc

boost/bloom_filter_test: wait for total memory reclaimed update

2024-07-26 08:15:11 +03:00

bptree_test.cc

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

bptree_validation.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

broken_sstable_test.cc

test: Make tests use schema_builder instead of make_shared_schema

2024-09-05 19:31:30 +03:00

btree_test.cc

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

btree_validation.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

bytes_ostream_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cache_algorithm_test.cc

auth: do not include unused headers

2024-06-17 17:33:55 +03:00

cache_mutation_reader_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cached_file_test.cc

cached_file: Adapt page_view to ContiguousSharedBuffer

2024-09-27 01:25:15 +02:00

caching_options_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

canonical_mutation_test.cc

canonical_mutation: add make_canonical_mutation_gently

2024-05-02 19:37:04 +03:00

cartesian_product_test.cc

…

castas_fcts_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cdc_generation_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cdc_test.cc

treewide: s/boost::adaptors::map_values/std::views::values/

2024-10-27 21:32:45 +02:00

cell_locker_test.cc

treewide: replace seastar::future::get0() with seastar::future::get()

2024-02-02 22:12:57 +08:00

checksum_utils_test.cc

…

chunked_managed_vector_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

chunked_vector_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

clustering_ranges_walker_test.cc

…

CMakeLists.txt

build: cmake: correct some tests' KIND

2024-10-22 07:10:47 +03:00

collection_stress.hh

test: Move stress-collecton header from unit to boost

2024-09-24 13:42:13 +03:00

column_mapping_test.cc

test: use generic boost_test_print_type()

2024-05-20 12:56:20 +03:00

commitlog_cleanup_test.cc

raft_group0_client: uninclude "db/system_keyspace.hh"

2024-09-28 16:31:53 +03:00

commitlog_test.cc

treewide: move log.hh into utils/log.hh

2024-10-22 06:54:46 +03:00

compaction_group_test.cc

replica: Fix tombstone GC during tablet split preparation

2024-10-02 11:26:13 -03:00

compound_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

compress_test.cc

…

config_test.cc

treewide: drop thrift support

2024-06-07 06:44:59 +08:00

continuous_data_consumer_test.cc

reader_permit: store schema_ptr instead of raw schema pointer

2024-01-11 08:37:56 +02:00

counter_test.cc

treewide: use std::ranges sort functions rather than boost

2024-10-01 14:19:05 +03:00

cql_auth_query_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

cql_auth_syntax_test.cc

test: Add tests for CREATE ROLE WITH SALTED HASH

2024-09-20 14:24:53 +02:00

cql_functions_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_group_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_large_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_like_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

cql_query_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

crc_test.cc

…

data_listeners_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

database_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

dirty_memory_manager_test.cc

treewide: move log.hh into utils/log.hh

2024-10-22 06:54:46 +03:00

double_decker_test.cc

utils: do not include unused headers

2024-01-18 12:50:06 +02:00

duration_test.cc

Typos: fix typos in comments

2023-12-02 22:37:22 +02:00

dynamic_bitset_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

enum_option_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

enum_set_test.cc

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

error_injection_test.cc

treewide: move log.hh into utils/log.hh

2024-10-22 06:54:46 +03:00

estimated_histogram_test.cc

test/estimated_histogram_test Add summary tests

2024-08-22 23:34:24 +03:00

exception_container_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

exceptions_fallback_test.cc

…

exceptions_optimized_test.cc

…

exceptions_test.inc.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

expr_test.cc

cql3: introduce dialect infrastructure

2024-08-29 21:19:23 +03:00

extensions_test.cc

…

filtering_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

flush_queue_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

fragmented_temporary_buffer_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

frozen_mutation_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

generic_server_test.cc

test/generic_server: add test case

2024-08-28 10:59:44 +02:00

gossiping_property_file_snitch_test.cc

test: use generic boost_test_print_type()

2024-05-20 12:56:20 +03:00

group0_cmd_merge_test.cc

raft_group0_client: uninclude "db/system_keyspace.hh"

2024-09-28 16:31:53 +03:00

group0_test.cc

raft: add the check for the group0 tables

2024-10-08 21:08:11 +02:00

hash_test.cc

…

hashers_test.cc

reader_permit: store schema_ptr instead of raw schema pointer

2024-01-11 08:37:56 +02:00

hint_test.cc

test/boost/hint_test.cc: Add missing parse() callback

2024-06-19 23:19:33 +02:00

idl_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

index_reader_test.cc

sstables: bsearch_clustered_cursor: Narrow down range using "end" position of the block

2024-10-03 14:16:05 +02:00

index_with_paging_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

input_stream_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

intrusive_array_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

json_cql_query_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

json_test.cc

…

keys_test.cc

…

large_paging_state_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

like_matcher_test.cc

…

limiting_data_source_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

linearizing_input_stream_test.cc

…

lister_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

loading_cache_test.cc

treewide: replace seastar::future::get0() with seastar::future::get()

2024-02-02 22:12:57 +08:00

locator_topology_test.cc

treewide: move log.hh into utils/log.hh

2024-10-22 06:54:46 +03:00

log_heap_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

logalloc_standard_allocator_segment_pool_backend_test.cc

…

logalloc_test.cc

treewide: move log.hh into utils/log.hh

2024-10-22 06:54:46 +03:00

managed_bytes_test.cc

…

managed_vector_test.cc

…

map_difference_test.cc

…

memtable_test.cc

replica: implement memtable_flush_period_in_ms schema option

2024-10-17 13:41:15 +03:00

multishard_combining_reader_as_mutation_source_test.cc

treewide: s/boost::adaptors::map_values/std::views::values/

2024-10-27 21:32:45 +02:00

multishard_mutation_query_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

murmur_hash_test.cc

…

mutation_fragment_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

mutation_query_test.cc

readers: Use reversed schema and native reversed slices

2024-08-13 10:03:46 +02:00

mutation_reader_another_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

mutation_reader_test.cc

treewide: s/boost::adaptors::map_values/std::views::values/

2024-10-27 21:32:45 +02:00

mutation_test.cc

treewide: Remove table::config::datadir

2024-09-19 13:06:39 +03:00

mutation_writer_test.cc

treewide: s/boost::adaptors::map_values/std::views::values/

2024-10-27 21:32:45 +02:00

mvcc_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

network_topology_strategy_test.cc

treewide: move log.hh into utils/log.hh

2024-10-22 06:54:46 +03:00

nonwrapping_interval_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

observable_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

partitioner_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

per_partition_rate_limit_test.cc

test: Avoid using deprecated sharded API

2024-05-16 00:28:47 +02:00

pretty_printers_test.cc

utils/pretty_printers: add "I" specifier support

2024-01-11 14:33:47 +08:00

querier_cache_test.cc

treewide: use std::ranges sort functions rather than boost

2024-10-01 14:19:05 +03:00

query_processor_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

radix_tree_printer.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

radix_tree_test.cc

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

range_assert.hh

…

range_tombstone_list_assertions.hh

…

range_tombstone_list_test.cc

clustering_bounds_comparator: add fmt::formtter for bound_{kind,view}

2024-03-11 11:37:48 +02:00

rate_limiter_test.cc

…

reader_concurrency_semaphore_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

README.md

test/boost: add README.md

2024-09-24 15:16:55 +03:00

recent_entries_map_test.cc

…

repair_test.cc

test/boost/repair_test: close reader after use

2024-09-13 06:52:26 -04:00

restrictions_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

result_utils_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

reusable_buffer_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

role_manager_test.cc

cql3: auth: use mutation collector for alter role

2024-06-04 15:43:04 +02:00

row_cache_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

rust_test.cc

…

s3_test.cc

s3-client: Add support for lister::filter

2024-08-27 16:15:40 +03:00

schema_change_test.cc

service/migration_listener: update_tablet_metadata(): add hint parameter

2024-08-11 09:53:19 -04:00

schema_changes_test.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

schema_loader_test.cc

test: Use sstables::test_env to make sstables for schema loader test

2024-09-09 14:22:58 +03:00

schema_registry_test.cc

db: move schema merging code into a separate unit

2024-09-23 12:01:36 +02:00

secondary_index_test.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

serialization_test.cc

gms: remove unused operator<<

2024-06-18 15:55:22 +08:00

serialized_action_test.cc

…

service_level_controller_test.cc

service/qos/service_level_controller: notify subscribers on effective

2024-08-08 10:42:09 +02:00

sessions_test.cc

storage_service: Introduce session concept

2023-12-05 14:09:34 +01:00

small_vector_test.cc

utils: small_vector: support from_range_t

2024-10-21 09:31:38 +03:00

snitch_reset_test.cc

treewide: include used headers

2024-05-27 17:34:38 +03:00

sorting_test.cc

test/boost: add test for topological sorting

2024-05-16 13:30:03 +02:00

sstable_3_x_test.cc

Merge 'sstables: Reduce amount of I/O for clustering-key-bounded reads from large partitions' from Tomasz Grabiec

2024-10-28 21:13:23 +02:00

sstable_compaction_test.cc

Merge 'sstables: Add digest checking in the validation path of the sstable layer' from Nikos Dragazis

2024-10-09 21:33:08 +03:00

sstable_conforms_to_mutation_source_test.cc

readers: Use reversed schema and native reversed slices

2024-08-13 10:03:46 +02:00

sstable_datafile_test.cc

sstables: scylla_metadata: add sstable identifier

2024-10-10 08:52:46 +03:00

sstable_directory_test.cc

test: Fix test_multiple_data_dirs

2024-10-07 12:04:23 +03:00

sstable_generation_test.cc

test/boost/sstable_generation_test: s/LE/LT/ when appropriate

2023-12-05 08:25:04 +03:00

sstable_move_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

sstable_mutation_test.cc

treewide: use std::ranges sort functions rather than boost

2024-10-01 14:19:05 +03:00

sstable_partition_index_cache_test.cc

treewide: replace seastar::future::get0() with seastar::future::get()

2024-02-02 22:12:57 +08:00

sstable_resharding_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

sstable_set_test.cc

boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor

2024-08-17 23:38:05 +05:30

sstable_test.cc

test: Squash test::change_generation_number() into test::store()

2024-10-24 11:29:17 +03:00

sstable_test.hh

test: Tune up indentation in uncompressed_schema()

2024-09-05 19:33:29 +03:00

stall_free_test.cc

utils/stall_free: introduce reserve_gently

2024-06-18 23:36:30 +05:30

statement_restrictions_test.cc

cql3: statement_restrictions, expr: move restrictions-related expression utilities out of expression.cc

2024-09-22 11:00:51 +03:00

storage_proxy_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

string_format_test.cc

test: string_format_test: disable test if {fmt} >= 10.0.0

2024-05-03 11:34:23 +03:00

suite.yaml

boost/bloom_filter_test: add testcase to verify unlinked sstables are not reloaded

2024-07-19 13:15:57 +05:30

summary_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

tablets_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

tagged_integer_test.cc

…

token_metadata_test.cc

locator: topology: add_or_update_endpoint: use none as the default node state

2024-08-29 10:37:07 +02:00

top_k_test.cc

treewide: include used headers

2024-05-27 17:34:38 +03:00

total_order_check.hh

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

tracing_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

transport_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

tree_test_key.hh

test: Move other collection-testing headers from unit to boost

2024-09-24 13:42:13 +03:00

types_test.cc

treewide: remove dependency on boost asio address_v4

2024-10-01 14:00:50 +03:00

user_function_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

user_types_test.cc

treewide: replace boost::irange with std::views::iota where possible

2024-10-03 10:33:33 +03:00

utf8_test.cc

…

UUID_test.cc

treewide: use seastar::format() or fmt::format() explicitly

2024-09-11 23:21:40 +03:00

view_build_test.cc

test/boost: stop using ranges::to()

2024-10-19 13:21:20 +08:00

view_complex_test.cc

compaction: replace optional<task_info> with task_info param

2024-08-02 14:38:46 +02:00

view_schema_ckey_test.cc

test/lib: do not include unused headers

2024-05-05 23:31:48 +03:00

view_schema_pkey_test.cc

test/lib: do not include unused headers

2024-05-05 23:31:48 +03:00

view_schema_test.cc

node_update_backlog: divide adding and fetching backlogs

2024-06-06 10:45:13 +02:00

vint_serialization_test.cc

…

virtual_reader_test.cc

treewide: drop includes of <boost/range/adaptors.hpp>

2024-10-20 17:17:11 +03:00

virtual_table_mutation_source_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

virtual_table_test.cc

test/boost: do not include unused headers

2024-02-06 13:22:16 +02:00

wasm_alloc_test.cc

main, test: use seastar::handle_signal() instead

2024-09-19 18:10:07 +03:00

wasm_test.cc

…

wrapping_interval_test.cc

test/boost: include test/lib/test_utils.hh

2024-05-26 12:32:43 +08:00

README.md

Scylla unit tests using C++ and the Boost test framework

The source files in this directory are Scylla unit tests written in C++ using the Boost.Test framework. These unit tests come in three flavors:

Some simple tests that check stand-alone C++ functions or classes use Boost's BOOST_AUTO_TEST_CASE.
Some tests require Seastar features, and need to be declared with Seastar's extensions to Boost.Test, namely SEASTAR_TEST_CASE.
Even more elaborate tests require not just a functioning Seastar environment but also a complete (or partial) Scylla environment. Those tests use the do_with_cql_env() or do_with_cql_env_thread() function to set up a mostly-functioning environment behaving like a single-node Scylla, in which the test can run.

While we have many tests of the third flavor, writing new tests of this type should be reserved to white box tests - tests where it is necessary to inspect or control Scylla internals that do not have user-facing APIs such as CQL. In contrast, black-box tests - tests that can be written only using user-facing APIs, should be written in one of newer test frameworks that we offer - such as test/cql-pytest or test/alternator (in Python, using the CQL or DynamoDB APIs respectively) or test/cql (using textual CQL commands), or - if more than one Scylla node is needed for a test - using the test/topology* framework.

Running tests

Because these are C++ tests, they need to be compiled before running. To compile a single test executable row_cache_test, use a command like

ninja build/dev/test/boost/row_cache_test

You can also use ninja dev-test to build all C++ tests, or use ninja deb-build to build the C++ tests and also the full Scylla executable (however, note that full Scylla executable isn't needed to run Boost tests).

Replace "dev" by "debug" or "release" in the examples above and below to use the "debug" build mode (which, importantly, compiles the test with ASAN and UBSAN enabling on and helps catch difficult-to-catch use-after-free bugs) or the "release" build mode (optimized for run speed).

To run an entire test file row_cache_test, including all its test functions, use a command like:

build/dev/test/boost/row_cache_test -- -c1 -m1G

to run a single test function test_reproduce_18045() from the longer test file, use a command like:

build/dev/test/boost/row_cache_test -t test_reproduce_18045 -- -c1 -m1G

In these command lines, the parameters before the -- are passed to Boost.Test, while the parameters after the -- are passed to the test code, and in particular to Seastar. In this example Seastar is asked to run on one CPU (-c1) and use 1G of memory (-m1G) instead of hogging the entire machine. The Boost.Test option -t test_reproduce_18045 asks it to run just this one test function instead of all the test functions in the executable.

Unfortunately, interrupting a running test with control-C while doesn't work. This is a known bug (#5696). Kill a test with SIGKILL (-9) if you need to kill it while it's running.

Boost tests can also be run using test.py - which is a script that provides a uniform way to run all tests in scylladb.git - C++ tests, Python tests, etc.

Writing tests

Because of the large build time and build size of each separate test executable, it is recommended to put test functions into relatively large source files. But not too large - to keep compilation time of a single source file (during development) at reasonable levels.

When adding new source files in test/boost, don't forget to list the new source file in configure.py and also in CMakeLists.txt. The former is needed by our CI, but the latter is preferred by some developers.