scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-06 23:13:15 +00:00

Files

Avi Kivity 6db826475d Merge "Introduce segregate scrub mode" from Botond

"
The current scrub compaction has a serious drawback, while it is
very effective at removing any corruptions it recognizes, it is very
heavy-handed in its way of repairing such corruptions: it simply drops
all data that is suspected to be corrupt. While this *is* the safest way
to cleanse data, it might not be the best way from the point of view of
a user who doesn't want to loose data, even at the risk of retaining
some business-logic level corruption. Mind you, no database-level scrub
can ever fully repair data from the business-logic point of view, they
can only do so on the database-level. So in certain cases it might be
desirable to have a less heavy-handed approach of cleansing the data,
that tries as hard as it can to not loose any data.

This series introduces a new scrub mode, with the goal of addressing
this use-case: when the user doesn't want to loose any data. The new
mode is called "segregate" and it works by segregating its input into
multiple outputs such that each output contains a valid stream. This
approach can fix any out-of-order data, be that on the partition or
fragment level. Out-of-order partitions are simply written into a
separate output. Out of order fragments are handled by injecting a
partition-end/partition-start pair right before them, so that they are
now in a separate (duplicate) partition, that will just be written into
a separate output, just like a regular out-of-order partition.

The reason this series is posted as an RFC is that although I consider
the code stable and tested, there are some questions related to the UX.
* First and foremost every scrub that does more than just discard data
  that is suspected to be corrupt (but even these a certain degree) have
  to consider the possibility that they are rehabilitating corruptions,
  leaving them in the system without a warning, in the sense that the
  user won't see any more problems due to low-level corruptions and
  hence might think everything is alright, while data is still corrupt
  from the business logic point of view. It is very hard to draw a line
  between what should and shouldn't scrub do, yet there is a demand from
  users for scrub that can restore data without loosing any of it. Note
  that anybody executing such a scrub is already in a bad shape, even if
  they can read their data (they often can't) it is already corrupt,
  scrub is not making anything worse here.
* This series converts the previous `skip_corrupted` boolean into an
  enum, which now selects the scrub mode. This means that
  `skip_corrupted` cannot be combined with segregate to throw out what
  the former can't fix. This was chosen for simplicity, a bunch of
  flags, all interacting with each other is very hard to see through in
  my opinion, a linear mode selector is much more so.
* The new segregate mode goes all-in, by trying to fix even
  fragment-level disorder. Maybe it should only do it on the partition
  level, or maybe this should be made configurable, allowing the user to
  select what to happen with those data that cannot be fixed.

Tests: unit(dev), unit(sstable_datafile_test:debug)
"

* 'sstable-scrub-segregate-by-partition/v1' of https://github.com/denesb/scylla:
  test: boost/sstable_datafile_test: add tests for segregate mode scrub
  api: storage_service/keyspace_scrub: expose new segregate mode
  sstables: compaction/scrub: add segregate mode
  mutation_fragment_stream_validator: add reset methods
  mutation_writer: add segregate_by_partition
  api: /storage_service/keyspace_scrub: add scrub mode param
  sstables: compaction/scrub: replace skip_corrupted with mode enum
  sstables: compaction/scrub: prevent infinite loop when last partition end is missing
  tests: boost/sstable_datafile_test: use the same permit for all fragments in scrub tests

2021-05-18 13:43:01 +03:00

aggregate_fcts_test.cc

cql3: Use correct comparator in timeuuid min/max

2021-01-13 11:07:29 +02:00

allocation_strategy_test.cc

…

alternator_unit_test.cc

test: add a test for rjson allocation

2021-04-22 15:59:13 +02:00

anchorless_list_test.cc

…

auth_passwords_test.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

auth_resource_test.cc

auth: Add service_level resource for supporting in authorization of cql service_level

2021-04-12 16:01:04 +02:00

auth_test.cc

test: add per-service-level timeout tests

2021-05-10 12:39:41 +02:00

batchlog_manager_test.cc

treewide: remove inclusions of storage_proxy.hh from headers

2021-04-20 21:23:00 +03:00

big_decimal_test.cc

…

bptree_test.cc

bptree: Special intra-node key search when possible

2020-08-06 15:41:31 +03:00

broken_sstable_test.cc

test: everywhere: close flat_mutation_reader when done

2021-04-25 11:35:07 +03:00

btree_test.cc

btree: Convert comparator to <=>

2021-04-01 12:56:08 +03:00

bytes_ostream_test.cc

bytes_ostream: max_chunk_size: account for chunk header

2021-03-10 19:54:12 +02:00

cache_flat_mutation_reader_test.cc

mutation_partition: Switch cache of rows onto B-tree

2021-02-02 09:30:30 +03:00

cached_file_test.cc

…

caching_options_test.cc

…

canonical_mutation_test.cc

…

cartesian_product_test.cc

…

castas_fcts_test.cc

test: avoid using literal suffix 'd'

2020-09-21 16:32:53 +03:00

cdc_generation_test.cc

cdc: Limit size of topology description

2021-02-17 13:24:40 +01:00

cdc_test.cc

cdc: tests: check cdc$deleted_ columns in images

2021-05-04 12:33:15 +02:00

cell_locker_test.cc

…

checksum_utils_test.cc

…

chunked_vector_test.cc

utils/chunked_vector: add reserve_partial()

2020-11-02 18:02:01 +02:00

clustering_ranges_walker_test.cc

clustering_range_walker: fix false discontiguity detected after a static row

2021-02-01 19:32:07 +02:00

column_mapping_test.cc

lwt: add column_mapping history persistence tests

2020-10-15 19:25:24 +03:00

commitlog_test.cc

flat_mutation_reader: make sure to close reader passed to read_mutation_from_flat_mutation_reader

2021-04-25 11:35:07 +03:00

compound_test.cc

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

compress_test.cc

…

config_test.cc

alternator: guard streams with an experimental flag

2020-11-12 12:36:16 +01:00

continuous_data_consumer_test.cc

test: everywhere: use seastar::testing::local_random_engine

2021-01-13 11:07:29 +02:00

counter_test.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

cql_auth_query_test.cc

…

cql_auth_syntax_test.cc

…

cql_functions_test.cc

cql3: Use correct comparator in timeuuid min/max

2021-01-13 11:07:29 +02:00

cql_query_group_test.cc

cql3: Delete some newlines

2020-10-19 15:40:55 -04:00

cql_query_large_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

cql_query_like_test.cc

…

cql_query_test.cc

db: add extracting service level info via CQL

2021-05-10 11:45:09 +02:00

crc_test.cc

…

data_listeners_test.cc

code: Force formatting of pointer in .debug and .trace

2020-08-26 20:44:11 +03:00

database_test.cc

database: add get_unlimited_query_max_result_size()

2021-05-05 13:30:42 +03:00

double_decker_test.cc

test: switch lsa-related tests (imr_test and double_decker_test) to seastar framework

2020-10-30 08:06:04 +02:00

duration_test.cc

…

dynamic_bitset_test.cc

…

enum_option_test.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

enum_set_test.cc

…

error_injection_test.cc

…

estimated_histogram_test.cc

…

extensions_test.cc

Mark CDC as GA

2020-11-12 12:36:13 +01:00

filtering_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

flat_mutation_reader_test.cc

flat_mutation_reader: require close

2021-04-25 11:35:07 +03:00

flush_queue_test.cc

test: everywhere: use seastar::testing::local_random_engine

2021-01-13 11:07:29 +02:00

fragmented_temporary_buffer_test.cc

…

frozen_mutation_test.cc

flat_mutation_reader: make sure to close flat_mutation_reader_from_mutations

2021-04-25 11:25:47 +03:00

gossip_test.cc

storage_service: Remove migration notifier dependency

2021-04-29 22:47:13 +03:00

gossiping_property_file_snitch_test.cc

storage-service: Subscribe to snitch to update topology

2021-01-13 16:41:34 +03:00

hash_test.cc

…

hashers_test.cc

test: add hashers_test

2021-01-15 18:28:24 +01:00

idl_test.cc

idl: add unit-test for const specifiers feature

2020-12-15 16:03:18 +03:00

index_with_paging_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

input_stream_test.cc

test: everywhere: use seastar::testing::local_random_engine

2021-01-13 11:07:29 +02:00

intrusive_array_test.cc

…

json_cql_query_test.cc

…

json_test.cc

…

keys_test.cc

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

large_paging_state_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

like_matcher_test.cc

…

limiting_data_source_test.cc

…

linearizing_input_stream_test.cc

…

loading_cache_test.cc

test: everywhere: use seastar::testing::local_random_engine

2021-01-13 11:07:29 +02:00

log_heap_test.cc

…

logalloc_test.cc

test: logalloc_test: harden background reclain test against cpu overcommit

2021-03-15 13:54:49 +02:00

managed_bytes_test.cc

utils: managed_bytes: add operator<< and to_hex for managed_bytes

2021-04-01 10:39:42 +02:00

managed_vector_test.cc

…

map_difference_test.cc

…

memtable_test.cc

test: everywhere: close flat_mutation_reader when done

2021-04-25 11:35:07 +03:00

multishard_combining_reader_as_mutation_source_test.cc

test: test_reader_lifecycle_policy: keep semaphores alive until all ops cease

2021-03-26 14:22:43 +02:00

multishard_mutation_query_test.cc

test: multishard_mutation_query_test: fuzzy-test: don't consume resource up-front

2021-04-29 11:45:53 +03:00

murmur_hash_test.cc

…

mutation_fragment_test.cc

reader_concurrency_semaphore: add stop method

2021-04-25 11:35:07 +03:00

mutation_query_test.cc

table: query, mutation_query: close querier when done

2021-04-25 11:35:07 +03:00

mutation_reader_test.cc

test: move reader_concurrency_semaphore related tests into separate file

2021-05-06 08:59:47 +03:00

mutation_test.cc

flat_mutation_reader: make sure to close reader passed to read_mutation_from_flat_mutation_reader

2021-04-25 11:35:07 +03:00

mutation_writer_test.cc

mutation_writer: add segregate_by_partition

2021-05-05 12:03:42 +03:00

mvcc_test.cc

partition_snapshot_row_cursor: Move cells hash creation to reader

2021-04-09 12:18:29 +03:00

network_topology_strategy_test.cc

storage_proxy, treewide: introduce names for vectors of inet_address

2021-05-05 18:36:48 +03:00

nonwrapping_range_test.cc

…

observable_test.cc

…

partitioner_test.cc

…

querier_cache_test.cc

reader_permit: always forward resources

2021-04-26 15:56:56 +03:00

query_processor_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

radix_tree_test.cc

test: Add tests for radix tree

2021-02-15 20:27:00 +03:00

raft_address_map_test.cc

raft: unit-tests for raft_address_map

2021-03-26 20:22:44 +03:00

raft_sys_table_storage_test.cc

raft: switch group_id type from uint64_t to utils::UUID

2021-05-02 16:39:54 +03:00

range_assert.hh

…

range_test.cc

…

range_tombstone_list_assertions.hh

…

range_tombstone_list_test.cc

test: Enhance test for range_tombstone_list de-overlapping

2020-12-28 18:26:48 +02:00

reader_concurrency_semaphore_test.cc

test: reader_concurrency_test: add reader_concurrency_semaphore_dump_reader_diganostics

2021-05-10 18:06:30 +03:00

restrictions_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

reusable_buffer_test.cc

…

role_manager_test.cc

tests: Use migration_manager from cql_test_env

2021-04-23 17:13:24 +03:00

row_cache_test.cc

test: row_cache_test: close readers

2021-04-25 11:35:07 +03:00

schema_change_test.cc

migration_manager: allow table updates with timestamp

2021-05-10 10:10:38 +02:00

schema_changes_test.cc

tests: don't pass temporary ranges to readers

2021-01-27 17:38:17 +02:00

schema_registry_test.cc

…

secondary_index_test.cc

treewide: remove timeout config from query options

2021-02-25 17:20:27 +01:00

serialization_test.cc

…

serialized_action_test.cc

utils: phased_barrier::operation do not leak gate entry when reassigned

2021-05-11 18:39:10 +03:00

small_vector_test.cc

…

snitch_reset_test.cc

storage-service: Subscribe to snitch to update topology

2021-01-13 16:41:34 +03:00

sstable_3_x_test.cc

test: everywhere: close flat_mutation_reader when done

2021-04-25 11:35:07 +03:00

sstable_conforms_to_mutation_source_test.cc

test: sstable_conforms_to_mutation_source_test: remove references to test_sstables_manager

2020-09-23 20:55:12 +03:00

sstable_datafile_test.cc

Merge "Introduce segregate scrub mode" from Botond

2021-05-18 13:43:01 +03:00

sstable_directory_test.cc

sstables: sstable_writer_config: add origin member

2021-02-01 16:45:52 +02:00

sstable_move_test.cc

database.*: Remove unused headers

2021-04-18 14:03:17 +03:00

sstable_mutation_test.cc

test: everywhere: close flat_mutation_reader when done

2021-04-25 11:35:07 +03:00

sstable_resharding_test.cc

test: sstable_resharding_test: prepare for asynchronously closed sstables_manager

2020-09-23 20:55:13 +03:00

sstable_set_test.cc

sstable_set: partitioned_sstable_set: clone: do clone all sstables

2021-03-18 11:15:59 +02:00

sstable_test.cc

test: everywhere: close flat_mutation_reader when done

2021-04-25 11:35:07 +03:00

sstable_test.hh

table: change build_new_sstable_list() to accept other sstable sets

2021-03-18 11:47:49 -03:00

stall_free_test.cc

utils: Add merge_to_gently

2020-08-11 10:37:34 +08:00

statement_restrictions_test.cc

cql3: Make get_clustering_bounds() use expressions

2021-03-10 21:25:43 -05:00

storage_proxy_test.cc

token_metdata: futurize update_normal_tokens

2020-12-22 10:35:15 +02:00

suite.yaml

test: move reader_concurrency_semaphore related tests into separate file

2021-05-06 08:59:47 +03:00

test_table.cc

keys: convert trichotomic comparators to return std::strong_ordering

2021-03-21 09:30:43 +02:00

top_k_test.cc

…

total_order_check.hh

test: total_order_checks: prepare for std::strong_ordering

2021-03-18 12:40:05 +02:00

tracing.cc

tracing: test/boost/tracing: fix use after free

2021-04-12 16:44:07 +03:00

transport_test.cc

cql3: values: remove raw_value_view::operator==

2021-04-01 10:42:07 +02:00

types_test.cc

Validate ascii values when creating from CQL

2020-11-02 16:47:32 +02:00

user_function_test.cc

user_function: throw on_internal_error if executed outside a seastar thread

2021-02-02 13:03:39 +02:00

user_types_test.cc

treewide: remove inclusions of storage_proxy.hh from headers

2021-04-20 21:23:00 +03:00

utf8_test.cc

test: utf8: add fragmented buffer validation tests

2020-10-21 11:14:44 +03:00

UUID_test.cc

uuid: switch the API to use std::chrono

2021-04-06 17:12:54 +03:00

view_build_test.cc

reader_concurrency_semaphore: add stop method

2021-04-25 11:35:07 +03:00

view_complex_test.cc

everywhere: Insert space after switch

2020-08-18 14:31:04 +03:00

view_schema_ckey_test.cc

…

view_schema_pkey_test.cc

…

view_schema_test.cc

tests: mv: Test dropping columns from base table

2020-08-20 14:53:07 +02:00

vint_serialization_test.cc

…

virtual_reader_test.cc

secondary index: fix index name in IndexInfo system table

2021-05-11 18:39:10 +03:00

virtual_table_mutation_source_test.cc

boost/tests: Test memtable_filling_virtual_table as mutation_source

2021-05-12 17:05:35 +02:00

virtual_table_test.cc

boost/tests: Add virtual_table_test for basic infrastructure

2021-05-12 17:05:35 +02:00