The only user is row level repair: it is replaced with
downgrade_to_v1(make_empty_flat_reader_v2()). The row level reader has
lots of downgrade_to_v1() calls, we will deal with these later all at
once.
Another use is the empty mutation source, this is trivially converted to
use the v2 variant.
The patchset embeds the mutation_fragment upgrading logic from v1 to v2 into the mutation_fragment_queue. This way the mutation fragments coming to the mutation_fragment_queue can be v1, but the underlying query_reader receives mutation_fragment_v2, eliminating the last usage of query_reader (v1). The last commit removes query_reader, query_reader_handle and associated factory functions.
tests: unit(dev), dtest(incremental_repair_test, read_repair_test, repair_additional_test, repair_test)
Closes#10371
* github.com:scylladb/scylla:
readers: Remove queue_reader v1 and associated code.
repair: Make mutation_fragment_queue internally upgrade fragments to v2
repair: Make mutation_fragment_queue::impl a seastar::shared_ptr
It makes mutation_fragment_queue copyable and makes the pointer to
pending mutation fragments in next commit stable. This allows moving the
mutation_fragment_queue without breaking the underlying
upgrading_consumer.
This patch series splits up parts of repair pipeline to allow unit testing
various bits of code without having to run full dtest suite. The reason why
repair pipeline has no unit tests is that by definition repair requires multiple
nodes, while unit test environment works only for a single node.
However, it is possible to explicitly define interfaces between various parts of the
pipeline, inject dependencies and test them individually. This patch series is focused
on taking repair_rows_on_wire (frozen mutation representation of changes coming from
another node) and flushing them to an sstable.
The commits are split into the following parts:
- pulling out classes to separate headers so that they can be included (potentially indirectly) from the test,
- pulling out repair_meta::to_repair_rows_list and part of repair_meta::flush_rows_in_working_row_buf so that they can be tested,
- refactoring repair_writer so that the actual writing logic can be injected as dependency,
- creating the unit test.
tests: unit(dev), dtest(incremental_repair_test, read_repair_test, repair_additional_test, repair_test)
Closes#10345
* github.com:scylladb/scylla:
repair: Add unit test for flushing repair_rows_on_wire to disk.
repair: Extract mutation_fragment_queue and repair_writer::impl interfaces.
repair: Make parts of repair_writer interface private.
repair: Rename inputs to flush_rows.
repair: Make repair_meta::flush_rows a free function.
repair: Split flush_rows_in_working_row_buf to two functions and make one static.
repair: Rename inputs to to_repair_rows_list.
repair: Make to_repair_rows_list a free function.
repair: Make repair_meta::to_repair_rows_list a static function
repair: Fix indentation in repair_writer.
repair: Move repair_writer to separate header.
repair: Move repair_row to a separate header.
repair: Move repair_sync_boundary to a separate header.
repair: Move decorated_key_with_hash to separate header.
repair: Move row_repair hashing logic to separate class and file.
Querying the table is now done with the help of qctx directly. This
patch replaces it with a querying helper that calls the consumer
function with the entry struct as the argument.
After this change repair code can stop including query_context and
mess with untyped_result_set.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Patch the history entry loader to use the recently introduced
history entry. This is just to reduce the churn in the next patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Current code works directly on the qctx which is not nice. Instead,
make it use the system keyspace reference. To make it work, the patch
adds a helper method and introduces a helper struct for the table
entry. This struct will also be used to query the table (next patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Repair updates (and queries on start) the system.repair_history table
and thus depends on the system_keyspace object
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It allows pulling out the logic of writing internal representation
of repair mutations to disk. This in turn is needed to unit test
this functionality without spinning up clusters, which significantly
improves developer iteration time.
It allows pulling out the logic of convering on-the-wire representation
of repair mutations to an internal representation used later for
flushing repair mutations to disk. This in turn is needed to unit test
the functionality without spinning up clusters, which significantly
improves developer iteration time.
The flat_mutation_reader files were conflated and contained multiple
readers, which were not strictly necessary. Splitting optimizes both
iterative compilation times, as touching rarely used readers doesn't
recompile large chunks of codebase. Total compilation times are also
improved, as the size of flat_mutation_reader.hh and
flat_mutation_reader_v2.hh have been reduced and those files are
included by many file in the codebase.
With changes
real 29m14.051s
user 168m39.071s
sys 5m13.443s
Without changes
real 30m36.203s
user 175m43.354s
sys 5m26.376s
Closes#10194
The flush of hints and batchlog are needed only for the table with
tombstone_gc_mode set to repair mode. We should skip the flush if the
tombstone_gc_mode is not repair mode.
Fixes#10004Closes#10124
More and more places are using the repair[uuid]: format for logging
repair jobs with the uuid. Convert more places to use the new format to
unify the log format.
This makes it easier to grep a specific repair job in the log.
Closes#10125
Rather than using a static unit32_t next_id,
move the next_id variable into repair_service shard 0
and manage it there.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Note that we can't pass the repair_service container()
from its ctor since it's not populated until all shards start.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Keep repair_meta in repair_meta_map as shared_ptr<repair_meta>
rather than lw_shared_ptr<repair_meta> so it can be defined
in the header file and use only forward-declared
class repair_meta.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Define the static {get,insert,remove}_repair_meta functions out
of the repair_meta class definition, on the way of moving them,
along with the repair_meta_map itself, to repair_service.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
All repair_meta needs is the local instance.
Need be, it's a peering service so the container()
can be used if needed.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Use repair_service as the authoritative source for
the database, messaging_service, system_distributed_keyspace,
and view_update_generator, similar to repair_info.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
"
Repair obtains a permit for each repair-meta instance it creates. This
permit is supposed to track all resources consumed by that repair as
well as ensure concurrency limit is respected. However when the
non-local reader path is used (shard config of master != shard config of
follower), a second permit will be obtained -- for the shard reader of
the multishard reader. This creates a situation where the repair-meta's
permit can block the shard permit, creating a deadlock situation.
This patch solves this by dropping the count resource on the
repair-meta's permit when a non-local reader path is executed -- that is
a multishard reader is created.
Fixes: #9751
"
* 'repair-double-permit-block/v4' of https://github.com/denesb/scylla:
repair: make sure there is one permit per repair with count res
reader_permit: add release_base_resource()
"
SSTables created by repair will potentially not conform to the
compaction strategy
layout goal. If node shuts down before off-strategy has a chance to
reshape those files, node will be forced to reshape them on restart.
That
causes unexpected downtime. Turns out we can skip reshape of those files
on boot, and allow them to be reshaped after node becomes online, as if
the node never went down. Those files will go through same procedure as
files created by repair-based ops. They will be placed in maintenance
set,
and be reshaped iteratively until ready for integration into the main
set.
"
Fixes#9895.
tests: UNIT(dev).
* 'postpone_reshape_on_repair_originated_files' of https://github.com/raphaelsc/scylla:
distributed_loader: postpone reshape of repair-originated sstables
sstables: Introduce filter for sstable_directory::reshape
table: add fast path when offstrategy is not needed
sstables: add constant for repair origin
This series greatly reduces gossipers' dependence on `seastar::async` (yet, not completely).
`i_endpoint_state_change_subscriber` callbacks are converted to return futures (again, to get rid of `seastar::async` dependency), all users are adjusted appropriately (e.g. `storage_service`, `cdc::generation_service`, `streaming::stream_manager`, `view_update_backlog_broker` and `migration_manager`).
This includes futurizing and coroutinizing the whole function call chain up to the `i_endpoint_state_change_subscriber` callback functions.
To aid the conversion process, a non-`seastar::async` dependent variant of `utils::atomic_vector::for_each` is introduced (`for_each_futurized`). A different name is used to clearly distinguish converted and non-converted code, so that the last step (remove `seastar::async()` wrappers around callback-calling code in gossiper) is easier. This is left for a follow-up series, though.
Tests: unit(dev)
Closes#9844
* github.com:scylladb/scylla:
service: storage_service: coroutinize `set_gossip_tokens`
service: storage_service: coroutinize `leave_ring`
service: storage_service: coroutinize `handle_state_left`
service: storage_service: coroutinize `handle_state_leaving`
service: storage_service: coroutinize `handle_state_removing`
service: storage_service: coroutinize `do_drain`
service: storage_service: coroutinize `shutdown_protocol_servers`
service: storage_service: coroutinize `excise`
service: storage_service: coroutinize `remove_endpoint`
service: storage_service: coroutinize `handle_state_replacing`
service: storage_service: coroutinize `handle_state_normal`
service: storage_service: coroutinize `update_peer_info`
service: storage_service: coroutinize `do_update_system_peers_table`
service: storage_service: coroutinize `update_table`
service: storage_service: coroutinize `handle_state_bootstrap`
service: storage_service: futurize `notify_*` functions
service: storage_service: coroutinize `handle_state_replacing_update_pending_ranges`
repair: row_level_repair_gossip_helper: coroutinize `remove_row_level_repair`
locator: reconnectable_snitch_helper: coroutinize `reconnect`
gms: i_endpoint_state_change_subscriber: make callbacks to return futures
utils: atomic_vector: introduce future-returning `for_each` function
utils: atomic_vector: rename `for_each` to `thread_for_each`
gms: gossiper: coroutinize `start_gossiping`
gms: gossiper: coroutinize `force_remove_endpoint`
gms: gossiper: coroutinize `do_status_check`
gms: gossiper: coroutinize `remove_endpoint`
"
The first patch introduces evictable_reader_v2, and the second one
further simplifies it. We clone instead of converting because there
is at least one downstream (by way of multishard_combining_reader) use
that is not itself straightforward to convert at the moment
(multishard_mutation_query), and because evictable_reader instances
cannot be {up,down}graded (since users also access the undelying
buffers). This also means that shard_reader, reader_lifecycle_policy
and multishard_combining_reader have to be cloned.
"
* tag 'clone-evictable-reader-to-v2/v3' of https://github.com/cmm/scylla:
convert make_multishard_streaming_reader() to flat_mutation_reader_v2
convert table::make_streaming_reader() to flat_mutation_reader_v2
convert make_flat_multi_range_reader() to flat_mutation_reader_v2
view_update_generator: remove unneeded call to downgrade_to_v1()
introduce multishard_combining_reader_v2
introduce shard_reader_v2
introduce the reader_lifecycle_policy_v2 abstract base
evictable_reader_v2: further code simplifications
introduce evictable_reader_v2 & friends