mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-24 02:20:37 +00:00
" This patchset adds a reshape operation to each compaction strategy; that is a strategy-specific way of detecting if SSTables are in-strategy or off-strategy, and in case they are offstrategy moving them to in-strategy. Often times the number of SSTables in a particular slice of the sstable set matters for that decision (number of SSTables in the same time window for TWCS, number of SSTables per tier for STCS, number of L0 SSTables for LCS). We want to be more lenient for operations that keep the node offline, like reshape at boot, but more forgiving for operations like upload, which run in maintenance mode. To accomodate for that the threshold for considering a slice of the SSTable set offstrategy is passed as a parameter Once this patchset is applied, the upload directory will reshape the SSTables before moving them to the main directory (if needed). One side effect of it is that it is no longer necessary to take locks for the refresh operation nor disable writes in the table. With the infrastructure that we have built in the upload directory, we can apply the same set of steps to populate_column_family. Using the sstable_directory to scan the files we can reshard and reshape (usually if we resharded a reshape will be necessary) with the node still offline. This has the benefit of never adding shared SSTables to the table. Applying this patchset will unlock a host of cleanups: - we can get rid of all testing for shared sstables, sstable_need_rewrite, etc. - we can remove the resharding backlog tracker. and many others. Most cleanups are deferred for a later patchset, though. " * 'reshard-reshape-v4' of github.com:glommer/scylla: distributed_loader: reshard before the node is made online distributed_loader: rework uploading of SSTables sstable_directory: add helper to reshape existing unshared sstables compaction_strategy: add method to reshape SSTables compaction: add a new compaction type, Reshape compaction: add a size and throught pretty printer. compaction: add default implementation for some pure functions tests: fix fragile database tests distributed_loader.cc: add a helper function to extract the highest SSTable version found distributed_loader.cc : extract highest_generation_seen code compaction_manager: rename run_resharding_job distributed_loader: assume populate_column_families is run in shard 0 api: do not allow user to meddle with auto compaction too early upload: use custom error handler for upload directory sstable_directory: fix debug message