Files
scylladb/db
Avi Kivity 7351db7cab Merge "Reshape upload files and reshard+reshape at boot" from Glauber
"

This patchset adds a reshape operation to each compaction strategy;
that is a strategy-specific way of detecting if SSTables are in-strategy
or off-strategy, and in case they are offstrategy moving them to in-strategy.

Often times the number of SSTables in a particular slice of the sstable set
matters for that decision (number of SSTables in the same time window for TWCS,
number of SSTables per tier for STCS, number of L0 SSTables for LCS). We want
to be more lenient for operations that keep the node offline, like reshape at
boot, but more forgiving for operations like upload, which run in maintenance
mode. To accomodate for that the threshold for considering a slice of the SSTable
set offstrategy is passed as a parameter

Once this patchset is applied, the upload directory will reshape the SSTables
before moving them to the main directory (if needed). One side effect of it
is that it is no longer necessary to take locks for the refresh operation nor
disable writes in the table.

With the infrastructure that we have built in the upload directory, we can
apply the same set of steps to populate_column_family. Using the sstable_directory
to scan the files we can reshard and reshape (usually if we resharded a reshape
will be necessary) with the node still offline. This has the benefit of never
adding shared SSTables to the table.

Applying this patchset will unlock a host of cleanups:
- we can get rid of all testing for shared sstables, sstable_need_rewrite, etc.
- we can remove the resharding backlog tracker.

and many others. Most cleanups are deferred for a later patchset, though.
"

* 'reshard-reshape-v4' of github.com:glommer/scylla:
  distributed_loader: reshard before the node is made online
  distributed_loader: rework uploading of SSTables
  sstable_directory: add helper to reshape existing unshared sstables
  compaction_strategy: add method to reshape SSTables
  compaction: add a new compaction type, Reshape
  compaction: add a size and throught pretty printer.
  compaction: add default implementation for some pure functions
  tests: fix fragile database tests
  distributed_loader.cc: add a helper function to extract the highest SSTable version found
  distributed_loader.cc : extract highest_generation_seen code
  compaction_manager: rename run_resharding_job
  distributed_loader: assume populate_column_families is run in shard 0
  api: do not allow user to meddle with auto compaction too early
  upload: use custom error handler for upload directory
  sstable_directory: fix debug message
2020-06-18 17:04:53 +03:00
..
2019-10-25 12:04:44 +02:00