Files
scylladb/compaction
Kefu Chai 9ce3695a0d compaction_manager: prevent gc-only sstables from being compacted
before this change, there are chances that the temporary sstables
created for collecting the GC-able data create by a certain
compaction can be picked up by another compaction job. this
wastes the CPU cycles, adds write amplification, and causes
inefficiency.

in general, these GC-only SSTables are created with the same run id
as those non-GC SSTables, but when a new sstable exhausts input
sstable(s), we proactively replace the old main set with a new one
so that we can free up the space as soon as possible. so the
GC-only SSTables are added to the new main set along with
the non-GC SSTables, but since the former have good chance to
overlap the latter. these GC-only SSTables are assigned with
different run ids. but we fail to register them to the
`compaction_manager` when replacing the main sstable set.
that's why future compactions pick them up when performing compaction,
when the compaction which created them is not yet completed.

so, in this change,

* to prevent sstables in the transient stage from being picked
  up by regular compactions, a new interface class is introduced
  so that the sstable is always added to registration before
  it is added to sstable set, and removed from registration after
  it is removed from sstable set. the struct helps to consolidate
  the regitration related logic in a single place, and helps to
  make it more obvious that the timespan of an sstable in
  the registration should cover that in the sstable set.
* use a different run_id for the gc sstable run, as it can
  overlap with the output sstable run. the run_id for the
  gc sstable run is created only when the gc sstable writer
  is created. because the gc sstables is not always created
  for all compactions.

please note, all (indirect) callers of
`compaction_task_executor::compact_sstables()` passes a non-empty
`std::function` to this function, so there is no need to check for
empty before calling it. so in this change, the check is dropped.

Fixes #14560
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14725

(cherry picked from commit fdf61d2f7c)

Closes #14827
2023-08-04 09:59:10 +03:00
..