LCS backlog tracker uses STCS tracker for L0. Turns out LCS tracker
is calling STCS tracker's replace_sstables() with empty arguments
even when higher levels (> 0) *only* had sstables replaced.
This unnecessary call to STCS tracker will cause it to recompute
the L0 backlog, yielding the same value as before.
As LCS has a fragment size of 0.16G on higher levels, we may be
updating the tracker multiple times during incremental compaction,
which operates on SSTables on higher levels.
Inefficiency is fixed by only updating the STCS tracker if any
L0 sstable is being added or removed from the table.
This may be fixing a quadratic behavior during boot or refresh,
as new sstables are loaded one by one.
Higher levels have a substantial higher number of sstables,
therefore updating STCS tracker only when level 0 changes, reduces
significantly the number of times L0 backlog is recomputed.
Refs #12499.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12676
(cherry picked from commit 1b2140e416)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
LCS reshape is compacting all levels if a single one breaks
disjointness. That's unnecessary work because rewriting that single
level is enough to restore disjointness. If multiple levels break
disjointness, they'll each be reshaped in its own iteration, so
reducing operation time for each step and disk space requirement,
as input files can be released incrementally.
Incremental compaction is not applied to reshape yet, so we need to
avoid "major compaction", to avoid the space overhead.
But space overhead is not the only problem, the inefficiency, when
deciding what to reshape when overlapping is detected, motivated
this patch.
Fixes#12495.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12496
(cherry picked from commit f2f839b9cc)
compaction_manager::task (and thus compaction_data) can be stopped
because of many different reasons. Thus, abort can be requested more
than once on compaction_data abort source causing a crash.
To prevent this before each request_abort() we check whether an abort
was requested before.
Closes#12004
(cherry picked from commit 7ead1a7857)
Fixes#12002.
When being stopped compaction manager may step on ENOSPC. This is not a
reason to fail stopping process with abort, better to warn this fact in
logs and proceed as if nothing happened
refs: #11245
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
If user stops off-strategy via API, compaction manager can decide
to give up on it completely, so data will sit unreshaped in
maintenance set, preventing it from being compacted with data
in the main set. That's problematic because it will probably lead
to a significant increase in read and space amplification until
off-strategy is triggered again, which cannot happen anytime
soon.
Let's handle it by moving data in maintenance set into main one,
even if unreshaped. Then regular compaction will be able to
continue from where off-strategy left off.
Fixes#11543.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#11545
(cherry picked from commit a04047f390)
Currently they are copied for the get_sstables function
so this change reduces copies.
Also, it will allow further decoupling of compaction_manager
from replica::database, by letting the caller of
perform_cleanup and perform_sstable_upgrade get the
owned token ranges from db and pass it to the perform_*
functions in the following patch.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And pass a reference to it to the database rather
than having the database construct its own compaction_manager.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To prepare for the next patch, implement default initialization
of the scheduling_group and io_priority_class, to the default values.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
over the rest api
Performing compaction scrub user did not know whether an operation
was aborted.
If compaction scrub is aborted, return status the user gets over
rest api is set to 1.
Performing compaction scrub user did not know whether any validation
errors were encountered.
The number of validation errors per given compaction scrub is gathered
and summed from each shard. Basing on that value return status over
the rest api is set to 3 if any validation errors were encountered.
The number of validation errors is registered in metrics. Metrics
provide common counters for all scrub operation within a compaction
manager, though. Thus, to check the exact number of validation errors,
the comparison of counters before and after scrub operation needs
to be done.
Called from try_flush_memtable_to_sstable,
maybe_wait_for_sstable_count_reduction will wait for
compaction to catch up with memtable flush if there
the bucket to compact is inflated, having too many
sstables. In that case we don't want to add fuel
to the fire by creating yet another sstable.
Fixes#4116
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To make sure fully_expired sstables are not missed
if get_sstables_for_compaction is called just heuristically,
change the state by setting _last_expired_check
to the current time only when no fully_expired_sstables are found
among the candidates.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Print the compaction_strategy `this` pointer
so we can distinguish between different instance of the
compaction_strategy object (some code paths copy it and
some may instantiate a branch new compaction_strategy object).
The motivation is detecting when the side effects of this function are
applied on the "master" instance, stored in the table shard.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This patch makes compaction_static_shares liveupdateable
to avoid having to restart the cluster after updating
this config.
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
This patch adds the _static_shares variable to the backlog_controller so that
instead of having to use a separate constructor when controller is disabled,
we can use a single constructor and periodically check on the adjust method
if we should use the static shares or the controller. This will be useful on
the next patches to make compaction_static_shares and memtable_flush_static_shares
live updateable.
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
compaction_manager.cc still cannot stop including replica/database.hh
because upgrade and scrub still take replica::database as param,
but I'll remove it soon in another series.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
rewrite_sstables() is used by maintenance compactions that perform
an operation on a single file at a time.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Now that submit() switched to table_state, compaction_reenabler
and friends can switch to table_state too.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
auto_compaction_disabled_by_user is a configuration that can be enabled
or disabled on a particular table. We're adding this interface to
avoid having to push the configuration for every compaction_state,
which would result in redundant information as the configuration
value is the same for all table states.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The idea is that we'll have a single on-completion interface for both
"in-strategy" and off-strategy compactions, so not to pollute table_state
with one interface for each.
replica::table::on_compaction_completion is being moved into private namespace.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
compaction_manager needs this interface when setting the sstable
creation lambda in compaction_descriptor, which is then forwarded
into the actual compaction procedure.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
they're used to stop all ongoing compaction on behalf of a given table
T. Today, each table has a single table_state representing it, but after
we implement compaction groups, we'll need to call the procedure for
each group in a table. But the discussion doesn't belong here, as
compaction group work will only come later. By the time being, we're
only making compaction manager fully switch to table_state.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The only external user of get_compactions() doesn't use any filtering,
so after table_state switch, one will be allowed to get all jobs
running associated with a table_state.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
propagate_replacement is used by incremental compaction to notify
ongoing compaction about sstable list updates, such that the
ongoing job won't hold reference to exhausted sstables.
So it needs to switch to table_state, too.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
in_strategy_sstables() doesn't have to be implemented in table, as it's
simply about main set with maintenance and staging files filtered out.
Also, let's make it switch to table_state as part of ongoing work.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>