As the cleanup process can now be driven by the compaction strategy,
let's move cleanup into a new task type that uses the new
compaction_strategy::get_cleanup_compaction_jobs().
By the time being all strategies are using the default method that
returns one descriptor for each sstable that needs clean up.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
For compaction to be able to purge expired data, like tombstones, a
sstable set snapshot is set in the compaction descriptor.
That's a decision that belongs to task type. For example, all regular
compaction enable GC, whereas scrub for example doesn't for safety
reasons.
The problem is that the decision is being made by every instantiation
of compaction_descriptor in the strategies, which is both unnecessary
and also adds lots of boilerplate to the code, making it hard to
understand and work with.
As sstable set snapshot is an implementation detail, a new method
is being added to compaction_descriptor to make the intention
clearer, making the interface easier to understand.
can_purge_tombstones, used previously by rewrite task only, is being
reused for communicating GC intention into task::compact_sstables().
The boilerplate was a pain when adding a new strategy method for
the ongoing work on cleanup, described by issue #10097.
Another benefit is that we'll now only create a set snapshot when
compaction will really run. Before, it could happen that the snapshot
would be discarded if the compaction attempt had to be postponed,
which is a waste of cpu cycles.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Flushing the base table triggers view building
and corresponding compactions on the view tables.
Temporarily disable compaction on both the base
table and all its view before flush and snapshot
since those flushed sstables are about to be truncated
anyway right after the snapshot is taken.
This should make truncate go faster.
In the process, this series also embeds `database::truncate_views`
into `truncate` and coroutinizes both
Refs #6309
Test: unit(dev)
Closes#10203
* github.com:scylladb/scylla:
replica/database: truncate: fixup indentation
replica/database: truncate: temporarily disable compaction on table and views before flush
replica/database: truncate: coroutinize per-view logic
replica/database: open-code truncate_view in truncate
replica/database: truncate: coroutinize run_with_compaction_disabled lambda
replica/database: coroutinize truncate
compaction_manager: add disable_compaction method
Compaction manager is calling back the table to run off-strategy compaction,
but the logic clearly belongs to manager which should perform the
operation independently and only call table to update its state with the
result.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220315174504.107926-2-raphaelsc@scylladb.com>
Returns a RAII class compaction_reenabler
that conditionally reenables compaction
for the given table when destroyed.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
With compact_sstables() now living in compaction_manager::task,
release_exhausted no longer has to live inside compaction_descriptor,
which is a good direction because implementation detail is being
removed from the interface.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220311023410.250149-2-raphaelsc@scylladb.com>
Table submits compaction request into manager, which in turn calls
back table to run the compaction when the time has come, i.e.:
table -> compaction manager -> table -> execute compaction
But manager should not rely on table to run compaction, as compaction
execution procedure sits one layer below the manager and should be
accessed directly by it, i.e:
table -> compaction manager -> execute compaction
This makes code easier to understand and update_compaction_history()
can now be noop for unit tests using table_state.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220311023410.250149-1-raphaelsc@scylladb.com>
Replace the _compaction_running boolean member
by calculating _state == state::active
now that setup_new_compaction switches state to
`active`
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Move the business logic into the task specific classes.
Separating initialization during task construction,
from the compaction_done task, moved into
a do_run() method, and in some cases moving
a lambda function that was called per table (as in
rewrite_sstables) into a private method of the
derived class.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add an enum class representing the task state machine
and a switch_state function to transition between the states
and update the corresponding compaction_manager stats counters.
Refs #9974
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Define task::describe and use it via operator<<
to print the task metadata to the log in a standard way.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Move the task-internal parts of can_proceed
to a respective compaction_manager::task method,
preparing for turning it into a class with
a proper hierarchy of access to private members.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And use it to get the compaction state of the
table to compact.
It will be used in a later patch
to manage the task state from task
methods.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
task_stop is called exclusively from stop_tasks,
Now that stop_tasks calls task::stop() directly,
there is no need for this separation, so open-code
task_stop in stop_tasks, using coroutines.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220302081547.2205813-1-bhalevy@scylladb.com>
Currently the output_run_identifier is assigned right
after the calling setup_new_compaction.
Move setting the uuid to setup_new_compaction to simplify
the flow.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220301083643.1845096-1-bhalevy@scylladb.com>
Instead, rely solely on compaction_data.abort source
that is task::stop now uses to stop the task.
This makes task stopping permanent, so it can't be undone
(as used to be the case where task_stop
set stopping to false after waiting for compaction_done,
to allow rerite_sstables's task to be created before
calling run_with_compaction_disabled, and start
running after it - which is no longer the case)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220301083535.1844829-1-bhalevy@scylladb.com>
Provide a function to make a sstables::compaction_stopped_exception
based on the information in the stopped task.
To be reused by the next patch that will
also throw this exception from the retry sleep path,
when the task is stopped.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that the table layer is using perform_offstrategy,
submit_offstrategy is no longer in use and can be deleted.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Return a future from perform_offstrategy, resolved
when the offstrategy compaction completes so that callers
can wait on it.
submit_offstrategy still submits the offstrategy compaction
in the background.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Move the ::database, ::keyspace, and ::table classes to a new replica
namespace and replica/ directory. This designates objects that only
have meaning on a replica and should not be used on a coordinator
(but note that not all replica-only classes should be in this module,
for example compaction and sstables are lower-level objects that
deserve their own modules).
The module is imperfect - some additional classes like distributed_loader
should also be moved, but there is only one way to untie Gordian knots.
Closes#9872
* github.com:scylladb/scylla:
replica: move ::database, ::keyspace, and ::table to replica namespace
database: Move database, keyspace, table classes to replica/ directory
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.
References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.
scylla-gdb.py is adjusted to look for both the new and old names.
It's hard to integrate maybe_stop_on_error() with coroutines as it
accepts a resolved future, not an exception pointer. Let's adjust
its interface, making it more flexible to work with.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
"
TWCS perform STCS on a window as long as it's the most recent one.
From there on, TWCS will compact all files in the past window into
a single file. With some moderate write load, it could happen that
there's still some compaction activity in that past window, meaning
that per-window major may miss some files being currently compacted.
As a result, a past window may contain more than 1 file after all
compaction activity is done on its behalf, which may increase read
amplification. To avoid that, TWCS will now make sure that per-window
major is serialized, to make sure no files are missed.
Fixes#9553.
tests: unit(dev).
"
* 'fix_twcs_per_window_major_v3' of https://github.com/raphaelsc/scylla:
TWCS: Make sure major on past window is done on all its sstables
TWCS: remove needless param for STCS options
TWCS: kill unused param in newest_bucket()
compaction: Implement strategy control and wire it
compaction: Add interface to control strategy behavior.
Allow stopping compaction by type on a given keyspace and list of tables.
Also add api unit test suite that tests the existing `stop_compaction` api
and the new `stop_keyspace_compaction` api.
Fixes#9700Closes#9746
* github.com:scylladb/scylla:
api: storage_service: validate_keyspace: improve exception error message
api: compaction_manager: add stop_keyspace_compaction
api: storage_service: expose validate_keyspace and parse_tables
api: compaction_manager: stop_compaction: fix type description
compaction_manager: stop_compaction: expose optional table*
test: api: add basic compaction_manager test
We have three semaphores for serialization of maintenance ops.
1) _rewrite_sstables_sem: for scrub, cleanup and upgrade.
2) _major_compaction_sem: for major
3) _custom_job_sem: for reshape, resharding and offstrategy
scrub, cleanup and upgrade should be serialized with major,
so rewrite sem should be merged into major one.
offstrategy is also a maintenance op that should be serialized
with others, to reduce compaction aggressiveness and space
requirement.
resharding is one-off operation, so can be merged there too.
the same applies for reshape, which can take long and not
serializing it with other maintenance activity can lead to
exhaustion of resources and high space requirement.
let's have a single semaphore to guarantee their serialization.
deadlock isn't an issue because locks are always taken in same
order.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211201182046.100942-1-raphaelsc@scylladb.com>
new code in manager adopted name and type table, whereas historical
code still uses name and type column family. let's make it consistent
for newcomers to not get confused.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
After commit 1f5b17f, overlapping can be introduced in level 1 because
procedure that filters out sstables from partial runs is considering
inactive tasks, so L1 sstables can be incorrectly filtered out from
next compaction attempt. When L0 is merged into L1, overlapping is
then introduced in L1 because old L1 sstables weren't considered in
L0 -> L1 compaction.
From now on, compaction_manager::get_candidates() will only consider
active tasks, to make sure actual partial runs are filtered out.
Fixes#9693.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211129180459.125847-1-raphaelsc@scylladb.com>
Similar to #9313, stop_compaction should also reuse the
stop_ongoing_comapctions() infrastructure and wait on ongoing
compactions of the given type to stop.
Fixes#9695
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now stop_ongoing_compactions(reason) is equivalent to
to stop_ongoing_compactions(reason, nullptr, std::nullopt)
so share the code of the latter for the former entry point.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And make the table optional as well, so it can be used
by stop_compaction() to a particular compaction type on all tables.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Optionally get running compaction on the provided table.
This is required for stop_ongoing_compactions on a given table.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>