Commit Graph

394 Commits

Author SHA1 Message Date
Avi Kivity
87b08c957f Merge 'treewide: drop FMT_DEPRECATED_OSTREAM macro and homebrew range formatters' from Kefu Chai
before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<.
with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Closes scylladb/scylladb#17968

* github.com:scylladb/scylladb:
  treewide: do not define FMT_DEPRECATED_OSTREAM
  treewide: include fmt/ranges.h and/or fmt/std.h
  utils/managed_bytes: add support for fmt::to_string() to bytes and friends
2024-04-20 22:25:00 +03:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Raphael S. Carvalho
223214439b compaction: Disconsider active tables in the hourly compaction reevaluation
This hourly reevaluation is there to help tablets that have very low
write activity, which can go a long time without flushing a memtable,
and it's important to reevaluate compaction as data can get expired.
Today it can happen that we reevaluate a table that is being compacted
actively, which is waste of cpu as the reevaluation will happen anyway
when there are changes to sstable set. This waste can be amplified with
a significant tablet count in a given shard.
Eventually, we could make the revaluation time per table based on
expiration histogram, but until we get there, let's avoid this waste
by only reevaluating tables that are compaction idle for more than 1h.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#18280
2024-04-19 14:33:40 +03:00
Pavel Emelyanov
1f44a374b8 error_injection: Overload inject() instead of inject_with_handler()
The inject_with_handler() method accepts a coroutine that can be called
wiht injection_handler. With such function as an argument, there's no
need in distinctive inject_with_handler() name for a method, it can be
overload of all the existing inject()-s

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-03-11 19:30:19 +03:00
Raphael S. Carvalho
f07c233ad5 Fix potential data resurrection when another compaction type does cleanup work
Since commit f1bbf70, many compaction types can do cleanup work, but turns out
we forgot to invalidate cache on their completion.

So if a node regains ownership of token that had partition deleted in its previous
owner (and tombstone is already gone), data can be resurrected.

Tablet is not affected, as it explicitly invalidates cache during migration
cleanup stage.

Scylla 5.4 is affected.

Fixes #17501.
Fixes #17452.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#17502
2024-02-25 13:08:04 +02:00
Avi Kivity
605bf6e221 range.hh: retire
range.hh was deprecated in bd794629f9 (2020) since its names
conflict with the C++ library concept of an iterator range. The name
::range also mapped to the dangerous wrapping_interval rather than
nonwrapping_interval.

Complete the deprecation by removing range.hh and replacing all the
aliases by the names they point to from the interval library. Note
this now exposes uses of wrapping intervals as they are now explicit.

The unit tests are renamed and range.hh is deleted.

Closes scylladb/scylladb#17428
2024-02-21 00:24:25 +02:00
Lakshmi Narayanan Sreethar
e86965c272 compaction: run rewrite_sstables_compaction_task_executor tasks in maintenance group
Use maintenance group to run all the compaction tasks that use the
rewrite_sstables_compaction_task_executor.

Fixes #16699

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#17112
2024-02-02 11:18:49 +02:00
Botond Dénes
c67698ea06 compaction/compaction_manager: perform_cleanup(): hold the compaction gate
While the cleanup is ongoing. Otherwise, a concurrent table drop might
trigger a use-after-free, as we have seen in dtests recently.

Fixes: #16770

Closes scylladb/scylladb#16874
2024-01-25 14:52:50 +01:00
Benny Halevy
51a46aa83b compaction_manager: perform_task_on_all_files: return early when there are no sstables to compact
Prevent the creation of a compaction task when
the list of sstables is known to be empty ahead
of time.

Refs scylladb/scylladb#16694
Fixes scylladb/scylladb#16803

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-01-17 11:53:39 +02:00
Benny Halevy
bd1d65ec38 compaction_manager: perform_cleanup: use compaction_manager::eligible_for_compaction
3b424e391b introduced a loop
in `perform_cleanup` that waits until all sstables that require
cleanup are cleaned up.

However, with f1bbf705f9,
an sstable that is_eligible_for_compaction (i.e. it
is not in staging, awaiting view update generation),
may already be compacted by e.g. regular compaction.
And so perform_cleanup should interrupt that
by calling try_perform_cleanup, since the latter
reevaluates `update_sstable_cleanup_state` with
compaction disabled - that stops ongoing compactions.

Refs scylladb/scylladb#15673

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-01-17 11:53:39 +02:00
Kefu Chai
eb9216ef11 compaction: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16707
2024-01-10 11:07:36 +02:00
Aleksandra Martyniuk
6f13e55187 tasks: call release_resources when task is finished
Call task_manager::task::impl::release_resources when task is finished
instead of putting the responsibility on user.

Closes scylladb/scylladb#16660
2024-01-09 11:41:54 +02:00
Lakshmi Narayanan Sreethar
1d6eaf2985 compaction manager: remove: cleanup _compaction_state on exceptions
If for some reason an exception is thrown in compaction_manager::remove,
it might leave behind stale table pointers in _compaction_state. Fix
that by setting up a deffered action to perform the cleanup.

Fixes #16635

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16632
2024-01-03 22:03:24 +02:00
Raphael S. Carvalho
dd1a6d6309 compaction: Add splitting compaction task to manager
The task for splitting compaction will run until all sstables
in the main set are split. The only exceptions are shutdown
or user has explicitly asked for abort.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-12-17 11:40:09 -03:00
Raphael S. Carvalho
f87161e556 compaction: Prepare rewrite_sstables_compaction_task_executor to be reused for splitting
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-12-17 11:40:09 -03:00
Raphael S. Carvalho
c96938c49b compaction: remove scrub-specific code from rewrite_sstables_compaction_task_executor
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-12-17 11:40:09 -03:00
Benny Halevy
0bcce35abd treewide: get rid of now unused fb_utilities
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-05 16:22:49 +02:00
Aleksandra Martyniuk
8639eae0ce test: test running compaction task abort
Test whether a task which is aborted while running has a proper status.
2023-11-24 19:25:20 +01:00
Aleksandra Martyniuk
aa7bba2d8b compaction: abort task manager compaction tasks
Set top level compaction tasks as abortable.

Compaction tasks which have no children, i.e. compaction task
executors, have abort method overriden to stop compaction data.
2023-11-24 15:44:34 +01:00
Botond Dénes
0ae1335daa Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk"
This reverts commit 11cafd2fc8, reversing
changes made to 2bae14f743.

Reverting because this series causes frequent CI failures, and the
proposed quickfix causes other failures of its own.

Fixes: #16113
2023-11-22 17:44:07 +02:00
Aleksandra Martyniuk
a63a6dcd93 test: test running compaction task abort
Test whether a task which is aborted while running has a proper status.
2023-11-13 16:06:36 +01:00
Aleksandra Martyniuk
599d6ebd52 compaction: abort task manager compaction tasks
Set top level compaction tasks as abortable.

Compaction tasks which have no children, i.e. compaction task
executors, have abort method overriden to stop compaction data.
2023-11-13 15:46:58 +01:00
Benny Halevy
68a7bbe582 compaction_manager: perform_cleanup: ignore condition_variable_timed_out
The polling loop was intended to ignore
`condition_variable_timed_out` and check for progress
using a longer `max_idle_duration` timeout in the loop.

Fixes #15669

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#15671
2023-11-12 13:53:51 +02:00
Botond Dénes
1cccc86813 Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk"
This reverts commit 2860d43309, reversing
changes made to a3621dbd3e.

Reverting because rest_api.test_compaction_task started failing after
this was merged.

Fixes: #16005
2023-11-09 10:43:11 +01:00
Aleksandra Martyniuk
520d9db92d test: test running compaction task abort
Test whether a task which is aborted while running has a proper status.
2023-10-19 10:47:20 +02:00
Aleksandra Martyniuk
0681795417 compaction: abort task manager compaction tasks
Set top level compaction tasks as abortable.

Compaction tasks which have no children, i.e. compaction task
executors, have abort method overriden to stop compaction data.
2023-10-19 10:47:17 +02:00
Aleksandra Martyniuk
198119f737 compaction: add get_progress method to compaction_task_impl
compaction_task_impl::get_progress is used by the lowest level
compaction tasks which progress can be taken from
compaction_progress_monitor.
2023-10-12 17:16:05 +02:00
Aleksandra Martyniuk
3553556708 compaction: keep compaction_progress_monitor in compaction_task_executor
Keep compaction_progress_monitor in compaction_task_executor and pass a reference
to it further, so that the compaction progress could be retrieved out of it.
2023-10-12 17:03:46 +02:00
Aleksandra Martyniuk
f42be12f43 repair: release resources of shard_repair_task_impl
Before integration with task manager the state of one shard repair
was kept in repair_info. repair_info object was destroyed immediately
after shard repair was finished.

In an integration process repair_info's fields were moved to
shard_repair_task_impl as the two served the similar purposes.
Though, shard_repair_task_impl isn't immediately destoyed, but is
kept in task manager for task_ttl seconds after it's complete.
Thus, some of repair_info's fields have their lifetime prolonged,
which makes the repair state change delayed.

Release shard_repair_task_impl resources immediately after shard
repair is finished.

Fixes: #15505.

Closes scylladb/scylladb#15506
2023-09-26 17:09:47 +03:00
Botond Dénes
d5f095d5a4 Merge 'Make interaction of compaction strategy with sstable runs more robust and efficient' from Raphael "Raph" Carvalho
SSTable runs work hard to keep the disjointness invariant, therefore they're
expensive to build from scratch.
For every insertion, it keeps the elements sorted by their first key in
order to reject insertion of element that would introduce overlapping.

Additionally, a sstable run can grow to dozens of elements (or hundreds)
therefore, we can also make interaction with compaction strategies more
efficient by not copying them when building a list of candidates in compaction
manager. And less fragile by filtering out any sstable runs that are not
completely eligible for compaction.

Previously, ICS had to give up on using runs managed by sstable set due to
fragility of the interface (meaning runs are being built from scratch
on every call to the strategy, which is very inefficient, but that had to
be done for correctness), but now we can restore that.

Closes scylladb/scylladb#15440

* github.com:scylladb/scylladb:
  compaction: Switch to strategy_control::candidates() for regular compaction
  tests: Prepare sstable_compaction_test for change in compaction_strategy interface
  compaction: Allow strategy to retrieve candidates either as sstables or runs
  compaction: Make get_candidates() work with frozen_sstable_run too
  sstables: add sstable_run::run_identifier()
  sstables: tag sstable_run::insert() with nodiscard
  sstables: Make all_sstable_runs() more efficient by exposing frozen shared runs
  sstables: Simplify sstable_set interface to retrieve runs
2023-09-26 14:56:05 +03:00
Aleksandra Martyniuk
d799adc536 tasks: change task_manager::task::impl::is_internal()
Most of the time only the roots of tasks tree should be non internal.

Change default implementation of is_internal and delete overrides
consistent with it.

Closes scylladb/scylladb#15353
2023-09-26 14:49:49 +03:00
Raphael S. Carvalho
8997fe0625 compaction: Switch to strategy_control::candidates() for regular compaction
Now everything is prepared for the switch, let's do it.

Now let's wait for ICS to enjoy the set of changes.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
02f1f24f27 compaction: Allow strategy to retrieve candidates either as sstables or runs
That's needed for upcoming changes that will allow ICS to efficiently
retrieve sstable runs.

Next patch will remove candidates from compaction_strategy's interface
to retrieve candidates using this one instead.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
ff8510445d compaction: Make get_candidates() work with frozen_sstable_run too
This is done in preparation for ICS to retrieve candidates as
sstable runs.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Avi Kivity
61440d20c3 Merge 'Enable incremental compaction on off-strategy' from Raphael "Raph" Carvalho
Off-strategy suffers with a 100% space overhead, as it adopted
a sort of all or nothing approach. Meaning all input sstables,
living in maintenance set, are kept alive until they're all
reshaped according to the strategy criteria.

Input sstables in off-strategy are very likely to be mostly disjoint,
so it can greatly benefit from incremental compaction.

The incremental compaction approach is not only good for
decreasing disk usage, but also memory usage (as metadata of
input and output live in memory), and file desc count, which
takes memory away from OS.

Turns out that this approach also greatly simplifies the
off-strategy impl in compaction manager, as it no longer have
to maintain new unused sstables and mark them for
deletion on failure, and also unlink intermediary sstables
used between reshape rounds.

Fixes https://github.com/scylladb/scylladb/issues/14992.

Closes scylladb/scylladb#15400

* github.com:scylladb/scylladb:
  test: Verify that off-strategy can do incremental compaction
  compaction: Clear pending_replacement list when tombstone GC is disabled
  compaction: Enable incremental compaction on off-strategy
  compaction: Extend reshape type to allow for incremental compaction
  compaction: Move reshape_compaction in the source
  compaction: Enable incremental compaction only if replacer callback is engaged
2023-09-21 20:12:19 +03:00
Raphael S. Carvalho
42050f13a0 compaction: Enable incremental compaction on off-strategy
Off-strategy suffers with a 100% space overhead, as it adopted
a sort of all or nothing approach. Meaning all input sstables,
living in maintenance set, are kept alive until they're all
reshaped according to the strategy criteria.

Input sstables in off-strategy are very likely to mostly disjoint,
so it can greatly benefit from incremental compaction.

The incremental compaction approach is not only good for
decreasing disk usage, but also memory usage (as metadata of
input and output live in memory), and file desc count, which
takes memory away from OS.

Turns out that this approach also greatly simplifies the
off-strategy impl in compaction manager, as it no longer have
to maintain new unused sstables and mark them for
deletion on failure, and also unlink intermediary sstables
used between reshape rounds.

Fixes #14992.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-21 11:15:46 -03:00
Botond Dénes
a56a4b6226 Merge 'compaction_backlog_tracker: do not allow moving registered trackers' from Benny Halevy
Currently, the moved-object's manager pointer is moved into the
constructed object, but without fixing the registration to
point to the moved-to object, causing #15248.

Although we could properly move the registration from
the moved-from object to the moved-to one, it is simpler
to just disallow moving a registered tracker, since it's
not needed anywhere. This way we just don't need to mess
with the trackers' registration.

The move-assignment operator has a similar problem,
therefore it is deleted in this series, and the function is
renamed to `transfer_backlog` that just doesn't deal with the
moved-from registration.  This is safe since it's only used internally
by the compaction manager.

Fixes #15248

Closes scylladb/scylladb#15445

* github.com:scylladb/scylladb:
  compaction_state: store backlog_track in std::optional
  compaction_backlog_tracker: do not allow moving registered trackers
2023-09-20 16:41:10 +03:00
Benny Halevy
7ca91d719c compaction_state: store backlog_track in std::optional
So that replacing it will destroy the previous tracker
and unregister it before assigning the new one and
then registering it.

This is safer than assiging it in place.

With that, the move assignment operator is not longer
used and can be deleted.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-19 13:59:54 +03:00
Benny Halevy
4ad4b632b8 compaction_backlog_tracker: do not allow moving registered trackers
Currently, the moved-object's manager pointer is moved into the
constructed object, but without fixing the registration to
point to the moved-to object, causing #15248.

Although we could properly move the registration from
the moved-from object to the moved-to one, it is simpler
to just disallow moving a registered tracker, since it's
not needed anywhere. This way we just don't need to mess
with the trackers' registration.

With that in mind, when move-assigning a compaction_backlog_tracker
the existing tracker can remain registered.

Fixes scylladb/scylladb#15248

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-19 13:24:36 +03:00
Aleksandra Martyniuk
59b7a45f73 compaction: do not run stopped compaction
Before compaction_task_executor::do_run	is called, the executor	can
be already aborted. Check if compaction	was stopped and set
_compaction_done to exceptional	future.
2023-09-09 11:19:11 +02:00
Aleksandra Martyniuk
515b8d4890 compaction: modify lowest compaction tasks' run method
For compaction_task_executors, unlike for all other task manager
tasks, run method does not embrace operations performed in a scope
of a task, but only waits until shared_future connected with
the operations is resolved.

Apart from breaking task manager task conventions, such a run method
must consider all corner cases, not to break task manager or
compaction manager functionality.

To fix existing and prevent further bugs related to task manager
and compaction manager coexistence, call perform_task inside
run method and wait for it in a standard way.

Executors that are not going to be reflected in task manager run call
perform_task the old way.
2023-09-09 11:19:11 +02:00
Aleksandra Martyniuk
832df38d26 compaction: pass do_throw_if_stopping to compaction_task_executor
As a preparation for further changes, keep do_throw_if_stopping flag
as a member of compaction_task_executor.
2023-09-09 11:19:11 +02:00
Benny Halevy
cfecb68245 compaction_manager: stop: close compaction_state:s gates
Make sure the compaction_state:s are idle before
they are destroyed. Although all tasks are stopped
in stop_ongoing_compactions, make sure there is
fiber holding the compaction_state gate.

compaction_manager::remove now needs to close the
compaction_state gate and to stop_ongoing_compactions
only if the gate is not closed yet.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
96055414c7 compaction_manager: gracefully handle gate close
Check if the compaction_state gate is closed
along with _state != state::enabled and return early
in this case.

At this point entering the gate is guaranteed to succeed.
So enter the gate before calling `perform_compaction`
keeping the std::optional<gate_holder> throughout
the compaction task.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Aleksandra Martyniuk
5e31ca7d20 tasks: api: show tasks' scopes
To make manual analysis of task manager tasks easier, task_status
and task_stats contain operation scope (e.g. shard, table).

Closes #15172
2023-08-29 11:32:16 +03:00
Aleksandra Martyniuk
e0ce711e4f compaction: do not swallow compaction_stopped_exception for reshape
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: #15058.

Closes #15067
2023-08-21 12:41:55 +03:00
Aleksandra Martyniuk
e9d94894f1 compaction: release resources of compaction executors
Before compaction task executors started inheriting from
compaction_task_impl, they were destructed immediately after
compaction finished. Destructors of executors and their
fields performed actions that affected global structures and
statistics and had impact on compaction process.

Currently, task executors are kept in memory much longer, as their
are tracked by task manager. Thus, destructors are not called just
after the compaction, which results in compaction stats not being
updated, which causes e.g. infinite cleanup loop.

Add release_resources() method which is called at the end
of compaction process and does what destructors used to.

Fixes: #14966.
Fixes: #15030.

Closes #15005
2023-08-16 15:51:17 +03:00
Benny Halevy
9f77a32805 compaction_manager: run_offstrategy_compaction: retrieve owned_ranges from compaction_state
perform_offstrategy is called from try_perform_cleanup
when there are sstables in the maintenance set that require
cleanup.

The input sstables are inserted into the compaction_state
`sstables_requiring_cleanup` and `try_perform_cleanup`
expects offstrategy compaction to clean them up along
with reshape compaction.

Otherwise, the maintenance sstables that require cleanup
are not cleaned up by cleanup compaction, since
the reshape output sstable(s) are not analyzed again
after reshape compaction, where that would insert
the output sstable(s) into `sstables_requiring_cleanup`
and trigger their cleanup in the subsequent cleanup compaction.

The latter method is viable too, but it is less effficient
since we can do reshape+cleanup in one pass, vs.
reshape first and cleanup later.

Fixes scylladb/scylladb#15041

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15043
2023-08-14 18:37:34 +03:00
Aleksandra Martyniuk
7a28cc60ec compaction: ignore future explicitly
discard_result ignores only successful futures. Thus, if
perform_compaction<regular_compaction_task_executor> call fails,
a failure is considered abandoned, causing tests to fail.

Explicitly ignore failed future.

Fixes: #14971.

Closes #15000
2023-08-14 16:41:15 +03:00
Aleksandra Martyniuk
9ec43fd3a7 compaction: update comment in compaction_manager::submit
Closes #15023
2023-08-14 09:34:56 +03:00