Commit Graph

77 Commits

Author SHA1 Message Date
Botond Dénes
86ed627fc4 compaction: move code to namespace compaction
The namespace usage in this directory is very inconsistent, with files
and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables

With cases, where all three used in the same file. This code used to
live in sstables/ and some of it still retains namespace sstables as a
heritage of that time. The mismatch between the dir (future module) and
the namespace used is confusing, so finish the migration and move all
code in compaction/ to namespace compaction too.

This patch, although large, is mechanic and only the following kind of
changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace
  context

This refactoring revealed some awkward leftover coupling between
sstables and compaction, in sstables/sstable_set.cc, where the
make_sstable_set() methods of compaction strategies are implemented.
2025-09-25 15:03:56 +03:00
Botond Dénes
6116f9e11b Merge 'Compaction tasks progress' from Aleksandra Martyniuk
Determine the progress of compaction tasks that have
children.

The progress of a compaction task is calculated using the default
get_progress method. If the expected_total_workload method is
implemented, the default progress is computed as:
(sum of child task progresses) / (expected total workload)

If expected_total_workload is not defined, progress is estimated based
on children progresses. However, in this case, the total progress may
increase over time as the task executes.

All compaction tasks, except for reshape tasks, implement the
expected_children_number method. To compute expected_total_workload,
iterate over all SSTables covered by the task and sum their sizes. Note
that expected_total_workload is just an approximation and the real workload
may differ if SStables set for the keyspace/table/compaction group changes.

Reshape tasks are an exception, as their scope is determined during
execution. Hence, for these tasks expected_total_workload isn't defined
and their progress (both total and completed) is determined based
on currently created children.

Fixes: https://github.com/scylladb/scylladb/issues/8392.
Fixes: https://github.com/scylladb/scylladb/issues/6406.
Fixes: https://github.com/scylladb/scylladb/issues/7845.

New feature, no backport needed

Closes scylladb/scylladb#15158

* github.com:scylladb/scylladb:
  test: add compaction task progress test
  compaction: set progress unit for compaction tasks
  compaction: find expected workload for reshard tasks
  compaction: find expected workload for global cleanup compaction tasks
  compaction: find expected workload for global major compaction tasks
  compaction: find expected workload for keyspace compaction tasks
  compaction: find expected workload for shard compaction tasks
  compaction: find expected workload for table compaction tasks
  compaction: return empty progress when compaction_size isn't set
  compaction: update compaction_data::compaction_size at once
  tasks: do not check expected workload for done task
2025-09-03 13:23:42 +03:00
Aleksandra Martyniuk
7fe1ad1f63 tasks: return task::impl from make_and_start_task
Currently, make_and_start_task returns a pointer to task_manager::task
that hides the implementation details. If we need to access
the implementation (e.g. because we want a task to "return" a value),
we need to make and start task step by step openly.

Return task_manager::task::impl from make_and_start_task. Use it
where possible.

Fixes: https://github.com/scylladb/scylladb/issues/22146.
2025-08-29 17:12:07 +02:00
Aleksandra Martyniuk
0844a057d1 compaction: use current_task_type 2025-08-29 17:08:00 +02:00
Aleksandra Martyniuk
f3ed852115 compaction: set progress unit for compaction tasks 2025-08-28 12:10:13 +02:00
Aleksandra Martyniuk
046742bd18 compaction: find expected workload for reshard tasks
Find expected workload in bytes of reshard tasks.

The workload of table_resharding_compaction_task_impl is found
at the beginning of its execution. Before that, expected_total_workload()
returns std::nullopt, which means that the progress for this task
won't be shown.
2025-08-28 12:10:13 +02:00
Aleksandra Martyniuk
a2381380f2 compaction: find expected workload for global cleanup compaction tasks
Sum bytes of all sstables of all non local vnode keyspaces.
2025-08-28 12:10:13 +02:00
Aleksandra Martyniuk
cc9e342cb6 compaction: find expected workload for global major compaction tasks
Sum bytes of all sstables of all keyspaces.
2025-08-28 12:10:13 +02:00
Aleksandra Martyniuk
b12fc50de7 compaction: find expected workload for keyspace compaction tasks
Add compaction_task_impl::get_keyspace_task_workload that sums
the bytes in all sstables of this keyspace.

This function is used to find the expected workload of the following
keyspace compaction types:
- major;
- cleanup;
- offstrategy;
- upgrade_sstables;
- scrub.
2025-08-28 12:10:07 +02:00
Aleksandra Martyniuk
82e241eb00 compaction: find expected workload for shard compaction tasks
Add compaction_task_impl::get_shard_task_workload that sums
the bytes in all sstables of this keyspace on this shard.

This function is used to find the expected workload of the following
shard compaction types:
- major;
- cleanup;
- offstrategy;
- upgrade_sstables;
- scrub.
2025-08-28 12:04:16 +02:00
Aleksandra Martyniuk
926753e8bf compaction: find expected workload for table compaction tasks
Add compaction_task_impl::get_table_task_workload that sums
the bytes in all sstables in the table.

This function is used to find the expected workload of the following
compaction types:
- major;
- cleanup;
- offstrategy;
- upgrade_sstables;
- scrub.
2025-08-28 10:41:22 +02:00
Raphael S. Carvalho
2c4a9ba70c treewide: Rename table_state to compaction_group_view
Since table_state is a view to a compaction group, it makes sense
to rename it as so.

With upcoming incremental repair, each replica::compaction_group
will be actually two compaction groups, so there will be two
views for each replica::compaction_group.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:51:28 +03:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Aleksandra Martyniuk
292d00463a tasks: add task_manager::task::is_user_task method 2024-11-25 14:21:53 +01:00
Lakshmi Narayanan Sreethar
84d06a13c7 api: compaction: add consider_only_existing_data option
Added a new parameter `consider_only_existing_data` to major compaction
API endpoints. When enabled, major compaction will:

- Force-flush all tables.
- Force a new active segment in the commit log.
- Compact all existing SSTables and garbage-collect tombstones by only
  checking the SSTables being compacted. Memtables, commit logs, and
  other SSTables not part of the compaction will not be checked, as they
  will only contain newer data that arrived after the compaction
  started.

The `consider_only_existing_data` is passed down to the compaction
descriptor's `gc_check_only_compacting_sstables` option to ensure that
only the existing data is considered for garbage collection.

The option is also passed to the `maybe_flush_commitlog` method to make
sure all the tables are flushed and a new active segment is created in
the commit log.

Fixes #19728

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-09-05 17:25:45 +05:30
Kefu Chai
e87b64b7bb compaction: not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-02 14:06:42 +08:00
Raphael S. Carvalho
b8bd4c51c2 replica: don't expose compaction_group to reshape task
compaction_group sits in replica layer and compaction layer is
supposed to talk to it through compaction::table_state only.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2024-06-13 12:43:14 -03:00
Aleksandra Martyniuk
cf9913b0b7 compaction: pass compaction group id to reshape_compaction_group
Pass compaction group id to
shard_reshaping_compaction_task_impl::reshape_compaction_group.
Modify table::as_table_state to return table_state of the given
compaction group.
2024-05-10 14:56:38 +02:00
Lakshmi Narayanan Sreethar
83fecc2f1f compaction: reshape sstables within compaction groups
For tables using tablet based replication strategies, the sstables
should be reshaped only within the compaction groups they belong to.
Updated shard_reshaping_compaction_task_impl to group the sstables based
on their compaction groups before reshaping them within the groups.

Fixes #16966

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-02-23 18:43:39 +05:30
Kefu Chai
5e0b3671d3 storage_service: fall back to local cleanup in cleanup_all
before this change, if no keyspaces are specified,
scylla-nodetool just enumerate all non-local keyspaces, and
call "/storage_service/keyspace_cleanup" on them one after another.
this is not quite efficient, as each this RESTful API call
force a new active commitlog segment, and flushes all tables.
so, if the target node of this command has N non-local keyspaces,
it would repeat the steps above for N times. this is not necessary.
and after a topology change, we would like to run a global
"nodetool cleanup" without specifying the keyspace, so this
is a typical use case which we do care about.

to address this performance issue, in this change, we improve
an existing RESTful API call "/storage_service/cleanup_all", so
if the topology coordinator is not enabled, we fall back to
a local cleanup to cleanup all non-local keyspaces.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-01 11:25:53 +08:00
Kefu Chai
4f90a875f6 compaction: format flush_mode without the helper
since flush_mode is moved out of major_compaction_task_impl, let's
drop the helper hosted in that class as well, and implement the
formatter witout it.

please note, the `__builtin_unreachable()` is dropped. it should
not change the behavior of the formatter. we don't put it in the
`default` branch in hope that `-Wswitch` can warn us in the case
when another enum of `flush_mode` is added, but we fail to handle
it somehow.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-01 11:25:53 +08:00
Kefu Chai
b39cc01bb3 compaction_manager: flush all tables before cleanup
according to the document "nodetool cleanup"

> Triggers removal of data that the node no longer owns

currently, scylla performs cleanup by rewriting the sstables. but
commitlog segments may still contain the mutations to the tables
which are dropped during sstable rewriting. when scylla server
restarts, the dirty mutations are replayed to the memtable. if
any of these dirty mutations changes the tables cleaned up. the
stale data are reapplied. this would lead to data resurrection.

so, in this change we following the same model of major compaction:

1. force new active segment,
2. flush all tables
3. perform cleanup using compaction, which rewrites the sstables
   of specified tables

because we already `flush()` all tables in
`cleanup_keyspace_compaction_task_impl::run()`, there is no need to
call `flush()` again, in `table::perform_cleanup_compaction()`, so
the `flush()` call is dropped in this function, and the tests using
this function are updated to call `flush()` manually to preserve
the existing behavior.

there are two callers of `cleanup_keyspace_compaction_task_impl`,

* one is `storage_service::sstable_cleanup_fiber()`, which listens
  for the events fired by topology_state_machine, which is in turn
  driven by, for instance, "/storage_service/cleanup_all" API.
  which cleanup all keyspaces in one after another.
* another is "/storage_service/keyspace_cleanup", which cleans up
  the specified keyspace.

in the first use case, we can force a new active segment for a single
time, so another parameter to the ctor of
`cleanup_keyspace_compaction_task_impl` is introduced to specify if
the `db.flush_all_tables()` call should be skiped.

please note, there are two possible optimizations,

1. force new active segment only if the mutations in it touches the
   tables being cleaned up
2. after forcing new active segment, only flush the (mem)tables
   mutated by the non-active segments

but let's leave them for following-up changes. this change is a
minimal fix for data resurrection issue.

Fixes #16757
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-01 11:25:53 +08:00
Kefu Chai
9afec2e3e7 api, compaction: promote flush_mode
so that this enum type can be shared by other task(s) as well.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-01 11:25:53 +08:00
Aleksandra Martyniuk
6b87778ef2 compaction: make regular compaction tasks internal
Regular compaction tasks are internal.

Adjust test_compaction_task accordingly: modify test_regular_compaction_task,
delete test_running_compaction_task_abort (relying on regular compaction)
which checks are already achived by test_not_created_compaction_task_abort.
Rename the latter.
2024-01-09 13:13:54 +01:00
Aleksandra Martyniuk
ceec5577d8 api: compaction: pass pointer to top level compaction tasks
As a preparation for asynchronous compaction api, from which we
cannot take values by reference, top level compaction tasks get
pointers which need to be set to nullptr when they are not needed
(like in async api).
2023-12-11 11:36:10 +01:00
Benny Halevy
b12b142232 api: add /storage_service/compact
For major compacting all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool compact` translates to
a sequence of `/storage_service/keyspace_compaction` calls).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-11-28 16:37:42 +02:00
Benny Halevy
66ba983fe0 compaction_manager: flush_all_tables before major compaction
Major compaction already flushes each table to make
sure it considers any mutations that are present in the
memtable for the purpose of tombstone purging.
See 64ec1c6ec6

However, tombstone purging may be inhibited by data
in commitlog segments based on `gc_time_min` in the
`tombstone_gc_state` (See f42eb4d1ce).

Flushing all sstables in the database release
all references to commitlog segments and there
it maximizes the potential for tombstone purging,
which is typically the reason for running major compaction.

However, flushing all tables too frequently might
result in tiny sstables.  Since when flushing all
keyspaces using `nodetool flush` the `force_keyspace_compaction`
api is invoked for keyspace successively, we need a mechanism
to prevent too frequent flushes by major compaction.

Hence a `compaction_flush_all_tables_before_major_seconds` interval
configuration option is added (defaults to 24 hours).

In the case that not all tables are flushed prior
to major compaction, we revert to the old behavior of
flushing each table in the keyspace before major-compacting it.

Fixes scylladb/scylladb#15777

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-11-28 16:37:42 +02:00
Benny Halevy
1fd85bd37b api: compaction: add flush_memtables option
When flushing is done externally, e.g. by running
`nodetool flush` prior to `nodetool compact`,
flush_memtables=false can be passed to skip flushing
of tables right before they are major-compacted.

This is useful to prevent creation of small sstables
due to excessive memtable flushing.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-11-28 16:37:42 +02:00
Aleksandra Martyniuk
aa7bba2d8b compaction: abort task manager compaction tasks
Set top level compaction tasks as abortable.

Compaction tasks which have no children, i.e. compaction task
executors, have abort method overriden to stop compaction data.
2023-11-24 15:44:34 +01:00
Botond Dénes
0ae1335daa Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk"
This reverts commit 11cafd2fc8, reversing
changes made to 2bae14f743.

Reverting because this series causes frequent CI failures, and the
proposed quickfix causes other failures of its own.

Fixes: #16113
2023-11-22 17:44:07 +02:00
Aleksandra Martyniuk
599d6ebd52 compaction: abort task manager compaction tasks
Set top level compaction tasks as abortable.

Compaction tasks which have no children, i.e. compaction task
executors, have abort method overriden to stop compaction data.
2023-11-13 15:46:58 +01:00
Botond Dénes
1cccc86813 Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk"
This reverts commit 2860d43309, reversing
changes made to a3621dbd3e.

Reverting because rest_api.test_compaction_task started failing after
this was merged.

Fixes: #16005
2023-11-09 10:43:11 +01:00
Aleksandra Martyniuk
0681795417 compaction: abort task manager compaction tasks
Set top level compaction tasks as abortable.

Compaction tasks which have no children, i.e. compaction task
executors, have abort method overriden to stop compaction data.
2023-10-19 10:47:17 +02:00
Aleksandra Martyniuk
198119f737 compaction: add get_progress method to compaction_task_impl
compaction_task_impl::get_progress is used by the lowest level
compaction tasks which progress can be taken from
compaction_progress_monitor.
2023-10-12 17:16:05 +02:00
Aleksandra Martyniuk
d799adc536 tasks: change task_manager::task::impl::is_internal()
Most of the time only the roots of tasks tree should be non internal.

Change default implementation of is_internal and delete overrides
consistent with it.

Closes scylladb/scylladb#15353
2023-09-26 14:49:49 +03:00
Aleksandra Martyniuk
5e31ca7d20 tasks: api: show tasks' scopes
To make manual analysis of task manager tasks easier, task_status
and task_stats contain operation scope (e.g. shard, table).

Closes #15172
2023-08-29 11:32:16 +03:00
Aleksandra Martyniuk
1decf86d71 compaction: change sstables compaction tasks type 2023-07-28 10:51:55 +02:00
Aleksandra Martyniuk
59b838688b compaction: move table_upgrade_sstables_compaction_task_impl
Move table_upgrade_sstables_compaction_task_impl so that the related
classes where placed next to each other.
2023-07-28 10:51:55 +02:00
Aleksandra Martyniuk
77dcdd743e compaction: add shard_reshard_sstables_compaction_task_impl
Add task manager's task covering resharding compaction on one shard.
2023-07-19 17:19:10 +02:00
Aleksandra Martyniuk
7a7e287d8c compaction: add reshard_sstables_compaction_task_impl
Add task manager's task covering resharding compaction.

A struct and some functions are moved from replica/distributed_loader.cc
to compaction/task_manager_module.cc.
2023-07-19 17:15:40 +02:00
Aleksandra Martyniuk
e486f4eba6 compaction: create resharding_compaction_task_impl
resharding_compaction_task_impl serves as a base class of all
concrete resharding compaction task classes.
2023-07-19 10:41:35 +02:00
Aleksandra Martyniuk
9fdd130943 compaction: add regular_compaction_task_impl
regular_compaction_task_impl serves as a base class of all
concrete regular compaction task classes.
2023-07-17 15:54:33 +02:00
Michał Chojnowski
b511d57fc8 Revert "Merge 'Compaction resharding tasks' from Aleksandra Martyniuk"
This reverts commit 2a58b4a39a, reversing
changes made to dd63169077.

After patch 87c8d63b7a,
table_resharding_compaction_task_impl::run() performs the forbidden
action of copying a lw_shared_ptr (_owned_ranges_ptr) on a remote shard,
which is a data race that can cause a use-after-free, typically manifesting
as allocator corruption.

Note: before the bad patch, this was avoided by copying the _contents_ of the
lw_shared_ptr into a new, local lw_shared_ptr.

Fixes #14475
Fixes #14618

Closes #14641
2023-07-11 19:11:37 +03:00
Aleksandra Martyniuk
87c8d63b7a compaction: add shard_reshard_sstables_compaction_task_impl
Add task manager's task covering resharding compaction on one shard.
2023-06-28 11:43:12 +02:00
Aleksandra Martyniuk
837d77ba8c compaction: add reshard_sstables_compaction_task_impl
Add task manager's task covering resharding compaction.
2023-06-28 11:41:43 +02:00
Aleksandra Martyniuk
2b4874bbf7 compaction: create resharding_compaction_task_impl
resharding_compaction_task_impl serves as a base class of all
concrete resharding compaction task classes.
2023-06-28 11:36:53 +02:00
Botond Dénes
b23361977b Merge 'Compaction reshape tasks' from Aleksandra Martyniuk
Task manager's tasks covering resharding compaction
on top and shard level.

Closes #14112

* github.com:scylladb/scylladb:
  test: extend test_compaction_task.py to test reshaping compaction
  compaction: move reshape function to shard_reshaping_table_compaction_task_impl::run()
  compaction: add shard_reshaping_compaction_task_impl
  replica: delete unused function
  compaction: add table_reshaping_compaction_task_impl
  compaction: copy reshape to task_manager_module.cc
  compaction: add reshaping_compaction_task_impl
2023-06-26 11:56:07 +03:00
Aleksandra Martyniuk
197635b44b compaction: delete generation of new sequence number for table tasks
Compaction tasks covering table major, cleanup, offstrategy,
and upgrade sstables compaction inherit sequence number from their
parents. Thus they do not need to have a new sequence number
generated as it will be overwritten anyway.

Closes #14379
2023-06-26 10:36:10 +03:00
Aleksandra Martyniuk
1960904a72 compaction: add shard_reshaping_compaction_task_impl
shard_reshaping_compaction_task_impl covers reshaping compaction
on one shard.
2023-06-23 16:22:38 +02:00
Aleksandra Martyniuk
e3e2d6b886 compaction: add table_reshaping_compaction_task_impl 2023-06-23 15:57:37 +02:00