Commit Graph

1249 Commits

Author SHA1 Message Date
Patryk Jędrzejczak
da44d6af09 Merge 'Move some compaction manager API handlers from storage_service.cc to tasks.cc' from Pavel Emelyanov
There's a bunch of /storage_service/... endpoints that start compaction manager tasks and wait for it. Most of them have async peer in /tasks/... that start the very same task, but return to the caller with the task ID.

This patch moves those handlers' code from storage_service.cc to tasks.cc, next to the corresponding async peers, to keep handlers that need compaction_manager in one place.

That's preparation for more future changes. Later all those endpoints will stop using database from http_context and will capture the compaction_manager they need from main, like it was done in #20962 for /compaction_manager/... endpoints. Even "more later", the former and the latter blocks of endpoints will be registered and unregistered together, e.g. like database endpoints were collected in one reg/unreg sequence by #25674.

Part of http_context dependencies cleanup effort, no need to backport.

Closes scylladb/scylladb#26140

* https://github.com/scylladb/scylladb:
  api: Move /storage_service/compact to tasks.cc
  api: Move /storage_service/keyspace_upgrade_sstables to tasks.cc
  api: Move /storage_service/keyspace_offstrategy_compaction to tasks.cc
  api: Move /storage_service/keyspace_cleanup to tasks.cc
  api: Move /storage_service/keyspace_compaction to tasks.cc
2025-09-23 15:08:48 +02:00
Michał Chojnowski
9e70df83ab db: get rid of sstables-format-selector
Our sstable format selection logic is weird, and hard to follow.

If I'm not misunderstanding, the pieces are:
1. There's the `sstable_format` config entry, which currently
   doesn't do anything, but in the past it used to disable
   cluster features for versions newer than the specified one.
2. There are deprecated and unused config entries for individual
   versions (`enable_sstables_mc_format`, `enable_sstables_md_format`,
   etc).
3. There is a cluster feature for each version:
   ME_SSTABLE_FORMAT, MD_SSTABLE_FORMAT, etc.
   (Currently all sstable version features have been grandfathered,
   and aren't checked by the code anymore).
4. There's an entry in `system.scylla_local` which contains the
   latest enabled sstable version. (Why? Isn't this directly derived
   from cluster features anyway)?
5. There's `sstable_manager::_format` which contains the
   sstable version to be used for new writes.
   This field is updated by `sstables_format_selector`
   based on cluster features and the `system.scylla_local` entry.

I don't see why those pieces are needed. Version selection has the
following constraints:
1. New sstables must be written with a format that supports existing
   data. For example, range tombstones with an infinite bound are only
   supported by sstables since version "mc". So if a range tombstone
   with an infinite bound exists somewhere in the dataset,
   the format chosen for new sstables has to be at least as new as "mc".
2. A new format might only be used after a corresponding cluster feature
   is enabled. (Otherwise new sstables might become unreadable if they
   are sent to another node, or if a node is downgraded).
3. The user should have a way to inhibit format ugprades if he wishes.

So far, constraint (1) has been fulfilled by never using formats older
than the newest format ever enabled on the node. (With an exception
for resharding and reshaping system tables).
Constraint (2) has been fulfilled by calling `sstable_manager::set_format`
only after the corresponsing cluster feature is enabled.
Constraint (3) has been fulfilled by the ability to inhibit cluster
features by setting `sstable_format` by some fixed value.

The main thing I don't like about this whole setup is that it doesn't
let me downgrade the preferred sstable format. After a format is
enabled, there is no way to go back to writing the old format again.
That is no good -- after I make some performance-sensitive changes
in a new format, it might turn out to be a pessimization for the
particular workload, and I want to be able to go back.

This patch aims to give a way to downgrade formats without violating
the constraints. What it does is:
1. The entry in `system.scylla_local` becomes obsolete.
   After the patch we no longer update or read it.
   As far as I understand, the purpose of this entry is to prevent
   unwanted format downgrades (which is something cluster features
   are designed for) and it's updated if and only if relevant
   cluster features are updated. So there's no reason to have it,
   we can just directly use cluster features.
2. `sstable_format_selector` gets deleted.
   Without the `system.scylla_local` around, it's just a glorified
   feature listener.
3. The format selection logic is moved into `sstable_manager`.
   It already sees the `db::config` and the `gms::feature_service`.
   For the foreseeable future, the knowledge of enabled cluster features
   and current config should be enough information to pick the right formats.
4. The `sstable_format` entry in `db::config` is no longer intended to
   inhibit cluster features. Instead, it is intended to select the
   format for new sstables, and it becomes live-updatable.
5. Instead of writing new sstables with "highest supported" format,
   (which used to be set by `sstables_format_selector`) we write
   them with the "preferred" format, which is determined by
   `sstable_manager` based on the combination of enabled features
   and the current value of `sstable_format`.

Closes scylladb/scylladb#26092

[avi: Pavel found the reason for the scylla_local entry -
      it predates stable storage for cluster features]
2025-09-19 16:17:56 +03:00
Pavel Emelyanov
d1626dfa86 api: Move /storage_service/compact to tasks.cc
This one doesn't have async peer there, but it's still a pure compaction
manager endpoint handler

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-19 13:23:59 +03:00
Pavel Emelyanov
6eaa2138ad api: Move /storage_service/keyspace_upgrade_sstables to tasks.cc
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-19 13:23:54 +03:00
Pavel Emelyanov
fe2a184713 api: Move /storage_service/keyspace_offstrategy_compaction to tasks.cc
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-19 13:23:49 +03:00
Pavel Emelyanov
607a39acbd api: Move /storage_service/keyspace_cleanup to tasks.cc
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-19 13:23:44 +03:00
Pavel Emelyanov
abd23bdd6d api: Move /storage_service/keyspace_compaction to tasks.cc
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-19 13:23:37 +03:00
Pavel Emelyanov
a1ea553fe1 code: Replace distributed<> with sharded<>
The latter is recommended in seastar, and the former was left as
compatibility alias. Latest seastar explicitly marks it as deprecated so
once the submodule is updated, compilation logs will explode.

Most of the patch is generated with

    for f in $(git grep -l '\<distributed<[A-Za-z0-9:_]*>') ; do sed -e 's/\<distributed<\([A-Za-z0-9:_]*\)>/sharded<\1>/g' -i $f; done
    for f in $(git grep -l distributed.hh); do sed -e 's/distributed.hh/sharded.hh/' -i $f ; done

and a small manual change in test/perf/perf.hh

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#26136
2025-09-19 12:22:51 +02:00
Michael Litvak
eeaa64ca0e storage_service: improve error message on repair of colocated tables
currently repair requests can't be added or deleted on non-base
colocated tables. improve the error message and comments to be more
clear and detailed.
2025-09-18 09:35:53 +02:00
Pavel Emelyanov
d69a51f42a compaction: Use function when filtering compaction tasks for stopping
The compaction_manager::stop_compaction() method internally walks the
list of tasks and compares each task's compacting_table (which is
compaction group view pointer) with the given one. In case this
stop_compaction() method is called via API for a specific table, the
method walks the list of tasks for every compaction group from the
table, thus resulting in nr_groups * nr_tasks complexity. Not terrible,
but not nice either.

The proposal is to pass filtering function into the inner
do_stop_ongoing_compactions() method. Some users will pass a simple
"return true" lambda, but those that need to stop compactions for a
specitif table (e.g. -- the API handler) will effectively walk the
list of tasks once comparing the given compaction group's schema with
the target table one (spoiler: eventually this place will also be
simplified not to mess with replica::table at all).

One ugliness with the change is the way "scope" for logging message is
collected. If all tasks belong to the same table, then "for table ..."
is printed in logs. With the change the scope is no longer known
instantly and is evaluated dynamically while walking the list of tasks.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25846
2025-09-16 23:40:47 +03:00
Aleksandra Martyniuk
55fde70f8d api: tasks: task_manager: keep children identities in chunked_{array,vector}
task_status contains a vector of children identities. If the number
of children is large, we may hit oversized allocation.

Change all types of children-related containers to chunked_vector.
Modify the children type returned from task manager API.

Fixes: scylladb#25795.

Closes scylladb/scylladb#25923
2025-09-15 08:44:16 +03:00
Pavel Emelyanov
88a01308e7 api: Move /storage_service/keyspaces handler to database module
The handler uses database service, not storage_service, and should
belong to the corresponding API module from column_family.cc

Once moved, the handler can use captured sharded<database> reference and
forget about http_context::db.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25834
2025-09-10 17:01:11 +02:00
Asias He
cb7db47ae1 repair: Add incremental_mode option for tablet repair
This patch introduces a new `incremental_mode` parameter to the tablet
repair REST API, providing more fine-grained control over the
incremental repair process.

Previously, incremental repair was on and could not be turned off. This
change allows users to select from three distinct modes:

- `regular`: This is the default mode. It performs a standard
  incremental repair, processing only unrepaired sstables and skipping
  those that are already repaired. The repair state (`repaired_at`,
  `sstables_repaired_at`) is updated.

- `full`: This mode forces the repair to process all sstables, including
  those that have been previously repaired. This is useful when a full
  data validation is needed without disabling the incremental repair
  feature. The repair state is updated.

- `disabled`: This mode completely disables the incremental repair logic
  for the current repair operation. It behaves like a classic
  (pre-incremental) repair, and it does not update any incremental
  repair state (`repaired_at` in sstables or `sstables_repaired_at` in
  the system.tablets table).

The implementation includes:

- Adding the `incremental_mode` parameter to the
  `/storage_service/repair/tablet` API endpoint.
- Updating the internal repair logic to handle the different modes.
- Adding a new test case to verify the behavior of each mode.
- Updating the API documentation and developer documentation.

Fixes #25605

Closes scylladb/scylladb#25693
2025-09-09 06:50:21 +03:00
Pavel Emelyanov
b86b4fc251 api: Simplify parse_scrub_options() helper
It no longer needs to be a coroutine, nether it needs the snapshot_ctl
reference argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-03 19:06:31 +03:00
Pavel Emelyanov
ee4197fa80 api: Take snapshot after parsing scrub options
Parsiong scrub options may throw after a snapshot is taken thus leaving
it on disk even though an operation reported as "failed". Not, probably,
critical, but not nice either.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-03 19:05:50 +03:00
Pavel Emelyanov
c0808c90b0 api: Use validate_table() helper in /storage_service/tokens_endpoint handler
The handler validates if the given ks:cf pair exists in the database,
then finds the table id to process further. There's a helper that does
both.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25669
2025-09-03 11:44:50 +03:00
Pavel Emelyanov
b5610050a1 api: Make GET/storage_service/drain handler work on storage service
POSTing on the same URL launches storage_service::drain(), so GETing on
it should (not that it's restriced somehow, but still) work on the same
service. This changes removes one more user of http_context::database
which in turn will allow removding database reference from context
eventually.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25677
2025-09-03 11:40:39 +03:00
Radosław Cybulski
c242234552 Revert "build: add precompiled headers to CMakeLists.txt"
This reverts commit 01bb7b629a.

Closes scylladb/scylladb#25735
2025-09-03 09:46:00 +03:00
Piotr Dulikowski
78ef334333 Merge 'Move "cache" API endpoints registration closer to column_family ones ' from Pavel Emelyanov
These two "blocks" of endpoints have different URL prefixes, but work with the same "service", which is sharded<replica::database>. The latter block had already been fixed to carry the sharded<database>& around (#25467), now it's the "cache" turn. However, since these endpoints also work with the database, there's no need in dedicated top-level set/unset machinery (similarly, gossiper has two API set/unset blocks that come together, see #19425), it's enough to just set/unset them next to each other.

Ongoing http_context dependency cleanup, no need to backport

Closes scylladb/scylladb#25674

* github.com:scylladb/scylladb:
  api: Capture and use db in cache_service handlers
  api: Add sharded<database>& arg to set_cache_service()
  api: Squash (un)set_cache_service into ..._column_family
  api: Coroutinize set_server_column_family()
2025-09-02 13:59:02 +02:00
Pavel Emelyanov
840cdab627 api: Move /load and /metrics/load handlers code to column_family.cc
Both handlers need database to proceed and thus need to be registered
(and unregistered) in a group that captures database for its handlers.

Once moved, the used get_cf_stats() method can be marked local to
column_family.cc file.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25671
2025-09-01 08:11:00 +02:00
Radosław Cybulski
01bb7b629a build: add precompiled headers to CMakeLists.txt
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.

New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.

The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.

The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.

Note: following configuration needs to be added to ccache.conf:

    sloppiness = pch_defines,time_macros

Closes #25182
2025-08-27 21:37:54 +03:00
Pavel Emelyanov
67b63768e4 api: Capture and use db in cache_service handlers
Now the sharded<database>& argument is there, so it can replace ctx one
on handlers lambdas.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-26 11:50:11 +03:00
Pavel Emelyanov
596d4640ff api: Add sharded<database>& arg to set_cache_service()
The reference is already available in set_server_column_family(), pass
it further so that "cache" handlers are able to use it (next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-26 11:49:35 +03:00
Pavel Emelyanov
4e556214ba api: Squash (un)set_cache_service into ..._column_family
The set_server_column_family() registers API handlers that work with
replica::database. The set_server_cache() does the very same thing, but
registers handlers with some other prefix. Squash the latter into
former, later "cache" handlers will also make use of the database
reference argument that's already available in ..._column_family()
setter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-26 11:46:48 +03:00
Pavel Emelyanov
1b4b539706 api: Coroutinize set_server_column_family()
To facilitate next patching

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-26 11:46:33 +03:00
Benny Halevy
45c496c276 api: storage_service: fix token_range documentation
Note that the token_range type is used only by describe_ring.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#25609
2025-08-22 10:06:21 +03:00
Pavel Emelyanov
2510a7b488 api/column_family: Capture sharded<database> to call get_cf_stats()
Update more handlers not to get databse from context, but to capture it
directly on handlers' lambdas.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 12:04:34 +03:00
Pavel Emelyanov
dc31b68451 api: Patch get_cf_stats to get sharded<database>& argument
Now it accepts http context and immediately gets the database from it to
pass to map_reduce_cf. Callers are updated to pass database from where
the context they already have.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 12:03:27 +03:00
Pavel Emelyanov
7933a68921 api: Drop CF map-reducers ability to work with http context
This patch finalizes the change started by the previous patch of the
similar title -- the map_reduce_cf(_raw) is switched to work only with
sharded<replica::database> reference. All callers were updated by
previous patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:45:24 +03:00
Pavel Emelyanov
ffd25f0c16 api: Patch callers of map_reduce_cf(_raw)? to use sharded<database>
There are some of them left that still pass http_context. These handlers
will eventually get their captured sharded database reference, but for
now make them explicitly use one from context. This will allow to
de-templatize map_reduce_cf... helpers making the code simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:43:55 +03:00
Pavel Emelyanov
7e0726d55b api: Use captured sharded<database> reference in handlers
Not all of them can switch from ctx to database, so in few places both,
the database and ctx, are captured. However, the ctx.db reference is no
longer used by the column_family handlers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
720a8fef4b api/column_family: Make map_reduce_cf_time_histogram() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
49cb81fb56 api/column_famliy: Make sum_sstable() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
3595ea7f49 api/column_family: Make get_cf_unleveled_sstables() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
d32ac35f60 api/column_famliy: Make get_cf_stats_count() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
cde39d3fc7 api/column_family: Make get_cf_rate_and_histogram() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
edc9e302e3 api/column_family: Make get_cf_histogram() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
422debbee2 api/column_family: Make get_cf_stats_sum() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
1c1fabc578 api/column_family: Make set_tables_tombstone_gc() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
0158743f5e api/column_family: Make set_tables_autocompaction() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
f52eb7cae2 api/column_family: Make for_tables_on_all_shards() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
818a41ccdb api: Capture sharded<database> for set_server_column_family()
Similarly to other API handlers, instead of using a database from http
context, patch the setting methods to capture the database from main
code and pass it around to handlers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
d3d217b3c9 api: Make CF map-reducers work on sharded<database> directly
Next patches are going to change a bunch of map_reduce_cf_... callers to
pass sharded<database> reference to it, not the http context. Not to
patch all the api/ code at once, keep the ability to call it with ctx at
hand. Eventually only the sharded<database>& overload will be kept.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:53 +03:00
Pavel Emelyanov
bdb7c2b014 api: Make map_reduce_cf_time_histogram() file-local
It's not used outside of api/column_family.cc

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:53 +03:00
Pavel Emelyanov
b0db83575c api: Remove unused ctx argument from run_toppartitions_query()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:53 +03:00
Raphael S. Carvalho
20c3301a1a treewide: Futurize estimation of pending compaction tasks
This is to allow futurization of compaction_group_view method that
retrieves sstable set.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:51:29 +03:00
Raphael S. Carvalho
2c4a9ba70c treewide: Rename table_state to compaction_group_view
Since table_state is a view to a compaction group, it makes sense
to rename it as so.

With upcoming incremental repair, each replica::compaction_group
will be actually two compaction groups, so there will be two
views for each replica::compaction_group.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:51:28 +03:00
Avi Kivity
8164f72f6e Merge 'Separate local_effective_replication_map from vnode_effective_replication_map' from Benny Halevy
Derive both vnode_effective_replication_map
and local_effective_replication_map from
static_effective_replication_map as both are static and per-keyspace.

However, local_effective_replication_map does not need vnodes
for the mapping of all tokens to the local node.

Refs #22733

* No backport required

Closes scylladb/scylladb#25222

* github.com:scylladb/scylladb:
  locator: abstract_replication_strategy: implement local_replication_strategy
  locator: vnode_effective_replication_map: convert clone_data_gently to clone_gently
  locator: abstract_replication_map: rename make_effective_replication_map
  locator: abstract_replication_map: rename calculate_effective_replication_map
  replica: database: keyspace: rename {create,update}_effective_replication_map
  locator: effective_replication_map_factory: rename create_effective_replication_map
  locator: abstract_replication_strategy: rename vnode_effective_replication_map_ptr et. al
  locator: abstract_replication_strategy: rename global_vnode_effective_replication_map
  keyspace: rename get_vnode_effective_replication_map
  dht: range_streamer: use naked e_r_m pointers
  storage_service: use naked e_r_m pointers
  alternator: ttl: use naked e_r_m pointers
  locator: abstract_replication_strategy: define is_local
2025-08-07 12:51:43 +03:00
Benny Halevy
bd62421c05 keyspace: rename get_vnode_effective_replication_map
to get_static_effective_replication_map, in preparation
for separating local_effective_replication_map from
vnode_effective_replication_map (both are per-keyspace).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 13:40:43 +03:00
Benny Halevy
ec85678de1 locator: abstract_replication_strategy: define is_local
Prefer for specializing the local replication strategy,
local effective replication map, et. al byt defining
an is_local() predicate, similar to uses_tablets().

Note that is_vnode_based() still applies to local replication
strategy.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 13:34:23 +03:00