Commit Graph

255 Commits

Author SHA1 Message Date
Avi Kivity
66173c06a3 Merge 'Eradicate the ability to create new sstables with numerical sstable generation' from Benny Halevy
Remove support for generating numerical sstable generation for new sstables.
Loading such sstables is still supported but new sstables are always created with a uuid generation.
This is possible since:
* All live versions (since 5.4 / f014ccf369) now support uuid sstable generations.
* The `uuid_sstable_identifiers_enabled` config option (that is unused from version 2025.2 / 6da758d74c) controls only the use of uuid generations when creating new sstables. SSTables with uuid generations should still be properly loaded by older versions, even if `uuid_sstable_identifiers_enabled` is set to `false`.

Fixes #24248

* Enhancement, no backport needed

Closes scylladb/scylladb#24512

* github.com:scylladb/scylladb:
  streaming: stream_blob: use the table sstable_generation_generator
  replica: distributed_loader: process_upload_dir: use the table sstable_generation_generator
  sstables: sstable_generation_generator: stop tracking highest generation
  replica: table: get rid of update_sstables_known_generation
  sstables: sstable_directory: stop tracking highest_generation
  replica: distributed_loader: stop tracking highest_generation
  sstables: sstable_generation: get rid of uuid_identifiers bool class
  sstables_manager: drop uuid_sstable_identifiers
  feature_service: move UUID_SSTABLE_IDENTIFIERS to supported_feature_set
  test: cql_query_test: add test_sstable_load_mixed_generation_type
  test: sstable_datafile_test: move copy_directory helper to test/lib/test_utils
  test: database_test: move table_dir helper to test/lib/test_utils
2025-08-14 11:54:33 +03:00
Benny Halevy
de8a199f79 replica: distributed_loader: process_upload_dir: use the table sstable_generation_generator
No need to start a local sharded generator.
Can just use the table's sstable generation generator
to make new sstables now that it's stateless and doesn't
depend on the highest generation found (including the uploaded
sstables).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
b01524c5a3 replica: distributed_loader: stop tracking highest_generation
It is not needed anymore as we always generate
uuid generations.

Move highest_generation_seen(sharded<sstables::sstable_directory>& directory)
to sstables/sstable_directory module.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
6cc964ef16 sstables: sstable_generation: get rid of uuid_identifiers bool class
Now that all call sites enable uuid_identifiers.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
43ee9c0593 sstables_manager: drop uuid_sstable_identifiers
It is returning constant sstables::uuid_identifiers::yes now,
so let the callers just use the constant (to be dropped
in a following patch).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
bd62421c05 keyspace: rename get_vnode_effective_replication_map
to get_static_effective_replication_map, in preparation
for separating local_effective_replication_map from
vnode_effective_replication_map (both are per-keyspace).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 13:40:43 +03:00
Robert Bindar
ca1a9c8d01 Add support for nodetool refresh --skip-reshape
This patch adds the new option in nodetool, patches the
load_new_ss_tables REST request with a new parameter and
skips the reshape step in refresh if this flag is passed.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#24409
Fixes: #24365
2025-06-10 12:52:13 +03:00
Raphael S. Carvalho
2d716f3ffe replica: Fix truncate assert failure
Truncate doesn't really go well with concurrent writes. The fix (#23560) exposed
a preexisting fragility which I missed.

1) truncate gets RP mark X, truncated_at = second T
2) new sstable written during snapshot or later, also at second T (difference of MS)
3) discard_sstables() get RP Y > saved RP X, since creation time of sstable
with RP Y is equal to truncated_at = second T.

So the problem is that truncate is using a clock of second granularity for
filtering out sstables written later, and after we got low mark and truncate time,
it can happen that a sstable is flushed later within the same second, but at a
different millisecond.
By switching to a millisecond clock (db_clock), we allow sstables written later
within the same second from being filtered out. It's not perfect but
extremely unlikely a new write lands and get flushed in the same
millisecond we recorded truncated_at timepoint. In practice, truncate
will not be used concurrently to writes, so this should be enough for
our tests performing such concurrent actions.
We're moving away from gc_clock which is our cheap lowres_clock, but
time is only retrieved when creating sstable objects, which frequency of
creation is low enough for not having significant consequences, and also
db_clock should be cheap enough since it's usually syscall-less.

Fixes #23771.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#24426
2025-06-08 15:59:15 +03:00
Pavel Emelyanov
ed3ce0f6af distributed_loader: Don't create owned ranges if skip-cleanup is true
In order to make reshard compaction task run cleanup, the owner-ranges
pointer is passed to it. If it's nullptr, the cleanup is not performed.
So to do the skip-cleanup, the easiest (but not the most apparent) way
is not to initialize the pointer and keep it nullptr.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-05-13 16:52:15 +03:00
Pavel Emelyanov
4ab049ac8d code: Push bool skip_cleanup flag around
Just put the boolean into the callstack between API and distributed
loader to reduce the churn in the next patches. No functional changes,
flag is false and unused.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-05-13 16:51:21 +03:00
Pavel Emelyanov
b5a124f60c sstable_directory: Move highest_generation_seen() to distributed_loader.cc
This method is only used by the loader code (and tests). Also, There's the
highest_version_seen() peer that sits in the loader code either.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#23324
2025-04-01 09:15:14 +03:00
Kefu Chai
7ff0d7ba98 tree: Remove unused boost headers
This commit eliminates unused boost header includes from the tree.

Removing these unnecessary includes reduces dependencies on the
external Boost.Adapters library, leading to faster compile times
and a slightly cleaner codebase.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22857
2025-02-15 20:32:22 +02:00
Pavel Emelyanov
bb094cc099 Merge 'Make restore task abortable' from Calle Wilund
Fixes #20717

Enables abortable interface and propagates abort_source to all s3 objects used for reading the restore data.

Note: because restore is done on each shard, we have to maintain a per-shard abort source proxy for each, and do a background per-shard abort on abort call. This is synced at the end of "run()".

Abort source is added as an optional parameter to s3 storage and the s3 path in distributed loader.

There is no attempt to "clean up" an aborted restore. As we read on a mutation level from remote sstables, we should not cause incomplete sstables as such, even though we might end up of course with partial data restored.

Closes scylladb/scylladb#21567

* github.com:scylladb/scylladb:
  test_backup: Add restore abort test case
  sstables_loader: Make restore task abortable
  distributed_loader: Add optional abort_source to get_sstables_from_object_store
  s3_storage: Add optional abort_source to params/object
  s3::client: Make "readable_file" abortable
2024-12-19 12:23:33 +03:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Calle Wilund
6a2a18a2fc distributed_loader: Add optional abort_source to get_sstables_from_object_store 2024-12-02 12:30:24 +00:00
Kefu Chai
7bc0b64f0a replica: correct indentation after coroutinizing make_sstables_available
The previous commit (b3ebbf35e2) transformed `make_sstables_available()`
into a coroutine but left behind incorrectly indented statements
from a nested lambda. This commit restores proper indentation.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21660
2024-11-25 09:58:01 +03:00
Kefu Chai
24d14b601b treewide: s/boost::adaptors::map_values/std::views::values/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::values`.

in this change, we:

- replace `boost::adaptors::map_values` with `std::views::values`
- update affected code to work with `std::views::values`
- the places where we use `boost::join()` are not changed, because
  we cannot use `std::views::concat` yet. this helper is only
  available in C++26.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21265
2024-10-27 21:32:45 +02:00
Pavel Emelyanov
dedb9d349c sstables: Generate table::all_datadirs from db::config and storage_options
As mentioned in the previous patch, there are several places that need
to scan all datafile directories for a given table. This list is
currently stored on table.config.all_datadirs, this patch stops using
one and instead generates it from db::config::data_file_directories and
table's storage options.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-21 15:13:27 +03:00
Pavel Emelyanov
0358515118 replica: Prepare vector of fs::path-s with table dirs
Most of the time table with local storage keeps its sstables in a single
directory referenced by its storage_options::local.dir path. However,
there are two cases when code needs to check all datafile directories
that could be configured -- on boot when distributed loader loads
sstables, and when checking table snapshots.

Both those places check table.cfg.all_datadirs vector of strings and
convert strings to fs::path-s along the way. This patch prepares the
vector of fs::path-s in advance and updates the loop code to work with
path-s.

This is preparation to next patching that will generate vector of paths
for a table.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-21 15:10:53 +03:00
Benny Halevy
d34878e96c view: check_needs_view_update_path: get token_metadata_ptr
check_needs_view_update_path is async and might yield
so the token_metadata reference passed to it must be kept
alive throughout the call.

Fixes scylladb/scylladb#20979

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#20980
2024-10-09 20:56:21 +03:00
Pavel Emelyanov
aa0c20a0e7 distributed_loader: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
792c0060c7 distributed_loader: Use correct datadir to collect local sstable
Current code uses datadir it gets from table itself, which is the 0th
element in the all-datadirs config. So populating local sstables happens
several times from the same directory. Fix it by starting sstable
directory with correct datadir -- the one obtained from the all-datadirs
loop.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
bf654f45bd distributed_loader: Move all-datadirs loop to local storage collecting
It now happens in the outer loop, but it's not correct for S3 storage,
which is thus asked to collect its data twice. Also it's broken for
local storage as well, because the datadir argument is ignored. Next
patch will fix it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
fe779ab1a2 distributed_loader: Collect table subdirs based on its storage options
Collecting sstables for local storage and for S3 storage differs. First,
the populator collects sstables for each datadir configured in
scylla.yaml, but S3 storage doesn't care, so it's effectively asked to
collect the same data twice. Second, S3 collector code uses
sstable_directory simply because that class is used by reshape and
reshard code, but in fact collecting of S3 sstable can be made much
simpler (but that's for later).

Having said that, split preparation of sstables population for local
and S3 storage types.

Indentation is deliberately left broken for local storage collecting
mathod. That's because otherwise next patch will need move it back
anyway.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
ee91cae5b9 distributed_loader: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
6ae486100c distributed_loader: Squash loop of collect_subdir into one method
This prepares the gound for the next patch.
Indentation is left broken.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
4db2929afd distributed_loader: Convert map of directories into a vector
Knowledge of sstable state is no longer needed in the table_populator
start/stop methods, so the map<state, directory> can be converted into
vector<directory>.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
89e9231653 distributed_loader: Make start_subdir() method work with directory
Similarly to populate_subdir() one, it also accepts state and gets
directory out of it. Patch is the same way -- caller now passes it the
reference to directory and doesn't care about the state (in fact, the
start_subdir() doesn't care of the state either).

While at it -- rename the method to reflect what it does.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
999ec88765 distributed_loader: Drop local reference variable
Cleanup after previous patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
50024ef62a distributed_loader: Split start_subdir()
It does two things -- starts sstable_directory and prepares it. Split it
accordingly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
abdc0bb02d distributed_loader: Remove allow-offstrategy argument
This is to make populate_subdir() be self-contained in a way it uses
passed sstable_directory and make caller not care about the state.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
3b583b9d9f distributed_loader: Make populate() method work with directory
The populate_subdir() accepts sstable_state argument and picks the
corresponding sstable_directory object from the map. Patch it so that
caller passes it the sstable_directory reference. For now it makes
things more complicated, but next patches will simplify it back.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
051ac3a737 distributed_loader: Remove check for sstable_directory presense
In the old days the set of sstable_directory-s used by populator could
skip some of them. Now they are all present and the checks is always
false.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
4752a504cd distributed_loader: Out-line table_populator() methods
To make further patching with less indentation level.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
617b0e3ce3 distributed_loader: Print storage options, not datadir
Tables not necessarily have data in a directory, so it's more correct to
show storage options in logs, not some directory path.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
31b2271f07 distributed_loader: Print prepared message
When population throws, the catch block prepares a message to re-throw
another exception and prints the same message into logs. Presumably the
intent was to print the prepared message as well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-07 12:04:23 +03:00
Pavel Emelyanov
6b480589fe Merge 'treewide: accept list of sstables in "restore" API ' from Kefu Chai
before this change, we enumerate the sstables tracked by the
system.sstables table, and restore them when serving
requests to "storage_service/restore" API. this works fine with
"storage_service/backup" API. but this "restore" API cannot be
used as a drop-in replacement of the rclone based API currently
used by scylla-manager.

in order to fill the gap, in this change:

* add the "prefix" parameter for specifying the shared prefix of
  sstables
* add the "sstables" parameter for specifying the list of  TOC
  components of sstables
* remove the "snapshot" parameter, as we don't encode the prefix
  on scylla's end anymore.
* make the "table" parameter mandatory.

Fixes https://github.com/scylladb/scylladb/issues/20461

----

this change is a part of the efforts to bring the native backup/restore to scylla, no need to backprt.

Closes scylladb/scylladb#20685

* github.com:scylladb/scylladb:
  treewide: accept list of sstables in "restore" API
  sstable: pass get_storage_option to sstable_directory::load_sstable()
  test/nodetool: add body parameter to `expected_request`
  tools/scylla-nodetool: enable nodetool to write HTTP body
2024-10-04 12:38:08 +03:00
Kefu Chai
787ea4b1d4 treewide: accept list of sstables in "restore" API
before this change, we enumerate the sstables tracked by the
system.sstables table, and restore them when serving
requests to "storage_service/restore" API. this works fine with
"storage_service/backup" API. but this "restore" API cannot be
used as a drop-in replacement of the rclone based API currently
used by scylla-manager.

in order to fill the gap, in this change:

* add the "prefix" parameter for specifying the shared prefix of
  sstables
* add the "sstables" parameter for specifying the list of  TOC
  components of sstables
* remove the "snapshot" parameter, as we don't encode the prefix
  on scylla's end anymore.
* make the "table" parameter mandatory.

Fixes scylladb/scylladb#20461
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-01 23:24:56 +08:00
Pavel Emelyanov
7f71371de1 distributed_loader: Get token metadata from e.r.m., not database
Though database can be used to get relevant token metadata, it's better
not to use one service (database) as a proxy to get another one (token
metadata). In case of tokens, there's effective replication map at hand,
which is a more correct source of such topology information.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20894
2024-10-01 14:59:35 +03:00
Pavel Emelyanov
350f64c38b distributed_loader: Print storage options, not datadir
When populating keyspace on boot the dist. loader prints a debugging
message with ks:cf names, state and the directory from where it picks
sstables. The last one is not extremely correct, as loading sstables
from S3 happens from a bucket, not directory. So it's better to print
the storage options, not the datadir string.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-19 13:06:39 +03:00
Botond Dénes
f32e67cb9e Merge 'Make sstables without on-disk path' from Pavel Emelyanov
New sstables for a table are created by the table::make_sstable() method. The method then calls sstables_manager::make_sstable() and passes there a path to component files which, in turn, sits on table::config. Since some time ago having an on-disk path for an sstable had become optional, as sstables could be put on S3 storage without local paths involved. In that case the aforementioned "path" is ~~ab~~used as a key in the system.sstables registry, that references a record with information used to retrieve URLs of sstables' objects.

This PR removes the "path" argument from sstables_manager::make_sstable() and its sstable_sdirectory peer. The details of sstables' location are moved onto storage_options and depend on storage type. For now in both storage types this location is still the good-old $datadir/$keyspace/$table-$uuid string. S3 storage needs to be patched more to use more elegant "location" value.

Eventually the `table::config::{datadir|all_datadirs}` will be removed, this PR is the step towards it.

closes: #12707

Closes scylladb/scylladb#20542

* github.com:scylladb/scylladb:
  table: Use storage options to clean the storage
  sstables/storage: Re-use ocally generated vector of paths
  sstables/storage: Visit options once to initialize storage
  sstables_manager: Return table storage options when initalizing storage
  sstables/storage: Fix indentation after previous patch
  table: Move datadirs initialization parallelism to storage level
  sstables/storage: Split the visitor's overloaded functor
  restore: Don't use table_dir to construct sstable_directory
  sstable_directory: Remove table_dir field
  sstable_directory: Use options details in lister
  sstables_manager: Remove table_dir from make_sstable()
  sstables: Remove table_dir from sstable constructor
  sstables/storage: Remove sstring dir from make_storage()
  sstables/storage: Use options to construct
  tests: Properly initialize storage options with "dir"
  distributed_loader: Create S3 options with prefix for restore
  storage_options: Add special-purpose local options maker
  storage_options: Keep local path / s3 prefix onboard
  table: Get another options when initializing storage
2024-09-17 09:41:21 +03:00
Pavel Emelyanov
30c8d89f97 restore: Don't use table_dir to construct sstable_directory
Continuation of the previous patch patching the special-purpose sstable
directory constructor that's used by restore-from-s3-backup code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-13 16:49:50 +03:00
Pavel Emelyanov
36863d4ad0 sstables_manager: Remove table_dir from make_sstable()
It used to be passed to sstable constructor, but now it doesn't need
this argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-13 16:49:50 +03:00
Pavel Emelyanov
33bc9e7112 distributed_loader: Create S3 options with prefix for restore
Restore-from-backup code wants to collect sstables from remote S3. For
that it constructs S3 options, and now it needs to put prefix on it as
well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-13 16:32:39 +03:00
Benny Halevy
f47b5e60bc sstable_directory: create_pending_deletion_log: place pending_delete log under the base directory
To be able to atomically delete sstables both in
base table directory and in its sub-directories,
like `staging/`, use a shared pending_delete_dir
under under the base directory.

Note that this requires loading and processing
the base directory first.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-09-10 09:28:13 +03:00
Pavel Emelyanov
11a04bfb66 code: Introduce restore API method
The method starts a task that uses sstables_loader load-and-stream
functionality to bring new sstables into the cluster. The existing
load-and-stream picks up sstables from upload/ directory, the newly
introduced task collects them from S3 bucket and given prefix (that
correspond to the path where backup API method put them).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-28 15:42:49 +03:00
Pavel Emelyanov
6a006d2255 distributed_loader: Split get_sstables_from_upload_dir()
Next patches will need this method to initialize sstable_directory
differently and then do its regular processing. For that, split the
method into two, next patch will re-use the common part it needs.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-27 16:15:41 +03:00
Benny Halevy
686a8f2939 abstract_replication_strategy: make get_ranges async
To prevent stalls due to large number of tokens.
For example, large cluster with say 70 nodes can have
more than 16K tokens.

Fixes #19757

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-25 10:57:34 +03:00
Benny Halevy
2bbbe2a8bc database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param
Prepare for making the function async.
Then, it will need to hold on to the erm while getting
the token_ranges asynchronously.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-25 10:55:33 +03:00
Pavel Emelyanov
66d72e010c distributed_loader: Lock table via global table ptr
The lock_table() method needs database, ks and cf to find the table on
all shards. The same can be achieved with the help of global_table_ptr
thing that all the core callers already have at hand.

There's a test that doesn't have global table, but it can get one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20139
2024-08-14 20:53:21 +03:00