Commit Graph

88 Commits

Author SHA1 Message Date
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Pavel Emelyanov
0c7efe38e1 distributed_loader: Rename table_population_metadata
It used to be just metadata by providing the meta for population, now it
does the population by itself, so rename it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-15 20:15:04 +03:00
Pavel Emelyanov
15926b22f4 distributed_loader: Dont check for directory presense twice
Both start_subdir() and populate_subdir() check for the directory to
exist with explicit file_exists() check. That's excessive, if the
directory wasn't there in the former call, the latter can get this by
checking the _sstable_directories map.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-15 20:15:04 +03:00
Pavel Emelyanov
eb477a13ad distributed_loader: Move populate calls into metadata.start()
This makes the metadata class export even shorter API, keeps the three
sub-directories scanned in one place and allows removing the zero-shard
assertion.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-15 20:15:04 +03:00
Pavel Emelyanov
123a82adb2 distributed_loader: Remove local aliases and exporters
After previous patch all local alias references in
populate_column_family() are no longer requires. Neither are the
exporting calls from the table_population_metadata class.

Some non-obvious change is capturing 'this' instead of 'global_table' on
calls that are cross-shard. That's OK, table_population_metadata is not
sharded<> and is designed for cross-shard usage too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-15 19:57:41 +03:00
Pavel Emelyanov
16fca3fa8a distributed_loader: Move populate_column_family() into population meta
This ownership change also requires the auto& = *this alias and extra
specification where to call reshard() and reshape() from.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-15 19:57:41 +03:00
Pavel Emelyanov
40de737b36 sstable_directory: Rename remove_input_sstables_from_reshaping()
It unlinks unshared sstables filtering some of them out. Name it
according to what it does without mentioning reshape/reshard.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-08 15:00:44 +03:00
Pavel Emelyanov
a1dc251214 sstable_directory: Make use of remove_sstables() helper
Currently it's called remove_input_sstables_from_resharding() but it's
just unlinks sstables in parallel from the given list. So rename it not
to mention reshard and also make use of this "new" helper in the
remove_input_sstables_from_reshaping(), it needs exactly the same
functionality.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-08 15:00:44 +03:00
Pavel Emelyanov
cb36f5e581 sstable_directory: Merge output sstables collecting methods
There are two of them collecting sstables from resharding and reshaping.
Both doing the same job except for the latter doesn't expect the list to
contain remote sstables.

This patch merges them together with the help of an extra sanity boolean
to check for the remote sstable not in the list. And renames the method
not to mention reshape/reshard.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-08 15:00:41 +03:00
Pavel Emelyanov
73d458cf89 distributed_loader: Remove max_compaction_threshold argument from reshard()
Since the whole reshard() is local to dist. loader code now, the caller
of the reshard helper may let this method get the threshold itself

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:43 +03:00
Pavel Emelyanov
25aaa45256 distributed_loader: Remove compaction_manager& argument from reshard()
It can be obtained from the table&

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:43 +03:00
Pavel Emelyanov
15547f1b5b sstable_directory: Move the .reshard() to distributed_loader
Now all the reshading logic is accumulated in distributed loader and the
sstable_directory is just the place where sstables are collected.

The changes summary is:
- add sstable_directory as argument (used to be "this")
- replace all "this" captures with &dir ones
- remove temporary namespace gap and declaration from sst-dir class

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:43 +03:00
Pavel Emelyanov
ab5f48d496 sstable_directory: Add helper to load foreign sstable
This is to generalize the code duplication between .reshard() and
existing .load_foreign_sstables() (plural form).

Make it coroutinized right at once.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:43 +03:00
Pavel Emelyanov
e6e65c87d5 sstable_directory: Add io-prio argument to .reshard()
Now it gets one from this-> but the method is becoming static one in
distributed_loader which only has it as an argument. That's not big deal
as the current IO class is going to be derived from current sched group,
so this extra arg will go away at all some day.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:41 +03:00
Pavel Emelyanov
a32d2b6d6a sstable_directory: Move reshard() to distributed_loader.cc
Just move the code and create temporary namespace gap for that

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:12 +03:00
Pavel Emelyanov
1de8c85acd distributed_loader: Remove compaction_manager& argument from reshape()
It can be obtained from the table&

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:31:12 +03:00
Pavel Emelyanov
d734b6b7c1 sstable_directory: Move the .reshape() to distributed loader
Now all the reshaping logic is accumulated in distributed loader and the
sstable_directory is just the place where sstables are collected.

The changes summary is:
- add sstable_directory as argument (used to be "this")
- replace all "this" captures with &dir ones
- remove temporary namespace gap and declaration from sst-dir class

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:30:55 +03:00
Pavel Emelyanov
b906d34807 sstable_directory: Add helper to retrive local sstables
There are methods to retrive shared local sstables and foreign sstables,
so here's one more to the family

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:23:40 +03:00
Pavel Emelyanov
420fc8d4df sstable_directory: Add io-prio argument to .reshape()
Now it gets one from this-> but the method is becoming static one in
distributed_loader which only has it as an argument. That's not big deal
as the current IO class is going to be derived from current sched group,
so this extra arg will go away at all some day.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:22:27 +03:00
Pavel Emelyanov
a70d6017f8 sstable_directory: Move reshape() to distributed_loader.cc
Just move the code and create temporary namespace gap for that

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-07 19:21:54 +03:00
Botond Dénes
84a69b6adb db/view/view_update_check: check_needs_view_update_path(): filter out non-member hosts
We currently don't clean up the system_distributed.view_build_status
table after removed nodes. This can cause false-positive check for
whether view update generation is needed for streaming.
The proper fix is to clean up this table, but that will be more
involved, it even when done, it might not be immediate. So until then
and to be on the safe side, filter out entries belonging to unknown
hosts from said table.

Fixes: #11905
Refs: #11836

Closes #11860
2023-01-27 17:12:45 +03:00
Pavel Emelyanov
bfddfb8927 distributed_loader: Add helpers to make sstables for reshape/reshard
This kills two birds with one stone. First, it factors out (quite a lot
of) common arguments that are passed to table.make_sstable(). Second, it
makes the helpers call sstable manager with extended args making it
possible to remove those wrappers from table class later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-01-26 10:47:39 +03:00
Kamil Braun
a483915c62 db: system_keyspace: add a virtual table with raft configuration
Add a new virtual table `system.raft_state` that shows the currently
operating Raft configuration for each present group. The schema is the
same as `system.raft_snapshot_config` (the latter shows the config from
the last snapshot). In the future we plan to add more columns to this
table, showing more information (like the current leader and term),
hence the generic name.

Adding the table requires some plumbing of
`sharded<raft_group_registry>&` through function parameters to make it
accessible from `register_virtual_tables`, but it's mostly
straightforward.

Also added some APIs to `raft_group_registry` to list all groups and
find a given group (returning `nullptr` if one isn't found, not throwing
an exception).
2023-01-17 12:28:00 +01:00
Pavel Emelyanov
abd3602b10 sstable_directory: Remove sstable creation callback
It's no longer used.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-05 12:03:19 +03:00
Pavel Emelyanov
db657a8d1c sstable_directory: Keep error handler generator
Yet another continuation to previous patch -- IO error handlers
generator is also needed to create sstables.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-05 12:03:19 +03:00
Pavel Emelyanov
4281f4af42 sstable_directory: Keep schema_ptr
Continuation of one-before-previous patch. In order to create sstable
without external lambda the directory code needs schema.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-05 12:03:19 +03:00
Pavel Emelyanov
8df1bcb907 sstable_directory: Use directory semaphore from manager
After previous patch sstables_directory code may no longer require for
semaphore argument, because it can get one from manager. This makes the
directory API shorter and simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-05 12:03:19 +03:00
Pavel Emelyanov
4da941e159 sstable_directory: Keep reference on manager
The sstables_directly accesses /var/lib/scylla/data in two ways -- lists
files in it and opens sstables. The latter is abdtracted with the help
of lambdas passed around, but the former (listing) is done by using
directory liters from utils.

Listing sstables components with directlry lister won't work for object
storage, the directory code will need to call some abstraction layer
instead. Opening sstables with the help of a lambda is a bit of
overkill, having sstables manager at hand could make it much simpler.

Said that, this patch makes sstables_directly reference sstables_manager
on start.

This change will also simplify directory semaphore usage (next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-05 12:03:19 +03:00
Pavel Emelyanov
be8512d7cc sstables, code: Wrap directory semaphore with concurrency
Currently this is a sharded<semaphore> started/stopped in main and
referenced by database in order to be fed into sstables code. This
semaphore always comes with the "concurrency" parameter that limits the
parallel_for_each parallelizm.

This patch wraps both together into directory_semaphore class. This
makes its usage simpler and will allow extending it in the future.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-05 11:59:30 +03:00
Avi Kivity
02b66bb31a Merge 'Mark sstable::<directory accessing methods> private' from Pavel Emelyanov
One of the prerequisites to make sstables reside on object-storage is not to let the rest of the code "know" the filesystem path they are located on (because sometimes they will not be on any filesystem path). This patch makes the methods that can reveal this path back private so that later they can be abstracted out.

Closes #12182

* github.com:scylladb/scylladb:
  sstable: Mark some methods private
  test: Don't get sstable dir when known
  test: Use move_to_quarantine() helper
  test: Use sstable::filename() overload without dir name
  sstables: Reimplement batch directory sync after move
  table, tests: Make use of move_to_new_dir() default arg
  sstables: Remove fsync_directory() helper
  table: Simplify take_snapshot()'s collecting sstables names
2022-12-04 17:45:37 +02:00
Pavel Emelyanov
1b42d5fce3 table, tests: Make use of move_to_new_dir() default arg
The method in question accepts boolean bit whether or not it should sync
directories at the end. It's always true but in one case, so there's the
default value for it. Make use of it.

Anticipating the suggestion to replace bool with bool_class -- next
patch will replace it with something else.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-12-02 21:07:16 +03:00
Pavel Emelyanov
71179ff5ab distributed_loader: Use coroutine::lambda in sleeping coroutine
According to seastar/doc/lambda-coroutine-fiasco.md lambda that
co_awaits once loses its capture frame. In distrobuted_loader
code there's at least one of that kind.

fixes: #12175

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #12170
2022-12-02 13:06:33 +02:00
Avi Kivity
7c66fdcad1 Merge 'Simplify sstable_directory configuration' from Pavel Emelyanov
When started the sstable_directory is constructed with a bunch of booleans that control the way its process_sstable_dir method works. It's shorter and simpler to pass these booleans into method directly, all the more so there's another flag that's already passed like this.

Closes #12005

* github.com:scylladb/scylladb:
  sstable_directory: Move all RAII booleans onto flags
  sstable_directory: Convert sort-sstables argument to flags struct
  sstable_directory: Drop default filter
2022-11-23 16:16:04 +02:00
Pavel Emelyanov
22133a3949 sstable_directory: Move all RAII booleans onto flags
There's a bunch of booleans that control the behavior of sstable
directory scanning. Currently they are described as verbose
bool_class<>-es and are put into sstable_directory construction time.

However, these are not used outside of .process_sstable_dir() method and
moving them onto recently added flags struct makes the code much
shorter (29 insertions(+), 121 deletions(-))

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-22 18:30:00 +03:00
Pavel Emelyanov
7ca5e143d7 sstable_directory: Convert sort-sstables argument to flags struct
The sstable_directory::process_sstable_dir() accepts a boolean to
control its behavior when collecting sstables. Turn this boolean into a
structure of flags. The intention is to extend this flags set in the
future (next patch).

This boolean is true all the time, but one place sets it to true in a
"verbose" manner, like this:

        bool sort_sstables_according_to_owner = false;
        process_sstable_dir(directory, sort_sstables_according_to_owner).get();

the local variable is not used anymore. Using designated initializers
solves the verbosity in a nicer manner.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-22 18:19:23 +03:00
Pavel Emelyanov
7c7017d726 sstable_directory: Drop default filter
It's used as default argument for .reshape() method, but callers specify
it explicitly. At the same time the filter is simple enough and is only
used in one place so that the caller can just use explicit lambda.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-22 18:19:23 +03:00
Pavel Emelyanov
2f9b7931af sstables: Delete log file in replay_pending_delete_log()
It's natural that the replayer cleans up after itself

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-21 13:16:22 +03:00
Pavel Emelyanov
bdc47b7717 sstables: Move deletion log manipulations to sstable_directory.cc
The deletion log concept uses the fact that files are on a POSIX
filesystem. Support for another storage type will have to reimplement
this place, so keep the FS-specific code in _directory.cc file.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-21 13:16:21 +03:00
Pavel Emelyanov
a61c96a627 sstables: Use fs::path in replay_pending_delete_log()
It's called by a code that has fs::path at hand and internally uses
helpers that need fs::path too, so no need to convert it back and forth.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-21 13:15:25 +03:00
Pavel Emelyanov
bc62ca46d4 lister: Make lister::dir_entry_types an enum_set
This type is currently an unordered_set, but only consists of at most
two elements. Making it an enum_set renders it into a size_t variable
and better describes the intention.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-17 19:01:45 +03:00
Benny Halevy
4d7f0be929 distributed_loader: populate_column_family: reindent 2022-10-19 14:18:38 +03:00
Benny Halevy
030afaa934 distributed_loader: coroutinize populate_column_family
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-19 14:18:04 +03:00
Benny Halevy
0f23ee14c9 distributed_loader: table_population_metadata: start: reindent 2022-10-19 14:16:59 +03:00
Benny Halevy
39cec4f304 distributed_loader: table_population_metadata: coroutinize start_subdir
Calling it in a seastar thread was done to reduce code churn
and facilitate backporting.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-19 14:16:59 +03:00
Benny Halevy
5749a54cab distributed_loader: table_population_metadata: start_subdir: reindent 2022-10-19 14:16:59 +03:00
Benny Halevy
119c0f3983 distributed_loader: pre-load all sstables metadata for table before populating it
We should scan all sstables in the table directory and its
subdirectories to determine the highest sstable version and generation
before using it for creating new sstables (via reshard or reshape).

Fixes scylladb/scylladb#11793

Note: table_population_metadata::start_subdir is called
in a seastar thread to facilitate backporting to old versions
that do not support coroutines yet.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-19 14:16:57 +03:00
Pavel Emelyanov
1938412d7a system_keyspace: Make make() non-static
This helper needs system_keyspace reference and using "this" as this
looks natural. Also this de-static-ification makes it possible to put
some sense into the invoke_on_all() call from init_system_keyspace()

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-06 17:56:11 +03:00
Pavel Emelyanov
9f79525f8e distributed_loader: Pass sys_ks argument to init_system_keyspace()
It's final destination is virtual tabls registration code called from
init_system_keyspace() eventually

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-06 17:55:03 +03:00
Calle Wilund
54aca8e814 distributed_loader: Restore separate processing of keyspace init prio/normal
Fixes #11349

In 7396de7 (and refactorings before it) the set of prioritized keyspaces (and processing thereof)
was removed, due to apparent non-usage (which is true for open-source version).

This functionality is however required for certain features of the enterprise version (ear).
As such is needs to be restored and reenabled. This patch and revert before it does so, adapted
to the recent version of this file.
2022-08-23 10:39:19 +00:00
Calle Wilund
d9c391e366 Revert "distributed_loader: Remove unused load-prio manipulations"
This reverts commit 7396de72b1.

In 7396de7 (and refactorings before it) the set of prioritized keyspaces (and processing thereof)
was removed, due to apparent non-usage (which is true for open-source version).

This functionality is however required for certain features of the enterprise version (ear).
As such is needs to be restored and reenabled. This reverts the actual commit, patch after
ensures we use the prio set.
2022-08-23 10:34:05 +00:00