Currently distributed_loader starts sharded<sstable_directory> with four sharded parameters. That's quite bulky and can be made much shorter.
Closesscylladb/scylladb#15653
* github.com:scylladb/scylladb:
distributed_loader: Remove explicit sharded<erms>
distributed_loader: Brush up start_subdir()
sstable_directory: Add enlightened construction
table: Add global_table_ptr::as_sharded_parameter()
The sharded replication map was needed to provide sharded for sstable
directory. Now it gets sharded via table reference and thus the erms
thing becomes unused
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Drop some local references to class members and line-up arguments to
starting distributed sstable directory. Purely a clean up patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The existing constructor is pretty heavyweight for the distributed
loader to use -- it needs to pass it 4 sharded parameters which looks
pretty bulky in the text editor. However, 5 constructor arguments are
obtained directly from the table, so the dist. loader code with global
table pointer at hand can pass _it_ as sharded parameter and let the
sstable directory extract what it needs.
Sad news is that sstable_directory cannot be switched to just use table
reference. Tools code doesn't have table at hand, but needs the
facilities sstable_directory provides
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
load_truncation_times() now works only for
schema tables since the rest is not loaded
until distributed_loader::init_non_system_keyspaces.
An attempt to call cf.set_truncation_time
for non-system table just throws an exception,
which is caught and logged with debug level.
This means that the call cf.get_truncation_time in
paxos_state.cc has never worked as expected.
To fix that we move load_truncation_times()
closer to the point where the tables are loaded.
The function distributed_loader::populate_keyspace is
called for both system and non-system tables. Once
the tables are loaded, we use the 'truncated' table
to initialize _truncated_at field for them.
The truncation_time check for schema tables is also moved
into populate_keyspace since is seems like a more natural
place for it.
This is continuation of the previous patch -- when populating a table,
creating directories should be (optionally) performed by the lister
backend, not by the generic loader.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The loader code still "knows" that tables' sstables live in directories
on datadir filesystem, but that's not always so. So whether or not the
directory with sstables exists should be checked by sstable directory's
component lister, not the loader.
After this change potentially missing quarantine directory will be
processed with the sstable directory with empty result, but that's OK,
empty directories should be already handled correctly, so even if the
directory lister doesn't produce any sstables because it found no files,
or because it just skipped scanning doesn't make any difference.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There is no need to hold on to the table's
shared ptr since it's held by the global table ptr
we got in the outer loop.
Simplify the code by just getting the local table reference
from `gtable`.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently the datadir is ignored.
Use it to construct the table's base path.
Fixes scylladb/scylladb#15418
Note that scylla still doesn't work correctly
with multiple data directories due to #15510.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently, mark_ready_for_writes is called too early,
after the first data dir is processed, then the next
datadir will hit an assert in `table::mark_ready_for_writes`.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is more efficient to iterate over multiple data directories
in the inner loop rather than the outer loop.
Following patch will make use of the datadir in
table_populator.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Populating of non-system keyspaces is now done by listing datadirs and assuming that each subdir found is a keyspace. For S3-backed keyspaces this is also true, but it's a bug (#13020). The loop needs to walk the list of known keyspaces instead, and try to find the keyspace storage later, based on the storage option.
Closesscylladb/scylladb#15436
* github.com:scylladb/scylladb:
distributed_loader: Indentation fix after previous patch
distributed_loader: Generalize datadir parallelizm loop
distributed_loader: Provide keyspace ref to populate_keyspace
distributed_loader: Walk list of keyspaces instead of directories
Some time ago populating of tables from sstables was reworked to use sstable states instead of full paths (#12707). Since then few places in the populator was left that still operate on the state-based subdirectory name. This PR collects most of those dangling ends
refs: #13020Closesscylladb/scylladb#15421
* github.com:scylladb/scylladb:
distributed_loader: Print sstable state explicitly
distributed_loader: Move check for the missing dir upper
distributed_loader: Use state as _sstable_directories key
Population of keyspaces happens first fo system keyspaces, then for
non-system ones. Both methods iterate over config datadirs to populate
from all configured directories. This patch generalizes this loop into
the populate_keyspace() method.
(indentation is deliberately left broken)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method in question tries to find keyspace reference on the database
by the given keyspace name. However, one of the callers aready has the
keyspace reference at hands and can just pass it. The other calls can
find the keyspace on its own.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When populating non-system keyspaces the dist. loader lists the
directories with keyspaces in datadirs, then tries to call
populate_keyspace() with the found name. If the keyspace in question is
not found on the database, a warning is printed and population
continues.
S3-backed keyspaces are nowadays populated with this process just
because there's a bug #13020 -- even such keyspaces still create empty
directories in datadirs. When the bug gets fixed, population would omit
such keyspaces. This patch prepares this by making population walk the
known keyspaces from the database. BTW, population of system keyspaces
already works by iterating over the list of known keyspaces, not the
datadir subdirectories.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When populating from a particular directory, populator code converts
state to subdir name, then prints the path. The conversion is pretty
much artificial, it's better to provide printer for state and print
state explicitly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The quarantine directory can be missing on the datadir and that's OK. In
order to check that and skip population the populator code uses two-step
logic -- first it checks if the directory exists and either puts or not
the sstable_directory object into the map. Later it checks the map and
decide whether to throw or not if the directory is missing.
Let's keep both check and throw in one place for brevity.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The populator maintains a map of path -> sstable_directory pairs one for
each subdirectory for every sstable state. The "path" is in fact not
used by the logic as it's just a subdirectory name for the state and the
rest of the core operates on state. So it's good to make the map of
directories also be indexed by the state.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We want to switch system.scylla_local table to the
schema commitlog, but load phases hamper here - schema
commitlog is initialized after phase1,
so a table which is using it should be moved to phase2,
but system.scylla_local contains features, and we need
them before schema commitlog initialization for
SCHEMA_COMMITLOG feature.
In this commit we are taking a different approach to
loading system tables. First, we load them all in
one pass in 'readonly' mode. In this mode, the table
cannot be written to and has not yet been assigned
a commit log. To achieve this we've added _readonly bool field
to the table class, it's initialized to true in table's
constructor. In addition, we changed the table constructor
to always assign nullptr to commitlog, and we trigger
an internal error if table.commitlog() property is accessed
while the table is in readonly mode. Then, after
triggering on_system_tables_loaded notifications on
feature_service and sstable_format_selector, we call
system_keyspace::mark_writable and eventually
table::mark_ready_for_writes which selects the
proper commitlog and marks the table as writable.
In sstable_compaction_test we drop several
mark_ready_for_writes calls since they are redundant,
the table has already been made writable in
env.make_table_for_tests call.
The table::commitlog function either returns the current
commitlog or causes an error if the table is readonly. This
didn't work for virtual tables, since they never called
mark_ready_for_writes. In this commit we add this
call to initialize_virtual_tables.
Right now, the function allows for passing the path to a file as a seastar::sstring,
which is then converted to std::filesystem::path -- implicitly to the caller.
However, the function performs I/O, and there is no reason to accept any other type
than std::filesystem::path, especially because the conversion is straightforward.
Callers can perform it on their own.
This commit introduces the more constrained API.
Closes#15266
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.
This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.
refs: #13017
refs: #12707Closes#14152
* github.com:scylladb/scylladb:
sstable/storage: Make filesystem storage with initial state
sstable: Maintain state
sstable: Make .change_state() accept state, not directory string
sstable: Construct it with state
sstables_manager: Remove state-less make_sstable()
table: Make sstables with required state
test: Make sstables with upload state in some cases
tools: Make sstables with normal state
table: Open-code sstables making streaming helpers
tests: Make sstables with normal state by default
sstable_directory: Make sstable with required state
sstable_directory: Construct with state
distributed_loader: Make sstable with desired state when populating
distributed_loader: Make sstable with upload state when uploading
sstable: Introduce state enum
sstable_directory: Merge verify and g.c. calls
distributed_loader: Merge verify and gc invocations
sstable/filesystem: Put underscores to dir members
sstable/s3: Mark make_s3_object_name() const
sstable: Remove filename(dir, ...) method
Pretty cosmetic change, but it will allow S3 to finally support moving
sstables between states (after this patch it still doesn't)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is to replace full path sitting on this object eventually. For now
they have to co-exist, but state will be used to make_sstable()-s from
manager with its new API
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This still needs to conver state to directory name internally as
sstable_directory instances are hashed on populator by subdir string.
Also the full string path is printed in logs. All this is now internal
to populate method and will be fixed later
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Both, init_system_keyspace() and init_non_system_keyspaces() populate
the keyspaces with the help of distributed_loader::populate_keyspace().
That method, in turn, walks the list of keyspaces' tables to load
sstables from disk and attach to them.
After it both init_...-s take the 2nd pass over keyspaces' tables to
call the table::mark_ready_for_writes() on each. This marking can be
moved into populate_keyspace(), that's much easier and shorter because
that method already has the shard-wide table pointer and can just call
whatever it needs on the table.
This changes the initialization sequence, before the patch all tables
were populated before any of them was marked as ready for write. This
looks safe however, as marking a table for write meaks resetting its
generation generator and different tables' generators are independent
from each other.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15026
Skip over verification of owner and mode of the snapshots
sub-directory as this might race with scylla-manager
trying to delete old snapshots concurrently.
Fixes#12010Closes#14892
* github.com:scylladb/scylladb:
distributed_loader: process_sstable_dir: do not verify snapshots
utils/directories: verify_owner_and_mode: add recursive flag
Skip over verification of owner and mode of the snapshots
sub-directory as this might race with scylla-manager
trying to delete old snapshots concurrently.
Fixes#12010
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
As a preparation for ensuring access safety for column families
related maps, add tables_metadata, access to members of which
would be protected by rwlock.
Add task manager's task covering resharding compaction.
A struct and some functions are moved from replica/distributed_loader.cc
to compaction/task_manager_module.cc.
instead of accessing the `feature_service`'s member variable, use
the accessor provided by sstable_manager. so we always access the
this setting via a single channel. this should helps with the
readability.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14658
This reverts commit 2a58b4a39a, reversing
changes made to dd63169077.
After patch 87c8d63b7a,
table_resharding_compaction_task_impl::run() performs the forbidden
action of copying a lw_shared_ptr (_owned_ranges_ptr) on a remote shard,
which is a data race that can cause a use-after-free, typically manifesting
as allocator corruption.
Note: before the bad patch, this was avoided by copying the _contents_ of the
lw_shared_ptr into a new, local lw_shared_ptr.
Fixes#14475Fixes#14618Closes#14641
This reverts commit 562087beff.
The regressions introduced by the reverted change have been fixed.
So let's revert this revert to resurrect the
uuid_sstable_identifier_enabled support.
Fixes#10459
schema::get_sharder() does not use the correct sharder for
tablet-based tables. Code which is supposed to work with all kinds of
tables should obtain the sharder from erm::get_sharder().
This reverts commit d1dc579062, reversing
changes made to 3a73048bc9.
Said commit caused regressions in dtests. We need to investigate and fix
those, but in the meanwhile let's revert this to reduce the disruption
to our workflows.
Refs: #14283
Take references to services which are initialized earlier. The
references to `gossiper`, `storage_service` and `raft_group0_registry`
are no longer needed.
This will allow us to move the `make` step right after starting
`system_keyspace`.