Commit Graph

4972 Commits

Author SHA1 Message Date
Dawid Medrek
3c347cc196 db/hints/manager: Explicitly delete copy constructor
This commit explicitly deletes the copy constructor of
db::hints::manager and its copy assignment. They're not
used in the code, and they should not.
2023-10-06 11:54:15 +02:00
Dawid Medrek
ee5a5c1661 db/hints: Capitalize constants
This is a common convention. Follow it for readability.
2023-10-06 11:54:15 +02:00
Dawid Medrek
fd30bac7b1 db/hints/manager: Hide declarations 2023-10-06 11:54:15 +02:00
Dawid Medrek
4b03cba1bf db/hints/manager: Move the defintions of static members to the header
If the variables are accessible from the outside, it makes
sense to also expose their initial values to the user.
This commit moves them to the header and marks as inline.
2023-10-06 11:54:15 +02:00
Dawid Medrek
c3ab28f5e9 db/hints: Move make_dummy() to the header
The function is trivial. It can also be marked as noexcept.
2023-10-06 11:54:15 +02:00
Dawid Medrek
5e333f0a52 db/hints: Don't explicitly define ~directory_initializer()
The destructor is the default destructor, and it is safe
to drop it altogether.
2023-10-06 11:53:02 +02:00
Dawid Medrek
9f215d3cf1 db/hints: Change the order of logging in ensure_created_and_verified()
The new logging order seems to make more sense, i.e.
we first log that we're creating and validating directories,
and only then do we start doing that.
The previous order when those actions were reversed didn't
match the log's message because the action was already
done when we informed the user of it.
2023-10-06 11:14:41 +02:00
Dawid Medrek
4ad3e8d37b db/hints: Coroutinize ensure_rebalanced() 2023-10-06 11:14:41 +02:00
Dawid Medrek
672cdb5c05 db/hints: Coroutinize ensure_created_and_verified() 2023-10-06 11:14:41 +02:00
Dawid Medrek
a5f14cb130 db/hints: Improve formatting of directory_initializer::impl
The implementation class has been divided into clear sections.
The indentation has also been adjusted to what is commonly
used in the codebase.
2023-10-06 11:14:41 +02:00
Dawid Medrek
500175d738 db/hints: Do not rely on the values of enums
These changes move away from relying on specific
values of enum variants. The code based on the arithmetic
of them is trivial, and there is no reason to not operator==
and operator!= instead. This should make the code less error
prone and easier to understand.
2023-10-06 11:14:41 +02:00
Dawid Medrek
d0b4d9f14f db/hints: Move the implementation of directory_initializer
This commit moves said code to the top of manager.cc
to match its position in the header file. That should
make navigation easier.
2023-10-06 11:14:41 +02:00
Dawid Medrek
b516fe1fc0 db/hints: Prefer nested namespaces
This reduces the amount of boilerplate.
2023-10-06 11:14:41 +02:00
Dawid Medrek
75a85b224b db/hints: Remove an unused alias from manager.hh 2023-10-06 11:14:41 +02:00
Dawid Medrek
fc80c57bec db/hints: Reorder includes in manager.hh and .cc
These changes improve the readability of the included headers.
2023-10-06 11:14:41 +02:00
Petr Gusev
a6087a10bd system_keyspace: drop truncation_record
This is a refactoring commit without observable
changes in behaviour.

The only usage was in get_truncation_records
method which can be inlined.
2023-10-05 15:19:59 +04:00
Petr Gusev
9d350e7532 system_keyspace: remove get_truncated_at method
The only usage is in batchlog_manager, and it
can be replaced with cf.get_truncation_time().

std::optional<std::reference_wrapper<canonical_mutation>>
is replaced with canonical_mutation* since it is
semantically the same but with less type boilerplate.
2023-10-05 15:19:59 +04:00
Petr Gusev
32a19fd61b database: add_column_family: rename readonly parameter to is_new
We want to make table::_truncated_at optional, so that in
get_truncation_time we can assert that it is initialized.
For existing tables this initialisation will happen in
load_truncation_times function, and for new tables we
want to initialize it in add_column_family like we do
with mark_ready_for_writes.

Now add_column_family function has parameter 'readonly', which is
set by the callers to false if we are creating a fresh new table
and not loading it from sstables. In this commit we rename this
parameter to is_new and invert the passed values.
This will allow us in the next commit to initialize _truncated_at field
for new tables.
2023-10-05 15:19:59 +04:00
Petr Gusev
b70bca71bc system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace
load_truncation_times() now works only for
schema tables since the rest is not loaded
until distributed_loader::init_non_system_keyspaces.
An attempt to call cf.set_truncation_time
for non-system table just throws an exception,
which is caught and logged with debug level.
This means that the call cf.get_truncation_time in
paxos_state.cc has never worked as expected.

To fix that we move load_truncation_times()
closer to the point where the tables are loaded.
The function distributed_loader::populate_keyspace is
called for both system and non-system tables. Once
the tables are loaded, we use the 'truncated' table
to initialize _truncated_at field for them.

The truncation_time check for schema tables is also moved
into populate_keyspace since is seems like a more natural
place for it.
2023-10-05 15:19:52 +04:00
Pavel Emelyanov
96651e0ddb sstables: Do not keep directory, keyspace and table names on descriptor
Now no code uses those strings. Even worse -- there are some places that
need to provide some strings but don't have real values at hand, so just
hard-code the empty strings there (because they are really not used).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-05 12:21:01 +03:00
David Garcia
1121a4df04 docs: add groups to reference docs
fix: comment

Closes scylladb/scylladb#15592
2023-10-04 11:42:36 +03:00
Petr Gusev
9711bfde11 commitlog_replayer: refactor commitlog_replayer::impl::init
We don't need map_reduce here since get_truncated_positions returns
the same result on all shards.

We remove 'finally' semantics in this commit since it doesn't seem we
really need it. There is no code that relies on the state of this
data structure in case of exception. An exception will propagate
to scylla_main() and the program will just exit.
2023-10-03 17:11:40 +04:00
Petr Gusev
c94946d566 system_keyspace: drop redundant typedef 2023-10-03 17:11:40 +04:00
Petr Gusev
f7d2300cf9 system_keyspace: drop redundant save_truncation_record overload 2023-10-03 17:11:40 +04:00
Petr Gusev
da1e6751e9 table: rename cache_truncation_record -> set_truncation_time
This is a refactoring commit without observable
changes in behaviour.

There is a truncation_record struct, but in this method we
only care about time, so rename it (and other related methods)
appropriately to avoid confusion.
2023-10-03 17:11:35 +04:00
Botond Dénes
ecceb554c3 Merge 'db/hints: Clean up hint_storage.cc' from Dawid Mędrek
This PR is the second step in refactoring the Hinted Handoff module. It cleans up the contents of the file `hint_storage.cc`. The biggest change is the transition from continuations to coroutines.

Refs #15358

Closes scylladb/scylladb#15496

* github.com:scylladb/scylladb:
  db/hints: Alias segment list in hint_storage.cc
  db/hints: Rename rebalance to rebalance_hints
  db/hints: Clean up rebalance() in hint_storage.cc
  db/hints: Coroutinize hint_storage.cc
  db/hints: Clean up remove_irrelevant_shards_directories() in hint_storage.cc
  db/hints: Clean up rebalance_segments() in hint_storage.cc
  db/hints: Clean up rebalance_segments_for() in hint_storage.cc
  db/hints: Clean up get_current_hints_segments() in hint_storage.cc
  db/hints: Rename scan_for_hints_dirs to scan_shard_hint_directories
  db/hints: Clean up scan_for_hints_dirs() in hint_storage.cc
  db/hints: Wrap hint_storage.cc in an anonymous namespace
2023-09-29 08:55:38 +03:00
Petr Gusev
1b2e0d0cc9 system_keyspace: get_truncated_position -> get_truncated_positions
This method can return many replay_positions, so
the plural form is more appropriate.
2023-09-28 12:25:40 +04:00
Dawid Medrek
a870eeb2ab db/hints: Alias segment list in hint_storage.cc
Naming the type should improve readability.
2023-09-27 18:49:08 +02:00
Dawid Medrek
aba85c9c98 db/hints: Rename rebalance to rebalance_hints
The new name conveys the idea clearly.
2023-09-27 18:49:08 +02:00
Dawid Medrek
64f4b825d3 db/hints: Clean up rebalance() in hint_storage.cc
This commit fixes indentation and formatting after
recent changes in the file.
2023-09-27 18:49:04 +02:00
Dawid Medrek
b662756256 db/hints: Coroutinize hint_storage.cc 2023-09-27 18:47:38 +02:00
Dawid Medrek
17e763a83a db/hints: Clean up remove_irrelevant_shards_directories() in hint_storage.cc
This commit makes the function abide by the limit of 120 characters
per line and stops unnecessarily calling c_str() on seastar::sstring.
2023-09-27 18:45:01 +02:00
Dawid Medrek
73d02cfcef db/hints: Clean up rebalance_segments() in hint_storage.cc
This commit makes the function less compact and turns overly
long lines into shorter ones to improve the readability of
the code.
2023-09-27 18:45:01 +02:00
Dawid Medrek
479f4d1ad3 db/hints: Clean up rebalance_segments_for() in hint_storage.cc
This commit makes the function less compact and abides by the limit
of 120 characters per line; that makes the code more readable.
We start using fmt::to_string instead of seastar::format("{:d"})
to convert strings to integers -- the new way is the preferred one.
The changes also name variables in a more descriptive way.
2023-09-27 18:45:01 +02:00
Dawid Medrek
a1df8dbf1c db/hints: Clean up get_current_hints_segments() in hint_storage.cc
This commit makes the function less compact and abides by the limit
of 120 characters per line. That makes the code more readable.
It also doesn't unnecessarily call c_str() on seastar::sstring.
2023-09-27 18:45:01 +02:00
Dawid Medrek
1fccd34dba db/hints: Rename scan_for_hints_dirs to scan_shard_hint_directories
The new name better conveys which directories the function should scan.
2023-09-27 18:45:01 +02:00
Dawid Medrek
8e94074b85 db/hints: Clean up scan_for_hints_dirs() in hint_storage.cc
There is no need to call c_str() on the name of the directory entry.
In fact, the used overload std::stoi() takes an std::string as its
argument. Providing seastar::sstring instead of const char* is more
efficient because we can allocate just the right amount of memory
and std::memcpy it, i.e. call std::string(const char*, std::size_t).
Using the overload std::string(const char*) would need to first
traverse the string to find the null byte.

This is a small change, all the more because paths don't tend to
be long, but it's some gain nonetheless.

The commit also inserts a few empty lines to make the code less
compact and improve readability as a result.
2023-09-27 18:45:01 +02:00
Dawid Medrek
7c68882578 db/hints: Wrap hint_storage.cc in an anonymous namespace
An anonymous namespace is a safer mechanism than the static
keyword. When adding a new piece of code, it's easy to
forget about adding the static. In that case, that code
might undergo external linkage. However, when code is put
in an anonymous namespace (when it should not), the linker
will immediately detect it (in most cases), and
the programmer will be able to spot and fix their mistake
right away.
2023-09-27 18:41:41 +02:00
Piotr Dulikowski
caf1d4938e topology_state_machine: add supported_features to replica_state
The `service::topology_features` struct was introduced in #14955. Its
purpose was to make it possible to load cluster features from
`system.topology` before schema commitlog replay. It contains a map from
host ID to supported feature set for every normal node.

In order not to duplicate logic for loading features,
the `service::topology`'s `replica_state`s do not hold a set of
supported features and users are supposed to refer to the features
in `topology_features`, which is a field in the `topology` struct.
However, accessing features is quite awkward now.

This commit adds `supported_features` field back to the `replica_state`
struct and the `load_topology_state` function initializes them properly.
The logic duplication needed to initialize them is quite small and the
drawbacks that come with it are outweighed by the fact that we now can
refer to node's supported features in a more natural way.

The `topology_features` struct is no longer a field of `topology`, but
it still exists for the purpose of the feature check that happens before
commitlog replay.
2023-09-26 15:56:52 +02:00
Pavel Emelyanov
becd960ae8 view_update_generator: Add logging to do_abort()
Just tell the logs that the guy is aborting
refs: #10941

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-21 13:34:21 +03:00
Pavel Emelyanov
967ebacaa4 view_update_generator: Move abort kicking to do_abort()
When v.u.g. stops is first aborts the generation background fiber by
requesting abort on the internal abort source and signalling the fiber
in case it's waiting. Right now v.u.g.::stop() is defer-scheduled last
in main(), so this move doesn't change much -- when stop_signal fires,
it will kick the v.u.g.::do_abort() just a bit earlier, there's nothing
that would happen after it before real ::stop() is called that depends
on it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-21 13:32:45 +03:00
Pavel Emelyanov
e34220ebb7 view_update_generator: Add early abort subscription
Subscribe v.u.g. to the main's stop_signal. For now a no-op callback.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-21 13:32:45 +03:00
Tomasz Grabiec
3d4398d1b2 Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun
When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620).

If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957).

When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary.

We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`.

Fixes: #7620
Fixes: #13957

Closes scylladb/scylladb#15331

* github.com:scylladb/scylladb:
  test: add test for group 0 schema versioning
  test/pylib: log_browsing: fix type hint
  feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode
  schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0
  migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations
  schema_tables: use schema version from group 0 if present
  migration_manager: store `group0_schema_version` in `scylla_local` during schema changes
  migration_manager: migration_request handler: assume `canonical_mutation` support
  system_keyspace: make `get/set_scylla_local_param` public
  feature_service: add `GROUP0_SCHEMA_VERSIONING` feature
  schema_tables: refactor `scylla_tables(schema_features)`
  migration_manager: add `std::move` to avoid a copy
  schema_tables: remove default value for `reload` in `merge_schema`
  schema_tables: pass `reload` flag when calling `merge_schema` cross-shard
  system_keyspace: fix outdated comment
2023-09-20 10:43:40 +02:00
Botond Dénes
111cdce2e1 Merge 'db/hints: Modularize manager.hh' from Dawid Mędrek
This PR modularizes `manager.{hh, cc}` by dividing the files into separate smaller units. The changes improve overall readability of code and help reason about it. Each file has a specific purpose now.

This is the first step in refactoring the Hinted Handoff module.

Refs scylladb/scylla#15358

Closes scylladb/scylladb#15378

* github.com:scylladb/scylladb:
  db/hints: Remove unused aliases from manager.hh
  db/hints: Rename end_point_hints_manager
  db/hints: Rename sender to hint_sender
  db/hints: Move the rebalancing logic to hint_storage
  db/hints: Move the implementation of sender
  db/hints: Move the declaration of sender to hint_sender.hh
  db/hints: Move sender::replay_allowed() to the source file
  db/hints: Put end_point_hints_manager in internal namespace
  db/hints: Move the implementation of end_point_hints_manager
  db/hints: Move the declaration of end_point_hints_manager
  db/hints: Move definitions of functions using shard hint manager
  db/hints: Introduce hint_storage.hh
  db/hints: Extract the logger from manager.cc
  db/hints: Extract common types from manager.hh
2023-09-19 10:56:16 +03:00
Michael Huang
62a8a31be7 cdc: use chunked_vector for topology_description entries
Lists can grow very big. Let's use a chunked vector to prevent large contiguous
allocations.
Fixes: #15302.

Closes scylladb/scylladb#15428
2023-09-18 23:17:01 +03:00
Kamil Braun
bc6f7d1b20 Merge 'raft topology: add garbage collection for internal CDC generations table' from Patryk Jędrzejczak
We add garbage collection for the `CDC_GENERATIONS_V3` table to prevent
it from endlessly growing. This mechanism is especially needed because
we send the entire contents of `CDC_GENERATIONS_V3` as a part of the
group 0 snapshot.

The solution is to keep a clean-up candidate, which is one of the
already published CDC generations. The CDC generation publisher
introduced in #15281 continually uses this candidate to remove all
generations with timestamps not exceeding the candidate's and sets a new
candidate when needed.

We also add `test_cdc_generation_clearing.py` that verifies this new
mechanism.

Fixes #15323

Closes scylladb/scylladb#15413

* github.com:scylladb/scylladb:
  test: add test_cdc_generation_clearing
  raft topology: remove obsolete CDC generations
  raft topology: set CDC generation clean-up candidate
  topology_coordinator: refactor publish_oldest_cdc_generation
  system_keyspace: introduce decode_cdc_generation_id
  system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3
2023-09-18 11:30:10 +02:00
Kamil Braun
947c419421 schema_tables: don't delete version cell from scylla_tables mutations from group 0
As explained in the previous commit, we use the new
`committed_by_group0` flag attached to each row of a `scylla_tables`
mutation to decide whether the `version` cell needs to be deleted or
not.

The rest of #13957 is solved by pre-existing code -- if the `version`
column is present in the mutation, we don't calculate a hash for
`schema::version()`, but take the value from the column:

```
table_schema_version schema_mutations::digest(db::schema_features sf)
const {
    if (_scylla_tables) {
        auto rs = query::result_set(*_scylla_tables);
        if (!rs.empty()) {
            auto&& row = rs.row(0);
            auto val = row.get<utils::UUID>("version");
            if (val) {
                return table_schema_version(*val);
            }
        }
    }

    ...
```

The issue will therefore be fixed once we enable
`GROUP0_SCHEMA_VERSIONING`.
2023-09-15 14:32:52 +02:00
Kamil Braun
ce68ee0950 migration_manager: add committed_by_group0 flag to system.scylla_tables mutations
As described in #13957, when creating or altering a table in group 0
mode, we don't want each node to calculate `schema::version()`s
independently using a hash algorithm. Instead, we want to all nodes to
use a single version for that table, commited by the group 0 command.

There's even a column ready for this in `system.scylla_tables` --
`version`. This column is currently being set for system tables, but
it's not being used for user tables.

Similarly to what we did with global schema version in earlier commits,
the obvious thing to do would be to include a live cell for the `version`
column in the `system.scylla_tables` mutation when we perform the schema
change in Raft mode, and to include a tombstone when performing it
outside of Raft mode, for the RECOVERY case.

But it's not that simple because as it turns out, we're *already*
sending a `version` live cell (and also a tombstone, with timestamp
decremented by 1) in all `system.scylla_tables` mutations. But then we
delete that cell when doing schema merge (which begs the question
why were we sending it in the first place? but I digress):
```
        // We must force recalculation of schema version after the merge, since the resulting
        // schema may be a mix of the old and new schemas.
        delete_schema_version(mutation);
```
the above function removes the `version` cell from the mutation.

So we need another way of distinguishing the cases of schema change
originating from group 0 vs outside group 0 (e.g. RECOVERY).

The method I chose is to extend `system.scylla_tables` with a boolean
column, `committed_by_group0`, and extend schema mutations to set
this column.

In the next commit we'll decide whether or not the `version` cell should
be deleted based on the value of this new column.
2023-09-15 14:32:52 +02:00
Kamil Braun
59912ca3b0 schema_tables: use schema version from group 0 if present
As promised in the previous commit, if we persisted a schema version
through a group 0 command, use it after a schema merge instead of
calculating a digest.

Ref: #7620

The above issue will be fixed once we enable the
`GROUP0_SCHEMA_VERSIONING` feature.
2023-09-15 14:32:52 +02:00
Kamil Braun
7ab7588d59 migration_manager: store group0_schema_version in scylla_local during schema changes
We extend schema mutations with an additional mutation to the
`system.scylla_local` table which:
- in Raft mode, stores a UUID under the `group0_schema_version` key.
- outside Raft mode, stores a tombstone under that key.

As we will see in later commits, nodes will use this after applying
schema mutations. If the key is absent or has a tombstone, they'll
calculate the global schema digest on their own -- using the old way. If
the key is present, they'll take the schema version from there.

The Raft-mode schema version is equal to the group 0 state ID of this
schema command.

The tombstone is necessary for the case of performing a schema change in
RECOVERY mode. It will force a revert to the old digest-based way.

Note that extending schema mutations with a `system.scylla_local`
mutation is possible thanks to earlier commits which moved
`system.scylla_local` to schema commitlog, so all mutations in the
schema mutations vector still go to the same commitlog domain.
2023-09-15 14:32:45 +02:00