Commit Graph

3555 Commits

Author SHA1 Message Date
Kefu Chai
5c0484cb02 db: add formatter for db::operation_type
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define a formatter for db::operation_type, and
remove their operator<<().

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16832
2024-01-19 10:16:41 +02:00
Kefu Chai
0ae81446ef ./: not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16766
2024-01-17 16:30:14 +02:00
Botond Dénes
f22fc88a64 Merge 'Configure service levels interval' from Michał Jadwiszczak
Service level controller updates itself in interval. However the interval time is hardcoded in main to 10 seconds and it leads to long sleeps in some of the tests.

This patch moves this value to `service_levels_interval_ms` command line option and sets this value to 0.5s in cql-pytest.

Closes scylladb/scylladb#16394

* github.com:scylladb/scylladb:
  test:cql-pytest: change service levels intervals in tests
  configure service levels interval
2024-01-17 12:24:49 +02:00
Calle Wilund
af0772d605 commitlog: Add wait_for_pending_deletes
Refs #16757

Allows waiting for all previous and pending segment deletes to finish.
Useful if a caller of `discard_completed_segments` (i.e. a memtable
flush target) not only wants to ensure segments are clean and released,
but thoroughly deleted/recycled, and hence no treat to resurrecting
data on crash+restart.

Test included.

Closes scylladb/scylladb#16801
2024-01-17 09:30:55 +02:00
Tomasz Grabiec
3d76aefb98 Merge "Enhance topology request status tracking" from Gleb
Currently to figure out if a topology request is complete a submitter
checks the topology state and tries to figure out from that the status
of the request. This is not exact. Lets look at rebuild handling for
instance. To figure out if request is completed the code waits for
request object to disappear from the topology, but if another rebuild
starts between the end of the previous one and the code noticing that
it completed the code will continue waiting for the next rebuild.
Another problem is that in case of operation failure there is no way to
pass an error back to the initiator.

This series solves those problems by assigning an id for each request and
tracking the status of each request in a separate table. The initiator
can query the request status from the table and see if the request was
completed successfully or if it failed with an error, which is also
evadable from the table.

The schema for the table is:

    CREATE TABLE system.topology_requests (
        id timeuuid PRIMARY KEY,

        initiating_host uuid,
        start_time timestamp,

        done boolean,
        error text,
        end_time timestamp,
    );

and all entries have TTL of one month.
2024-01-17 00:37:19 +01:00
Gleb Natapov
84197ff735 storage_service: topology coordinator: check topology operation completion using status in topology_requests table
Instead of trying to guess if a request completed by looking into the
topology state (which is sometimes can be error prone) look at the
request status in the new topology_requests. If request failed report
a reason for the failure from the table.
2024-01-16 17:02:54 +02:00
Avi Kivity
a9844ed69a Merge 'view: revert cleanup filter that doesn't work with tablets' from Nadav Har'El
The goal of this PR is fix Scylla so that the dtest test_mvs_populating_from_existing_data, which starts to fail when enabling tablets, will pass.

The main fix (the second patch) is reverting code which doesn't work with tablets, and I explain why I think this code was not necessary in the first place.

Fixes #16598

Closes scylladb/scylladb#16670

* github.com:scylladb/scylladb:
  view: revert cleanup filter that doesn't work with tablets
  mv: sleep a bit before view-update-generator restart
2024-01-16 16:42:20 +02:00
Gleb Natapov
584551f849 topology coordinator: add request_id to the topology state machine
Provide a unique ID for each topology request and store it the topology
state machine. It will be used to index new topology requests table in
order to retrieve request status.
2024-01-16 13:57:27 +02:00
Gleb Natapov
ecb8778950 system keyspace: introduce local table to store topology requests status
The table has the following schema and will be managed by raft:

CREATE TABLE system.topology_requests (
    id timeuuid PRIMARY KEY,

    initiating_host uuid,
    start_time timestamp,

    done boolean,
    error text,
    end_time timestamp,
);

In case of an request completing with an error the "error" filed will be non empty when "done" is set to true.
2024-01-16 13:57:16 +02:00
Gleb Natapov
a4ac64a652 system_keyspace: raft topology: load ignore nodes parameter together with removenode topology request
Next patch will need ignore nodes list while processing removenode
request. Load it.
2024-01-14 14:44:07 +02:00
Gleb Natapov
cc54796e23 raft topology: add cleanup state to the topology state machine
The patch adds cleanup state to the persistent and in memory state and
handles the loading. The state can be "clean" which means no cleanup
needed, "needed" which means the node is dirty and needs to run cleanup
at some point, "running" which means that cleanup is running by the node
right now and when it will be completed the state will be reset to "clean".
2024-01-14 13:30:54 +02:00
Nadav Har'El
1bcaeb89c7 view: revert cleanup filter that doesn't work with tablets
This patch reverts commit 10f8f13b90 from
November 2022. That commit added to the "view update generator", the code
which builds view updates for staging sstables, a filter that ignores
ranges that do not belong to this node. However,

1. I believe this filter was never necessary, because the view update
   code already silently ignores base updates which do not belong to
   this replica (see get_view_natural_endpoint()). After all, the view
   update needs to know that this replica is the Nth owner of the base
   update to send its update to the Nth view replica, but if no such
   N exists, no view update is sent.

2. The code introduced for that filter used a per-keyspace replication
   map, which was ok for vnodes but no longer works for tablets, and
   causes the operation using it to fail.

3. The filter was used every time the "view update generator" was used,
   regardless of whether any cleanup is necessary or not, so every
   such operation would fail with tablets. So for example the dtest
   test_mvs_populating_from_existing_data fails with tablets:
     * This test has view building in parallel with automatic tablet
       movement.
     * Tablet movement is streaming.
     * When streaming happens before view building has finished, the
       streamed sstables get "view update generator" run on them.
       This causes the problematic code to be called.

Before this patch, the dtest test_mvs_populating_from_existing_data
fails when tablets are enabled. After this patch, it passes.

Fixes #16598

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-01-14 13:24:44 +02:00
Nadav Har'El
0fe40f729e mv: sleep a bit before view-update-generator restart
The "view update generator" is responsible for generating view updates
for staging sstables (such as coming from repair). If the processing
fails, the code retries - immediately. If there is some persistent bug,
such as issue #16598, we will have a tight loop of error messages,
potentially a gigabyte of identical messages every second.

In this patch we simply add a sleep of one second after view update
generation fails before retrying. We can still get many identical
error messages if there is some bug, but not more than one per second.

Refs #16598.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-01-14 13:13:52 +02:00
Michał Jadwiszczak
f6a464ad81 configure service levels interval
So far the service levels interval, responsible for updating SL configuration,
was hardcoded in main.
Now it's extracted to `service_levels_interval_ms` option.
2024-01-12 10:28:24 +01:00
Kefu Chai
344ea25ed8 db: add fmt::format for db::consistency_level
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we

* define a formatter for `db::consistency_level`
* drop its `operator<<`, as it is not used anymore

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16755
2024-01-12 10:49:00 +02:00
Kefu Chai
54d49c04e0 db, sstable: bump up default sstable format to "md"
before this change, we defaults to use "mc" sstable format, and
switch to "md" if the cluster agrees on using it, and to
"me" if the cluster agrees on using this. the cluster feature
is used to get the consensus across the members in the cluster,
if any of the existing nodes in the cluster has its `sstable_format`
configured to, for instance, "mc", then the cluster is stuck with
"mc".

but we disabled "mc" sstable format back in 3d345609, the first LTS
release including that change was scylla v5.2.0. which means, the
cluster of the last major version Scylla should be using "md" or
"me". per our document on upgrade, see docs/upgrade/index.rst,

> You should perform the upgrades consecutively - to each
> successive X.Y version, without skipping any major or minor version.
>
> Before you upgrade to the next version, the whole cluster (each
> node) must be upgraded to the previous version.

we can assume that, a 6.x node will only join a cluster
with 5.x or 6.x nodes. (joining a 7.x cluster should work, but
this is not relevant to this change). in both cases, since
5.x and up scylla can only configured with "md" `sstable_format`,
there is no need to switch from "mc" to "md" anymore. so we can
ditch the code supporting it.

Refs #16551
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-11 22:43:05 +08:00
Kefu Chai
be364d30fd db: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16664
2024-01-09 11:44:19 +02:00
Kefu Chai
34259a03d0 treewide: use consteval string as format string when formatting log message
seastar::logger is using the compile-time format checking by default if
compiled using {fmt} 8.0 and up. and it requires the format string to be
consteval string, otherwise we have to use `fmt::runtime()` explicitly.

so adapt the change, let's use the consteval string when formatting
logging messages.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16612
2024-01-02 19:08:47 +02:00
Benny Halevy
c520fc23f0 system_keyspace: update_peer_info: drop single-column overloads
They are no longer used.
Instead, all callers now pass peer_info.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
7670f60b83 system_keyspace: load_tokens/peers/host_ids: enforce presence of host_id
Skip rows that have no host_id to make
sure the node state we load always has a valid host_id.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
74159bb5ae system_keyspace: drop update_tokens(endpoint, tokens) overload
It is unused now after the previous patch
to update_peer_info in one call.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
b2735d47f7 system_keyspace: update_peer_info: use struct peer_info for all optional values
Define struct peer_info holding optional values
for all system.peers columns, allowing the caller to
update any column.

Pass the values as std::vector<std::optional<data_value>>
to query_processor::execute_internal.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:30 +02:00
Benny Halevy
328ce23c78 types: add data_value_list
data_value_list is a wrapper around std::initializer_list<data_value>.
Use it for passing values to `cql3::query_processor::execute_internal`
and friends.

A following path will add a std::variant for data_value_or_unset
and extend data_value_list to support unset values.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:17:27 +02:00
Benny Halevy
85b3232086 system_keyspace: get rid of update_cached_values
It's a no-op.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 10:10:51 +02:00
Benny Halevy
f64ecc2edf storage_service: do not update peer info for this node
system_keyspace had a hack to skip update_peer_info
for the local node, and then to remove an entry for
the local node in system.peers if `update_tokens(endpoint, ...)`
was called for this node.

This change unhacks system_keyspace by considering
update of system.peers with the local address as
an internal error and fixing the call sites that do that.

Fixes #16425

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 10:10:51 +02:00
Avi Kivity
6394854f04 Merge 'Some cleanups in tests for tablets + MV ' from Nadav Har'El
This small series improves two things in the multi-node tests for tablet supports in materialized views:

1. The test for Alternator LSI, which "sometimes" could reproduce the bug by creating 10-node cluster with a random tablet distribution, is replaced by a reliable 2-node cluster which controls the tablet distribution. The new test also confirms that tablets are actually enabled in Alternator (reviewers of the original test noted it would be easy to pass the test if tablets were accidentally not enabled... :-)).
2. Simplify the tablet lookup code in the test to not go through a "table id", and lookup the table's (or view's) name directly (requires a full-table of the tablets table, but that's entirely reasonable in a test).

The third patch in this series also fixes a comment typo discovered in a previous review.

Closes scylladb/scylladb#16440

* github.com:scylladb/scylladb:
  materialized views: fix typo in comment
  test_mv_tablets: simplify lookup of tablets
  alternator, tablets: improve Alternator LSI tablets test
2023-12-27 20:18:14 +02:00
Pavel Emelyanov
129196db98 schema_tables: Use new_keyspace() sugar
The create_keyspace_from_schema_partition code creates ks metadata
without schemas and user-types. There's new_keyspace() convenience
helper for such cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-26 13:26:58 +03:00
Pavel Emelyanov
ffdafe4024 keyspace_metadata: Add default value for new_keyspace's durable_writes
Almost all callers call new_keyspace with durable writes ON, so it's
worth having default value for it

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-26 11:47:37 +03:00
Pavel Emelyanov
c43501d973 locator,schema: Move initial tablets from r.s. options to params
The option is kepd in DDL, but is _not_ stored in
system_schema.keyspaces. Instead, it's removed from the provided options
and kept in scylla_keyspaces table in its own column. All the places
that had optional initial_tablets disengaged now set this value up the
way the find appropriate.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 16:07:10 +03:00
Pavel Emelyanov
30e7273658 schema_tables: Relax extract_scylla_specific_ks_info() check
Nowadays reading scylla-specific info from schema happens under
respective schema feature. However (at least in raft case) when a new
node joins the cluster merging schema for the first time may happen
_before_ features are merged and enabled. Thus merging schema can go the
wrong way by errorneously skipping the scylla-specific info.

On the other hand, if system_schema.scylla_keyspaces is there it's
there, there's no reason _not_ to pick this data up in that case.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 16:05:01 +03:00
Pavel Emelyanov
a67c535539 keyspace_metadata: Carry optional<initial_tablets> on board
The object in question fully describes the keyspace to be created and,
among other things, contains replication strategy options. Next patches
move the "initial_tablets" option out of those options and keep it
separately, so the ks metadata should also carry this option separately.

This patch is _just_ extending the metadata creation API, in fact the
new field is unused (write-only) so all the places that need to provide
this data keep it disengaged and are explicitly marked with FIXME
comment. Next patches will fix that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 15:58:05 +03:00
Benny Halevy
060b16f987 view: apply_to_remote_endpoints: fix use-after-free
b815aa021c added a yield before
the trace point, causing the moved `frozen_mutation_and_schema`
(and `inet_address_vector_topology_change`) to drop out of scope
and be destroyed, as the rvalue-referenced objects aren't moved
onto the coroutine frame.

This change passes them by value rather than by rvalue-reference
so they will be stored in the coroutine frame.

Fixes #16540

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#16541
2023-12-24 21:43:48 +02:00
Nadav Har'El
6640278aa7 materialized views: fix typo in comment
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-24 10:12:44 +02:00
Tomasz Grabiec
9c7e5f6277 Merge 'Fix secondary index feature with tablets' from Nadav Har'El
Before this series, materialized views already work correctly on keyspaces with tablets, but secondary indexes do not. The goal of these series is make CQL secondary indexes fully supported on tablets:

1. First we need to make CREATE INDEX work with tablets (it didn't before this series). Fixes #16396.
2. Then we need to keep the promise that our documentation makes - that **local** secondary index should be synchronously updated - Fixes #16371.

As you can see in the patches below, and as was expected already in the design phase, the code changes needed to make indexes support tablets were minimal. But writing reliable tests for these issues was the biggest effort that went into this series.

Closes scylladb/scylladb#16436

* github.com:scylladb/scylladb:
  secondary-index, tablets: ensure that LSI are synchronous
  test: add missing "tags" schema extension to cql_test_env
  mv, test: fix delay_before_remote_view_update injection point
  secondary index: fix view creation when using tablets
2023-12-21 23:37:00 +01:00
Avi Kivity
2853f79f96 virtual_tables: scope virtual tables registry in system_keyspace
Virtual tables are kept in a thread_local registry for deduplication
purposes. The problem is that thread_local variables are destroyed late,
possibly after the schema registry and the reactor are destroyed.
Currently this isn't a problem, but after a seastar change to
destroy the reactor after termination [1], things break.

Fix by moving the registry to system_keyspace. system_keyspace was chosen
since it was the birthplace of virtual tables.

Pimpl is used to avoid increasing dependencies.

[1] 101b245ed7
2023-12-21 16:19:42 +02:00
Nadav Har'El
7c5092cb8f test: add missing "tags" schema extension to cql_test_env
One of the unfortunate anti-features of cql_test_env (the framework used
in our CQL tests that are written in C++) is that it needs to repeat
various bizarre initializations steps done in main.cc, otherwise various
requests work incorrectly. One of these steps that main.cc is to initialize
various "schema extensions" which some of the Scylla features need to work
correctly.

We remembered to initialize some schema extensions in cql_test_env, but
forgot others. The one I will need in the following patch is the "tags"
extension, which we need to mark materialized views used by local
secondary indexes as "synchronous_updates" - without this patch the LSI
tests in secondary_index_test.cc will crash.

In addition to adding the missing extension, this patch also replaces
the segmentation-fault crash when it's missing (caused by a dynamic
cast failure) by a clearer on_internal_error() - so if we ever have
this bug again, it will be easier to debug.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-21 11:44:50 +02:00
Nadav Har'El
b815aa021c mv, test: fix delay_before_remote_view_update injection point
The "delay_before_remote_view_update" is a recently-added injection
point which should add a delay before remove view updates, but NOT
force the writer to wait for it (whether the writer waits for it or
not depends on whether the view is configured as synchronous or not).

Unfortunately, the delay was added at the WRONG place, which caused
it to sometimes be done even on asynchronous views, breaking (with
false-negative) the tests that need this delay to reproduce bugs of
missing synchronous updates (Refs #16371).

The fix here is even simpler then the (wrong) old code - we just add
the sleep to the existing function apply_to_remote_endpoints() instead
of making the caller even more complex.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-21 11:44:50 +02:00
Nadav Har'El
8181e28731 secondary index: fix view creation when using tablets
In commit 88a5ddabce, we fixed materialized
view creation to support tablets. We added to the function called to
create materialized views in CQL, prepare_new_view_announcement()
a missing call to the on_before_create_column_family() notifier that
creates tablets for this new view.

Unfortunately, We have the same problem when creating a secondary index,
because it does not use prepare_new_view_announcement(), and instead uses
a generic function to "update" the base table, which in some cases ends
up creating new views when a new index is requested. In this path, the
notifier did not get called to the notifier, so we must add it here too.
Unfortunately, the notifiers must run in a Seastar thread, which means
that yet another function now needs to run in a Seastar thread.

Before this patch, creating a secondary index in a table using tablets
fails with "Tablet map not found for table <uuid>". With this patch,
it works.

The patch also includes tests for creating a regular and local secondary
index. Both tests fail (with the aforementioned error) before this
patch, and pass with it.

Fixes #16396

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-21 11:44:50 +02:00
Kamil Braun
6fcaec75db Merge 'Add maintenance socket' from Mikołaj Grzebieluch
It enables interaction with the node through CQL protocol without authentication. It gives full-permission access.
The maintenance socket is available by Unix domain socket with file permissions `755`, thus it is not accessible from outside of the node and from other POSIX groups on the node.
It is created before the node joins the cluster.

To set up the maintenance socket, use the `maintenance-socket` option when starting the node.

* If set to `ignore` maintenance socket will not be created.
* If set to `workdir` maintenance socket will be created in `<node's workdir>/cql.m`.
* Otherwise maintenance socket will be created in the specified path.

The default value is `ignore`.

* With python driver

```python
from cassandra.cluster import Cluster
from cassandra.connection import UnixSocketEndPoint
from cassandra.policies import HostFilterPolicy, RoundRobinPolicy

socket = "<node's workdir>/cql.m"
cluster = Cluster([UnixSocketEndPoint(socket)],
                  # Driver tries to connect to other nodes in the cluster, so we need to filter them out.
                  load_balancing_policy=HostFilterPolicy(RoundRobinPolicy(), lambda h: h.address == socket))
session = cluster.connect()
```

Merge note: apparently cqlsh does not support unix domain sockets; it
will have to be fixed in a follow-up.

Closes scylladb/scylladb#16172

* github.com:scylladb/scylladb:
  test.py: add maintenance socket test
  test.py: enable maintenance socket in tests by default
  docs: add maintenance socket documentation
  main: add maintenance socket
  main: refactor initialization of cql controller and auth service
  auth/service: don't create system_auth keyspace when used by maintenance socket
  cql_controller: maintenance socket: fix indentation
  cql_controller: add option to start maintenance socket
  db/config: add maintenance_socket_enabled bool class
  auth: add maintenance_socket_role_manager
  db/config: add maintenance_socket variable
2023-12-20 19:04:40 +02:00
Kamil Braun
ffb6ae917f Merge 'Add support for tablets in Alternator' from Nadav Har'El
The pull requests adds support for tablets in Alternator, and particularly focuses in getting Alternator's GSI and LSI (i.e., materialized views)  to work.

After this series support for tablets in Alternator _mostly_ work, but not completely:
1. CDC doesn't yet work with tablets, and Alternator needs to provide CDC (known as "DynamoDB Streams").
2. Alternator's TTL feature was not tested with tablets, and probably doesn't work because it assumes the replication map belongs to a keyspace.

Because of these reasons, Alternator does not yet use tablets by default and it needs to be enabled explicitly be adding an experimental tag to the new table. This will allow us to test Alternator with tablets even before it is ready for the limelight.

Fixes #16203
Fixes #16313

Closes scylladb/scylladb#16353

* github.com:scylladb/scylladb:
  mv, tablets, alternator: test for Alternator LSI with tablets
  mv: coroutinize wait code for remote view updates
  mv, test: add injection point to delay remove view update
  alternator: explicitly request synchronous updates for LSI
  alternator: fix view creation when using tablets
  alternator: add experimental method to create a table with tablets
2023-12-20 10:00:31 +01:00
Mikołaj Grzebieluch
cf43787295 db/config: add maintenance_socket_enabled bool class 2023-12-18 11:42:40 +01:00
Mikołaj Grzebieluch
e682e362a3 db/config: add maintenance_socket variable
If set to "ignore", maintenance socket will be disabled.
If set to "workdir", maintenance socket will be opened on <scylla's
workdir>/cql.m.
Otherwise it will be opened on path provided by maintenance_socket
variable.

It is set by default to 'ignore'.
2023-12-18 11:42:05 +01:00
Kamil Braun
3b108f2e31 Merge 'db: config: make consistent_cluster_management mandatory' from Patryk Jędrzejczak
We make `consistent_cluster_management` mandatory in 5.5. This
option will be always unused and assumed to be true.

Additionally, we make `override_decommission` deprecated, as this option
has been supported only with `consistent_cluster_management=false`.

Making `consistent_cluster_management` mandatory also simplifies
the code. Branches that execute only with
`consistent_cluster_management` disabled are removed.

We also update documentation by removing information irrelevant in 5.5.

Fixes scylladb/scylladb#15854

Note about upgrades: this PR does not introduce any more limitations
to the upgrade procedure than there are already. As in
scylladb/scylladb#16254, we can upgrade from the first version of Scylla
that supports the schema commitlog feature, i.e. from 5.1 (or
corresponding Enterprise release) or later. Assuming this PR ends up in
5.5, the documented upgrade support is from 5.4. For corresponding
Enterprise release, it's from 2023.x (based on 5.2), so all requirements
are met.

Closes scylladb/scylladb#16334

* github.com:scylladb/scylladb:
  docs: update after making consistent_cluster_management mandatory
  system_keyspace, main, cql_test_env: fix indendations
  db: config: make consistent_cluster_management mandatory
  test: boost: schema_change_test: replace disable_raft_schema_config
  db: config: make override_decommission deprecated
  db: config: make force_schema_commit_log deprecated
2023-12-18 09:44:52 +01:00
Nadav Har'El
37b5c03865 mv: coroutinize wait code for remote view updates
In the previous patch we added a delay injection point (for testing)
in the view update code. Because the code was using continuation style,
this resulted in increased indentation and ugly repetition of captures.

So in this patch we coroutinize the code that waits for remote view
updates, making it simpler, shorter, and less indented.

Note that this function still uses continuations in one place:
The remote view update is still composed of two steps that need
to happen one after another, but we don't necessarily need to wait
for them to happen. This is easiest to do with chaining continuations,
and then either waiting or not waiting for the resulting future.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 20:15:08 +02:00
Nadav Har'El
bf6848d277 mv, test: add injection point to delay remove view update
It's difficult to write a test (as we plan to do in to in the next patch)
that verifies that synchronous view updates are indeed synchronous, i.e.,
that write with CL=QUORUM on the base-table write returns only after
CL=QUORUM was also achieved in the view table. The difficulty is that in a
fast test machine, even if the synchronous-view-update is completely buggy,
it's likely that by the time the test reads from the view, all view updates
will have been completed anyway.

So in this patch we introduce an injection point, for testing, named
"delay_before_remote_view_update", which adds a delay before the base
replica sends its update to the remote view replica (in case the view
replica is indeed remote). As usual, this injection point isn't
configurable - when enabled it adds a fixed (0.5 second) delay, on all
view updates on all tables.

The existing code used continuation-style Seastar programming, and the
addition of the injection point in this patch made it even uglier, so
in the next patch we will coroutine-ize this code.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 20:15:08 +02:00
Kefu Chai
81d5c4e661 db/system_keyspace: explicitly instantiate used template
future<std::optional<utils::UUID>>
system_keyspace::get_scylla_local_param_as<utils::UUID>(const sstring&)
is used by db/schema_tables.cc. so let's instantiate this template
explicitly.
otherwise we'd have following link failure:

```
: && /home/kefu/.local/bin/clang++ -ffunction-sections -fdata-sections -O3 -g -gz -Xlinker --build-id=sha1 -fuse-ld=lld -dynamic-linker=/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////lib64/ld-linux-x86-64.so.2 -Xlinker --gc-sections CMakeFiles/scylla_version.dir/Release/release.cc.o CMakeFiles/scylla.dir/Release/main.cc.o -o Release/scylla  Release/libscylla-main.a  api/Release/libapi.a  alternator/Release/libalternator.a  db/Release/libdb.a  cdc/Release/libcdc.a  compaction/Release/libcompaction.a  cql3/Release/libcql3.a  data_dictionary/Release/libdata_dictionary.a  gms/Release/libgms.a  index/Release/libindex.a  lang/Release/liblang.a  message/Release/libmessage.a  mutation/Release/libmutation.a  mutation_writer/Release/libmutation_writer.a  raft/Release/libraft.a  readers/Release/libreaders.a  redis/Release/libredis.a  repair/Release/librepair.a  replica/Release/libreplica.a  schema/Release/libschema.a  service/Release/libservice.a  sstables/Release/libsstables.a  streaming/Release/libstreaming.a  test/perf/Release/libtest-perf.a  thrift/Release/libthrift.a  tools/Release/libtools.a  transport/Release/libtransport.a  types/Release/libtypes.a  utils/Release/libutils.a  seastar/Release/libseastar.a  /usr/lib64/libboost_program_options.so.1.81.0  test/lib/Release/libtest-lib.a  Release/libscylla-main.a  -Xlinker --push-state -Xlinker --whole-archive  auth/Release/libscylla_auth.a  -Xlinker --pop-state  /usr/lib64/libcrypt.so  cdc/Release/libcdc.a  compaction/Release/libcompaction.a  mutation_writer/Release/libmutation_writer.a  -Xlinker --push-state -Xlinker --whole-archive  dht/Release/libscylla_dht.a  -Xlinker --pop-state  index/Release/libindex.a  -Xlinker --push-state -Xlinker --whole-archive  locator/Release/libscylla_locator.a  -Xlinker --pop-state  message/Release/libmessage.a  gms/Release/libgms.a  sstables/Release/libsstables.a  readers/Release/libreaders.a  schema/Release/libschema.a  -Xlinker --push-state -Xlinker --whole-archive  tracing/Release/libscylla_tracing.a  -Xlinker --pop-state  service/Release/libservice.a  node_ops/Release/libnode_ops.a  service/Release/libservice.a  node_ops/Release/libnode_ops.a  raft/Release/libraft.a  repair/Release/librepair.a  streaming/Release/libstreaming.a  replica/Release/libreplica.a  /usr/lib64/libabsl_raw_hash_set.so.2308.0.0  /usr/lib64/libabsl_hash.so.2308.0.0  /usr/lib64/libabsl_city.so.2308.0.0  /usr/lib64/libabsl_bad_variant_access.so.2308.0.0  /usr/lib64/libabsl_low_level_hash.so.2308.0.0  /usr/lib64/libabsl_bad_optional_access.so.2308.0.0  /usr/lib64/libabsl_hashtablez_sampler.so.2308.0.0  /usr/lib64/libabsl_exponential_biased.so.2308.0.0  /usr/lib64/libabsl_synchronization.so.2308.0.0  /usr/lib64/libabsl_graphcycles_internal.so.2308.0.0  /usr/lib64/libabsl_kernel_timeout_internal.so.2308.0.0  /usr/lib64/libabsl_stacktrace.so.2308.0.0  /usr/lib64/libabsl_symbolize.so.2308.0.0  /usr/lib64/libabsl_malloc_internal.so.2308.0.0  /usr/lib64/libabsl_debugging_internal.so.2308.0.0  /usr/lib64/libabsl_demangle_internal.so.2308.0.0  /usr/lib64/libabsl_time.so.2308.0.0  /usr/lib64/libabsl_strings.so.2308.0.0  /usr/lib64/libabsl_int128.so.2308.0.0  /usr/lib64/libabsl_strings_internal.so.2308.0.0  /usr/lib64/libabsl_string_view.so.2308.0.0  /usr/lib64/libabsl_throw_delegate.so.2308.0.0  /usr/lib64/libabsl_base.so.2308.0.0  /usr/lib64/libabsl_spinlock_wait.so.2308.0.0  /usr/lib64/libabsl_civil_time.so.2308.0.0  /usr/lib64/libabsl_time_zone.so.2308.0.0  /usr/lib64/libabsl_raw_logging_internal.so.2308.0.0  /usr/lib64/libabsl_log_severity.so.2308.0.0  -lsystemd  /usr/lib64/libz.so  /usr/lib64/libdeflate.so  types/Release/libtypes.a  utils/Release/libutils.a  /usr/lib64/libcryptopp.so  /usr/lib64/libboost_regex.so.1.81.0  /usr/lib64/libicui18n.so  /usr/lib64/libicuuc.so  /usr/lib64/libboost_unit_test_framework.so.1.81.0  seastar/Release/libseastar_perf_testing.a  /usr/lib64/libjsoncpp.so.1.9.5  interface/Release/libinterface.a  /usr/lib64/libthrift.so  db/Release/libdb.a  data_dictionary/Release/libdata_dictionary.a  cql3/Release/libcql3.a  transport/Release/libtransport.a  cql3/Release/libcql3.a  transport/Release/libtransport.a  lang/Release/liblang.a  /usr/lib64/liblua-5.4.so  -lm  rust/Release/libwasmtime_bindings.a  rust/librust_combined.a  /usr/lib64/libsnappy.so.1.1.10  mutation/Release/libmutation.a  seastar/Release/libseastar.a  /usr/lib64/libboost_program_options.so  /usr/lib64/libboost_thread.so  /usr/lib64/libboost_chrono.so  /usr/lib64/libboost_atomic.so  /usr/lib64/libcares.so  /usr/lib64/libcryptopp.so  /usr/lib64/libfmt.so.10.0.0  /usr/lib64/liblz4.so  -ldl  /usr/lib64/libgnutls.so  -latomic  /usr/lib64/libsctp.so  /usr/lib64/libyaml-cpp.so  /usr/lib64/libhwloc.so  //usr/lib64/liburing.so  /usr/lib64/libnuma.so  /usr/lib64/libxxhash.so && :
ld.lld: error: undefined symbol: seastar::future<std::optional<utils::UUID>> db::system_keyspace::get_scylla_local_param_as<utils::UUID>(seastar::basic_sstring<char, unsigned int, 15u, true> const&)
>>> referenced by schema_tables.cc:981 (./build/./db/schema_tables.cc:981)
>>>               schema_tables.cc.o:(db::schema_tables::merge_schema(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&, std::vector<mutation, std::allocator<mutation>>, bool)::$_1::operator()()) in archive db/Release/libdb.a
>>> referenced by schema_tables.cc:981 (./build/./db/schema_tables.cc:981)
>>>               schema_tables.cc.o:(db::schema_tables::recalculate_schema_version(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&)::$_0::operator()() const) in archive db/Release/libdb.a
>>> referenced by schema_tables.cc:981 (./build/./db/schema_tables.cc:981)
>>>               schema_tables.cc.o:(db::schema_tables::merge_schema(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&, std::vector<mutation, std::allocator<mutation>>, bool)::$_1::operator()() (.resume)) in archive db/Release/libdb.a
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
```

it seems that, without the explicit instantiation, clang-18
just inlines the body of the instantiated template function at the
caller site.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16434
2023-12-17 15:12:05 +02:00
Kamil Braun
6a4106edf3 migration_manager: don't attach empty system.scylla_local mutation in migration request handler
In effb9fb3cb migration request handler
(called when a node requests schema pull) was extended with a
`system.scylla_local` mutation:
```
        cm.emplace_back(co_await self._sys_ks.local().get_group0_schema_version());
```

This mutation is empty if the GROUP0_SCHEMA_VERSIONING feature is
disabled.

Nevertheless, it turned out to cause problems during upgrades.
The following scenario shows the problem:

We upgrade from 5.2 to enterprise version with the aforementioned patch.
In 5.2, `system.scylla_local` does not use schema commitlog.
After the first node upgrades to the enterprise version, it immediately
on boot creates a new enterprise-only table
(`system_replicated_keys.encrypted_keys`) -- the specific table is not
important, only the fact that a schema change is performed.
This happens before the restarting node notices other nodes being UP, so
the schema change is not immediately pushed to the other nodes.
Instead, soon after boot, the other non-upgraded nodes pull the schema
from the upgraded node.
The upgraded node attaches a `system.scylla_local` mutation to the
vector of returned mutations.
The non-upgraded nodes try to apply this vector of mutations. Because
some of these mutations are for tables that already use schema
commitlog, while the `system.scylla_local` table does not use schema
commitlog, this triggers the following error (even though the mutation
is empty):
```
    Cannot apply atomically across commitlog domains: system.scylla_local, system_schema.keyspaces
```

Fortunately, the fix is simple -- instead of attaching an empty
mutation, do not attach a mutation at all if the handler of migration
request notices that group0_schema_version is not present.

Note that group0_schema_version is only present if the
GROUP0_SCHEMA_VERSIONING feature is enabled, which happens only after
the whole upgrade finishes.

Refs: scylladb/scylladb#16414

Not using "Fixes" because the issue will only be fixed once this PR is
merged to `master` and the commit is cherry-picked onto next-enterprise.

Closes scylladb/scylladb#16416
2023-12-14 22:58:13 +01:00
Patryk Jędrzejczak
dced4bb924 system_keyspace, main, cql_test_env: fix indendations
Broken in the previous patch.
2023-12-14 16:54:04 +01:00
Patryk Jędrzejczak
5ebfbf42bc db: config: make consistent_cluster_management mandatory
Code that executed only when consistent_cluster_management=false is
removed. In particular, after this patch:
- raft_group0 and raft_group_registry are always enabled,
- raft_group0::status_for_monitoring::disabled becomes unused,
- topology tests can only run with consistent_cluster_management.
2023-12-14 16:54:04 +01:00
Patryk Jędrzejczak
a54f9052fc db: config: make override_decommission deprecated
The override_decommission option is supported only when
consistent_cluster_management is disabled. In the following commit,
we make consistent_cluster_management mandatory, which makes
overwrite_decommission unusable.
2023-12-14 16:54:04 +01:00