Commit Graph

39169 Commits

Author SHA1 Message Date
Botond Dénes
91e62413c8 tools/scylla-nodetool: implement version command 2023-10-04 05:27:09 -04:00
Botond Dénes
5ad9b1424c test/nodetool: introduce utils.check_nodetool_fails_with()
Checking that nodetool fails with a given message turned out to be a
common pattern, so extract the logic for checking this into a method of
its own. Refactor the existing tests to use it, instead of the
hand-coded equivalent.
2023-10-04 05:27:09 -04:00
Botond Dénes
644d91fe95 test/nodetool: return stdout of nodetool invokation
So the test can inspect it.
2023-10-04 05:09:49 -04:00
Botond Dénes
dd62299355 test/nodetool/rest_api_mock.py: fix request param matching
Turns out expected request params were dropped on the floor, so any
expected param matched any actual params.
2023-10-04 05:09:41 -04:00
Botond Dénes
4f66e0208b tools/scylla-nodetool: compact: remove --partition argument
This argument is not recognized by the current nodetool either. It is
mentioned only in our documentation, but it should be removed from there
too.
2023-10-04 05:08:33 -04:00
Botond Dénes
2ddf28b8e5 tools/scylla-nodetool: scylla_rest_client: add support delete method 2023-10-04 05:07:03 -04:00
Botond Dénes
7dc77d03af tools/scylla-nodetool: get rid of check_json_type()
This check is redundant. Originally it was intended to work around by
rapidjson using an assert by default to check that the fields have the
expected type. But turns out we already configure rapidjson to use a
plain exception in utils/rjson.hh, so check_json_type() is not needed
for graceful error handling.
2023-10-03 02:05:30 -04:00
Botond Dénes
fdecea5480 tools/scylla-nodetool: log more details for failed requests
Instead of the unhelpful "Unexpected reply status", log what the request
was and what is the response status code.
2023-10-03 02:05:30 -04:00
Botond Dénes
adb65e18a1 tools/scylla-*: use operation_option for positional options
Use operation_option to describe positional options. The structure used
before -- app_template::positional_option -- was not a good fit for
this, as it was designed to store a description that is immediately
passed to the boost::program_options subsystem and then discarded.
As such, it had a raw pointer member, which was expected to be
immediately wrapped by boost::shared_ptr<> by boost::program_options.
This produced memory leaks for tools, for options that ended up not
being used. To avoid this altogether, use operation_option, converting
to the app_template::positional_option at the last moment.
2023-10-03 02:05:30 -04:00
Botond Dénes
c252ff4f03 tools/utils: add support for operation aliases
Some operations may have additional names, beyond their "main". Add
support for this.
2023-10-03 02:05:30 -04:00
David Garcia
d543b96d18 docs: download iam csv files
docs: automate generation

docs: rm _data dir

fix: windows build

Closes scylladb/scylladb#15276
2023-10-02 12:28:56 +03:00
Botond Dénes
3e74432dbf Merge 'Sanitize storage_proxy API handlers' from Pavel Emelyanov
Registering API handlers for services need to
- happen next to the corresponding service's start
- use only the provided service, not any other ones (if needed, the handler's service can use its internal dependencies to do its job)
- get the service to handle requests via argument, not from http context (http context, in turn, is going _not_ to depend on anything)

The storage proxy handlers don't follow any of that rules, this PR fixes them

Closes scylladb/scylladb#15584

* github.com:scylladb/scylladb:
  api: Make storage_proxy handlers use proxy argument
  api: Change some static helpers to use proxy instead of ctx
  api: Pass sharded<storage_proxy> reference to storage_proxy handlers
  api: Start (and stop) storage_proxy API earlier
  api: Remove storage_service argument from storage_proxy setup
  api: Move storage_proxy/ endpoint using storage_service
  api: Remove storage_proxy.hh from storage_service.cc
  main: Initialize API server early
2023-10-02 12:28:56 +03:00
Benny Halevy
6dc1ac768d cql-pytest/test_select_from_mutation_fragments: disable compaction on test_table
Use NullCompactionStrategy for the test_table fixture
rather than using the `no_autocompaction_context`.

Besides being simpler, as regular compaction just comes in
the way for all tests that use `SELECT MUTATION_FRAGMENTS`

The latter would be problematic when we start run cql-pytest
test cases in parallel rather than in serial since it
will inadvertantly affect other test cases.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#15574
2023-10-02 10:28:59 +03:00
Michael Huang
1640f83fdc raft: Store snapshot update and truncate log atomically
In case the snapshot update fails, we don't truncate commit log.

Fixes scylladb/scylladb#9603

Closes scylladb/scylladb#15540
2023-09-29 17:57:49 +02:00
Pavel Emelyanov
2603605cd5 api: Make storage_proxy handlers use proxy argument
And stop using proxy reference from http context. After a while the
proxy dependency will be removed from http context

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:10:09 +03:00
Pavel Emelyanov
fc4335387a api: Change some static helpers to use proxy instead of ctx
There are some helpers in storage_proxy.cc that get proxy reference from
passed http context argument. Next patch will stop using ctx for that
purpose, so prepare in advance by making the helpers use proxy reference
argument directly

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:10:09 +03:00
Pavel Emelyanov
4910b4d5b7 api: Pass sharded<storage_proxy> reference to storage_proxy handlers
The goals is to make handlers use proxy argument instead of keeping
proxt as dependency on http context (other handlers are mostly such
already)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:10:09 +03:00
Pavel Emelyanov
bbba691931 api: Start (and stop) storage_proxy API earlier
Handlers can register as early as the used service is started

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:10:09 +03:00
Pavel Emelyanov
7ef7b05397 api: Remove storage_service argument from storage_proxy setup
The code setting up storage_proxy/ endpoints no longer needs
storage_service and related decoration

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:10:09 +03:00
Pavel Emelyanov
0eea513663 api: Move storage_proxy/ endpoint using storage_service
The storage_proxy/get_schema_version is served by storage_service, so it
should be in storage_service.cc instead

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:10:08 +03:00
Pavel Emelyanov
b5eb474d95 api: Remove storage_proxy.hh from storage_service.cc
Proxy is not used in storage service handlers

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:09:01 +03:00
Pavel Emelyanov
abf541cf29 main: Initialize API server early
Surprisingly, but the dependency-less API server context is initialized
somewhere in the middle of main. By that time some "real" services had
already started and should have the ability to register their endpoints,
so API context should be initialized way ahead. This patch places its
initialization next to prometheus init.

One thing that's not nice here is that API port listening remains where
it was before the patch, so for the external ... observer API
initialization doesn't change. Likely API should start listening for
connection early as well, but that's left for future patching.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-29 14:09:01 +03:00
Botond Dénes
ecceb554c3 Merge 'db/hints: Clean up hint_storage.cc' from Dawid Mędrek
This PR is the second step in refactoring the Hinted Handoff module. It cleans up the contents of the file `hint_storage.cc`. The biggest change is the transition from continuations to coroutines.

Refs #15358

Closes scylladb/scylladb#15496

* github.com:scylladb/scylladb:
  db/hints: Alias segment list in hint_storage.cc
  db/hints: Rename rebalance to rebalance_hints
  db/hints: Clean up rebalance() in hint_storage.cc
  db/hints: Coroutinize hint_storage.cc
  db/hints: Clean up remove_irrelevant_shards_directories() in hint_storage.cc
  db/hints: Clean up rebalance_segments() in hint_storage.cc
  db/hints: Clean up rebalance_segments_for() in hint_storage.cc
  db/hints: Clean up get_current_hints_segments() in hint_storage.cc
  db/hints: Rename scan_for_hints_dirs to scan_shard_hint_directories
  db/hints: Clean up scan_for_hints_dirs() in hint_storage.cc
  db/hints: Wrap hint_storage.cc in an anonymous namespace
2023-09-29 08:55:38 +03:00
Botond Dénes
5d8384eff0 Merge 'Fix test_fencing.py::test_fence_hints flakiness' from Kamil Braun
Add a REST API to reload Raft topology state without having to restart a node and use it in `test_fence_hints`. Restarting the node has undesired side effects which cause test flakiness; more details provided in commit messages.

Refactor the test a bit while at it.

Fixes: #15285

Closes scylladb/scylladb#15523

* github.com:scylladb/scylladb:
  test: test_fencing.py: enable hints_manager=trace logs in `test_fence_hints`
  test: test_fencing.py: reload topology through REST API in `test_fence_hints`
  test: refactor test_fencing.py
  api: storage_service: add REST API to reload topology state
2023-09-28 16:30:23 +03:00
Benny Halevy
3709a43ccc cql-pytest.nodetool: no_autocompaction_context: support ks.tbl syntax
Allow disabling auto-compaction for given table(s)
using either the ks.table syntax or ks:table (as the
api suggests).

The first syntax would likely be more common since
the test tables we automatically create are named
as test_keyspace.test_table so we can pass that name
to `no_autocompaction_context` as is.

test_tools.system_scylla_local_sstable_prepared was
modified to disable auto-compaction only only
the `system.scylla_local` table rather than
the whole `system` keyspace, since it only relies
on this table. Plus, it helps test this change :)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#15575
2023-09-28 13:59:48 +03:00
Kamil Braun
53c01b121a test: test_fencing.py: enable hints_manager=trace logs in test_fence_hints
Enable TRACE level logging on the server that's supposed to send the
hints. Should make it easier to debug failures in the future, if any
happen again.
2023-09-28 11:59:17 +02:00
Kamil Braun
706734f76c test: test_fencing.py: reload topology through REST API in test_fence_hints
Restarting a node in order to reload topology may have side effects that
lead to test flakiness. While the node is shutting down, it gives up
leadership. Before it finishes shutting down, another node may become
Raft group 0 leader, then topology coordinator, then send a topology
command, triggering topology state reload on the shutting down node,
causing its topology version to get updated, allowing it to send a
successful hint before it shuts down and restarts. After it restarts, no
more hints will be sent, so the metrics condition we're waiting for (for
a hint to be sent) will never become true (metrics are not persisted
between restarts).

Instead of restarting, reload topology state through the new REST API.
This also makes the test a bit faster.

Fixes #15285
2023-09-28 11:59:17 +02:00
Kamil Braun
02dd297ba1 test: refactor test_fencing.py
- use `manager.get_cql()` to silence mypy (`manager.cql` is `Optional`)
- extract `metrics.lines_by_prefix('scylla_hints_manager_')` to a helper
  function
- when waiting for conditions on metrics, split the condition into
  safety and liveness part, and fail early if the safety part does not
  hold
- in `exactly_one_hint` send, don't check that `send_errors_metric` is
  `0` (it won't be after the next commit)
2023-09-28 11:59:17 +02:00
Kamil Braun
992f1327d3 api: storage_service: add REST API to reload topology state
Some tests may want to modify system.topology table directly. Add a REST
API to reload the state into memory. An alternative would be restarting
the server, but that's slower and may have other side effects undesired
in the test.

The API can also be called outside tests, it should not have any
observable effects unless the user modifies `system.topology` table
directly (which they should never do, outside perhaps some disaster
recovery scenarios).
2023-09-28 11:59:16 +02:00
Kamil Braun
060f2de14e Merge 'Cluster features on raft: new procedure for joining group 0' from Piotr Dulikowski
This PR implements a new procedure for joining nodes to group 0, based on the description in the "Cluster features on Raft (v2)" document. This is a continuation of the previous PRs related to cluster features on raft (https://github.com/scylladb/scylladb/pull/14722, https://github.com/scylladb/scylladb/pull/14232), and the last piece necessary to replace cluster feature checks in gossip.

Current implementation relies on gossip shadow round to fetch the set of enabled features, determine whether the node supports all of the enabled features, and joins only if it is safe. As we are moving management of cluster features to group 0, we encounter a problem: the contents of group 0 itself may depend on features, hence it is not safe to join it unless we perform the feature check which depends on information in group 0. Hence, we have a dependency cycle.

In order to solve this problem, the algorithm for joining group 0 is modified, and verification of features and other parameters is offloaded to an existing node in group 0. Instead of directly asking the discovery leader to unconditionally add the node to the configuration with `GROUP0_MODIFY_CONFIG`, two different RPCs are added: `JOIN_NODE_REQUEST` and `JOIN_NODE_RESPONSE`. The main idea is as follows:

- The new node sends `JOIN_NODE_REQUEST` to the discovery leader. It sends a bunch of information describing the node, including supported cluster features. The discovery leader verifies some of the parameters and adds the node in the `none` state to `system.topology`.
- The topology coordinator picks up the request for the node to be joined (i.e. the node in `none` state), verifies its properties - including cluster features - and then:
	- If the node is accepted, the coordinator transitions it to `boostrap`/`replace` state and transitions the topology to `join_group0` state. The node is added to group 0 and then `JOIN_NODE_RESPONSE` is sent to it with information that the node was accepted.
	- Otherwise, the node is moved to `left` state, told by the coordinator via `JOIN_NODE_RESPONSE` that it was rejected and it shuts down.

The procedure is not retryable - if a node fails to do it from start to end and crashes in between, it will not be allowed to retry it with the same host_id - `JOIN_NODE_REQUEST` will fail. The data directory must be cleared before attempting to add it again (so that a new host_id is generated).

More details about the procedure and the RPC are described in `topology-over-raft.md`.

Fixes: #15152

Closes scylladb/scylladb#15196

* github.com:scylladb/scylladb:
  tests: mark test_blocked_bootstrap as skipped
  storage_service: do not check features in shadow round
  storage_service: remove raft_{boostrap,replace}
  topology_coordinator: relax the check in enable_features
  raft_group0: insert replaced node info before server setup
  storage_service: use join node rpc to join the cluster
  topology_coordinator: handle joining nodes
  topology_state_machine: add join_group0 state
  storage_service: add join node RPC handlers
  raft: expose current_leader in raft::server
  storage_service: extract wait_for_live_nodes_timeout constant
  raft_group0: abstract out node joining handshake
  storage_service: pass raft_topology_change_enabled on rpc init
  rpc: add new join handshake verbs
  docs: document the new join procedure
  topology_state_machine: add supported_features to replica_state
  storage_service: check destination host ID in raft verbs
  group_state_machine: take reference to raft address map
  raft_group0: expose joined_group0
2023-09-28 11:45:09 +02:00
Dawid Medrek
a870eeb2ab db/hints: Alias segment list in hint_storage.cc
Naming the type should improve readability.
2023-09-27 18:49:08 +02:00
Dawid Medrek
aba85c9c98 db/hints: Rename rebalance to rebalance_hints
The new name conveys the idea clearly.
2023-09-27 18:49:08 +02:00
Dawid Medrek
64f4b825d3 db/hints: Clean up rebalance() in hint_storage.cc
This commit fixes indentation and formatting after
recent changes in the file.
2023-09-27 18:49:04 +02:00
Dawid Medrek
b662756256 db/hints: Coroutinize hint_storage.cc 2023-09-27 18:47:38 +02:00
Dawid Medrek
17e763a83a db/hints: Clean up remove_irrelevant_shards_directories() in hint_storage.cc
This commit makes the function abide by the limit of 120 characters
per line and stops unnecessarily calling c_str() on seastar::sstring.
2023-09-27 18:45:01 +02:00
Dawid Medrek
73d02cfcef db/hints: Clean up rebalance_segments() in hint_storage.cc
This commit makes the function less compact and turns overly
long lines into shorter ones to improve the readability of
the code.
2023-09-27 18:45:01 +02:00
Dawid Medrek
479f4d1ad3 db/hints: Clean up rebalance_segments_for() in hint_storage.cc
This commit makes the function less compact and abides by the limit
of 120 characters per line; that makes the code more readable.
We start using fmt::to_string instead of seastar::format("{:d"})
to convert strings to integers -- the new way is the preferred one.
The changes also name variables in a more descriptive way.
2023-09-27 18:45:01 +02:00
Dawid Medrek
a1df8dbf1c db/hints: Clean up get_current_hints_segments() in hint_storage.cc
This commit makes the function less compact and abides by the limit
of 120 characters per line. That makes the code more readable.
It also doesn't unnecessarily call c_str() on seastar::sstring.
2023-09-27 18:45:01 +02:00
Dawid Medrek
1fccd34dba db/hints: Rename scan_for_hints_dirs to scan_shard_hint_directories
The new name better conveys which directories the function should scan.
2023-09-27 18:45:01 +02:00
Dawid Medrek
8e94074b85 db/hints: Clean up scan_for_hints_dirs() in hint_storage.cc
There is no need to call c_str() on the name of the directory entry.
In fact, the used overload std::stoi() takes an std::string as its
argument. Providing seastar::sstring instead of const char* is more
efficient because we can allocate just the right amount of memory
and std::memcpy it, i.e. call std::string(const char*, std::size_t).
Using the overload std::string(const char*) would need to first
traverse the string to find the null byte.

This is a small change, all the more because paths don't tend to
be long, but it's some gain nonetheless.

The commit also inserts a few empty lines to make the code less
compact and improve readability as a result.
2023-09-27 18:45:01 +02:00
Dawid Medrek
7c68882578 db/hints: Wrap hint_storage.cc in an anonymous namespace
An anonymous namespace is a safer mechanism than the static
keyword. When adding a new piece of code, it's easy to
forget about adding the static. In that case, that code
might undergo external linkage. However, when code is put
in an anonymous namespace (when it should not), the linker
will immediately detect it (in most cases), and
the programmer will be able to spot and fix their mistake
right away.
2023-09-27 18:41:41 +02:00
Piotr Dulikowski
2c17f81f44 tests: mark test_blocked_bootstrap as skipped
With the new procedure to join nodes, testing the scenario in
`test_blocked_bootstrap` becomes very tricky. To recap, the test does
the following:

- Starts a 3-node cluster,
- Shuts down node 1,
- Tries to replace node 1 with node 4, but an error injection is
  triggered which causes node 4 to fail after it joins group 0. Note
  that pre-JOIN_NODE handshake, this would only result in node 4 being
  added to group 0 config, but no modification to the group 0 state
  itself is being done - the joining node is supposed to write a request
  to join.
- Tries to replace node 1 again with node 5, which should succeed.

The bug that this regression test was supposed to check for was that
node 5 would try to resolve all IPs of nodes added to group 0 config.
Because node 4 shuts down before advertising itself in gossip, the node
5 would get stuck.

The new procedure to join group 0 complicates the situation because a
request to join is written first to group 0 and only then the topology
coordinator modifies the group 0 config. It is possible to add an error
injection to the topology coordinator code so that it doesn't change the
group 0 state and proceeds with bootstrapping the node, but it will only
get stuck trying to add the node. If node 5 tries to join in the
meantime, the topology coordinator may switch to it and try to bootstrap
it instead, but this is basically a 50% chance because it depends on the
order of node 4 and node 5's host IDs in the topology_state_machine
struct.

It should be possible to fix the test with error recovery, but until
then it is marked as skipped.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
11ab7c3853 storage_service: do not check features in shadow round
The new joining procedure safely checks compatibility of
supported/enabled features, therefore there is no longer any need to do
it in the gossip shadow round.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
bf5059e83c storage_service: remove raft_{boostrap,replace}
The functionality of `raft_bootstrap` and `raft_replace` is handled by
the new handshake, so those functions can be removed.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
9a829ddf97 topology_coordinator: relax the check in enable_features
Currently, `enable_features` requires that there are no topology in
progress and there are no nodes waiting to be joined. Now, after the new
handshake is implemented, we can drop the second condition because nodes
in `none` state are not a part of group 0 yet.

Additionally, the comments inside `enable_features` are clarified so
that they explain why it's safe to only include normal features when
doing the barrier and calculating features to enable.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
3ee3699a9c raft_group0: insert replaced node info before server setup
Currently, information about replaced node is put into the raft address
map after joining group 0 via `join_group0`. However, the new handshake
which happens when joining group 0 needs to read the group 0 state (so
that it can wait until it sees all normal nodes as UP). Loading the
topology state to memory involves resolving IP addresses of the normal
nodes, so the information about replaced node needs to be inserted
before the handshake happens.

This commit moves insertion of the replace node's data before the call
to `join_group0`.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
41a22f6e3b storage_service: use join node rpc to join the cluster
Now, the storage_service uses new RPCs to join the cluster. A new
handshaker is implemented and passed to group0 in order to make it
happen.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
862b6e61a4 topology_coordinator: handle joining nodes
The topology coordinator is updated to perform verification of joining
nodes and to send `JOIN_NODE_RESPONSE` RPC back to the joining node.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
5ba2bfa015 topology_state_machine: add join_group0 state
Currently, when the topology coordinator notices a request to join or
replace a node, the node is transitioned to an appropriate state and the
topology is moved to commit_new_generation/write_both_read_old, in a
single group 0 operation. In later commits, the topology coordinator
will accept/reject nodes based on the request, so we would like to have
a separate step - topology coordinator accepts, transitions to bootstrap
state, tells the node that it is accepted, and only then continues with
the topology transition.

This commits adds a new `join_group0` transition state that precedes
`commit_cdc_generation`.
2023-09-27 15:53:15 +02:00
Piotr Dulikowski
bb40c2a8b8 storage_service: add join node RPC handlers 2023-09-27 15:53:13 +02:00