Commit Graph

7372 Commits

Author SHA1 Message Date
Avi Kivity
7197d280b0 Merge 'scylla-gdb.py: lazy-evaluate the constants ' from Kefu Chai
instead of evaluating the constants in-class, accessing them via
a cached class property.

it would be handy if we could source `scylla-gdb.py` in `.gdbinit`,
but this script accesses some symbols which are not available without
a file being debugged. what's why gdb fails to load the init script:

```
Traceback (most recent call last):
  File "/home/kefu/dev/scylladb/scylla-gdb.py", line 167, in <module>
    class intrusive_slist:
  File "/home/kefu/dev/scylladb/scylla-gdb.py", line 168, in intrusive_slist
    size_t = gdb.lookup_type('size_t')
             ^^^^^^^^^^^^^^^^^^^^^^^^^
gdb.error: No type named size_t.
```

so we have to `file path/to/scylla` and *then*
`source scylla-gdb.py` every time when we debug scylla or a seastar
application, instead of loading `scylla-gdb.py` in `.gdbinit`.

the reason is that the script accesses the debug symbols like
`gdb.lookup_type('size_t')` in-class. so when the python interpreter
reads the script, it evaluates this statement, but at that moment,
the debug symbols are not loaded, so `source scylla-gdb.py` fails
in `.gdbinit`.

in this change, we transform all these class variables to cached
properties, so that they

* are evaluated on-demand
* are evaluated only once at most

this addresses the pain at the expense of verbosity.

---

this change intends to improve the developer's user experience, and has no impacts on product, so no need to backport.

Closes scylladb/scylladb#20334

* github.com:scylladb/scylladb:
  test/scylla_gdb: test the .gdb init use case
  scylla-gdb.py: lazy-evaluate the constants
2024-09-01 20:00:53 +03:00
Pavel Emelyanov
7df43312ac test: Remove sstable making helpers from table_for_tests
All users of it have sstable_test_env at hand (in fact -- they call env
method to get table_for_test). And since sstable_test_env already has a
bunch of methods to create sstable, the table_for_test wrapper doesn't
need to duplicate this code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20360
2024-09-01 19:58:15 +03:00
Kefu Chai
e431b90145 test/boost/view_build_test: include used header
before this change, when building the test of `view_build_test` with
clang-20, we can have following build failure:

```
FAILED: test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o
/home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o -MF test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o.d -o test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o -c /home/kefu/dev/scylladb/test/boost/view_build_test.cc
/home/kefu/dev/scylladb/test/boost/view_build_test.cc:998:5: error: unknown type name 'simple_schema'
  998 |     simple_schema ss;
      |     ^
```

apparently, `simple_schema`'s declaration is not available in this
translation unit.

in this change

* we include the header where `simple_schema` is defined, so that
  the build passes with clang-20.
* also take this opportunity to reorder the header a little bit,
  so the testing headers are grouped together.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20367
2024-09-01 18:58:23 +03:00
Kefu Chai
753188c33d test: include seastar/testing/random.hh when appropriate
in a recent seastar change (644bb662), we do not include
`seastar/testing/random.hh` in `seastar/testing/test_runner.hh` anymore,
as the latter is not a facade of the former, and neither does it use the
former. as a sequence, some tests which take the advantage of the
included `seastar/testing/random.hh` do not build with the latest
seastar:

```
FAILED: test/lib/CMakeFiles/test-lib.dir/key_utils.cc.o
/usr/bin/clang++ -DBOOST_REGEX_DYN_LINK -DBOOST_REGEX_NO_LIB -DBOOST_UNIT_TEST_FRAMEWORK_DYN_LINK -DBOOST_UNIT_TEST_FRAMEWORK_NO_LIB -DDEVEL -DFMT_SHARED -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSCYLLA_ENABLE_PREEMPTION_SOURCE -DSEASTAR_API_LEVEL=7 -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/__w/scylladb/scylladb -I/__w/scylladb/scylladb/build/gen -I/__w/scylladb/scylladb/seastar/include -I/__w/scylladb/scylladb/build/seastar/gen/include -I/__w/scylladb/scylladb/build/seastar/gen/src -I/__w/scylladb/scylladb/build -isystem /__w/scylladb/scylladb/abseil -isystem /__w/scylladb/scylladb/build/rust -O2 -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/__w/scylladb/scylladb/build=. -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -MD -MT test/lib/CMakeFiles/test-lib.dir/key_utils.cc.o -MF test/lib/CMakeFiles/test-lib.dir/key_utils.cc.o.d -o test/lib/CMakeFiles/test-lib.dir/key_utils.cc.o -c /__w/scylladb/scylladb/test/lib/key_utils.cc
In file included from /__w/scylladb/scylladb/test/lib/key_utils.cc:11:
/__w/scylladb/scylladb/test/lib/random_utils.hh:25:30: error: no member named 'local_random_engine' in namespace 'seastar::testing'
   25 |     return seastar::testing::local_random_engine;
      |            ~~~~~~~~~~~~~~~~~~^
1 error generated.
```

in this change, we include `seastar/testing/random.hh` when the random
facility is used, so that they can be compiled with the latest seastar
library.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20368
2024-09-01 18:57:07 +03:00
Kamil Braun
e01cef01a6 Merge 'Ignore seed name resolution errors during the restart of a cluster member node.' from Sergey Zolotukhin
All seeds hostname resolution errors will be ignored during a node
restart in case the node had already joined a cluster.  This will
prevent restart errors if some seed names are not resolvable.

Fixes scylladb/scylladb#14945

Closes scylladb/scylladb#20292

* github.com:scylladb/scylladb:
  Ignore seed name resolution errors on restart.
  Add a test for starting with a wrong seed.
2024-08-30 11:33:44 +02:00
Kamil Braun
292ef0d1f9 Merge 'Fix node replace with inter-dc encryption enabled.' from Gleb Natapov
Currently if a coordinator and a node being replaced are in the same DC
while inter-dc encryption is enabled (connections between nodes in the
same DC should not be encrypted) the replace operation will fail. It
fails because a coordinator uses non encrypted connection to push raft
data to the new node, but the new node will not accept such connection
until it knows which DC the coordinator belongs to and for that the raft
data needs to be transferred.

The series adds the test for this scenario and the fix for the
chicken&egg problem above.

The series (or at least the fix itself) needs to be backported because
this is a serious regression.

Fixes: scylladb/scylladb#19025

Closes scylladb/scylladb#20290

* github.com:scylladb/scylladb:
  topology coordinator: fix indentation after the last patch
  topology coordinator: do not add replacing node without a ring to topology
  test: add test for replace in clusters with encryption enabled
  test.py: add server encryption support to cluster manager
  .gitignore: fix pattern for resources to match only one specific directory
2024-08-30 11:29:05 +02:00
Kefu Chai
82fbe317ec test/scylla_gdb: test the .gdb init use case
before this change, we run all the tests in a single pytest session,
with scylladb debug symbols loaded. but we want to test another use
case, where the scylladb debug symbols are missing.

in this change,

* we do not check for the existence of debug symbols until necessary
* add a mark named "without_scylla"
* run the tests in two pytest sessions
  - one with "without_scylla" mark
  - one with "not without_scylla" mark
* add a test which is marked with the "without_scylla" mark. the test
  verify that the scylla-gdb.py script can be loaded even without
  scylladb debug symbols.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-08-30 17:05:29 +08:00
Pavel Emelyanov
cec4d207f6 Merge 'repair: throw if batchlog manager isn't initialized' from Aleksandra Martyniuk
repair_service::repair_flush_hints_batchlog_handler may access batchlog
manager while it is uninitialized.

Throw if batchlog manager isn't initialized.

Fixes:  #20236.

Needs backport to 6.0 and 6.1 as they suffer from the uninitialized bm access.

Closes scylladb/scylladb#20251

* github.com:scylladb/scylladb:
  test: add test to ensure repair won't fail with uninitialized bm
  repair: throw if batchlog manager isn't initialized
2024-08-30 11:37:24 +03:00
Botond Dénes
9f9346fc59 Merge 'nodetool: tasks: add nodetool commands to track task manager tasks' from Aleksandra Martyniuk
Add nodetool commands to manage task manager tasks:
- tasks abort - aborts the task
- tasks list - lists all tasks in the module
- tasks modules - lists all modules
- tasks set-ttl - sets task ttl
- tasks status - gets status of the task
- tasks tree - gets statuses of the task and all its desendent's
- tasks ttl - gets task ttl
- tasks wait - waits for the task and gets its status

Fixes: https://github.com/scylladb/scylladb/issues/19201.

Closes scylladb/scylladb#19614

* github.com:scylladb/scylladb:
  test: nodetool: add tests for tasks commands
  nodetool: tasks: add nodetool commands to track task manager tasks
  api: task_manager: return status 403 if a task is not abortable
  api: task_manager: return none instead of empty task id
  api: task_manager: add timeout to wait_task
  api: task_manager: add operation to get ttl
  nodetool: add suboperations support
  nodetool: change operations_with_func type
  nodetool: prepare operation related classes for suboperations
2024-08-30 07:37:37 +03:00
Avi Kivity
67b24859bc Merge 'generic_server: convert connection tracking to seastar::gate' from Laszlo Ersek
~~~
generic_server: convert connection tracking to seastar::gate

If we call server::stop() right after "server" construction, it hangs:

With the server never listening (never accepting connections and never
serving connections), nothing ever calls server::maybe_stop().
Consequently,

    co_await _all_connections_stopped.get_future();

at the end of server::stop() deadlocks.

Such a server::stop() call does occur in controller::do_start_server()
[transport/controller.cc], when

- cserver->start() (sharded<cql_server>::start()) constructs a
  "server"-derived object,

- start_listening_on_tcp_sockets() throws an exception before reaching
  listen_on_all_shards() (for example because it fails to set up client
  encryption -- certificate file is inaccessible etc.),

- the "deferred_action"

      cserver->stop().get();

  is invoked during cleanup.

(The cserver->stop() call exposing the connection tracking problem dates
back to commit ae4d5a60ca ("transport::controller: Shut down distributed
object on startup exception", 2020-11-25), and it's been triggerable
through the above code path since commit 6b178f9a4a
("transport/controller: split configuring sockets into separate
functions", 2024-02-05).)

Tracking live connections and connection acceptances seems like a good fit
for "seastar::gate", so rewrite the tracking with that. "seastar::gate"
can be closed (and the returned future can be waited for) without anyone
ever having entered the gate.

NOTE: this change makes it quite clear that neither server::stop() nor
server::shutdown() must be called multiple times. The permitted sequences
are:

- server::shutdown() + server::stop()

- or just server::stop().

Fixes #10305

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
~~~

Fixes #10305.

I think we might want to backport this -- it fixes a hang-on-misconfiguration which affects `scylla-6.1.0-0.20240804.abbf0b24a60c.x86_64` minimally. Basically every release that contains commit ae4d5a60ca has a theoretical chance for the hang, and every release that contains commit 6b178f9a4a has a practical chance for the hang.

Focusing on the more practical symptom (i.e., releases containing commit 6b178f9a4a), `git tag --contains 6b178f9a4a90` gives us (ignoring candidates and release candidates):
- scylla-6.0.0
- scylla-6.0.1
- scylla-6.0.2
- scylla-6.1.0

Closes scylladb/scylladb#20212

* github.com:scylladb/scylladb:
  generic_server: make server::stop() idempotent
  generic_server: coroutinize server::shutdown()
  generic_server: make server::shutdown() idempotent
  test/generic_server: add test case
  configure, cmake: sort the lists of boost unit tests
  generic_server: convert connection tracking to seastar::gate
2024-08-29 19:45:48 +03:00
Aleksandra Martyniuk
1f46cad5de test: nodetool: add tests for tasks commands 2024-08-29 17:37:13 +02:00
Avi Kivity
7da3314deb Merge 'Integrated restore' from Ernest Zaslavsky
Handed over from https://github.com/scylladb/scylladb/pull/20149

This adds minimal implementation of the start-restore API call.

The method starts a task that runs load-and-stream functionality against sstables from S3 bucket. Arguments are:

```
endpoint -- the ID in object_store.yaml config file
bucket -- the target bucket to get objects from
keyspace -- the keyspace to work on
table -- the table to work on
snapshot -- the name of the snapshot from which the backup was taken
```
The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion.

Remote sstables components are scanned as if they were placed in local upload/ directory. Then colelcted sstables are fed into load-and-stream.

This branch has https://github.com/scylladb/scylladb/pull/19890 (Integrated backup), https://github.com/scylladb/scylladb/pull/20120 (S3 lister) and few more minor PRs merged in. The restore branch itself starts with [utils: Introduce abstract (directory) lister](29c867b54d) commit.

refs: https://github.com/scylladb/scylladb/issues/18392

Closes scylladb/scylladb#20305

* github.com:scylladb/scylladb:
  tools/scylla-nodetool: add restore integration
  test/object_store: Add simple restore test
  test/object_store: Generalize prepare_snapshot_for_backup()
  code: Introduce restore API method
  sstable_loader: Add sstables::storage_manager dependency
  sstable_loader: Maintain task manager module
  sstable_loader: Out-line constructor
  distributed_loader: Split get_sstables_from_upload_dir()
  sstables/storage: Compose uploaded sstable path simpler
  sstable_directory: Prepare FS lister to scan files on S3
  sstable_directory: Parse sstable component without full path
  s3-client: Add support for lister::filter
  utils: Introduce abstract (directory) lister
2024-08-29 18:25:30 +03:00
Kamil Braun
9574c399ce Merge 'add support for zero-token nodes' from Patryk Jędrzejczak
We revive the `join_ring` option. We support it only in the
Raft-based topology, as we plan to remove the gossip-based topology
when we fix the last blocker - the implementation of the manual
recovery tool. In the Raft-based topology, a node can be assigned
tokens only once when it joins the cluster. Hence, we disallow
joining the ring later, which is possible in Cassandra.

The main idea behind the solution is simple. We make the unsupported
special case of zero tokens a supported normal case. Nodes with zero
tokens assigned are called "zero-token nodes" from now on.

From the topology point of view, zero-token nodes are the same as
token-owning nodes. They can be in the same states, etc. From the
data point of view, they are different. They are not members of
the token ring, so they are not present in
`token_metadata::_normal_token_owners`. Hence, they are ignored in
all non-local replication strategies. The tablet load balancer also
ignores them.

Zero-token nodes can be used as coordinator-only nodes, just like in
Cassandra. They can handle requests just like token-owning nodes.

The main motivation behind zero-token nodes is that they can prevent
the Raft majority loss efficiently. Zero-token nodes are group 0
voters, but they can run on much weaker and cheaper machines because
they do not replicate data and handle client requests by default
(drivers ignore them). For example, if there are two DCs, one with 4
nodes and one with 5 nodes, if we add a DC with 2 zero-token nodes,
every DC will contain less than half of the nodes, so we won't lose
the majority when any DC dies.

Another way of preventing the Raft majority loss is changing the
voter set, which is tracked by scylladb/scylladb#18793. That approach
can be used together with zero-token nodes. In the example above, if
we choose equal numbers of voters in both DCs, then a DC with one
zero-token node will be sufficient. However, in the typical setup of
2 DCs with the same number of nodes it is enough to add a DC with
only one zero-token node without changing the voter set.

Zero-token nodes could also be used as load balancers in the
Alternator.

Additionally, this PR fixes scylladb/scylladb#11087, which turned out to
be a blocker.

This PR introduced a new feature. There is no need to backport it.

Fixes scylladb/scylladb#6527
Fixes scylladb/scylladb#11087
Fixes scylladb/scylladb#15360

Closes scylladb/scylladb#19684

* github.com:scylladb/scylladb:
  docs: raft: document using zero-token nodes to prevent majority loss
  test: test recovery mode in the presence of zero-token nodes
  test: topology: util.py: add cqls parameter to check_system_topology_and_cdc_generations_v3_consistency
  test: topology: util.py: accept zero tokens in check_system_topology_and_cdc_generations_v3_consistency
  treewide: support zero-token nodes in the recovery mode
  storage_proxy: make TRUNCATE work locally for local tables
  test: topology: util.py: document that check_token_ring_and_group0_consistency fails with zero-token nodes
  test: test zero-token nodes
  test: test_topology_ops: move helpers to topology/util.py
  feature_service: introduce the ZERO_TOKEN_NODES feature
  storage_service: rename join_token_ring to join_topology
  storage_service: raft_topology_cmd_handler: improve warnings
  topology_coordinator: fix indentation after the previous patch
  treewide: introduce support for zero-token nodes in Raft topology
  system_keyspace: load_topology_state: remove assertion impossible to hit
  treewide: distinguish all nodes from all token owners
  gossip topology: make a replacing node remove the replaced node from topology
  locator: topology: add_or_update_endpoint: use none as the default node state
  test: boost: tablets tests: ensure all nodes are normal token owners
  token_metadata: rename get_all_endpoints and get_all_ips
  network_topology_strategy: reallocate_tablets: remove unused dc_rack_nodes
  virtual_tables: cluster_status_table: execute: set dc regardless of the token ownership
2024-08-29 16:26:21 +02:00
Gleb Natapov
17f4a151ce topology coordinator: do not add replacing node without a ring to topology
When only inter dc encryption is enabled a non encrypted connection
between two nodes is allowed only if both nodes are in the same dc.
If a nodes that initiates the connection knows that dst is in the same
dc and hence use non encrypted connection, but the dst not yet knows the
topology of the src such connection will not be allowed since dst cannot
guaranty that dst is in the same dc.

Currently, when topology coordinator is used, a replacing node will
appear in the coordinator's topology immediately after it is added to the
group0. The coordinator will try to send raft message to the new node
and (assuming only inter dc encryption is enabled and replacing node and
the coordinator are in the same dc) it will try to open regular, non encrypted,
connection to it. But the replacing node will not have the coordinator
in it's topology yet (it needs to sync the raft state for that). so it
will reject such connection.

To solve the problem the patch does not add a replacing node that was
just added to group0 to the topology. It will be added later, when
tokens will be assigned to it. At this point a replacing node will
already make sure that its topology state is up-to-date (since it will
execute a raft barrier in join_node_response_params handler) and it knows
coordinator's topology. This aligns replace behaviour with bootstrap
since bootstrap also does not add a node without a ring to the topology.

The patch effectively reverts b8ee8911ca

Fixes: scylladb/scylladb#19025
2024-08-29 17:14:09 +03:00
Gleb Natapov
2f1b1fd45e test: add test for replace in clusters with encryption enabled 2024-08-29 17:14:09 +03:00
Gleb Natapov
b98282a976 test.py: add server encryption support to cluster manager 2024-08-29 17:14:09 +03:00
Aleksandra Martyniuk
627fc46ca7 api: task_manager: return status 403 if a task is not abortable 2024-08-29 13:53:40 +02:00
Aleksandra Martyniuk
10ab60f32b api: task_manager: return none instead of empty task id
If a user requests a status of a task that does not have a parent,
show "none" instead of an empty parent_id.
2024-08-29 13:53:40 +02:00
Aleksandra Martyniuk
fb160afaf6 nodetool: add suboperations support
Modify nodetool methods so that it support suboperations.
2024-08-29 13:53:39 +02:00
Patryk Jędrzejczak
e027ffdffc test: test recovery mode in the presence of zero-token nodes
We modify existing tests to verify that the recovery mode works
correctly in the presence of zero-token nodes.

In `test_topology_recovery_basic`, we test the case when a
zero-token node is live. In particular, we test that the
gossip-based restart of such a node works.

In `test_topology_recovery_after_majority_loss`, we test the case
when zero-token nodes are unrecoverable. In particular, we test
that the gossip-based removenode of such nodes works.

Since zero-token nodes are ignored by the Python driver if it also
connects to other nodes, we use different CQL sessions for a
zero-token node in `test_topology_recovery_basic`.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
fb1e060c4c test: topology: util.py: add cqls parameter to check_system_topology_and_cdc_generations_v3_consistency
In the following commit, we modify `test_topology_recovery_basic`
to test the recovery mode in the presence of live zero-token nodes.
Unfortunately, it requires a bit ugly workaround. Zero-token nodes
are ignored by the Python driver if it also connects to other
nodes because of empty tokens in the `system.peers` table. In that
test, we must connect to a zero-token node to enter the recovery
mode and purge the Raft data. Hence, we use different CQL sessions
for different nodes.

In the future, we may change the Python driver behavior and revert
this workaround. Moreover, the recovery tests will be removed or
significantly changed when we implement the manual recovery tool.
Therefore, we shouldn't worry about this workaround too much.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
54905fc179 test: topology: util.py: accept zero tokens in check_system_topology_and_cdc_generations_v3_consistency
Before we use `check_system_topology_and_cdc_generations_v3_consistency`
in a test with a zero-token node, we must ensure it doesn't fail
because of zero tokens in a row of the `system.topology` table.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
21c8409fa4 test: topology: util.py: document that check_token_ring_and_group0_consistency fails with zero-token nodes 2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
95e14ae44b test: test zero-token nodes
We add tests to verify the basic properties of zero-token nodes.

`test_zero_token_nodes_no_replication` and
`test_not_enough_token_owners` are more or less deterministic tests.
Running them only in the dev mode is sufficient.

`test_zero_token_nodes_topology_ops` is quite slow, as expected,
considering parameterization and the number of topology operations.
In the future we can think of making it faster or skipping in the
debug mode. For now, our priority is to test zero-token nodes
thoroughly.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
d43d67c525 test: test_topology_ops: move helpers to topology/util.py
In one of the following patches, we reuse the helper functions from
`test_topology_ops` in a new test, so we move them to `util.py`.

Also, we add the `cl` parameter to `start_writes`, as the new test
will use `cl=2`.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
ed55261650 treewide: distinguish all nodes from all token owners
In one of the following patches, we introduce support for zero-token
nodes. From that point, getting all nodes and getting all token
owners isn't equivalent. In this patch, we ensure that we consider
only token owners when we want to consider only token owners (for
example, in the replication logic), and we consider all nodes when
we want to consider all nodes (for example, in the topology logic).

The main purpose of this patch is to make the PR introducing
zero-token nodes easier to review. The patch that introduces
zero-token nodes is already complicated. We don't want trivial
changes from this patch to make noise there.

This patch introduces changes needed for zero-token nodes only in the
Raft-based topology and in the recovery mode. Zero-token nodes are
unsupported in the gossip-based topology outside recovery.

Some functions added to `token_metadata` and `topology` are
inefficient because they compute a new data structure in every call.
They are never called in the hot path, so it's not a serious problem.
Nevertheless, we should improve it somehow. Note that it's not
obvious how to do it because we don't want to make `token_metadata`
store topology-related data. Similarly, we don't want to make
`topology` store token-related data. We can think of an improvement
in a follow-up.

We don't remove unused `topology::get_datacenter_rack_nodes` and
`topology::get_datacenter_nodes`. These function can be useful in the
future. Also, `topology::_dc_nodes` is used internally in `topology`.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
c7016dedb3 locator: topology: add_or_update_endpoint: use none as the default node state
In one of the following patches, we change the gossiper to work the
same for zero-token nodes and token-owning nodes. We replace
occurrences of `is_normal_token_owner` with topology-based
conditions. We want to rely on the invariant that token-owning nodes
own tokens if and only if they are in the normal or leaving state.
However, this invariant can be broken in the gossip-based topology
when a new node joins the cluster. When a boostrapping node starts
gossiping, other nodes add it to their topology in
`storage_service::on_alive`. Surprisingly, the state of the new node
is set to `normal`, as it's the default value used by
`add_or_update_endpoint`. Later, the state will be set to
`bootstrapping` or `replacing`, and finally it will be set again to
`normal` when the join operation finishes. We fix this strange
behavior by setting the node state to `none` in
`storage_service::on_alive` for nodes not present in the topology.
Note that we must add such nodes to the topology. Other code needs
their Host ID, IP, and location.

We change the default node state from `normal` to `none` in
`add_or_update_endpoint` to prevent bugs like the one in
`storage_service::on_alive`. Also, we ensure that nodes in the `none`
state are ignored in the getters of `locator::topology`.
2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak
6adaf85634 test: boost: tablets tests: ensure all nodes are normal token owners
In one of the following patches, we make NetworkTopologyStrategy
and the tablet load balancer consider only normal token owners to
ensure they ignore zero-token nodes. Some unit tests would start
failing after this change because they do not ensure that all
nodes are normal token owners. This patch prevents it.

Judging by the logic in the test cases in
`network_topology_strategy_test`, `point++` was probably intended
anyway.
2024-08-29 10:37:07 +02:00
Kefu Chai
ecfe0aace6 perf: perf_mutation_readers: break memtable class down
before this change, memtable serves as the fixture for 6 test cases,
actually these 6 test cases can be categorized into a matrix of 3 x 2:
{ single_row, multi_row, large_partition } x { single_partition, multi_paritition }.

in this change, we break memtable into 3 different fixtures, to reflect
this fact. more readable this way. and a benefit is that each test does
not have to pay for the overhead of setup it does not use at all.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20177
2024-08-29 08:54:17 +03:00
Nadav Har'El
6391550bbc test/alternator: add another check to test_stream_list_tables
The test test_streams.py::test_stream_list_tables reproduces a bug where
enabling streams added a spurious result to ListTables. A reviewer of
that patch asked to also add a check that name of the table itself
doesn't disappear from ListTables when a stream is enabled, so this is
what this patch adds.

This theoretical scenario (a table's name disappearing from ListTables)
never happened, so the new check doesn't reproduce any known bug, but
I guess it never hurts to make the test stronger for regression testing.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19934
2024-08-29 08:45:22 +03:00
Kefu Chai
03ab80501f tools/scylla-nodetool: add restore integration
as we have an API for restore a keyspace / table, let's expose this feature
with nodetool. so we can exercise it without the help of scylla-manager
or 3rd-party tools with a user-friendly interface.

in this change:

* add a new subcommand named "restore" to nodetool
* add test to verify its interaction with the API server
* update the document accordingly.
* the bash completion script is updated accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-08-28 15:42:49 +03:00
Pavel Emelyanov
41b9eda398 test/object_store: Add simple restore test
The test shows how to restore previously backed up table:

- backup
- truncate to get rid of existing sstables
- start restore with the new API method
- wait for the task to finish

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-28 15:42:49 +03:00
Pavel Emelyanov
f5a22a94c6 test/object_store: Generalize prepare_snapshot_for_backup()
Give it snapshot-name argument. Next test will want custom snapshot
name.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-28 15:42:49 +03:00
Pavel Emelyanov
11a04bfb66 code: Introduce restore API method
The method starts a task that uses sstables_loader load-and-stream
functionality to bring new sstables into the cluster. The existing
load-and-stream picks up sstables from upload/ directory, the newly
introduced task collects them from S3 bucket and given prefix (that
correspond to the path where backup API method put them).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-28 15:42:49 +03:00
Sergey Zolotukhin
65f37f3ba6 Ignore seed name resolution errors on restart.
Gossiper seeds host name resolution failures are ignored during restart if
a node is already boostrapped (i.e. it has successfully joined the cluster).

Fixes scylladb/scylladb#14945
2024-08-28 14:01:04 +02:00
Patryk Jędrzejczak
08cb3a5e2c test: test_raft_recovery_basic: add raft=trace logs
It could help when we hit scylladb/scylladb#17918 again.

This PR only changes log levels in a test, no need to backport it.

Refs scylladb/scylladb#17918

Closes scylladb/scylladb#20318
2024-08-28 13:50:09 +02:00
Sergey Zolotukhin
fc5e683d02 Add a test for starting with a wrong seed.
The test checks a bootstrapped node start with a wrong host name in the
seeds config.

Test for scylladb/scylladb#14945
2024-08-28 11:34:37 +02:00
Laszlo Ersek
dbc0ca6354 test/generic_server: add test case
Check whether we can stop a generic server without first asking it to
listen.

The test fails currently; the failure mode is a hang, which triggers the 5
minute timeout set in the test:

> unknown location(0): fatal error: in "stop_without_listening":
> seastar::timed_out_error: timedout
> seastar/src/testing/seastar_test.cc(43): last checkpoint
> test/boost/generic_server_test.cc(34): Leaving test case
> "stop_without_listening"; testing time: 300097447us

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-28 10:59:44 +02:00
Laszlo Ersek
931f2f8d73 configure, cmake: sort the lists of boost unit tests
Both lists were obviously meant to be sorted originally, but by today
we've introduced many instances of disorder -- thus, inserting a new test
in the proper place leaves the developer scratching their head. Sort both
lists.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-28 10:59:44 +02:00
Avi Kivity
94d5507237 Merge 'select from mutation_fragments() + tablets: handle reads for non-owned partitions' from Botond Dénes
Attempting to read a partition via `SELECT * FROM MUTATION_FRAGMENTS()`, which the node doesn't own, from a table using tablets causes a crash.
This is because when using tablets, the replica side simply doesn't handle requests for un-owned tokens and this triggers a crash.
We should probably improve how this is handled (an exception is better than a crash), but this is outside the scope of this PR.
This PR fixes this and also adds a reproducer test.

Fixes: https://github.com/scylladb/scylladb/issues/18786

Fixes a regression introduced in 6.0, so needs backport to 6.0 and 6.1

Closes scylladb/scylladb#20109

* github.com:scylladb/scylladb:
  test/tablets: Test that reading tablets' mutations from MUTATION_FRAGMENTS works
  replica/mutation_dump: enfore pinning of effective replication map
  replica/mutation_dump: handle un-owned tokens (with tablets)
2024-08-27 20:46:10 +03:00
Avi Kivity
b13ab90448 Merge 'alternator/executor: Use native reversed format' from Łukasz Paszkowski
When executing reversed queries, a native revered format shall be used. Therefore, the table schema and the clustering key bounds are reversed before a partition slice and a read command are constructed.

It is, however, possible to run a reversed query passing a table schema but only when there are no restrictions on the clustering keys. In this particular situation, the query returns correct results. Since the current alternator tests in test.py do not imply any restrictions, this situation was not caught during development of https://github.com/scylladb/scylladb/pull/18864.

Hence, additional tests are provided that add clustering keys restrictions when executing reversed queries to capture such errors earlier than in dtests.

Additional manual tests were performed to test a mixed-node cluster (with alternator API enabled in Scylla on each node):

1. 2-node cluster with one node upgraded: reverse read queries performed on an old node
2. 2-node cluster with one node upgraded: reverse read queries performed on a new node
3. 2-node cluster with one node upgraded and all its sstable files deleted to trigger repair: reverse read queries performed on an old node
4. 2-node cluster with one node upgraded and all its sstable files deleted to trigger repair: reverse read queries performed on a new node

All reverse read queries above consists of:

- single-partition reverse reads with no clustering key restrictions, with single column restrictions and multi column restrictions both with and without paging turned on

The exact same tests were also performed on a fully upgraded cluster.

Fixes https://github.com/scylladb/scylladb/issues/20191

No backport is required as this is a complementary patch for the series https://github.com/scylladb/scylladb/pull/18864 that did not require backporting.

Closes scylladb/scylladb#20205

* github.com:scylladb/scylladb:
  test_query.py: Test reverse queries with clustering key bounds
  alternator::do_query Add additional trace log
  alternator::do_query: Use native reversed format
  alternator::do_query Rename schema with table_schema
2024-08-27 20:40:49 +03:00
Pavel Emelyanov
86bc5b11fe s3-client: Add support for lister::filter
Directory lister comes with a filter function that tells lister which
entries to skip by its .get() method. For uniformity, add the same to
S3 bucket_lister.

After this change the lister reports shorter name in the returned
directory entry (with the prefix cut), so also need to tune up the unit
test respectively.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-27 16:15:40 +03:00
Piotr Dulikowski
da5f4faac1 Merge 'mv: reject user requests by coordinator when a replica is overloaded by MVs' from Wojciech Mitros
Currently, when a view update backlog of one replica is full, the write is still sent by the coordinator to all replicas. Because of the backlog, the write fails on the replica, causing inconsistency that needs to be fixed by repair. To avoid these inconsistencies, this patch adds a check on the coordinator for overloaded replicas. As a result, a write may be rejected before being sent to any replicas and later retried by the user, when the replica is no longer overloaded.

This patch does not remove the replica write failures, because we still may reach a full backlog when more view updates are generated after the coordinator check is performed and before the write reaches the replica.

Fixes scylladb/scylladb#17426

Closes scylladb/scylladb#18334

* github.com:scylladb/scylladb:
  mv: test the view update behavior
  mv: add test for admission control
  storage_proxy: return overloaded_exception instead of throwing
  mv: reject user requests by coordinator when a replica is overloaded by MVs
2024-08-27 12:50:34 +02:00
Aleksandra Martyniuk
f38bb6483a test: add test to ensure repair won't fail with uninitialized bm 2024-08-27 11:37:50 +02:00
Botond Dénes
5c0f6d4613 Merge 'Make Summary support histogram with infinite bucket vlaues' from Amnon Heiman
This series fixes an issue where histogram Summaries return an infinite value.

It updated the quantile calculation logic to address cases where values fall into the infinite bucket of a histogram.
Now, instead of returning infinite (max int), the calculation will return the last bucket limit, ensuring finite outputs in all cases.

The series adds a test for summaries with a specific test case for this scenario.

Fixes #20255
Need backport to 6.0, 6.1 and 2023.1 and above

Closes scylladb/scylladb#20257

* github.com:scylladb/scylladb:
  test/estimated_histogram_test Add summary tests
  utils/histogram.hh: Make summary support inifinite bucket.
2024-08-27 10:33:54 +03:00
Avi Kivity
0acfa4a00d Merge 'abstract_replication_strategy: make get_ranges async' from Benny Halevy
To prevent stalls due to large number of tokens.
For example, large cluster with say 70 nodes can have
more than 16K tokens.

Fixes #19757

Closes scylladb/scylladb#19758

* github.com:scylladb/scylladb:
  abstract_replication_strategy: make get_ranges async
  database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param
  compaction: task_manager_module: open code maybe_get_keyspace_local_ranges
  alternator: ttl: token_ranges_owned_by_this_shard: let caller make the ranges_holder
  alternator: ttl: can pass const gms::gossiper& to ranges_holder
  alternator: ttl: ranges_holder_primary: unconstify _token_ranges member
  alternator: ttl: refactor token_ranges_owned_by_this_shard
2024-08-26 16:56:18 +03:00
Botond Dénes
b2c07c9b6f Merge 'compaction: change compaction stop reason ' from Aleksandra Martyniuk
Currently "table removal" is logged as a reason of compaction stop for table drop,
tablet cleanup and tablet split. Modify log to reflect the reason.

Closes scylladb/scylladb#20042

* github.com:scylladb/scylladb:
  test: add test to check compaction stop log
  compaction: fix compaction group stop reason
2024-08-26 13:40:07 +03:00
Kefu Chai
8ef26a9c8c build: cmake: add "test" target
before this change, none of the target generated by CMake-based
building system runs `test.py`. but `build.ninja` generated directly
by `configure.py` provides a target named `test`, which runs the
`test.py` with the options passed to `configure.py`.

to be more compatible with the rules generated by `configure.py`,
in this change

* do not include "CTest" module, as we are not using CTest for
  driving tests. we use the homebrew `test.py` for this purpose.
  more importantly, the target named "test" is provided by "CTest".
  so in order to add our own "test" target, we cannot use "CTest"
  module.
* add a target named "test" to run "test.py".
* add two CMake options so we can customize the behavior of "test.py",
  this is to be compatible with the existing behavior of `configure.py`.

Refs scylladb/scylladb#2717

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20263
2024-08-25 21:45:13 +03:00
Avi Kivity
72a85e3812 Merge 'Integrated backup' from Pavel Emelyanov
This adds minimal implementation of the start-backup API call.

The method starts a task that uploads all files from the given keyspace's snapshot to the requested endpoint/bucket. Arguments are:
- endpoint -- the ID in object_store.yaml config file
- bucket -- the target bucket to put objects into
- keyspace -- the keyspace to work on
- snapshot -- the method assumes that the snapshot had been already taken and only copies sstables from it

The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion (hint: it's good to have non-zero TTL value to make sure fast backups don't finish before the caller manages to call wait_task API).

Sstables components are scanned for all tables in the keyspace and are uploaded into the /bucket/${cf_name}/${snapshot_name}/ path.

refs: #18391

Closes scylladb/scylladb#19890

* github.com:scylladb/scylladb:
  tools/scylla-nodetool: add backup integration
  docs: Document the new backup method
  test/object_store: Test that backup task is abortable
  test/object_store: Add simple backup test
  test/object_store: Move format_tuples()
  test/pylib: Add more methods to rest client
  backup-task: Make it abortable (almost)
  code: Introduce backup API method
  database: Export parse_table_directory_name() helper
  database: Introduce format_table_directory_name() helper
  snapshot-ctl: Add config to snapshot_ctl
  snapshot-ctl: Add sstables::storage_manager dependency
  snapshot-ctl: Maintain task manager module
  snapshot-ctl: Add "snapshots" logger
  snapshot-ctl: Outline stop() method and constructor
  snapshot-ctl: Inline run_snapshot_list<>
  test/cql_test_env: Export task manager from cql test env
  task_manager: Print task ttl on start (for debugging)
  docs: Update object_storage.md with AWS_ environment
  docs: Restructure object_storage.md
2024-08-25 20:19:10 +03:00
Andrei Chekun
f54b7f5427 test.py: Increase pool size
Increase pool size changes were recently reverted because of the flakiness for the test_gossip_boot test. Test started
to fail on adding the node to the cluster without any issues in the Scylla log file. In test logs it looked like the
installation process for the new node just hanged. After investigating the problem, I've found out that the issue is that
test.py was draining the io_executor pool for cleaning the directory during install that was set to eight workers. So
to fix the issue, io_executor pool should be increased to more or less the same ratio as it was: doubled cluster pool size.

Closes scylladb/scylladb#20276
2024-08-25 19:59:18 +03:00