Commit Graph

38529 Commits

Author SHA1 Message Date
Pavel Emelyanov
7597663ef5 cql_test_env: Use table.find_row() shortcut
The require_column_has_value() finds the cell in three steps -- finds
partition, then row, then cell. The class table already has a method to
facilitate row finding by partition and clustering key

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-29 15:37:27 +03:00
Benny Halevy
5afc242814 token_metadata: get_endpoint_to_host_id_map_for_reading: just inform that normal node has null host_id
It is too early to require that all nodes in normal state
have a non-null host_id.

The assertion was added in 44c14f3e2b
but unfortunately there are several call sites where
we add the node as normal, but without a host_id
and we patch it in later on.

In the future we should be able to require that
once we identify nodes by host_id over gossiper
and in token_metadata.

Fixes scylladb/scylladb#15181

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15184
2023-08-28 21:40:55 +03:00
Botond Dénes
47ce69e9bf Merge 'paxos_response_handler: carry effective replication map' from Benny Halevy
As `create_write_response_handler` on this path accepts
an `inet_address_vector_replica_set` that corresponds to the
effective_replication_map_ptr in the paxos_response_handler,
but currently, the function retrieves a new
effective_replication_map_ptr
that may not hold all the said endpoints.

Fixes scylladb/scylladb#15138

Closes #15141

* github.com:scylladb/scylladb:
  storage_proxy: create_write_response_handler: carry effective_replication_map_ptr from paxos_response_handler
  storage_proxy: send_to_live_endpoints: throw on_internal_error if node not found
2023-08-28 11:42:38 +03:00
Kefu Chai
86e8be2dcd replica:database: log if endpoint not found
if the endpoint specified when creating a KEYSPACE is not found,
when flushing a memtable, we would throw an `std::out_of_range`
exception when looking up the client in `storage_manager::_s3_endpoints`
by the name of endpoint. and scylla would crash because of it. so
far, we don't have a good way to error out early. since the
storage option for keyspace is still experimental, we can live
with this, but would be better if we can spot this error in logging
messages when testing this feature.

also, in this change, `std::invalid_argument` is thrown instead of
`std::out_of_range`. it's more appropriate in this circumstance.

Refs #15074
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15075
2023-08-28 10:51:19 +03:00
Avi Kivity
fb8375e1e7 Merge 'storage_proxy: mutate_atomically_result: carry effective replication map down to create_write_response_handler' from Benny Halevy
The effective_replication_map_ptr passed to
`create_write_response_handler` by `send_batchlog_mutation`
must be synchronized with the one used to calculate
_batchlog_endpoints to ensure they use the same topology.

Fixes scylladb/scylladb#15147

Closes #15149

* github.com:scylladb/scylladb:
  storage_proxy: mutate_atomically_result: carry effective_replication_map down to create_write_response_handler
  storage_proxy: mutate_atomically_result: keep schema of batchlog mutation in context
2023-08-27 16:34:34 +03:00
Benny Halevy
a5d5b6ded1 gossiper: remove_endpoint: call on_dead notifications is endpoint was alive
Since 75d1dd3a76
gossiper::convict will no longer call `mark_dead`
(e.g. when called from the failure detection loop
after a node is stopped following decommission)
and therefore the on_dead notification won't get called.

To make that explicit, if the node was alive before
remove_endpoint erased it from _live_endpoint,
and it has an endpoint_state, call the on_dead notifications.
These are imporant to clean up after the node is dead
e.g. in storage_proxy::on_down which cancels all
respective write handlers.

This preferred over going through `mark_dead` as the latter
marks the endpoint as unreachable, which is wrong in this
case as the node left the cluster.

Fixes #15178

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15179
2023-08-27 16:18:27 +03:00
Takuya ASADA
ae25a216bc scylla_fstrim_setup: stop disabling fstrim.timer
Disabling fstrim.timer was for avoid running fstrim on /var/lib/scylla from
both scylla-fstrim.timer and fstrim.timer, but fstrim.timer actually never do
that, since it is only looking on fstab entries, not our systemd unit.

To run fstrim correctly on rootfs and other filesystems not related
scylla, we should stop disabling fstrim.timer.

Fixes #15176

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes #15177
2023-08-27 14:56:37 +03:00
Kefu Chai
83ceedb18b storage_service: do not cast a string to string_view before formatting
seastar::format() just forward the parameters to be formatted to
`fmt::format_to()`, which is able to format `std::string`, so there is
no need to cast the `std::string` instance to `std::string_view` for
formatting it.

in this change, the cast is dropped. simpler this way.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15143
2023-08-25 16:43:38 +03:00
Mikołaj Grzebieluch
a031a14249 tests: add asynchronous log browsing functionality
Add a class that handles log file browsing with the following features:
* mark: returns "a mark" to the current position of the log.
* wait_for: asynchronously checks if the log contains the given message.
* grep: returns a list of lines matching the regular expression in the log.

Add a new endpoint in `ManagerClient` to obtain the scylla logfile path.

Fixes #14782

Closes #14834
2023-08-25 14:19:09 +02:00
Raphael S. Carvalho
a22f74df00 table: Introduce storage snapshot for upcoming tablet streaming
New file streaming for tablets will require integration with compaction
groups. So this patch introduces a way for streaming to take a storage
snapshot of a given tablet using its token range. Memtable is flushed
first, so all data of a tablet can be streamed through its sstables.
The interface is compaction group / tablet agnostic, but user can
easily pick data from a single tablet by using the range in tablet
metadata for a given tablet.

E.g.:

	auto erm = table.get_effective_replication_map();
	auto& tm = erm->get_token_metadata();
	auto tablet_map = tm.tablets().get_tablet_map(table.schema()->id());

	for (auto tid : tablet_map.tablet_ids()) {
		auto tr = tmap.get_token_range(tid);

		auto ssts = co_await table.take_storage_snapshot(tr);

		...
	}

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15128
2023-08-25 13:06:02 +02:00
Patryk Jędrzejczak
9806bddf75 test: fix a test case in raft_address_map_test
The test didn't test what it was supposed to test. It would pass
even if set_nonexpiring() didn't insert a new entry.

Closes #15157
2023-08-25 12:11:33 +02:00
Kefu Chai
d2d1141188 sstables: writer: delegate flush() in checksummed_file_data_sink_impl
before this change, `checksummed_file_data_sink_impl` just inherits the
`data_sink_impl::flush()` from its parent class. but as a wrapper around
the underlying `_out` data_sink, this is not only an unusual design
decision in a layered design of an I/O system, but also could be
problematic. to be more specific, the typical user of `data_sink_impl`
is a `data_sink`, whose `flush()` member function is called when
the user of `data_sink` want to ensure that the data sent to the sink
is pushed to the underlying storage / channel.

this in general works, as the typical user of `data_sink` is in turn
`output_stream`, which calls `data_sink.flush()` before closing the
`data_sink` with `data_sink.close()`. and the operating system will
eventually flush the data after application closes the corresponding
fd. to be more specific, almost none of the popular local filesystem
implements the file_operations.op, hence, it's safe even if the
`output_stream` does not flush the underlying data_sink after writing
to it. this is the use case when we write to sstables stored on local
filesystem. but as explained above, if the data_sink is backed by a
network filesystem, a layered filesystem or a storage connected via
a buffered network device, then it is crucial to flush in a timely
manner, otherwise we could risk data lost if the application / machine /
network breaks when the data is considerered persisted but they are
_not_!

but the `data_sink` returned by `client::make_upload_jumbo_sink` is
a little bit different. multipart upload is used under the hood, and
we have to finalize the upload once all the parts are uploaded by
calling `close()`. but if the caller fails / chooses to close the
sink before flushing it, the upload is aborted, and the partially
uploaded parts are deleted.

the default-implemented `checksummed_file_data_sink_impl::flush()`
breaks `upload_jumbo_sink` which is the `_out` data_sink being
wrapped by `checksummed_file_data_sink_impl`. as the `flush()`
calls are shortcircuited by the wrapper, the `close()` call
always aborts the upload. that's why the data and index components
just fail to upload with the S3 backend.

in this change, we just delegate the `flush()` call to the
wrapped class.

Fixes #15079
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15134
2023-08-24 18:03:10 +03:00
Raphael S. Carvalho
d6cc752718 test: Fix flakiness in sstable_compaction_test.autocompaction_control_test
It's possible that compaction task is preempted after completion and
before reevaluation, causing pending_tasks to be > 1.

Let's only exit the loop if there are no pending tasks, and also
reduce 100ms sleep which is an eternity for this test.

Fixes #14809.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15059
2023-08-24 13:37:06 +03:00
Benny Halevy
4a2e367e92 storage_proxy: create_write_response_handler: carry effective_replication_map_ptr from paxos_response_handler
As `create_write_response_handler` on this path accepts
an `inet_address_vector_replica_set` that corresponds to the
effective_replication_map_ptr in the paxos_response_handler,
but currently, the function retrieves a new
effective_replication_map_ptr
that may not hold all the said endpoints.

Fixes scylladb/scylladb#15138

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-24 11:45:13 +03:00
Benny Halevy
6af0b281a6 storage_proxy: mutate_atomically_result: carry effective_replication_map down to create_write_response_handler
The effective_replication_map_ptr passed to
`create_write_response_handler` by `send_batchlog_mutation`
must be synchronized with the one used to calculate
_batchlog_endpoints to ensure they use the same topology.

Fixes scylladb/scylladb#15147

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-24 10:43:40 +03:00
Benny Halevy
098dd5021a storage_proxy: mutate_atomically_result: keep schema of batchlog mutation in context
The batchlog mutation is for system.batchlog.
Rather than looking the schema up in multiple places
do that once and keep it in the context object.

It will be used in the next patch to get a respective
effective_replication_map_ptr.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-24 10:43:23 +03:00
Benny Halevy
27c33015a5 storage_proxy: send_to_live_endpoints: throw on_internal_error if node not found
Return error in production rather than crashing
as in https://github.com/scylladb/scylladb/issues/15138

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-24 08:59:38 +03:00
Kefu Chai
2f17b76df7 docs/operating-scylla/admin-tools: add note on deprecating sstabledump
sstabledump is deprecated in place of `scylla sstable` commands. so
let's reflect this in the document.

Fixes #15020
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15021
2023-08-24 08:31:29 +03:00
Botond Dénes
1609c76d62 tools/scylla-sstable: scrub: don't qurantine sstables after validate
Scylla sstable promises to *never* mutate its input sstables. This
promise was broken by `scylla sstable scrub --scrub-mode=validate`,
because validate moves invalid input sstables into qurantine. This is
unexpected and caused occasional failures in the scrub tests in
test_tools.py. Fix by propagating a flag down to
`scrub_sstables_validate_mode()` in `compaction.cc`, specifying whether
validate should qurantine invalid sstables, then set this flag to false
in `scylla-sstable.cc`. The existing test for validate-mode scrub is
ammended to check that the sstable is not mutated. The test now fails
before the fix and passes afterwards.

Fixes: #14309

Closes #15139
2023-08-23 21:53:12 +03:00
Kamil Braun
93be4c0cb0 Merge 'Base node liveliness consistently on gossiper::is_alive' from Benny Halevy
Currently he gossiper marks endpoint_state objects as alive/dead.
I some cases the endpoint_state::is_alive function is checked but in many other cases
gossiper::is_alive(endpoint) is used to determine if the endpoint is alive.

This series removed the endpoint_state::is_alive state and moves all the logic to gossiper::is_alive
that bases its decision on the endpoint having an endpoint_state and being in the _live_endpoints set.

For that, the _live_endpoints is made sure to be replicated to all shards when changed
and the endpoint_state changes are serialized under lock_endpoint, and also making sure that the
endpoint_state in the _endpoint_states_map is never updated in place, but rather a temporary copy is changed
and then safely replicated using gossiper::replicate

Refs https://github.com/scylladb/scylladb/issues/14794

Closes #14801

* github.com:scylladb/scylladb:
  gossiper: mark_alive: remove local_state param
  endpoint_state: get rid of _is_alive member and methods
  gossiper: is_alive: use _live_endpoints
  gossiper: evict_from_membership: erase endpoint from _live_endpoints
  gossiper: replicate_live_endpoints_on_change: use _live_endpoints_version to detect change
  gossiper: run: no need to replicate live_endpoints
  gossiper: fold update_live_endpoints_version into replicate_live_endpoints_on_change
  gossiper: add mutate_live_and_unreachable_endpoints
  gossiper: reset_endpoint_state_map: clear also shadow endpoint sets
  gossiper: reset_endpoint_state_map: clear live/unreachable endpoints on all shards
  gossiper: functions that change _live_endpoints must be called on shard 0
  gossiper: add lock_endpoint_update_semaphore
  gossiper: make _live_endpoints an unordered_set
  endpoint_state: use gossiper::is_alive externally
2023-08-23 17:18:05 +02:00
Gleb Natapov
d1654ccdda storage_service: register schema version observer before joining group0 and starting gossiper
The schema version is updated by group0, so if group0 starts before
schema version observer is registered some updates may be missed. Since
the observer is used to update node's gossiper state the gossiper may
contain wrong schema version.

Fix by registering the observer before starting group0 and even before
starting gossiper to avoid a theoretical case that something may pull
schema after start of gossiping and before the observer is registered.

Fixes: #15078

Message-Id: <ZOYZWhEh6Zyb+FaN@scylladb.com>
2023-08-23 17:11:51 +02:00
Patryk Jędrzejczak
ef2eac9941 raft topology: make every type in request_param a named struct
We make every alternative type in the request_param variant
a named struct to make the code more readable. Additionally, this
change will make extending request parameters easier if we decide
to do so in the future.

Closes #15132
2023-08-23 16:56:00 +02:00
Patryk Jędrzejczak
7eab9f8a02 raft_removenode: remove "raft topology" from errors
Some runtime errors thrown in storage_service::raft_removenode
start with the "raft topology " prefix. Since "raft topology" is
an implementation detail, we don't want to throw this information
through the user API. Only logs should contain it.

Closes #15136
2023-08-23 16:20:14 +02:00
Nadav Har'El
5530c529c2 test/cql-pytest: regression test for old bug with CAST(f AS TEXT) precision
When casting a float or double column to a string with `CAST(f AS TEXT)`,
Scylla is expected to print the number with enough digits so that reading
that string back to a float or double restores the original number
exactly. This expectation isn't documented anywhere, but makes sense,
and is what Cassandra does.

Before commit 71bbd7475c, this wasn't the
case in Scylla: `CAST(f AS TEXT)` always printed 6 digits of precision,
which was a bit under enough for a float (which can have 7 decimal digits
of precision), but very much not enough for a double (which can need 15
digits). The origin of this magic "6 digits" number was that Scylla uses
seastar::to_sstring() to print the float and double values, and before
the aforementioned commit those functions used sprintf with the "%g"
format - which always prints 6 decimal digits of precision! After that
commit, to_sstring() now uses a different approach (based on fmt) to
print the float and double values, that prints all significant digits.

This patch adds a regression test for this bug: We write float and double
values to the database, cast them to text, and then recover the float
or double number from that text - and check that we get back exactly the
same float or double object. The test *fails* before the aforementioned
commit, and passes after it. It also passes on Cassandra.

Refs #15127

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #15131
2023-08-23 16:06:52 +03:00
Botond Dénes
e7af2a7de8 Merge 'token_metadata::get_endpoint_to_host_id_map_for_reading: restrict to token owners' from Benny Halevy
And verify the they returned host_id isn't null.
Call on_internal_error_noexcept in that case
since all token owners are expected to have their
host_id set. Aborting in testing would help fix
issues in this area.

Fixes scylladb/scylladb#14843
Refs scylladb/scylladb#14793

Closes #14844

* github.com:scylladb/scylladb:
  api: storage_service: improve description of /storage_service/host_id
  token_metadata: get_endpoint_to_host_id_map_for_reading: restrict to token owners
2023-08-23 13:55:14 +03:00
Botond Dénes
139ba553b8 Merge 'sstable, test: log sstable name and pk when capping local_deletion_time ' from Kefu Chai
in this series, we also print the sstable name and pk when writing a tombstone whose local_deletion_time (ldt for short) is greater than INT32_MAX which cannot be represented by an uint32_t.

Fixes #15015

Closes #15107

* github.com:scylladb/scylladb:
  sstable/writer: log sstable name and pk when capping ldt
  test: sstable_compaction_test: add a test for capped tombstone ldt
2023-08-23 09:29:54 +03:00
Botond Dénes
f7505405f0 scylla-gdb.py: use for_each_table() everywhere
scylla-gdb.py has two methods for iterating over all tables:
* all_tables()
* for_each_table()

Despite this, many places in the code iterate over the column family map
directly. This patch leaves just a single method (for_each_table()) and
migrates all the codebase to use it, instead of iterating over the raw
map. While at it, the access to the map is made backward compatible with
pre 52afd9d42d code, said commit wrapped database::_column_families in
tables_metadata object. This broke scylla-gdb.py for older versions.

Closes #15121
2023-08-22 20:39:31 +03:00
Kamil Braun
169d19e5b0 Merge 'raft topology: support --ignore-dead-nodes in removenode and replace' from Patryk Jędrzejczak
We add support for `--ignore-dead-nodes` in `raft_removenode` and
`--ignore-dead-nodes-for-replace` in `raft_replace`. For now, we allow
passing only host ids of the ignored nodes. Supporting IPs is currently
impossible because `raft_address_map` doesn't provide a mapping from IP
to a host id.

The main steps of the implementation are as follows:
- add the `ignore_nodes` column to `system.topology`,
- set the `ignore_nodes` value of the topology mutation in `raft_removenode` and `raft_replace`,
- extend `service::request_param` with alternative types that allow storing a set of ids of the ignored nodes,
- load `ignore_nodes` from `system.topology` into `request_param` in `system_keyspace::load_topology_state`,
- add `ignore_nodes` to `exclude_nodes` in `topology_coordinator::exec_global_command`,
- pass `ignore_nodes` to `replace_with_repair` and `remove_with_repair` in `storage_service::raft_topology_cmd_handler`.

Additionally, we add `test_raft_ignore_nodes.py` with two tests that verify the added changes.

Fixes #15025

Closes #15113

* github.com:scylladb/scylladb:
  test: add test_raft_ignore_nodes
  test: ManagerClient.remove_node: allow List[HostId] for ignore_dead
  raft topology: pass ignore_nodes to {replace, remove}_with_repair
  raft topology: exec_global_command: add ignore_nodes to exclude_nodes
  raft topology: exec_global_command: change type of exclude_nodes
  topology_state_machine: extend request_param with a set of raft ids
  raft topology: set ignore_nodes in raft_removenode and raft_replace
  utils: introduce split_comma_separated_list
  raft topology: add the ignore_nodes column to system.topology
2023-08-22 18:04:59 +02:00
Kamil Braun
cdc3cd2b79 Merge 'raft: add fencing tests' from Petr Gusev
In this PR a simple test for fencing is added. It exercises the data
plane, meaning if it somehow happens that the node has a stale topology
version, then requests from this node will get an error 'stale
topology'. The test just decrements the node version manually through
CQL, so it's quite artificial. To test a more real-world scenario we
need to allow the topology change fiber to sometimes skip unavailable
nodes. Now the algorithm fails and retries indefinitely in this case.

The PR also adds some logs, and removes one seemingly redundant topology
version increment, see the commit messages for details.

Closes #14901

* github.com:scylladb/scylladb:
  test_fencing: add test_fence_hints
  test.py: output the skipped tests
  test.py: add skip_mode decorator and fixture
  test.py: add mode fixture
  hints: add debug log for dropped hints
  hints: send_one_hint: extend the scope of file_send_gate holder
  pylib: add ScyllaMetrics
  hints manager: add send_errors counter
  token_metadata: add debug logs
  fencing: add simple data plane test
  random_tables.py: add counter column type
  raft topology: don't increment version when transitioning to node_state::normal
2023-08-22 16:28:21 +02:00
Piotr Grabowski
17e3e367ca test: use more frequent reconnection policy
The default reconnection policy in Python Driver is an exponential
backoff (with jitter) policy, which starts at 1 second reconnection
interval and ramps up to 600 seconds.

This is a problem in tests (refs #15104), especially in tests that restart
or replace nodes. In such a scenario, a node can be unavailable for an
extended period of time and the driver will try to reconnect to it
multiple times, eventually reaching very long reconnection interval
values, exceeding the timeout of a test.

Fix the issue by using a exponential reconnection policy with a maximum
interval of 4 seconds. A smaller value was not chosen, as each retry
clutters the logs with reconnection exception stack trace.

Fixes #15104

Closes #15112
2023-08-22 15:40:39 +02:00
Avi Kivity
d944872d19 Merge 'Prevent reactor stalls in to_repair_rows_list' from Benny Halevy
This sort series deals with two stall sources in row-level repair `to_repair_rows_list`:
1. Freeing the input `repair_rows_on_wire` in one shot on return (as seen in https://github.com/scylladb/scylladb/issues/14537)
2. Freeing the result `row_list` in one shot on error.  this hasn't been seen in testing but I have no reason to believe it is not susceptible to stalls exactly like repair_rows_on_wire with the same number of rows and mutations.

Fixes https://github.com/scylladb/scylladb/issues/14537

Closes #15102

* github.com:scylladb/scylladb:
  repair: reindent to_repair_rows_list
  repair: to_repair_rows_list: clear_gently on error
  repair: to_repair_rows_list: consume frozen rows gently
2023-08-22 15:29:37 +03:00
Patryk Jędrzejczak
b044ee535f test: add test_raft_ignore_nodes
We add two tests verifying that --ignore-dead-nodes in
raft_removenode and --ignore-dead-nodes-for-replace in
raft_replace are handled correctly.

We need a 7-cluster to have a Raft majority. Therefore, these
tests are quite slow, and we want to run them only in the dev mode.
2023-08-22 14:19:21 +02:00
Patryk Jędrzejczak
6818d13f7d test: ManagerClient.remove_node: allow List[HostId] for ignore_dead
ManagerClient.remove_node allows passing ignore_dead only as
List[IPAddress]. However, raft_removenode currently supports
only host ids. To write a test that passes ignore_dead to
ManagerClient.remove_node in the Raft topology mode, we allow
passing ignore_dead as List[HostId].

Note that we don't want to use List[IPAddress | HostId] because
mixing IP addresses and host ids fails anyway. See
ss::remove_node.set(...) in api::set_storage_service.
2023-08-22 14:19:09 +02:00
Patryk Jędrzejczak
26ad527666 raft topology: pass ignore_nodes to {replace, remove}_with_repair
To properly stream ranges during the removenode or replace
operation in the Raft topology mode, we pass IPs of the ignored
nodes to replace_with_repair and remove_with_repair in
storage_service::raft_topology_cmd_handler.
2023-08-22 14:18:39 +02:00
Patryk Jędrzejczak
e685182290 raft topology: exec_global_command: add ignore_nodes to exclude_nodes
We add ignore_nodes to exclude_nodes in exec_global_command
to ignore nodes marked as dead by --ignore-dead-nodes for
raft_removenode and --ignore-dead-nodes-for-replace for
raft_replace.
2023-08-22 14:18:37 +02:00
Patryk Jędrzejczak
5ebee35f99 raft topology: exec_global_command: change type of exclude_nodes
We extend exclude_nodes in exec_global_command with ignore_nodes
in the next commit. Since we already use std::unordered_set to
store ids of the ignored nodes and their number is unknown, we
change the type of exclude_nodes from utils::small_vector to
std::unordered_set.
2023-08-22 14:17:55 +02:00
Patryk Jędrzejczak
1f57d80ba1 topology_state_machine: extend request_param with a set of raft ids
We add two new alternative types to service::request_param:
removenode_param and replace_param. They allow storing the list
of ignored nodes loaded from the ignore_nodes column of
system.topology. We also remove the raft::server_id type because
it has been only used by the replace operation.
2023-08-22 14:17:37 +02:00
Patryk Jędrzejczak
7d3dc306eb raft topology: set ignore_nodes in raft_removenode and raft_replace
To handle --ignore-dead-nodes in raft_removenode and
--ignore-dead-nodes-for-replace in raft_replace, we set the
ignore_nodes value of the topology mutation in these functions. In
the following commits, we ensure that the topology coordinator
properly makes use of it.
2023-08-22 14:13:51 +02:00
Petr Gusev
1ddc76ffd1 test_fencing: add test_fence_hints
The test makes a write through the first node with
the third node down, this causes a hint to be stored on the
first node for the second. We increment the version
and fence_version on the third node, restart it,
and expect to see a hint delivery failure
because of versions mismatch. Then we update the versions
of the first node and expect hint to be successfully
delivered.
2023-08-22 15:48:40 +04:00
Petr Gusev
3ccd2abad4 test.py: output the skipped tests
pytest option -rs forces it to print
all the skipped tests along with
the reasons. Without this option we
can't tell why certain tests were skipped,
maybe some of them shouldn't already.
2023-08-22 15:48:40 +04:00
Petr Gusev
c434d26b36 test.py: add skip_mode decorator and fixture
Syntactic sugar for marking tests to be
skipped in a particular mode.

There is skip_in_debug/skip_in_release in suite.yaml,
but they can be applied only on the entire file,
which is unnatural and inconvenient. Also, they
don't allow to specify a reason why the test is skipped.

Separate dictionary skipped_funcs is needed since
we can't use pytest fixtures in decorators.
2023-08-22 15:48:40 +04:00
Petr Gusev
a639d161e6 test.py: add mode fixture
Sometimes a test wants to know what mode
it is running in so that e.g. it can skip
itself in some of them.
2023-08-22 15:48:40 +04:00
Petr Gusev
439c91851f hints: add debug log for dropped hints
Dropping data is rather important event,
let's log it at least at the debug level.
It'll help in debugging tests.
2023-08-22 15:48:40 +04:00
Petr Gusev
9fd3df13a2 hints: send_one_hint: extend the scope of file_send_gate holder
The problem was that the holder in with_gate
call was released too early. This happened
before the possible call to on_hint_send_failure
in then_wrapped. As a result, the effects of
on_hint_send_failure (segment_replay_failed flag)
were not visible in send_one_file after
ctx_ptr->file_send_gate.close(), so we could decide
that the segment was sent in full and delete
it even if sending of some hints led to errors.

Fixes #15110
2023-08-22 15:48:40 +04:00
Petr Gusev
0b7a90dff6 pylib: add ScyllaMetrics
This patch adds facilities to work
with Scylla metrics from test.py tests.
The new metrics property was added to
ManagerClient, its query method
sends a request to Scylla metrics
endpoint and returns and object
to conveniently access the result.

ScyllaMetrics is copy-pasted from
test_shedding.py. It's difficult
to reuse code between 'new' and 'old'
styles of tests, we can't just import
pylib in 'old' tests because of some
problems with python search directories.
A past commit of mine that attempted
to solve this problem was rejected on review.
2023-08-22 14:31:04 +04:00
Petr Gusev
1b7603af23 hints manager: add send_errors counter
There was no indication of problems
in the hints manager metrics before.
We need this counter for fencing tests
in the later commit, but it seems to be
useful on its own.
2023-08-22 14:31:04 +04:00
Petr Gusev
fa25e6d63e token_metadata: add debug logs
We log the new version when the new token
metadata is set.

Also, the log for fence_version is moved
in shared_token_metadata from storage_service
for uniformity.
2023-08-22 14:31:04 +04:00
Petr Gusev
360453fd87 fencing: add simple data plane test
The test starts a three node cluster
and manually decrements the version on
the last node. It then tries to write
some data through the last node and
expects to get 'stale topology' exception.
2023-08-22 14:31:01 +04:00
Benny Halevy
801987ab19 gossiper: mark_alive: remove local_state param
It is not used anymore.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 12:06:45 +03:00
Benny Halevy
75d1dd3a76 endpoint_state: get rid of _is_alive member and methods
Now that gossiper bases its is_alive status on _live_endpoints.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 12:06:45 +03:00