Commit Graph

39 Commits

Author SHA1 Message Date
Kefu Chai
372a4d1b79 treewide: do not define FMT_DEPRECATED_OSTREAM
since we do not rely on FMT_DEPRECATED_OSTREAM to define the
fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`.

in this change,

* utils: drop the range formatters in to_string.hh and to_string.c, as
  we don't use them anymore. and the tests for them in
  test/boost/string_format_test.cc are removed accordingly.
* utils: use fmt to print chunk_vector and small_vector. as
  we are not able to print the elements using operator<< anymore
  after switching to {fmt} formatters.
* test/boost: specialize fmt::details::is_std_string_like<bytes>
  due to a bug in {fmt} v9, {fmt} fails to format a range whose
  element type is `basic_sstring<uint8_t>`, as it considers it
  as a string-like type, but `basic_sstring<uint8_t>`'s char type
  is signed char, not char. this issue does not exist in {fmt} v10,
  so, in this change, we add a workaround to explicitly specialize
  the type trait to assure that {fmt} format this type using its
  `fmt::formatter` specialization instead of trying to format it
  as a string. also, {fmt}'s generic ranges formatter calls the
  pair formatter's `set_brackets()` and `set_separator()` methods
  when printing the range, but operator<< based formatter does not
  provide these method, we have to include this change in the change
  switching to {fmt}, otherwise the change specializing
  `fmt::details::is_std_string_like<bytes>` won't compile.
* test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends
  for comparing values. but without the operator<< based formatters,
  Boost.Test would not be able to print them. after removing
  the homebrew formatters, we need to use the generic
  `boost_test_print_type()` helper to do this job. so we are
  including `test_utils.hh` in tests so that we can print
  the formattable types.
* treewide: add "#include "utils/to_string.hh" where
  `fmt::formatter<optional<>>` is used.
* configure.py: do not define FMT_DEPRECATED_OSTREAM
* cmake: do not define FMT_DEPRECATED_OSTREAM

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:57:36 +08:00
Patryk Jędrzejczak
df2034ebd7 server, raft_group0_client: remove the default nullptr values
The previous commit has fixed 5 bugs of the same type - incorrectly
passing the default nullptr to one of the changed functions. At
least some of these bugs wouldn't appear if there was no default
value. It's much harder to make this kind of a bug if you have to
write "nullptr". It's also much easier to detect it in review.

Moreover, these default values are rarely used outside tests.
Keeping them is just not worth the time spent on debugging.
2024-01-05 18:45:50 +01:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Kamil Braun
2fea2fc19c raft: replication test: don't hang if _seen overshots _apply_entries
As in the previous commit, if a command gets doubly applied due to
`commit_status_unknown`, this will could lead to hard-to-debug failures;
one of them was the test hanging because we would never call
`_done.set_value()` in `state_machine::apply` due to `_seen`
overshooting `_apply_entries`.

Fix the problem and print a warning if we apply too many commands.

Fixes: #14072
2023-06-07 14:17:23 +02:00
Kamil Braun
43b48c59fd raft: replication test: print a warning when handling commit_status_unknown
`commit_status_unknown` may lead to double application and then a
hard-to-debug failure. But some tests actually rely on retrying it, so
print a warning and leave a FIXME for maybe a better future solution.

Ref: #14029
2023-06-07 14:17:20 +02:00
Kefu Chai
3ae11de204 treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-28 21:56:53 +08:00
Avi Kivity
e2f6e0b848 utils: move hashing related files to utils/ module
Closes #12884
2023-02-17 07:19:52 +02:00
Konstantin Osipov
990c7a209f raft: change the API of conf change notifications
Pass a change diff into the notification callback,
rather than add or remove servers one by one, so that
if we need to persist the state, we can do it once per
configuration change, not for every added or removed server.

For now still pass added and removed entries in two separate calls
per a single configuration change. This is done mainly to fulfill the
library contract that it never sends messages to servers
outside the current configuration. The group0 RPC
implementation doesn't need the two calls, since it simply
marks the removed servers as expired: they are not removed immediately
anyway, and messages can still be delivered to them.
However, there may be test/mock implementations of RPC which
could benefit from this contract, so we decided to keep it.
2022-11-17 12:07:31 +03:00
Petr Gusev
bc50b7407f raft replication_test, make backpressure test to do actual backpressure
Before this patch this test didn't actually experience any backpressure since all the commands were executed sequentially.
2022-09-27 12:04:14 +04:00
Petr Gusev
c57238d3d6 raft server, check aborted state on public server public api's
Fix: #11352
2022-09-12 10:16:40 +04:00
Kamil Braun
daf9c53bb8 raft: split can_vote field from server_address to separate struct
Whether a server can vote in a Raft configuration is not part of the
address. `server_address` was used in many context where `can_vote` is
irrelevant.

Split the struct: `server_address` now contains only `id` and
`server_info` as it did before `can_vote` was introduced. Instead we
have a `config_member` struct that contains a `server_address` and the
`can_vote` field.

Also remove an "unsafe" constructor from `server_address` where `id` was
provided but `server_info` was not. The constructor was used for tests
where `server_info` is irrelevant, but it's important not to forget
about the info in production code. The constructor was used for two
purposes:
- Invoking set operations such as `contains`. To solve this we use C++20
  transparent hash and comparator functions, which allow invoking
  `contains` and similar functions by providing a different key type (in
  this case `raft::server_id` in set of addresses, for example).
- constructing addresses without `info`s in tests. For this we provide
  helper functions in the test helpers module and use them.
2022-07-18 18:22:10 +02:00
Avi Kivity
4b53af0bd5 treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines
coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime
of the function object is less ambiguous, and so it is safer. Replace all eligible
occurences (i.e. caller is a coroutine).

One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra
attention since there was a handle_exception() continuation attached. It is converted
to a try/catch.

Closes #10699
2022-05-31 09:06:24 +03:00
Gleb Natapov
a1604aa388 raft: make raft requests abortable
This patch adds an ability to pass abort_source to raft request APIs (
add_entry, modify_config) to make them abortable. A request issuer not
always want to wait for a request to complete. For instance because a
client disconnected or because it no longer interested in waiting
because of a timeout. After this patch it can now abort waiting for such
requests through an abort source. Note that aborting a request only
aborts the wait for it to complete, it does not mean that the request
will not be eventually executed.

Message-Id: <YjHivLfIB9Xj5F4g@scylladb.com>
2022-03-16 18:38:01 +01:00
Gleb Natapov
579dcf187a raft: allow an option to persist commit index
Raft does not need to persist the commit index since a restarted node will
either learn it from an append message from a leader or (if entire cluster
is restarted and hence there is no leader) new leader will figure it out
after contacting a quorum. But some users may want to be able to bring
their local state machine to a state as up-to-date as it was before restart
as soon as possible without any external communication.

For them this patch introduces new persistence API that allows saving
and restoring last seen committed index.

Message-Id: <YfFD53oS2j1My0p/@scylladb.com>
2022-01-26 14:06:39 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
4118f2d8be treewide: replace deprecated seastar::later() with seastar::yield()
seastar::later() was recently deprecated and replaced with two
alternatives: a cheap seastar::yield() and an expensive (but more
powerful) seastar::check_for_io_immediately(), that corresponds to
the original later().

This patch replaces all later() calls with the weaker yield(). In
all cases except one, it's unambiguously correct. In one case
(test/perf scheduling_latency_measurer::stop()) it's not so ambiguous,
since check_for_io_immediately() will additionally force a poll and
so will cause more work to be done (but no additional tasks to be
executed). However, I think that any measurement that relies on
the measuring the work on the last tick to be inaccurate (you need
thousands of ticks to get any amount of confidence in the
measurement) that in the end it doesn't matter what we pick.

Tests: unit (dev)

Closes #9904
2022-01-12 12:19:19 +01:00
Konstantin Osipov
65e549946f raft: add a test case for adding entries on follower 2021-11-25 11:50:38 +03:00
Konstantin Osipov
e3751068fe raft: (server) allow adding entries/modify config on a follower
Implement an RPC to forward add_entry calls from the follower
to leader. Bounce & retry in case of not_a_leader.
Do not retry in case of uncertainty - this can lead to adding
duplicate entries.

The feature is added to core Raft since it's needed by
all current clients - both topology and schema changes.

When forwarding an entry to a remote leader we may get back
a term/index pair that conflicts (has the same index, but is with
a higher term) with a local entry we're still waiting on.

This can happen, e.g. because there was a leader change and the
log was truncated, but we still haven't got the append_entries
RPC from the new leader, still haven't truncated the log locally,
still haven't aborted all the local waits for truncated entries.

Only remove the offending entry from the wait list and abort it.
There may be entries labeled with an older term to the right (with
higher commit index) of the conflicting entry. However, finding them,
would require a linear scan. If we allow it, we may end up doing this
linear scan for *every* conflicting entry during the transition
period, which brings us to N^2 complexity of this step. At the
same time, as soon as append_entries that commits a higher-term
entry with the same index reaches the follower, the waits
for the respective truncated entry will be aborted anyway (see
notify_waiters() which sets dropped_entry exception), so the scan
is unnecessary.

Similarly to being able to add entries, allow to modify
Raft group configuration on a follower. The implementation
works the same way as adding entries - forwards the command
to the leader.

Now that add_entry() or modify_config never throws not_a_leader,
it's more likely to throw timed_out_error, e.g. in case the
network is partitioned. Previously it was only possible due to a
semaphore wait timeout, and this scenario was not tested.
Handle timed_out_error on RPC level to let the existing tests
(specifically the randomized nemesis test) pass.
2021-11-25 11:50:38 +03:00
Konstantin Osipov
ae5dc8e980 raft: (test) replace virtual with override in derived class
Clang 12 complains if use of override is inconsistent,
so stick to it everywhere.
2021-11-25 11:50:38 +03:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Gleb Natapov
ce40b01b07 raft: rename snapshot into snapshot_descriptor
The snapshot structure does not contain the snapshot itself but only
refers to it trough its id. Rename it to snapshot_descriptor for clarity.
2021-08-29 12:53:03 +03:00
Gleb Natapov
80a392a444 raft: replication_test: store multiple snapshots in a state machine
State machine should be able to store more then one snapshot at a time
(one may be the currently used one and another is transferred from a
leader but not applied yet).
2021-08-29 12:53:03 +03:00
Gleb Natapov
3ff6f76cef raft: test: add read_barrier test to replication_test 2021-08-25 08:57:13 +03:00
Gleb Natapov
03a266d73b raft: make read_barrier work on a follower as well as on a leader
This patch implements RAFT extension that allows to perform linearisable
reads by accessing local state machine. The extension is described
in section 6.4 of the PhD. To sum it up to perform a read barrier on
a follower it needs to asks a leader the last committed index that it
knows about. The leader must make sure that it is still a leader before
answering by communicating with a quorum. When follower gets the index
back it waits for it to be applied and by that completes read_barrier
invocation.

The patch adds three new RPC: read_barrier, read_barrier_reply and
execute_read_barrier_on_leader. The last one is the one a follower uses
to ask a leader about safe index it can read. First two are used by a
leader to communicate with a quorum.
2021-08-25 08:57:13 +03:00
Alejo Sanchez
87a03a3485 raft: replication test: remove unused tick_all
Tests now wait for normal ticks for election, remove deprecated tick_all
helper.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-24 13:09:01 +02:00
Alejo Sanchez
14c214d73e raft: replication test: delays
Allow test supplied delays for rpc communication.

Allow supplying network delay, local delay (nodes within the same
server), how many nodes are local, and an extra small delay simulating
local load.

Modify rpc class to support delays. If delays are enabled, it no longer
directly calls the other node's server code but it schedules it to be
called later. This makes the test more realistic as in the previous
version the first candidate was always going to get to all followers
first, preventing a dueling candidates scenario.

Previously, tickers were all scheduled at the same time, so there was no
spread of them across the tick time. Now these tickers are scheduled
with a uniform spread across this time (tick delta).

Also previously, for custom free elections used tick_all() which
traversed _in_configuration sequentially and ticked each. This, combined
with rpc outbound directly calling methods in the other server without
yielding, caused free elections to be unrealistic with same order
determined and first candidate always winning. This patch changes this
behavior. The free election uses normal tickers (now uniformly
distributed in tick delay time) and its loop waits for tick delay time
(yielding) and checks if there's a new leader. Also note the order might
not be the same in debug mode if more than one tick is scheduled.

As rpc messages are sent delayed, network connectivity needs to be
checked again before calling the function on the remote side.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-24 13:05:53 +02:00
Alejo Sanchez
db23823c77 raft: replication test: packet drop rpc helper
Add a helper to check if a packet should be dropped.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
497af3167f raft: replication test: connectivity configuration
Pass packet drops within connectivity configuration struct.
Default to no packet drops.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
e4d5428e8a raft: replication test: rpc network map in raft_cluster
Move rpc network map to raft cluster, no longer as static in rpc class.
2021-08-23 17:50:16 +02:00
Alejo Sanchez
5cfe6c1ca2 raft: replication test: minor: rename local to int ids
For clarity, name 0-based integer ids as int ids not local.
This is in contrast with 1-based UUID ids.
2021-08-23 17:50:16 +02:00
Alejo Sanchez
27d90f0165 raft: replication test: fix restart_tickers when partitioning
When partitioning, elect_new_leader restarts tickers, so don't
re-restart them in this case.

When leader is dropped and no new leader is specified, restart tickers
before free election.

If no change of leader, restart tickers.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
e4262291f2 raft: replication test: partition ranges
Allow specifying ranges within partition to handle large number of
nodes.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
56a110d42f raft: replication test: isolate one server
Support disconnection of one server with the rest.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
6b3327c753 raft: replication test: move objects out of header
Use a separate cc file for definitions and objects.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
cea18e6830 raft: replication test: make dummy command const
Make dummy command const in header.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
2db3192ac3 raft: replication test: template clock type
Templetize clock type.

Use a struct for run_test to work around
https://bugs.llvm.org/show_bug.cgi?id=50345

With help from @kbr-

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
cb35588fb1 raft: replication test: tick delta inside raft_cluster
Store tick delta inside raft_cluster.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00
Alejo Sanchez
49cb040037 raft: replication test: style - member initializer
Fix raft_cluster constructor member initializer list.
2021-08-23 17:50:16 +02:00
Alejo Sanchez
6e2ab657b3 raft: replication test: move common code out
Common replication test code moved to header.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-08-23 17:50:16 +02:00