scylladb

Author	SHA1	Message	Date
Kefu Chai	372a4d1b79	treewide: do not define FMT_DEPRECATED_OSTREAM since we do not rely on FMT_DEPRECATED_OSTREAM to define the fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`. in this change, * utils: drop the range formatters in to_string.hh and to_string.c, as we don't use them anymore. and the tests for them in test/boost/string_format_test.cc are removed accordingly. * utils: use fmt to print chunk_vector and small_vector. as we are not able to print the elements using operator<< anymore after switching to {fmt} formatters. * test/boost: specialize fmt::details::is_std_string_like<bytes> due to a bug in {fmt} v9, {fmt} fails to format a range whose element type is `basic_sstring<uint8_t>`, as it considers it as a string-like type, but `basic_sstring<uint8_t>`'s char type is signed char, not char. this issue does not exist in {fmt} v10, so, in this change, we add a workaround to explicitly specialize the type trait to assure that {fmt} format this type using its `fmt::formatter` specialization instead of trying to format it as a string. also, {fmt}'s generic ranges formatter calls the pair formatter's `set_brackets()` and `set_separator()` methods when printing the range, but operator<< based formatter does not provide these method, we have to include this change in the change switching to {fmt}, otherwise the change specializing `fmt::details::is_std_string_like<bytes>` won't compile. * test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends for comparing values. but without the operator<< based formatters, Boost.Test would not be able to print them. after removing the homebrew formatters, we need to use the generic `boost_test_print_type()` helper to do this job. so we are including `test_utils.hh` in tests so that we can print the formattable types. * treewide: add "#include "utils/to_string.hh" where `fmt::formatter<optional<>>` is used. * configure.py: do not define FMT_DEPRECATED_OSTREAM * cmake: do not define FMT_DEPRECATED_OSTREAM Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:57:36 +08:00
Patryk Jędrzejczak	df2034ebd7	server, raft_group0_client: remove the default nullptr values The previous commit has fixed 5 bugs of the same type - incorrectly passing the default nullptr to one of the changed functions. At least some of these bugs wouldn't appear if there was no default value. It's much harder to make this kind of a bug if you have to write "nullptr". It's also much easier to detect it in review. Moreover, these default values are rarely used outside tests. Keeping them is just not worth the time spent on debugging.	2024-01-05 18:45:50 +01:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Kamil Braun	2fea2fc19c	raft: replication test: don't hang if `_seen` overshots `_apply_entries` As in the previous commit, if a command gets doubly applied due to `commit_status_unknown`, this will could lead to hard-to-debug failures; one of them was the test hanging because we would never call `_done.set_value()` in `state_machine::apply` due to `_seen` overshooting `_apply_entries`. Fix the problem and print a warning if we apply too many commands. Fixes: #14072	2023-06-07 14:17:23 +02:00
Kamil Braun	43b48c59fd	raft: replication test: print a warning when handling `commit_status_unknown` `commit_status_unknown` may lead to double application and then a hard-to-debug failure. But some tests actually rely on retrying it, so print a warning and leave a FIXME for maybe a better future solution. Ref: #14029	2023-06-07 14:17:20 +02:00
Kefu Chai	3ae11de204	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:53 +08:00
Avi Kivity	e2f6e0b848	utils: move hashing related files to utils/ module Closes #12884	2023-02-17 07:19:52 +02:00
Konstantin Osipov	990c7a209f	raft: change the API of conf change notifications Pass a change diff into the notification callback, rather than add or remove servers one by one, so that if we need to persist the state, we can do it once per configuration change, not for every added or removed server. For now still pass added and removed entries in two separate calls per a single configuration change. This is done mainly to fulfill the library contract that it never sends messages to servers outside the current configuration. The group0 RPC implementation doesn't need the two calls, since it simply marks the removed servers as expired: they are not removed immediately anyway, and messages can still be delivered to them. However, there may be test/mock implementations of RPC which could benefit from this contract, so we decided to keep it.	2022-11-17 12:07:31 +03:00
Petr Gusev	bc50b7407f	raft replication_test, make backpressure test to do actual backpressure Before this patch this test didn't actually experience any backpressure since all the commands were executed sequentially.	2022-09-27 12:04:14 +04:00
Petr Gusev	c57238d3d6	raft server, check aborted state on public server public api's Fix: #11352	2022-09-12 10:16:40 +04:00
Kamil Braun	daf9c53bb8	raft: split `can_vote` field from `server_address` to separate struct Whether a server can vote in a Raft configuration is not part of the address. `server_address` was used in many context where `can_vote` is irrelevant. Split the struct: `server_address` now contains only `id` and `server_info` as it did before `can_vote` was introduced. Instead we have a `config_member` struct that contains a `server_address` and the `can_vote` field. Also remove an "unsafe" constructor from `server_address` where `id` was provided but `server_info` was not. The constructor was used for tests where `server_info` is irrelevant, but it's important not to forget about the info in production code. The constructor was used for two purposes: - Invoking set operations such as `contains`. To solve this we use C++20 transparent hash and comparator functions, which allow invoking `contains` and similar functions by providing a different key type (in this case `raft::server_id` in set of addresses, for example). - constructing addresses without `info`s in tests. For this we provide helper functions in the test helpers module and use them.	2022-07-18 18:22:10 +02:00
Avi Kivity	4b53af0bd5	treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime of the function object is less ambiguous, and so it is safer. Replace all eligible occurences (i.e. caller is a coroutine). One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra attention since there was a handle_exception() continuation attached. It is converted to a try/catch. Closes #10699	2022-05-31 09:06:24 +03:00
Gleb Natapov	a1604aa388	raft: make raft requests abortable This patch adds an ability to pass abort_source to raft request APIs ( add_entry, modify_config) to make them abortable. A request issuer not always want to wait for a request to complete. For instance because a client disconnected or because it no longer interested in waiting because of a timeout. After this patch it can now abort waiting for such requests through an abort source. Note that aborting a request only aborts the wait for it to complete, it does not mean that the request will not be eventually executed. Message-Id: <YjHivLfIB9Xj5F4g@scylladb.com>	2022-03-16 18:38:01 +01:00
Gleb Natapov	579dcf187a	raft: allow an option to persist commit index Raft does not need to persist the commit index since a restarted node will either learn it from an append message from a leader or (if entire cluster is restarted and hence there is no leader) new leader will figure it out after contacting a quorum. But some users may want to be able to bring their local state machine to a state as up-to-date as it was before restart as soon as possible without any external communication. For them this patch introduces new persistence API that allows saving and restoring last seen committed index. Message-Id: <YfFD53oS2j1My0p/@scylladb.com>	2022-01-26 14:06:39 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	4118f2d8be	treewide: replace deprecated seastar::later() with seastar::yield() seastar::later() was recently deprecated and replaced with two alternatives: a cheap seastar::yield() and an expensive (but more powerful) seastar::check_for_io_immediately(), that corresponds to the original later(). This patch replaces all later() calls with the weaker yield(). In all cases except one, it's unambiguously correct. In one case (test/perf scheduling_latency_measurer::stop()) it's not so ambiguous, since check_for_io_immediately() will additionally force a poll and so will cause more work to be done (but no additional tasks to be executed). However, I think that any measurement that relies on the measuring the work on the last tick to be inaccurate (you need thousands of ticks to get any amount of confidence in the measurement) that in the end it doesn't matter what we pick. Tests: unit (dev) Closes #9904	2022-01-12 12:19:19 +01:00
Konstantin Osipov	65e549946f	raft: add a test case for adding entries on follower	2021-11-25 11:50:38 +03:00
Konstantin Osipov	e3751068fe	raft: (server) allow adding entries/modify config on a follower Implement an RPC to forward add_entry calls from the follower to leader. Bounce & retry in case of not_a_leader. Do not retry in case of uncertainty - this can lead to adding duplicate entries. The feature is added to core Raft since it's needed by all current clients - both topology and schema changes. When forwarding an entry to a remote leader we may get back a term/index pair that conflicts (has the same index, but is with a higher term) with a local entry we're still waiting on. This can happen, e.g. because there was a leader change and the log was truncated, but we still haven't got the append_entries RPC from the new leader, still haven't truncated the log locally, still haven't aborted all the local waits for truncated entries. Only remove the offending entry from the wait list and abort it. There may be entries labeled with an older term to the right (with higher commit index) of the conflicting entry. However, finding them, would require a linear scan. If we allow it, we may end up doing this linear scan for every conflicting entry during the transition period, which brings us to N^2 complexity of this step. At the same time, as soon as append_entries that commits a higher-term entry with the same index reaches the follower, the waits for the respective truncated entry will be aborted anyway (see notify_waiters() which sets dropped_entry exception), so the scan is unnecessary. Similarly to being able to add entries, allow to modify Raft group configuration on a follower. The implementation works the same way as adding entries - forwards the command to the leader. Now that add_entry() or modify_config never throws not_a_leader, it's more likely to throw timed_out_error, e.g. in case the network is partitioned. Previously it was only possible due to a semaphore wait timeout, and this scenario was not tested. Handle timed_out_error on RPC level to let the existing tests (specifically the randomized nemesis test) pass.	2021-11-25 11:50:38 +03:00
Konstantin Osipov	ae5dc8e980	raft: (test) replace virtual with override in derived class Clang 12 complains if use of override is inconsistent, so stick to it everywhere.	2021-11-25 11:50:38 +03:00
Avi Kivity	daf028210b	build: enable -Winconsistent-missing-override warning This warning can catch a virtual function that thinks it overrides another, but doesn't, because the two functions have different signatures. This isn't very likely since most of our virtual functions override pure virtuals, but it's still worth having. Enable the warning and fix numerous violations. Closes #9347	2021-09-15 12:55:54 +03:00
Gleb Natapov	ce40b01b07	raft: rename snapshot into snapshot_descriptor The snapshot structure does not contain the snapshot itself but only refers to it trough its id. Rename it to snapshot_descriptor for clarity.	2021-08-29 12:53:03 +03:00
Gleb Natapov	80a392a444	raft: replication_test: store multiple snapshots in a state machine State machine should be able to store more then one snapshot at a time (one may be the currently used one and another is transferred from a leader but not applied yet).	2021-08-29 12:53:03 +03:00
Gleb Natapov	3ff6f76cef	raft: test: add read_barrier test to replication_test	2021-08-25 08:57:13 +03:00
Gleb Natapov	03a266d73b	raft: make read_barrier work on a follower as well as on a leader This patch implements RAFT extension that allows to perform linearisable reads by accessing local state machine. The extension is described in section 6.4 of the PhD. To sum it up to perform a read barrier on a follower it needs to asks a leader the last committed index that it knows about. The leader must make sure that it is still a leader before answering by communicating with a quorum. When follower gets the index back it waits for it to be applied and by that completes read_barrier invocation. The patch adds three new RPC: read_barrier, read_barrier_reply and execute_read_barrier_on_leader. The last one is the one a follower uses to ask a leader about safe index it can read. First two are used by a leader to communicate with a quorum.	2021-08-25 08:57:13 +03:00
Alejo Sanchez	87a03a3485	raft: replication test: remove unused tick_all Tests now wait for normal ticks for election, remove deprecated tick_all helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-24 13:09:01 +02:00
Alejo Sanchez	14c214d73e	raft: replication test: delays Allow test supplied delays for rpc communication. Allow supplying network delay, local delay (nodes within the same server), how many nodes are local, and an extra small delay simulating local load. Modify rpc class to support delays. If delays are enabled, it no longer directly calls the other node's server code but it schedules it to be called later. This makes the test more realistic as in the previous version the first candidate was always going to get to all followers first, preventing a dueling candidates scenario. Previously, tickers were all scheduled at the same time, so there was no spread of them across the tick time. Now these tickers are scheduled with a uniform spread across this time (tick delta). Also previously, for custom free elections used tick_all() which traversed _in_configuration sequentially and ticked each. This, combined with rpc outbound directly calling methods in the other server without yielding, caused free elections to be unrealistic with same order determined and first candidate always winning. This patch changes this behavior. The free election uses normal tickers (now uniformly distributed in tick delay time) and its loop waits for tick delay time (yielding) and checks if there's a new leader. Also note the order might not be the same in debug mode if more than one tick is scheduled. As rpc messages are sent delayed, network connectivity needs to be checked again before calling the function on the remote side. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-24 13:05:53 +02:00
Alejo Sanchez	db23823c77	raft: replication test: packet drop rpc helper Add a helper to check if a packet should be dropped. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	497af3167f	raft: replication test: connectivity configuration Pass packet drops within connectivity configuration struct. Default to no packet drops. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	e4d5428e8a	raft: replication test: rpc network map in raft_cluster Move rpc network map to raft cluster, no longer as static in rpc class.	2021-08-23 17:50:16 +02:00
Alejo Sanchez	5cfe6c1ca2	raft: replication test: minor: rename local to int ids For clarity, name 0-based integer ids as int ids not local. This is in contrast with 1-based UUID ids.	2021-08-23 17:50:16 +02:00
Alejo Sanchez	27d90f0165	raft: replication test: fix restart_tickers when partitioning When partitioning, elect_new_leader restarts tickers, so don't re-restart them in this case. When leader is dropped and no new leader is specified, restart tickers before free election. If no change of leader, restart tickers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	e4262291f2	raft: replication test: partition ranges Allow specifying ranges within partition to handle large number of nodes. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	56a110d42f	raft: replication test: isolate one server Support disconnection of one server with the rest. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	6b3327c753	raft: replication test: move objects out of header Use a separate cc file for definitions and objects. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	cea18e6830	raft: replication test: make dummy command const Make dummy command const in header. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	2db3192ac3	raft: replication test: template clock type Templetize clock type. Use a struct for run_test to work around https://bugs.llvm.org/show_bug.cgi?id=50345 With help from @kbr- Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	cb35588fb1	raft: replication test: tick delta inside raft_cluster Store tick delta inside raft_cluster. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	49cb040037	raft: replication test: style - member initializer Fix raft_cluster constructor member initializer list.	2021-08-23 17:50:16 +02:00
Alejo Sanchez	6e2ab657b3	raft: replication test: move common code out Common replication test code moved to header. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00

39 Commits