Commit Graph

17 Commits

Author SHA1 Message Date
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Gleb Natapov
ce40b01b07 raft: rename snapshot into snapshot_descriptor
The snapshot structure does not contain the snapshot itself but only
refers to it trough its id. Rename it to snapshot_descriptor for clarity.
2021-08-29 12:53:03 +03:00
Kamil Braun
7533c84e62 raft: sometimes become a candidate even if outside the configuration
There are situations where a node outside the current configuration is
the only node that can become a leader. We become candidates in such
cases. But there is an easy check for when we don't need to; a comment was
added explaining that.
2021-08-06 13:18:32 +02:00
Kamil Braun
c6563220b0 raft: store cluster configuration when taking snapshots
We add a function `log_last_conf_before(index_t)` to `fsm` which, given
an index greater than the last snapshot index, returns the configuration
at this index, i.e. the configuration of the last configuration entry
before this index.

This function is then used in `applier_fiber` to obtain the correct
configuration to be stored in a snapshot.

In order to ensure that the configuration can be obtained, i.e. the
index we're looking at is not smaller than the last snapshot index, we
strengthen the conditions required for taking a snapshot: we check that
`_fsm` has not yet applied a snapshot at a larger index (which it may
have due to a remote snapshot install request). This also causes fewer
unnecessary snapshots to be taken in general.
2021-08-06 12:00:32 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Konstantin Osipov
0295163f6f raft: always return a non-zero configuration index from the log
Return snapshot index for last configuration index if there
is no configuration in the log.
2021-03-24 14:05:55 +03:00
Konstantin Osipov
4083026b65 raft: consistently use configuration from the log 2021-02-18 16:04:44 +03:00
Konstantin Osipov
51c968bcb4 raft: rename log::non_snapshoted_length() to log::in_memory_size()
The old name was incorrect, in case apply_snapshot() was called with
non-zero trailing entries, the total log length is greater than the
length of the part that is not stored in a snapshot.

Fix spelling in related comments.

Rename fsm::wait() to fsm::wait_max_log_size(), it's a more
specific name. Rename max_log_length to max_log_size to use
'size' rather than 'length' consistently for log size.
2021-02-18 16:04:44 +03:00
Konstantin Osipov
cfe407b402 raft: inline raft::log::truncate_tail()
It's the core of apply_snapshot() work and is only used in it.

Now that truncate_tail is inline, rename truncate_head()
to truncate_uncommitted().
2021-02-18 16:04:44 +03:00
Konstantin Osipov
805d52eb16 raft: remove log::start_idx()
Replace it with a private _first_idx, which is maintained
along with the rest of class log state.
_first_idx is a name consistent with counterpart last_idx().

Do not use a function since going forward we may want
to remove Raft index from struct log_entry, so should rely
less on it.

This fixes a bug when _last_conf_idx was not reset
after apply_snapshot() because start_idx() was pointing
to a non-existent entry.
2021-02-18 16:04:44 +03:00
Konstantin Osipov
af8770da63 raft: return a correct last term on an empty log
If the log is empty, we must use snapshot's term,
since the log could be right after taking a snapshot
when no trailing entries were kept.

This fixes a rare possible bug when a log matching
rule could be violated during elections by a follower
with a log which was just truncated after a snapshot.

A separate unit test for the issue will follow.
2021-02-18 16:04:43 +03:00
Konstantin Osipov
cb035a7c8d raft: do not use raft::log::start_idx() outside raft::log()
raft::log::start_idx() is currently not meaningful
in case the log is empty.

Avoid using it in fsm::replicate_to() and avoid manual search for
previous log term, instead encapsulate the search in log::term_for().

As a side effect we currently return a correct term (0)
when log matching rule is exercised for an empty log
and the very first snapshot with term 0. Update raft_etcd_test.cc
accordingly.

This change happens to reduce the overall line count.

While at it, improve the comments in raft::replicate_to().
2021-02-18 16:04:43 +03:00
Konstantin Osipov
b29181875c raft: joint consensus, keep track of the last confchange index in the log
When initializing the log, find the most recent configuration
change index, if present.
Maintain the most recent configuration change index when
the log is truncated or entries are appended to it.
The last configuration change index will be used by FSM when it enters
candidate or leader state to fetch the current configuration.

We never truncate beyond a single in-progress configuration
change, so storing the previous value of last_conf_idx
helps avoid log backward scan on truncation in 100% of cases.

Remove all unused log constructors.
2021-01-29 22:07:07 +03:00
Gleb Natapov
6d47a535b9 raft: combine append_request _receive and _send
Combine structs for append request send and receive into a single
struct.

Author:    Gleb Natapov <gleb@scylladb.com>
Date:      Mon Nov 23 14:33:14 2020 +0200
2021-01-18 12:24:13 -04:00
Gleb Natapov
8d9b6f588e raft: stop accepting requests on a leader after the log reaches the limit
To prevent the log to take too much memory introduce a mechanism that
limits the log to a certain size. If the size is reached no new log
entries can be submitted until previous entries are committed and
snapshotted.
2020-11-18 19:14:37 +01:00
Gleb Natapov
7fdfa32dbd raft: preserve trailing raft log entries during snapshotting
This patch allows to leave snapshot_trailing amount of entries
when a state machine is snapshotted and raft log entries are dropped.
Those entries can be used to catch up nodes that are slow without
requiring snapshot transfer. The value is part of the configuration
and can be changed.
2020-10-15 11:50:27 +03:00
Gleb Natapov
e1ac1a61c9 raft: Implement log replication and leader election
This patch introduces partial RAFT implementation. It has only log
replication and leader election support. Snapshotting and configuration
change along with other, smaller features are not yet implemented.

The approach taken by this implementation is to have a deterministic
state machine coded in raft::fsm. What makes the FSM deterministic is
that it does not do any IO by itself. It only takes an input (which may
be a networking message, time tick or new append message), changes its
state and produce an output. The output contains the state that has
to be persisted, messages that need to be sent and entries that may
be applied (in that order). The input and output of the FSM is handled
by raft::server class. It uses raft::rpc interface to send and receive
messages and raft::storage interface to implement persistence.
2020-10-01 14:30:59 +03:00