A snapshot transfer may take a lot of time and meanwhile a leader doing
it may lose the leadership. If that happens the ongoing snapshot transfer
becomes obsolete since the snapshot will be rejected by the receiving
node as coming from an old leader. Make snapshot transfer abortable and
abort them when leader changes.
As we are now serially adding commands with consecutive integers there
is no need to build vectors of commands. Remove helper.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Add support for configuration change on leader.
Keep track of servers in config in test.
Add a dummy entry to confirm configuration changed. If the add fails,
because the old leader was not in the new config and stepped down, the
config is considered changed, too.
Add a test with some configuration changes.
Add a test cycling every scenario for 1 of 4 nodes removed.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Use a special value as dummy entry to be ignored when seen in state
machine input.
Ignore dummy entries for count.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Before this change the default was prevote enabled.
With this change each test is run with and without prevote.
This duplicates the number of test cases.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
The test suite requires an initial leader and at the moment it's always
just 0. Make it default and simplify code.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
If a leader was already disconnected the election of a new leader could
re-connect. Save original connectivity and restore it when done electing
new leader.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Use the new specific connectivity to manage old leader disconnection
more specifically.
This fixes having elections where the vote of the old leader is required
for quorum. For example {A,B} and we want to switch leader. For B to
become candidate it has to see A as down. Then A has to see B's request
for vote, and vote for A.
So to make the general case old leader needs to be first disconnected
from all nodes, make the desired node candidate, then have the old
leader connected only to the desired candidate (else, other nodes would
see the new candidate as disrupting a live leader).
Also, there might be stray messages from the former leader. These could
revert the candidate to follower. To handle this this patch retries
the process until the desired node becomes leader.
The helper function elect_me_leader() is split and renamed to
wait_until_candidate() and wait_election_done(). The former ticks until
the node is a candidate and the later waits until a candidate either
becomes a leader or reverts to follower
The existing etcd test workaround of incrementing from n=2 to n=3 nodes
is corrected back to original n=2.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Add 2 helper functions for making nodes reach timeout threshold and to
elect a specific node.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Replace simple full disconnect of a node with specific from -> to
disconnection tracking.
This will help electing new leaders.
Say there are {A,B,C} with A leader and we want to elect B.
Before this patch, we would disconnect A, run an election with just
{B,C}, and then re-connect A.
If we have {A,B} and want to elect B, this won't work as B needs 2/2+1
votes and A is disconnected. Even if we made A stepped down. This patch
corrects this shortcoming. (@gleb-cloudius)
With this patch, we can specify other followers (not the previous or
next leader) to not see the old leader, but the new and old leaders see
each other just fine. In the example {A,B,C} above we can cut A<->B
specifcally.
Also, this is closer to etcd testing and should help porting cases.
NOTE: in the current test implementation failure_detector reports
node.is_alive(other_node) if there is a connection both ways.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Checksum was removed so undo support for multiple versions added in:
test: add support for different state machines
43dc5e7dc2
NOTE: as there is a test with custom total_values, expected value cannot
be static const anymore. (line 630)
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Previously, entries were added in parallel and we needed to check if
order was broken. Using a simple checksum was better than a hash as you
could easily find the position it broke (we add consecutive numbers).
Now order of entries is forced so it's not useful. This patch removes
it.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
When probes are sent over a slow network, the leader would send
multiple probes to a lagging follower before it would get a
reject response to the first probe back. After getting a reject, the
leader will be able to correctly position `next_idx` for that
follower and switch to pipeline mode. Then, an out of order reject
to a now irrelevant probe could crash the leader, since it would
effectively request it to "rewind" its `match_idx` for that
follower, and the code asserts this never happens.
We fix the problem by strengthening `is_stray_reject`. The check that
was previously only made in `PIPELINE` case
(`rejected.non_matching_idx <= match_idx`) is now always performed and
we add a new check: `rejected.last_idx < match_idx`. We also strengthen
the assert.
The commit improves the documentation by explaining that
`is_stray_reject` may return false negatives. We also precisely state
the preconditions and postconditions of `is_stray_reject`, give a more
precise definition of `progress.match_idx`, argue how the
postconditions of `is_stray_reject` follow from its preconditions
and Raft invariants, and argue why the (strengthened) assert
must always pass.
Message-Id: <20210423173117.32939-1-kbraun@scylladb.com>
Add the test for the case where C_new entry is not the last one in a
leader that is been removed from a cluster. In this case a leader will
continue replication even after committing C_new and will start stepdown
process later, when at least one follower is fully synchronized.
TestProposal
For multiple scenarios, check proposal handling.
Note, instead of expecting an explicit result for each specified case,
the test automatically checks for expected behavior when quorum is
reached or not.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
TestSingleNodePreCandidate
Checks a single node configuration with precandidate on works to
automatically elect the node.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
TestDuelingPreCandidates
In a configuration of 3 nodes, two nodes don't see each other and they
compete for leadership. Loser (3) should revert to follower when prevote
is rejected and revert to term 1.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
TestDuelingCandidates
In a configuration of 3 nodes, two nodes don't see each other and they
compete for leadership. Once reconnected, loser should not disrupt.
But note it will remain candidate with current algorithm without
prevoting and other fsms will not bump term.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
TestCannotCommitWithoutNewTermEntry tests the entries cannot be
committed when leader changes, no new proposal comes in and ChangeTerm
proposal is filtered.
NOTE: this doesn't check committed but it's implicit for next round;
this could also use communicate() providing committed output map
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Port etcd TestSingleNodeCommit
In a single node configuration elect the node, add 2 entries and check
number of committed entries.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Make test_leader_election_overwrite_newer_logs use newer communicate()
and other new helpers.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Make implementation follow closer to original test.
Use newer boost test helpers.
NOTE: in etcd it seems a leader's self progress is in PIPELINE state.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Two helper functions to compare logs. For now only index, term, and data
type are used. Data content comparison does not seem to be necessary for now.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Current election_timeout() helper might bump the term twice.
It's convenient and less error prone to have a more fine grained helper
that stops right when candidate state is reached.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>