mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Files

Tomasz Grabiec f08a3e3fd8 Merge "raft: test fixes, etcd tests, simplification" from Alejo

This patch set adds etcd unit tests for raft.

It also includes a fix for replication test in debug mode and a
simplification for append_request.

Tests: unit ({dev}), unit ({debug}), unit ({release})

*  https://github.com/alecco/scylla/tree/raft-ale-tests-09b:
  raft: etcd unit tests: test log replication
  raft: boost test etcd: test fsm can vote from any state
  raft: boost test etcd: port TestLeaderElectionOverwriteNewerLogs
  raft: replication test: add etcd test for cycling leaders
  raft: testing: provide primitives to wait for log propagation
  raft: etcd unit tests: initial boost tests
  raft: combine append_request _receive and _send

2021-01-21 10:41:33 +02:00

fsm.cc

Merge "raft: test fixes, etcd tests, simplification" from Alejo

2021-01-21 10:41:33 +02:00

fsm.hh

Merge "raft: test fixes, etcd tests, simplification" from Alejo

2021-01-21 10:41:33 +02:00

internal.hh

…

log.cc

raft: combine append_request _receive and _send

2021-01-18 12:24:13 -04:00

log.hh

raft: combine append_request _receive and _send

2021-01-18 12:24:13 -04:00

logical_clock.hh

raft: Implement log replication and leader election

2020-10-01 14:30:59 +03:00

progress.cc

raft: Get back probe_sent logic in progress::PROBE state.

2020-10-05 15:04:44 +02:00

progress.hh

raft: Get back probe_sent logic in progress::PROBE state.

2020-10-05 15:04:44 +02:00

raft.cc

raft: Implement log replication and leader election

2020-10-01 14:30:59 +03:00

raft.hh

Merge "raft: test fixes, etcd tests, simplification" from Alejo

2021-01-21 10:41:33 +02:00

README.md

raft: rename storage to persistence

2021-01-20 10:23:43 +02:00

server.cc

Merge "raft: test fixes, etcd tests, simplification" from Alejo

2021-01-21 10:41:33 +02:00

server.hh

Merge "raft: test fixes, etcd tests, simplification" from Alejo

2021-01-21 10:41:33 +02:00

README.md

Raft consensus algorithm implementation for Seastar

Seastar is a high performance server-side application framework written in C++. Please read more about Seastar at http://seastar.io/

This library provides an efficient, extensible, implementation of Raft consensus algorithm for Seastar. For more details about Raft see https://raft.github.io/

Implementation status

log replication, including throttling for unresponsive servers
leader election

Usage

In order to use the library the application has to provide implementations for RPC, persistence and state machine APIs, defined in raft/raft.hh. The purpose of these interfaces is:

provide a way to communicate between Raft protocol instances
persist the required protocol state on disk, a pre-requisite of the protocol correctness,
apply committed entries to the state machine.

While comments for these classes provide an insight into expected guarantees they should provide, in order to provide a complying implementation it's necessary to understand the expectations of the Raft consistency protocol on its environment:

RPC should implement a model of asynchronous, unreliable network, in which messages can be lost, reordered, retransmitted more than once, but not corrupted. Specifically, it's an error to deliver a message to a Raft server which was not sent to it.
persistence should provide a durable persistent storage, which survives between state machine restarts and does not corrupt its state.
Raft library calls state_machine::apply_entry() for entries reliably committed to the replication log on the majority of servers. While apply_entry() is called in the order entries are serialized in the distributed log, there is no guarantee that apply_entry() is called exactly once. E.g. when a protocol instance restart from persistent state, it may re-apply some already applied log entries.

Seastar's execution model is that every object is safe to use within a given shard (physical OS thread). Raft library follows the same pattern. Calls to Raft API are safe when they are local to a single shard. Moving instances of the library between shards is not supported.

First usage.

For an example of first usage see replication_test.cc in test/raft/.

In a nutshell:

create instances of RPC, persistence, and state machine
pass them to an instance of Raft server - the facade to the Raft cluster on this node
repeat the above for every node in the cluster
use server::add_entry() to submit new entries on a leader, state_machine::apply_entries() is called after the added entry is committed by the cluster.

Subsequent usages

Similar to the first usage, but persistence::load_term_and_vote() persistence::load_log(), persistence::load_snapshot() are expected to return valid protocol state as persisted by the previous incarnation of an instance of class server.