Zach Brown bb3db7e272 Send quorum heartbeats while fencing
Quorum members will try to elect a new leader when they don't receive
heartbeats from the currently elected leader.   This timeout is short to
encourage restoring service promptly.

Heartbeats are sent from the quorum worker thread and are delayed while
it synchronously starts up the server, which includes fencing previous
servers.  If fence requests take too long then heartbeats will be
delayed long enough for remaining quorum members to elect a new leader
while the recently elected server is still busy fencing.

To fix this we decouple server startup from the quorum main thread.
Server starting and stopping becomes asynchronous so the quorum thread
is able to send heartbeats while the server work is off starting up and
fencing.

The server used to call into quorum to clear a flag as it exited.   We
remove that mechanism and have the server maintain a running status that
quorum can query.

We add some state to the quorum work to track the asynchronous state of
the server.   This lets the quorum protocol change roles immediately as
needed while remembering that there is a server running that needs to be
acted on.

The server used to also call into quorum to update quorum blocks.   This
is a read-modify-write operation that has to be serialized.  Now that we
have both the server starting up and the quorum work running they both
can't perform these read-modify-write cycles.  Instead we have the
quorum work own all the block updates and it queries the server status
to determine when it should update the quorum block to indicate that the
server has fenced or shut down.

Signed-off-by: Zach Brown <zab@versity.com>
2022-03-31 10:29:43 -07:00
2022-03-31 10:29:43 -07:00
2020-12-07 09:47:12 -08:00
2020-12-07 10:39:20 -08:00
2021-11-05 11:16:57 -07:00
2022-03-14 17:15:24 -07:00

Introduction

scoutfs is a clustered in-kernel Linux filesystem designed to support large archival systems. It features additional interfaces and metadata so that archive agents can perform their maintenance workflows without walking all the files in the namespace. Its cluster support lets deployments add nodes to satisfy archival tier bandwidth targets.

The design goal is to reach file populations in the trillions, with the archival bandwidth to match, while remaining operational and responsive.

Highlights of the design and implementation include:

  • Fully consistent POSIX semantics between nodes
  • Atomic transactions to maintain consistent persistent structures
  • Integrated archival metadata replaces syncing to external databases
  • Dynamic seperation of resources lets nodes write in parallel
  • 64bit throughout; no limits on file or directory sizes or counts
  • Open GPLv2 implementation

Community Mailing List

Please join us on the open scoutfs-devel@scoutfs.org mailing list hosted on Google Groups

Description
No description provided
Readme 6.9 MiB
Languages
C 87.1%
Shell 9.2%
Roff 2.5%
TeX 0.9%
Makefile 0.3%