mirror of
https://github.com/versity/scoutfs.git
synced 2026-04-20 13:30:29 +00:00
ca526e2bc0d7b2d1e4b48e460492c9a9003476cd
When a new server starts up it rebuilds its view of all the granted locks with lock recovery messages. Clients give the server their granted lock modes which the server then uses to process all the resent lock requests from clients. The lock invalidation work in the client is responsible for transitioning an old granted mode to a new invalidated mode from an unsolicited message from the server. It has to process any client state that'd be incompatible with the new mode (write dirty data, drop caches). While it is doing this work, as an implementation short cut, it sets the granted lock mode to the new mode so that users that are compatible with the new invalidated mode can use the lock whlie it's being invalidated. Picture readers reading data while a write lock is invalidating and writing dirty data. A problem arises when a lock recover request is processed during lock invalidation. The client lock recover request handler sends a response with the current granted mode. The server takes this to mean that the invalidation is done but the client invalidation worker might still be writing data, dropping caches, etc. The server will allow the state machine to advance which can send grants to pending client requests which believed that the invalidation was done. All of this can lead to a grant response handler in the client tripping the assertion that there can not be cached items that were incompatible with the old mode in a grant from the server. Invalidation might still be invalidating caches. Hitting this bug is very rare and requires a new server starting up while a client has both a request outstanding and an invalidation being processed when the lock recover request arrives. The fix is to record the old mode during invalidation and send that in lock recover responses. This can lead the lock server to resend invalidation requests to the client. The client already safely handles duplicate invalidation requests from other failover cases. Signed-off-by: Zach Brown <zab@versity.com>
Introduction
scoutfs is a clustered in-kernel Linux filesystem designed to support large archival systems. It features additional interfaces and metadata so that archive agents can perform their maintenance workflows without walking all the files in the namespace. Its cluster support lets deployments add nodes to satisfy archival tier bandwidth targets.
The design goal is to reach file populations in the trillions, with the archival bandwidth to match, while remaining operational and responsive.
Highlights of the design and implementation include:
- Fully consistent POSIX semantics between nodes
- Atomic transactions to maintain consistent persistent structures
- Integrated archival metadata replaces syncing to external databases
- Dynamic seperation of resources lets nodes write in parallel
- 64bit throughout; no limits on file or directory sizes or counts
- Open GPLv2 implementation
Community Mailing List
Please join us on the open scoutfs-devel@scoutfs.org mailing list hosted on Google Groups
Description
Languages
C
86.4%
Shell
10%
Roff
2.5%
TeX
0.8%
Makefile
0.3%