mirror of https://github.com/versity/scoutfs.git synced 2026-07-20 06:52:20 +00:00

T

Zach Brown 73bf916182 Return ENOSPC as space gets low

Returning ENOSPC is challenging because we have clients working on
allocators which are a fraction of the whole and we use COW transactions
so we need to be able to allocate to free.  This adds support for
returning ENOSPC to client posix allocators as free space gets low.

For metadata, we reserve a number of free blocks for making progress
with client and server transactions which can free space.  The server
sets the low flag in a client's allocator if we start to dip into
reserved blocks.  In the client we add an argument to entering a
transaction which indicates if we're allocating new space (as opposed to
just modifying existing data or freeing).  When an allocating
transaction runs low and the server low flag is set then we return
ENOSPC.

Adding an argument to transaciton holders and having it return ENOSPC
gave us the opportunity to clean it up and make it a little clearer.
More work is done outside the wait_event function and it now
specifically waits for a transaction to cycle when it forces a commit
rather than spinning until the transaction worker acquires the lock and
stops it.

For data the same pattern applies except there are no reserved blocks
and we don't COW data so it's a simple case of returning the hard ENOSPC
when the data allocator flag is set.

The server needs to consider the reserved count when refilling the
client's meta_avail allocator and when swapping between the two
meta_avail and meta_free allocators.

We add the reserved metadata block count to statfs_more so that df can
subtract it from the free meta blocks and make it clear when enospc is
going to be returned for metadata allocations.

We increase the minimum device size in mkfs so that small testing
devices provide sufficient reserved blocks.

And finally we add a little test that makes sure we can fill both
metadata and data to ENOSPC and then recover by deleting what we filled.

Signed-off-by: Zach Brown <zab@versity.com>

2021-07-07 14:13:14 -07:00

kmod

Return ENOSPC as space gets low

2021-07-07 14:13:14 -07:00

tests

Return ENOSPC as space gets low

2021-07-07 14:13:14 -07:00

utils

Return ENOSPC as space gets low

2021-07-07 14:13:14 -07:00

.gitignore

Initial commit

2020-12-07 09:47:12 -08:00

Makefile

Add simple top-level Makefile

2020-12-07 10:39:20 -08:00

README.md

Update repo README.md for quorum slots

2021-02-22 13:28:38 -08:00

README.md

Introduction

scoutfs is a clustered in-kernel Linux filesystem designed and built from the ground up to support large archival systems.

Its key differentiating features are:

Integrated consistent indexing accelerates archival maintenance operations
Commit logs allow nodes to write concurrently without contention

It meets best of breed expectations:

Fully consistent POSIX semantics between nodes
Rich metadata to ensure the integrity of metadata references
Atomic transactions to maintain consistent persistent structures
First class kernel implementation for high performance and low latency
Open GPLv2 implementation

Learn more in the white paper.

Current Status

Alpha Open Source Development

scoutfs is under heavy active development. We're developing it in the open to give the community an opportunity to affect the design and implementation.

The core architectural design elements are in place. Much surrounding functionality hasn't been implemented. It's appropriate for early adopters and interested developers, not for production use.

In that vein, expect significant incompatible changes to both the format of network messages and persistent structures. Since the format hash-checking has now been removed in preparation for release, if there is any doubt, mkfs is strongly recommended.

The current kernel module is developed against the RHEL/CentOS 7.x kernel to minimize the friction of developing and testing with partners' existing infrastructure. Once we're happy with the design we'll shift development to the upstream kernel while maintaining distro compatibility branches.

Community Mailing List

Please join us on the open scoutfs-devel@scoutfs.org mailing list hosted on Google Groups for all discussion of scoutfs.

Quick Start

This following a very rough example of the procedure to get up and running, experience will be needed to fill in the gaps. We're happy to help on the mailing list.

The requirements for running scoutfs on a small cluster are:

One or more nodes running x86-64 CentOS/RHEL 7.4 (or 7.3)
Access to two shared block devices
IPv4 connectivity between the nodes

The steps for getting scoutfs mounted and operational are:

Get the kernel module running on the nodes
Make a new filesystem on the devices with the userspace utilities
Mount the devices on all the nodes

In this example we use three nodes. The names of the block devices are the same on all the nodes. Two of the nodes will be quorum members. A majority of quorum members must be mounted to elect a leader to run a server that all the mounts connect to. It should be noted that two quorum members results in a majority of one, each member itself, so split brain elections are possible but so unlikely that it's fine for a demonstration.

Get the Kernel Module and Userspace Binaries

Either use snapshot RPMs built from git by Versity:

rpm -i https://scoutfs.s3-us-west-2.amazonaws.com/scoutfs-repo-0.0.1-1.el7_4.noarch.rpm
yum install scoutfs-utils kmod-scoutfs

Or use the binaries built from checked out git repositories:

yum install kernel-devel
git clone git@github.com:versity/scoutfs.git
make -C scoutfs
modprobe libcrc32c
insmod scoutfs/kmod/src/scoutfs.ko
alias scoutfs=$PWD/scoutfs/utils/src/scoutfs

Make a New Filesystem (destroys contents)

We specify quorum slots with the addresses of each of the quorum member nodes, the metadata device, and the data device.
```
scoutfs mkfs -Q 0,$NODE0_ADDR,12345 -Q 1,$NODE1_ADDR,12345 /dev/meta_dev /dev/data_dev
```
Mount the Filesystem

First, mount each of the quorum nodes so that they can elect and start a server for the remaining node to connect to. The slot numbers were specified with the leading "0,..." and "1,..." in the mkfs options above.
```
mount -t scoutfs -o quorum_slot_nr=$SLOT_NR,metadev_path=/dev/meta_dev /dev/data_dev /mnt/scoutfs
```
Then mount the remaining node which can now connect to the running server.
```
mount -t scoutfs -o metadev_path=/dev/meta_dev /dev/data_dev /mnt/scoutfs
```

For Kicks, Observe the Metadata Change Index

The meta_seq index tracks the inodes that are changed in each transaction.

scoutfs walk-inodes meta_seq 0 -1 /mnt/scoutfs
touch /mnt/scoutfs/one; sync
scoutfs walk-inodes meta_seq 0 -1 /mnt/scoutfs
touch /mnt/scoutfs/two; sync
scoutfs walk-inodes meta_seq 0 -1 /mnt/scoutfs
touch /mnt/scoutfs/one; sync
scoutfs walk-inodes meta_seq 0 -1 /mnt/scoutfs

Languages

C 86.3%

Shell 10%

Roff 2.5%

TeX 0.8%

Makefile 0.4%