Commit Graph

10 Commits

Author SHA1 Message Date
Mark Fasheh
3a5093c6ae scoutfs: replace trace_printk in alloc.c
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-09-28 13:59:49 -07:00
Zach Brown
ff5a094833 scoutfs: store allocator regions in btree
Convert the segment allocator to store its free region bitmaps in the
btree.

This is a very straight forward mechanical transformation.  We split the
allocator region into a big-endian index key and the bitmap value
payload.  We're careful to operate on aligned copies of the bitmaps so
that they're long aligned.

We can remove all the funky functions that were needed when writing the
ring.  All we're left with is a call to apply the pending allocations to
dirty btree blocks before writing the btree.

Signed-off-by: Zach Brown <zab@versity.com>
2017-07-08 10:59:40 -07:00
Zach Brown
8d59e6d071 scoutfs: fix alloc eio for free region
It's possible for the next segno to fall at the end of an allocation
region that doesn't have any bits set.  The code shouldn't return -EIO
in that case, it should carry on to the next region.

Signed-off-by: Zach Brown <zab@versity.com>
2017-06-27 14:04:38 -07:00
Zach Brown
cec3f9468a Further isolate rings and compaction
Each mount was still loading the manifest and allocator rings and
starting compaction, even if they were coordinating segment reads
and writes with the server.

This moves ring and compaction setup and teardown from on mount and
unmount to as the server starts up and shuts down.  Now only the server
has the rings resident and is running compaction.

We had to null some of the super info fields so that we can repeatedly
load and destroy the ring indices over the lifetime of a mount.

We also have to be careful not to call between item transactions and
compaction.   We'll restore this functionality with the server in the
future.

Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:51:10 -07:00
Zach Brown
5e0e9ac12e Move to much simpler manifest/alloc storage
Using the treap to be able to incrementally read and write the manifest
and allocation storage from all nodes wasn't quite ready for prime time.
The biggest problem is that invalidating cached nodes which are the
target of native pointers, either for consistency or memory pressure, is
problematic.  This was getting in the way of adding shared support as
readers and writers try to use as much of their treap caches as they
can.  There were other serious problems that we'd run into eventually:
memory pressure from duplicate caching in native nodes and the page
cache, small IOs from reading a page at a time, the risk of
pathologically imbalanced treaps, and the ring being corrupted if the
migration balancing doesn't work (the model assumed you could always
dirty an individual node in a transaction, you have to dirty all the
parents in each new transaction).

Let's back off to a much simpler mechanism while we build the rest of
the system around it.  We can revisit aggressively optimizing this when
it's our worst problem.

We'll store the indexes that the manifest server needs in simple
preallocated rings with log entries.   The server has to read the index
in its entirety into a native rbtree before it can work on it.  We won't
access the physical ring from mounts anymore, they'll send messages to
the server.

The ring callers are now working with a pinned tree in memory so the
interface can be a bit simpler.  By storing the indexes in their own
rings the code and write path become a lot simper: we have an IO
submission path for each index instead of "dirtying" calls per index and
then a writing call.

All this is much more robust and much less likely to get in our way as
we stand up the rest of the system around it.

Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:51:10 -07:00
Zach Brown
6516ce7d57 Report free blocks in statfs
Our statfs callback was still using the old buddy allocator.

We add a free segments field to the super and have it track the number
of free segments in the allocator.  We then use that to calculate the
number of free blocks for statfs.

Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:44:54 -07:00
Zach Brown
0a5fb7fd83 Add some counters
Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:44:53 -07:00
Zach Brown
a333c507fb Fix how dirty treap is tracked
The transaction writing thread tests if the manifest and alloc treaps
are dirty.  It did this by testing if there were any dirty nodes in the
treap.

But this misses the case where the treap has been modified and all nodes
have been removed.  In that case the root references no dirty nodes but
needs to be written.

Instead let's specifically mark the treap dirty when it's modified.
From then on sync will always try to write it out.  We also integrate
updating the persistent root as part of writing the dirty nodes to the
persistent ring.  It's required and every caller did it so it was silly
to make it a separate step.

Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:44:53 -07:00
Zach Brown
db9f2be728 Switch to indexed manifest using treap ring
The first pass manifest and allocator storage used a simple ring log
that was entirely replayed into memory to be used.  That risked the
manifest being too large to fit in memory, especially with large keys
and large volumes.

So we move to using an indexed persistent structure that can be read on
demand and cached.  We use a treap of byte referenced nodoes stored in a
circular ring.

The code interface is modeled a bit on the in-memory rbtree interface.
Except that we can get IO errors and manage allocation so we return data
pointers to the item payload istead of item structs and we can return
errors.

The manifest and allocator are converted over and the old ring code is
removed entirely.

Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:44:53 -07:00
Zach Brown
c4954eb6f4 Add initial LSM write implementation
Add all the core strutural components to be able to modify metadata.  We
modify items in fs write operations, track dirty items in the cache,
allocate free segment block reagions, stream dirty items into segments,
write out the segments, update the manifest to reference the written
segments, and write out a new ring that has the new manifest.

Signed-off-by: Zach Brown <zab@versity.com>
2017-04-18 13:42:30 -07:00