Commit Graph

24 Commits

Author SHA1 Message Date
Zach Brown
867d717d2b scoutfs: item offsets need to skip block headers
The vallue offset allocation knew to skip block headers at the start of
each segment block but, weirdly, the item offset allocation didn't.

We make item offset calculation skip the header and we add some tracing
to help see the problem.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-25 19:28:21 -07:00
Zach Brown
6834100251 scoutfs: free our dentry info
Stop leaking dentry_info allocations by adding a dentry_op with a
d_release that frees our dentry info allocation.  rmmod tests no longer
fail when dmesg screams that we have slab caches that still have
allocated objects.

Signed-off-by: Zach Brown <zab@versity.com>
s
2016-03-25 11:08:20 -07:00
Zach Brown
434cbb9c78 scoutfs: create dirty items for inode updates
Inode updates weren't persistent because they were being stored in clean
segments in memory.  This was triggered by the new hashed dirent
mechanism returning -ENOENT when the inode still had a 0 max dirent hash
nr.

We make sure that there is a dirty item in the dirty segment at the
start of inode modification so that later updates will store in the
dirty segment.  Nothing ensures that the dirty segment won't be written
out today but that will be added soon.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-25 10:08:34 -07:00
Zach Brown
3bb00fafdc scoutfs: require sparse builds
Now that we know that it's easy to fix sparse build failures against
RHEL kernel headers we can require sparse builds when developing.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-24 21:45:08 -07:00
Zach Brown
fbbfac1b27 scoutfs: fix sparse errors
I was building against a RHEL tree that broke sparse builds.  With that
fixed I can now see and fix sparse errors.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-24 21:44:42 -07:00
Zach Brown
3755adddd5 scoutfs: store dirents at multiple hash values
Previously we dealt with colliding dirent hash values by storing all the
dirents that share a hash value in a big item with multiple dirents.

This complicated the code and strongly encouraged resizing items as
dirents come and go.  Resizing items isn't very easy with our simple log
segment item creation mechanism.

Instead let's deal with collisions by allowing a dirent to be stored at
multiple hash values.  The code is much simpler.

Lookup has to iterate over all possible hash values.  We can track the
greatest hash iteration stored in the directory inode to limit the
overhead of negative lookups in small directories.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-24 20:11:58 -07:00
Zach Brown
1270553f1f scoutfs: mega item access omnibus commit 9000
Initially items were stored in memory with an rbtree.  That let us build
up the API above items without worrying about their storage.  That gave
us dirty items in memory and we could start working on writing them to
and reading them from the log segment blocks.

Now that we have the code on either side we can get rid of the item
cache in between.  It had some nice properties but it's fundamentally
duplicating the item storage in cached log segment blocks.  We'd also
have to teach it to differentiate between negative cache entries and
missing entries that need to be filled from blocks.  And the giant item
index becomes a bottleneck.

We have to index items in log segments anyway so we rewrite the item
APIs to read and write the items in the log segments directly.  Creation
writes to dirty blocks in memory and reading and iteration walk through
the cached blocks in the buffer cache.

I've tried to comment the files and functions appropriately so most of
the commentary for the new methods is in the body of the commit.

The overall theme is making it relatively efficient to operate on
individual items in log segments.  Previously we could only walk all the
items in an existing segment or write all the dirty items to a new
segment.  Now we have bloom filters and sorted item headers to let us
test for the presence of an item's key with progressively more expensive
methods.   We hold on to a dirty segment and fill it as we create new
items.

This needs more fleshing out and testing but this is a solid first pass
and it passes our existing tests.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-24 17:40:14 -07:00
Zach Brown
12d5d3d216 scoutfs: add next item reading
Add code to walk all the block segments that intersect a key range to
find the next item after that key value.

It is easier to just return failure from the next item reader and have
the caller retry the searches so we change the specific item reading
path to use the same convention to keep the caller consistent.

This still warns as it falls off the last block but that's fine for now.
We're going to be changing all this in the next few commits.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-18 17:30:39 -07:00
Zach Brown
af492a9f27 scoutfs: add scoutfs_inc_key()
Add a quick inline function for incrementing a key value across the
inode>type>offset sorted key space.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-18 17:24:12 -07:00
Zach Brown
96b8a6da46 scoutfs: update created inode times in mknod
In mknod the newly created inode's times are set down in the new inode
creation path instead of up in the mknod path to match the parent dir's
ctime and mtime.

This is strictly legal but it's easier to test that all the times have
been set in the mknod by having them equal.  This stops mkdir-interface
test failures when enough time passes between inode creation and parent
dir timestamp updates to have them differ.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-18 17:21:12 -07:00
Zach Brown
0c0f2b19d5 scoutfs: update dirty inode items
Wire up the code to update dirty inode items as inodes are modified in
memory.  We had a bit of the code but it wasn't being called.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-17 19:12:49 -07:00
Zach Brown
edf3c8a5d4 scoutfs: add initial item block writing
Add a sync_fs method that writes dirty items into level 0 item blocks.

Add chunk allocator code to allocate new item blocks in free chunks.  As
the allocator bitmap is modified it adds bitmap entries to the ring.

As new item blocks are allocated we create manifest entries that
describe their block location and keys.  The entry is added to the
in-memory manifest and to entries in the ring.

This isn't complete and there's still bugs but this is enough to start
building on.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-17 17:47:32 -07:00
Zach Brown
d2ead58ce4 scoutfs: translate d_type in readdir
I had forgotten to translate from the scoutfs types in items to the vfs
types for filldir() so userspace was seeing garbage d_type values.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-16 14:04:20 -07:00
Zach Brown
c46fb0be78 scoutfs: fix sense of filldir return in readdir
The migration from the new iterator interface in upstream to the old
readdir interface in rhel7 got the sense of the filldir return code
wrong.  Any readdir would deadlock livelock as the dot entry was
returned at offset 0 without advancing f_pos.

Signed-off-by: Zach Brown <zab@versity.com>
2016-03-14 19:26:47 -07:00
Zach Brown
4b182c7759 scoutfs: insert manifest nodes into blkno radix
We had forgotten to actually insert manifest nodes in to the blkno
radix.  This hasn't mattered yet because there's only been one manifest
in the level 0 list.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-29 18:21:54 -08:00
Zach Brown
16abddb46a scoutfs: add basic segment reading
Add the most basic ability to read items from log segment blocks.  If
an item isn't in the cache then we walk segments in the manifest and
check for the item in each one.

This is just the core fundamental code.  There's still a lot to do:
basic corruption validation, multi-block segments, bloom filters and
arrays to optimize segment misses, and some day the ability to read file
data items directly into page cache pages.  The manifest locking is also
super broken.

But this is enough to let us mount and stat the root inode!

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-28 17:45:44 -08:00
Zach Brown
8604c85486 scoutfs: add basic reing replay on mount
Read the ring described by the super block and replay its entries to
rebuild the in-memory state of the chunk allocator and log segment
manifest.

We add just enough of the chunk allocator to set the free bits to the
contents of the ring bitmap entries.

We start to build out the basic manifest data structure.  It'll
certainly evolve when we later add code to actually query it.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-26 17:00:19 -08:00
Zach Brown
28521e8c45 scoutfs: add block read helper
Add a trivial helper function which verifies the block header in
metadata blocks.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-23 21:13:56 -08:00
Zach Brown
71df879f07 scoutfs: update format.h to remove bricks
Update to the format.h from the recent -utils changes that moved from
the clumsy 'brick' terminology to the more reasonable
'block/chunk/segment' terminology.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-23 19:39:02 -08:00
Zach Brown
6686ca191a scoutfs: remove the prototype log writing
The sync implementation was a quick demonstration of packing items in to
large log blocks.  We'll be doing things very differently in the actual
system.  So tear this code out so we can build up more functional
structures.  It'll still be in revision control so we'll be able
to reuse the parts that make sense in the new code.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-23 19:33:56 -08:00
Zach Brown
3483133cdf Read super brick instead of mkfs
Now that we have a working userspace mkfs we can read the supers on
mount instead of always initializing a new file system.  We still don't
know how to read items from blocks so mount fails when it can't find the
root dir inode.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-12 19:37:48 -08:00
Zach Brown
82ec91d1e0 Update format to recent utils changes
The format was updated while implementing mkfs and print in
scoutfs-utils.  Bring the kernel code up to speed.

For some reason I changed the name of the item length in the item header
struct.  Who knows.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-12 19:32:53 -08:00
Zach Brown
eb4694e401 Add simple message printing
Add a message printing function whose output includes the device and
major:minor and which handles the kernel level string prefix.

Signed-off-by: Zach Brown <zab@versity.com>
2016-02-12 19:28:03 -08:00
Zach Brown
25a1e8d1b7 Initial commit
This is the initial commit of the repo that will track development
against distro kernels.

This is an import of a prototype branch in the upstream kernel that only
had a few initial commits.  It needed to move to the old readdir
interface and use find_or_create_page() instead of pagecache_get_page()
to build in older distro kernels.
2016-02-05 14:12:14 -08:00