Our item cache protocol is tied to holding DLM locks which cover a
region of the item namespace. We want locks to cover all the data
associated with an inode and other locks to cover the indexes. So we
resort the items first by major (index, fs) then by inode type (inode,
dirent, etc).
Signed-off-by: Zach Brown <zab@versity.com>
Manifest entries and segment allocation bitmap regions are now stored in
btree items instead of the ring log. This lets us work with them
incrementally and share them between nodes.
Signed-off-by: Zach Brown <zab@versity.com>
Just lift the key printer from the kernel and use it to print
item keys in segments and in manifest entries.
Signed-off-by: Zach Brown <zab@versity.com>
Add support for the inode index items which are replacing the seq walks
from the old btree structures. We create the index items for the root
inode, can print out the items, and add a commmand to walk the indices.
Signed-off-by: Zach Brown <zab@versity.com>
Recent kernel headers have leaked __bitwise into userspace. Rename our
use of __bitwise in userspace sparse builds to avoid the collision.
Signed-off-by: Zach Brown <zab@versity.com>
It's a bit confusing to always see both the old and current super block.
Let's only print the first one. We could add an argument to print all
of them.
Signed-off-by: Zach Brown <zab@versity.com>
Add mkfs and print support for the simpler rings that the segment bitmap
allocator and manifest are now using. Some other recent format header
updates come along for the ride.
Signed-off-by: Zach Brown <zab@versity.com>
The segment item struct used to have fiddly packed offsets and lengths.
Now it's just normal fields so we can work with them directly and get
rid of the native item indirection.
Signed-off-by: Zach Brown <zab@versity.com>
We were using a bitmap to record segments during manifest printing and
then walking that bitmap to print segments. It's a little silly to have
a second data structure record the referenced segments when we could
just walk the manifest again to print the segments.
So refactor node printing into a treap walker that calls a function for
each node. Then we can have functions that print the node data
structurs for each treap and then one that prints the segments that are
referenced by manifest nodes.
Signed-off-by: Zach Brown <zab@versity.com>
We had changed the manifest keys to fully cover the space around the
segments in the hopes that it'd let item reading easily find negative
cached regions around items.
But that makes compaction think that segments intersect with items when
they really don't. We'd much rather avoid unnecessary compaction by
having the manifest entries precisely reflect the keys in the segment.
Item reading can do more work at run time to find the bounds of the key
space that are around the edges of the segments it works with.
Signed-off-by: Zach Brown <zab@versity.com>
Make sure that the manifest entries for a given level fully
cover the possible key space. This helps item reading describe
cached key ranges that extend around items.
Signed-off-by: Zach Brown <zab@versity.com>
Update mkfs and print to describe the ring blocks with a starting index
and number of blocks instead of a head and tail index.
Signed-off-by: Zach Brown <zab@versity.com>
Make a new file system by writing a root inode in a segment and storing
a manifest entry in the ring that references the segment.
Signed-off-by: Zach Brown <zab@versity.com>
We updated the code to use the new iteration of the data_version ioctl
but we forgot to update the ioctl definition so it didn't actually work.
Signed-off-by: Zach Brown <zab@versity.com>
mkfs was starting setting free blk bits from 0 instead of from
the blkno offset of the first free block. This resulted in
the highest order above a used blkno being marked free. Freeing
that blkno would set its lowest order blkno. Now that blkno can be
allocated from two orders. That, eventually, can lead to blocks
being doubly allocated and users trampling on each other.
While auditing the code to chase this bug down I also noticed that
write_buddy_blocks() was using a min() that makes no sense at all. Here
'blk' is inclusive, the modulo math works on its own.
Signed-off-by: Zach Brown <zab@versity.com>
The btree block now has a le16 nr_items field to make room for the
number of items that larger blocks can hold.
Signed-off-by: Zach Brown <zab@versity.com>
Update mkfs and print for the full radix buddy allocators. mkfs has to
calculate the number of blocks and the height of the tree and has to
initialize the paths down the left and right side of the tree.
Print needs to dump the new radix blockx and super block fields.
Signed-off-by: Zach Brown <zab@versity.com>