Now that we have the interval tree we can use it to store the manifest.
Instead of having different indexes for each level we store all the
levels in one index.
This simplifies the code quite a bit. In particular, we won't have to
special case merging between level 0 and 1 quite as much because level 0
is no longer a special list.
We have a strong motivation to keep the manifest small. So we get
rid of the blkno radix. It wasn't wise to trade off more manifest
storage to make the ring a bit smaller. We can store full manifests
in the ring instead of just the block numbers.
We rework the new_manifest interace that adds a final manifest entry
and logs it. The ring entry addition and manifest update are atomic.
We're about to implement merging which will permute the manifest. Read
methods won't be able to iterate over levels while racing with merging.
We change the manifest key search interface to return a full set of all
the segments that intersect the key.
The next item interface now knows how to restart the search if hits
the end of a segment on one level and the next least key is in another
segment and greater than the end of that completed segment.
There was also a very crazy cut+paste bug where next item was testing
that the item is past the last search key with a while instead of an if.
It'd spin throwing list_del_init() and brelse() debugging warnings.
Signed-off-by: Zach Brown <zab@versity.com>
Manifests for newly written segments can be inserted at the highest
level that doesn't have segments they intersect. This avoids ring and
merging churn.
The change cleans up the code a little bit, which is nice, and adds
tracepoints for manifests entering and leaving the in memory structures.
Signed-off-by: Zach Brown <zab@versity.com>
Initially items were stored in memory with an rbtree. That let us build
up the API above items without worrying about their storage. That gave
us dirty items in memory and we could start working on writing them to
and reading them from the log segment blocks.
Now that we have the code on either side we can get rid of the item
cache in between. It had some nice properties but it's fundamentally
duplicating the item storage in cached log segment blocks. We'd also
have to teach it to differentiate between negative cache entries and
missing entries that need to be filled from blocks. And the giant item
index becomes a bottleneck.
We have to index items in log segments anyway so we rewrite the item
APIs to read and write the items in the log segments directly. Creation
writes to dirty blocks in memory and reading and iteration walk through
the cached blocks in the buffer cache.
I've tried to comment the files and functions appropriately so most of
the commentary for the new methods is in the body of the commit.
The overall theme is making it relatively efficient to operate on
individual items in log segments. Previously we could only walk all the
items in an existing segment or write all the dirty items to a new
segment. Now we have bloom filters and sorted item headers to let us
test for the presence of an item's key with progressively more expensive
methods. We hold on to a dirty segment and fill it as we create new
items.
This needs more fleshing out and testing but this is a solid first pass
and it passes our existing tests.
Signed-off-by: Zach Brown <zab@versity.com>
Add a sync_fs method that writes dirty items into level 0 item blocks.
Add chunk allocator code to allocate new item blocks in free chunks. As
the allocator bitmap is modified it adds bitmap entries to the ring.
As new item blocks are allocated we create manifest entries that
describe their block location and keys. The entry is added to the
in-memory manifest and to entries in the ring.
This isn't complete and there's still bugs but this is enough to start
building on.
Signed-off-by: Zach Brown <zab@versity.com>
Read the ring described by the super block and replay its entries to
rebuild the in-memory state of the chunk allocator and log segment
manifest.
We add just enough of the chunk allocator to set the free bits to the
contents of the ring bitmap entries.
We start to build out the basic manifest data structure. It'll
certainly evolve when we later add code to actually query it.
Signed-off-by: Zach Brown <zab@versity.com>