scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-07-22 07:52:38 +00:00

Author	SHA1	Message	Date
Zach Brown	cd6cd000ce	Add ifdefed out quick treap printer This was pretty handy for debugging weird failure cases. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	eb94092f2f	Add kvec big endian inc and dec Add helpers that increment or decrement kvec vectors as theough they're big endian values. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	a15d37783e	Set read node parent through ref The code was using parent_ref() to set the parent ref's node pointer. But parent_ref() uses the parent's left node pointer to determine which ref points to the node. If we were setting the left it would return the right because the left isn't set yet. This messed up the tree shape and all hell broke loose. Just set it through the ref, we have it anyway. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	157b9294fa	Be sure not to overfill a segment with items The segment writing loop was assuming that the currently dirty items will fit in a segment. That's not true. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	be497f3fcf	Make sure to bubble the node aug bits up the treap We forgot to or in a node's children's augmentation bits when setting the augmentation bits up in the parent's ref. This stopped ring dirtying from finding all the dirty nodes in the treap. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	3333d89f82	Assign next seg seq from super We hadn't yet assigned real sequence numbers to the segments. Let's track the next sequence in the super block and assign it to segments as we write the first new item in each. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	db9f2be728	Switch to indexed manifest using treap ring The first pass manifest and allocator storage used a simple ring log that was entirely replayed into memory to be used. That risked the manifest being too large to fit in memory, especially with large keys and large volumes. So we move to using an indexed persistent structure that can be read on demand and cached. We use a treap of byte referenced nodoes stored in a circular ring. The code interface is modeled a bit on the in-memory rbtree interface. Except that we can get IO errors and manage allocation so we return data pointers to the item payload istead of item structs and we can return errors. The manifest and allocator are converted over and the old ring code is removed entirely. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	3dfe8e10df	Add little cmp u64 helper We have use of this little u64 comparison function in a few more places so let's make it a proper usable inline function in a header. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	b8d7e04262	Add negative caching of item ranges The item cache only knew about present items in the rbtree. Attempts to read items that didn't exist would always trigger expensive manfifest and segment searches. This reworks the item cache and item reading code to support the notion of cached ranges of keys. When we read items we also communicate the range of keys that we searched. This lets the cache return negative lookups for key values in the search that don't have items. The item cache gets an rbtree of key ranges. Each item lookup method now uses it to determine if a missing item needs to trigger a read. Item reading is now performed in batches instead of one at a time. This lets us specify the cache range along with the batch and apply them all atomically under the lock. The item range code is much more robust now that it has to track the range of keys that it searches. The read items call now takes a range. It knows to look for all level0 segments that interesect that range, not just the first key. The manifest segment references now include the min and max keys for the segment so we can use those to define the item search range. Since the refs now include keys we no longer have them as a dumb allocated array but instead have a list of alloced ref structs. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	cbb4282429	Add kvec key helpers Add a suite of simple kvec functions to work with kvecs that point to file system keys. The ones worth mentioning format keys into strings. They're used to add formatted strings for the keys to tracepoints. They're still a little rough but this is a functional first step. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	57a6ff087f	Add max key and max key size to format We're going to use these to support tracking cached item ranges. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	1b4bab3217	Fix ring read nr/part confusion Some parts of the ring reading were still using the old 'nr' for the number of blocks to read, but it's now the total number of blocks in the ring. Use part instead. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	c8d61c2e01	Manifest item reading tracked wrong key When iterating over items the manifest would always insert whatever values it found at the caller's key, instead of the key that it found in the segment. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	c74787a848	Harden, simplify, and shrink the kvecs The initial kvec code was a bit wobbly. It had raw loops, some weird constructs, and had more elements than we need. Add some iterator helpers that make it less likely that we'll screw up iterating over different length vectors. Get rid of reliance on a tailing null pointer and always use the count of elements to stop iterating. With that in place we can shrink the number of elements to just the greatest user. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	51a84447dd	Fix calculation of number of dirty segments The estimate for the number of dirty segment bytes was wildly over calculating the number of segment headers by confusing the length of the segment header with the length of segments. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	acee97ba2a	Fix tail ring entry zeroing The space calculation didn't include the terminating zero entry. That ensured that the space for the netry would never be consumed. But the remaining space was used to zero the end of the block so the final entry wasn't being zeroed. So have the space remaining include the terminating entry and factor that into the space consumption of each entry being appended. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	967e90e5ef	Fix kvec overlapping comparison The comparisons were a bit wrong when comparing overlaping kvec endpoints. We want to compare the starts and ends with the ends and starts, respectively. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	b251f91842	Fix updating parent dirty bits We were trying to propagate dirty bits from a node itself when its dirty bit is set. But it's bits are consistent so it stops immediately. We need to propagate from the parent of the node that changed. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	b7b43de8c7	Queue trans work in our work queue We went to the trouble of allocating a work queue with one work in flight but then didn't use it. We could have concurrent trans write func execution. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	a5cac107a1	Set END_IO on allocated segs A reader that hits an allocated segment would wait on IO forever. Setting the end_io bit lets readers use written segments. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	f9ca1885f9	Specify ring blocks with index,nr Specifying the ring blocks with a head and tail index lead to pretty confusing code to figure out how many blocks to read and if we had passed the tail. Instead specify the ring with a starting index and number of blocks. The code to read and write the ring blocks naturally falls out and is a lot more clear. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	b598bf855d	Fix ring appending and writing The ring appending next entry header cursor assignment was pointing past the caller's src header, not at the next header to write to in the block. The writing block index and blkno calculations were just bad. Pretend they never happened. And finally we need to point the dirty super at the ring index for the commit and we need to reset the append state for the next commit. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:53 -07:00
Zach Brown	b8ede1f6ee	Fix ring block tail zeroing The ring block tail zeroing memset treated the value as the offset to zero from, not the number of bytes at the tail to zero. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	fd7b09b4e4	Fix written manifest entry length Manifest entries were being written with the size of their in-memory nodes, not the smaller persistent add_manifest structure size. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	e418629bea	Remove items from trees before freeing Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	454767e992	Stop first dirty search looping first_dirty() forgot to stop if the tree had no dirty items at all. It'd just spin forever. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	a8a6d3697b	Fix dirty item counter calculations Unfortunately the generic augmented callbacks don't work for our augmented node bits which specifically reflect the left and right nodes. We need our own rotation callback and then we have boilerplate for the other two copy and propagate callbacks. Once we have to provide .propagate we can call it instead of our own update_dirty_parents() equivalent. In addition some callers messed up marking and clearing dirty. We only want to mark dirty item insertions, not all inserted items. And if we update an item's keys and values we need to clear and mark it to keep the counters consistent. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	fbd12b4dda	Fix existing item insertion Inserting an item over an existing key was super broken. Now that we're not replacing we can't stop descent if we find an existing item. We need to keep descending and then insert. And the caller needs to, you know, actually remove the existing item when it's found -- not the item it just inserted :P. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	5eb388ae6e	Fix seg item filling The two functions that added to items had little bugs. They initialized the item vectors incorrectly and didn't actually store the keys and values. Appending was always overwriting the first segment. Have it call 'nr' 'pos' like the rest of the code to make it more clear. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	07ba01f6b0	Iniitialize segment header when writing item Initialize the segment header as the items are written. This isn't a great place to do it. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	48f9be8455	Free key and value in the right order! Always key then value! twitch Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	9e02573e06	Rename scoutfs_seg_manfest_add Rename scoutfs_seg_add_ment to _manifest_add as that makes it a lot more clear that it's a wrapper around scoutfs_manifest_add() that gets its arguments from the segment. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	471405f8cd	Fix kvec iterators A few of the kvec iterators that work with byte offsets forgot to reset the offsets as they advanced to the next vec. These should probably be refactored into a set of iterator helpers. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	be98c4dfd8	Fix up manifest key use The manifest entries were changed to be a single contiguous allocation. The calculation of the vec that points to the last key vec was adding the key length in units of the add manifest struct. Adding the manifest wasn't setting the key lengths nor copying the keys into their position in the entry alloc. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:44:52 -07:00
Zach Brown	c4954eb6f4	Add initial LSM write implementation Add all the core strutural components to be able to modify metadata. We modify items in fs write operations, track dirty items in the cache, allocate free segment block reagions, stream dirty items into segments, write out the segments, update the manifest to reference the written segments, and write out a new ring that has the new manifest. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	b45ec8824b	Add a bunch of trace_printk()s There's nothing particularly coherent about these, they're what I added while debugging. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	641aae50ed	Fix ring block replay The ring block replay walk messed up the blkno it read from and its exit condition. It needed to test for having just replayed the tail before moving on. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	21d313e0f6	Corretly wait on submitted segment reads The segment waiting loop was rewritten to use n to iterate up to i, but the body of the loop still had i. Take that as a signal to Always Iterate With 'i' and store the last i and then iterate towards it with i again. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	d1f36e2165	Correctly store last manifest key A copy+paste error led us to overwrite the first key in the manifest with the last, leaving the last uninitialized. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	f3288f27c6	Declare full kvecs in manifest The manifest had silly single kvecs instead of the macros that define our maximal kvecs. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	6957c73aba	Have _lookup_exact return 0 scoutfs_item_lookup_exact() exists to only return one size. Have it just return 0 on success so callers don't have to remember that it returns > 0. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	3f27de0b2c	Fix hilarious BLOCK_SIZE typo Turns out BLOCK_SIZE is a thing and confused scoutfs into thinking it had many more blocks per segment then it did. Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	a201cff5ad	Read supers with bios instead of bl blocks Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	f7f840a342	Fix bio read completion init Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:42:30 -07:00
Zach Brown	43d0d44e48	Add initial LSM implementation Add the initial core components of the LSM implementation to be able to read the root inode: - bio.c: read big block regions - seg.c: cache logical segments - ring.c: read the manifest from storage - manifest.c: organize segments into an LSM - kvec.c: work with arbitrary memory vectors - item.c: cache fs metadata items read from segments Signed-off-by: Zach Brown <zab@versity.com>	2017-04-18 13:38:50 -07:00
Zach Brown	c6b688c2bf	Add staging ioctl This adds the ioctl for writing archived file contents back into the file if the data_version still matches. Signed-off-by: Zach Brown <zab@versity.com> Reviewed-by: Mark Fasheh <mfasheh@versity.com>	2016-11-16 14:45:08 -08:00
Zach Brown	df561bbd19	Add offline extent flag and release ioctl Add the _OFFLINE flag to indicate offline extents. The release ioctl frees extents within the release range and sets their _OFFLINE flag if the data_version still matches. We tweak the existing truncate item function just a bit to support making extents offline. We make it take an explicit range of blocks to remove instead of just giving it the size and it learns to mark extents offline and update them instead of always deleting them. Reads from offline extents return zeros like reading from a sparse region (later it will trigger demand staging) and writing to offline extents clears the offline flag (later only staging can do that). Signed-off-by: Zach Brown <zab@versity.com> Reviewed-by: Mark Fasheh <mfasheh@versity.com>	2016-11-16 14:45:08 -08:00
Zach Brown	5d87418925	Add ioctl for sampling inode data version Add an ioctl that samples the inode's data_version. Signed-off-by: Zach Brown <zab@versity.com> Reviewed-by: Mark Fasheh <mfasheh@versity.com>	2016-11-16 14:45:08 -08:00
Zach Brown	f86fab1162	Add an inode data_version field The data_version field is changed every time the contents of the file could have changed. Signed-off-by: Zach Brown <zab@versity.com> Reviewed-by: Mark Fasheh <mfasheh@versity.com>	2016-11-16 14:45:08 -08:00
Mark Fasheh	467801de73	scoutfs: use extents for file data We're very basic here at this stage and simply put a single-block extent item where we would have previously had a multi-block bmap item. Multi-block extents will come in future patches. Signed-off-by: Mark Fasheh <mfasheh@versity.com> Signed-off-by: Zach Brown <zab@versity.com>	2016-11-16 14:45:08 -08:00

1 2 3 4

186 Commits