Commit Graph

407 Commits

Author SHA1 Message Date
Zach Brown
826bd7f7bf scoutfs: ifdef out some unused dlmglue functions
Some dlmglue functions are unused by the current ifdefery.  They're
throwing warnigns that obscure other warnings in the build.  This
broadens the ifdef coverage so that we don't get warnings.  The unused
code will either be promoted to an interface or removed as dlmglue
evolves into a reusable component.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
8735d319a3 scoutfs: fix inode lock inversions
We lock multiple inodes by order of their inode number.  This fixes
the directory entry paths that hold parent dir and target inode locks.

Link and unlink are easy because they just acquire the existing parent
dir and target inode locks.

Lookup is a little squirrely because we don't want to try and order
the parent dir lock with locks down in iget.  It turns out that it's
safe to drop the dir lock before calling iget as long as iget handles
racing the inode cache instantiation with inode deletion.

Creation is the remaining pattern and it's a little weird because we
want to lock the newly created inode before we create it and the items
that store it.  We add a function that correctly orders the locks,
transaction, and inode cache instantiation.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
d2ea247ab9 scoutfs: remove scoutfs_item_delete_many()
It looked like it was easier to have a helper dirty and delete items.
But now that we also have to pass in locks the interface gets messy
enough that it's easier to have the caller take care of it.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
f634a5b598 scoutfs: implement scoutfs_rename()
Previously we had lots of inode creation callers that used a function to
create the dirent items and we had unlink remove entries by hand.
Rename is different because it wants to remove and add multiple links as
it does its work, including recreating links that it has deleted.

We rework add_entry_item() so that it gets the specific fields it needs
instead of getting them from the vfs structs.  This makes it clear that
callers are responsible for the source of the fields.  Specifically we
need to be able to add entries during failed rename cleanup without
allocating a new readdir pos from the parent dir.

With callers now responsible for the inputs to add_entry_items() we move
some of its code out into all callers: checking name length, dirtying
the parent dir inode, and allocating a readdir pos from the parent.

We then refactor most of _unlink() into a a del_entry_items() to match
addition.  This removes the last user of scoutfs_item_delete_many() and
it will be removed in a future commit.

With the entry item helpers taking specific fields all the helpers they
use also need to use specific fields instead of the vfs structs.

To make rename cluster safe we need to get cluster locks for all the
inodes that we work with.  We also have to check that the locally cached
vfs input is still valid after acquiring the locks.  We only check the
basic structural correctness of the args: that parent dirs don't violate
ancestor rules to create loops and that the entries assumed by the
rename arguments still exist, or not.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
3233ab47e8 scoutfs: add global lock names
Add a lock name that has a global scope in a given lockspace.  It's not
associated with any file system items.  We add a scope to the lock name
to indicate if a lock is global or not and set that in other lock naming
intitialization.  We permit lock allocation to accept null start and end
keys.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
e47d66ddd3 scoutfs: add scoutfs_lock_inodes()
Add a function that can lock multiple inodes in order of their inode
numbers.  It handles nulls and duplicate inodes.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Mark Fasheh
c4e7b5a6e9 scoutfs: provide cluster safe ->llseek
Without this we return -ESPIPE when a process tries to seek on a regular
file.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
[zab: adapted to new lock call]
Signed-off-by: Zach Brown <zab@zabbo.net>
2017-08-30 10:38:00 -07:00
Mark Fasheh
1bcad2e9cc scoutfs: provide ->permission
We need to lock and refresh the VFS inode before it checks permissions in
system calls, otherwise we risk checking against stale inode metadata.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
[zab: adapted to newer lock call]
Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Mark Fasheh
c0d3f99a6e scoutfs: Cluster coherent read/write
With trylock implemented we can add locking in readpage. After that it's
pretty easy to implement our own read/write functions which at this
point are more or less wrapping the kernel helpers in the correct
cluster locking.

Data invalidation is a bit interesting. If the lock we are invalidating
is an inode group lock, we use the lock boundaries to incrementally
search our inode cache. When an inode struct is found, we sync and
(optionally) truncate pages.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
[zab: adapted to newer lock call, fixed some error handling]
Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
ceccc56c8f scoutfs: add inode locking flags to callers
Now that we have the inode refreshing flags let's add them to the
callers that want to have a current inode after they have their lock.
Callers locking newly created items use the new inode flag to reset the
refresh gen.

A few inode tests are moved down to after locking so that it can test
the current refreshed inode.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:38:00 -07:00
Zach Brown
a08530a24e scoutfs: add LKF_TRYLOCK
Add a flag that tells locking to return -EAGAIN if it hits contention.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:37:59 -07:00
Zach Brown
fdbe0de8e9 scoutfs: add flag to refresh inode after locking
Lock callers can specify that they want inode fields reread from items
after the lock is acquired.   dlmglue sets a refresh_gen in the locks
that we store in inodes to track when they were last refreshed and if
they need a refresh.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:37:59 -07:00
Zach Brown
d2a1b915fc scoutfs: publish refresh_gen from dlmglue
In addition to setting NEEDS_REFRESH when locks are acquired out of NL,
we now also give them a refresh_gen counter that is increased by
incrementing a long lived counter in the super.

This gives callers a strictly increasing read-only indication that the
lock has changed.  They don't have to serialize users to clear
NEEDS_REFRESH and transfer it to some other serialized state.

scoutfs will use with the multiple inodes that are refreshed with
respect to the lock's refresh_gen.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:37:59 -07:00
Zach Brown
51e03dcb7a scoutfs: refactor inode locking function
This is based on Mark Fasheh <mfasheh@versity.com>'s series that
introduced inode refreshing after locking and a trylock for readpage.

Rework the inode locking function so that it's more clearly named and
takes flags and the inode struct.

We have callers that want to lock the logical inode but aren't doing
anything with the vfs inode so we provide that specific entry point.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-30 10:37:59 -07:00
Mark Fasheh
e2befc8736 scoutfs: silence dlmglue mlog()
These debug prints are spamming the console, send them to the trace
buffer instead.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-29 18:45:02 -05:00
Mark Fasheh
0c1a81621b scoutfs: #if 0 out lockdep code in dlmglue
This portion of the port needs a bit of work before we can use
it in scoutfs. In the meantime, disable it so that we can build
on debug kernels.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-28 16:44:22 -05:00
Mark Fasheh
3a54b413d5 scoutfs: remove some #ifdef'd out definitions in dlmglue.h
These make it hard to read the header and are very ocfs2-specific
functions that would get moved when we merge this upstream anyway.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-28 16:37:15 -05:00
Mark Fasheh
dc15c610ca scoutfs: fix null pointer deref in get_manifest_refs()
When we're not the server node, 'mani' is NULL, so derefing it in our
loop causes a crash. That said, we don't need it anyway - the loop will
eventually end when our btree walk (via btree_prev_overlap_or_next())
ends.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-24 16:41:35 -05:00
Mark Fasheh
0011c185a9 scoutfs: plug the rest of our locking into dlmglue
We move struct ocfs2_lock_res_ops and flags to dlmglue.c so that
locks.c can get access to it. Similarly, we export
ocfs2_lock_res_init_common() for locks.c can initialize each lockres
before use. Also, free_lock_tree() now has to happen before we shut
down the dlm - this gives dlmglue the opportunity to unlock their
underlying dlm locks before we go off freeing the structures.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-24 11:45:15 -05:00
Mark Fasheh
00f5ebf38c scoutfs: use dlmglue for lockspace bringup/shutdown
Ultimataly the direct dlm lock calls will go away. For now though we
grab the lockspace off our cluster connection object. In order to get
this going, I stubbed out our recovery callbacks which now gets us a
print when a node goes down.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-24 02:27:24 -05:00
Mark Fasheh
6308f347c0 scoutfs: provide a function to init and uninit our dlmglue context
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-24 00:07:22 -05:00
Mark Fasheh
4fb011ca71 scoutfs: export ocfs2_cluster_(un)lock from dlmglue.c
This is what we'll want to build our scoutfs locks on.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 22:18:42 -05:00
Mark Fasheh
b1084bee8f scoutfs: enable ocfs2_dlm_init/ocfs2_dlm_shutdown
These work with little modification. We comment out a couple
ocfs2-specific lines. And decouple a few more variables from the osb
structure. As it stands, ocfs2 could also use these init/shutdown
functions with little modification.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 22:12:24 -05:00
Mark Fasheh
72a8e9e171 scoutfs: pull in some of ocfs2 stackglue
Dlmglue is built on top of this. Bring in the portions we need which
includes the stackglue API as well as most of the fs/dlm implementation.
I left off the Ocfs2 specific version and connection handling. Also
left out is the old Ocfs2 dlm support which we'll never want.

Like dlmglue, we keep as much of the generic stackglue code in tact
here. This will make translating to/from upstream patches much easier.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 21:40:20 -05:00
Mark Fasheh
960f8e08bb scoutfs: copy in DLM_LVB_LEN from fs/ocfs2/dlm/dlmapi.h
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 19:06:18 -05:00
Mark Fasheh
114760365c scoutfs: fix up ocfs2_log_dlm_error()
We're still referencing some ocfs2 specific lock names here,
take them out.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 19:00:05 -05:00
Mark Fasheh
61499c5d30 scoutfs: pull in struct ocfs2_dlm_debug from fs/ocfs2/ocfs2.h
We need this for the dlmglue global context.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:52:49 -05:00
Mark Fasheh
1b59ed99fb scoutfs: remove ocfs2_lock_res->l_type
We don't need it - this the only ocfs2-ism in struct ocfs2_lock_res.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:51:38 -05:00
Mark Fasheh
bb100356d9 scoutfs: pull in some fields from ocfs2_super for dlmglue
This is all the dlmglue global context needed.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:37:06 -05:00
Mark Fasheh
1831014c24 scoutfs: remove usage of ocfs2_lock_type_string()
This only leaked into the bast function. I retained the debug print -
it'll be turned off in our build anyway, and that's what we'd
want to do upstream.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:16:00 -05:00
Mark Fasheh
13963d22e3 scoutfs: pull in OCFS2_LOCK_ID_MAX_LEN
We need this for the lockres name. It also turns out to be the only
thing we need from fs/ocfs2/ocfs2_lockid.h.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:12:54 -05:00
Mark Fasheh
9bfb9c059d scoutfs: copy struct ocfs2_lock_res
Grab this from fs/ocfs2/ocfs2.h and put it in dlmglue.h.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:07:52 -05:00
Mark Fasheh
99d00a5a2f scoutfs: dlmglue needs to #include "dlmglue.h"
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 18:07:31 -05:00
Mark Fasheh
2142648906 scoutfs: include linux/dlm.h
dlmglue needs this as we're no longer hooking it into the stackglue
component.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 17:59:10 -05:00
Mark Fasheh
498a2f3721 scoutfs: ifdef out usage of OCFS2_LOCK_TYPE_DENTRY
Some of this leaks through even after the big #ifdef'ing - ocfs2 had
to special case printing the name of dentry locks. We don't have such
a need so it's easy to drop those calls.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 17:57:34 -05:00
Mark Fasheh
bf6020c22b scoutfs: hide lockdep_keys in dlmglue for now
This belongs behind #ifdef CONFIG_DEBUG_LOCK_ALLOC in the
upstream code too.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 17:47:50 -05:00
Mark Fasheh
d4a89a5fbc scoutfs: dlmglue ifdef out ocfs2_build_lock_name()
This was missed in the initial #ifdef patch.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 17:46:55 -05:00
Mark Fasheh
500baca533 scoutfs: wrap some mlog calls in dlmglue
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 17:15:23 -05:00
Mark Fasheh
eae932e0fe scoutfs: dlmglue fix sched.h header
Upstream moved linux/sched.h to linux/sched/signal.h. Centos still uses
the old header location.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 16:05:34 -05:00
Mark Fasheh
bc2fef7fc8 scoutfs: ifdef out ocfs2 specific callbacks and functions
We only want the generic stuff. Long term the Ocfs2 specific code would be
what's left in fs/ocfs2/dlmglue.[ch].

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 16:05:24 -05:00
Mark Fasheh
fc21a0253c scoutfs: Hook dlmglue into our build system
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-23 15:54:08 -05:00
Mark Fasheh
f7e3f6f9e6 scoutfs: import fs/ocfs2/dlmglue.[ch] from Linux v4.13-rc6
Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-22 19:07:53 -05:00
Mark Fasheh
021404bb6a scoutfs: remove inode ctime index
Like the mtime index, this index is unused. Removing it is a near
identical task. Running the same createmany test from our last
patch gives us the following:

 $ createmany -o '/scoutfs/file_%lu' 10000000

 total: 10000000 creates in 598.28 seconds: 16714.59 creates/second

 real    9m58.292s
 user    0m7.420s
 sys     5m44.632s

So after both indices are gone, we go from a 12m56 run time to 9m58s,
saving almost 3 minutes which translates into a total performance
increase of about 23%.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-22 15:59:13 -07:00
Mark Fasheh
d59367262d scoutfs: remove inode mtime index
This index is unused - we can gain some create performance by removing it.

To verify this, I ran createmany for 10 million files:

 $ createmany -o '/scoutfs/file_%lu' 10000000

Before this patch:
 total: 10000000 creates in 776.54 seconds: 12877.56 creates/second

 real    12m56.557s
 user    0m7.861s
 sys     6m56.986s

After this patch:
 total: 10000000 creates in 691.92 seconds: 14452.46 creates/second

 real    11m31.936s
 user    0m7.785s
 sys     6m19.328s

So removing the index gained us about a minute and a half on the test or a
12% performance increase.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-22 15:59:13 -07:00
Zach Brown
8135b18c76 scoutfs: start truncate from first block
Truncation updates extents that intersect with the input range.  It
starts with the first block in the range and iterates until it has
searched for all the extents that could cover the range.

Extents are stored in items at their final block location so that we can
use _next to find intersections.  Truncation was searching for the next
extent after the full extent that it was still searching for.  That
means it was starting the search at the last block in the extent, not
the first.  It would miss all the extents that didn't overlap with the
last block it was searching for.

This fixed by searching from a temporary single block extent at the
start of the search range.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-17 15:29:08 -07:00
Mark Fasheh
d1ae486d83 scoutfs: provide ->llseek
Without this we return -ESPIPE when a process tries to seek on a regular
file.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
2017-08-14 19:57:13 -07:00
Zach Brown
07bbc418c3 scoutfs: merge offline extents
Offline extents weren't being merged because they all had their physical
blkno set to 0 and all the extent calculations didn't treat them
specially.  They would only merge if the physical blocks of two extent
were contiguous.  Instead of special casing offline extents everywhere
we store them with a physical blkno set to the logical blk_off.  This
lets all the current extent calculations work as expected.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-14 09:19:03 -07:00
Zach Brown
7cc09761f5 scoutfs: release item cleanup needs transaction
Release tries to re-instate extents if it sees an error during release.
Those item manipulations need to be covered by the transaction.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-14 09:19:03 -07:00
Zach Brown
c7ad9fe772 scoutfs: make release block granular
The existing release interface specified byte regions to release but
that didn't match what the underlying file data mapping structure is
capable of.  What happens if you specify a single byte to release?  Does
it release the whole block?  Does it release nothing?  Does it return an
error?

By making the interface match the capability of the operation we make
the functioning of the system that much more predictable.  Callers are
forced to think about implementing their desires in terms of block
granular releasing.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-14 09:19:03 -07:00
Zach Brown
87ab27beb1 scoutfs: add statfs network message
The ->statfs method was still using the super_block in the super_info
that was read during mount.  This will get progressively more out
of date.

We add a network message to ask the server for the current fields that
impact statfs.  This is always racy and the fields are mostly nonsense,
but we try our best.

Signed-off-by: Zach Brown <zab@versity.com>
2017-08-11 10:43:35 -07:00