Use mounted_client item as sign of farewell

As clients unmount they send a farewell request that cleans up
persistent state associated with the mount.  The client needs to be sure
that it gets processed, and we must maintain a majority of quorum
members mounted to be able to elect a server to process farewell
requests.

We had a mechanism using the unmount_barrier fields in the greeting and
super_block to let the final unmounting quorum majority know that their
farewells have been processed and that they didn't need to keep trying
to reconnect.

But we missed that we also need this out of band farewell handling
signal for non-quorum member clients as well.  The server can send
farewells to a non-member client as well as the final majority and then
tear down all the connections before the non-quorum client can see its
farewell response.  It also needs to be able to know that its farewell
has been processed before the server let the final majority unmount.

We can remove the custom unmount_barrier method and instead have all
unmounting clients check for their mounted_client item in the server's
btree.  This item is removed as the last step of farewell processing so
if the client sees that it has been removed it knows that it doesn't
need to resend the farewell and can finish unmounting.

This fixes a bug where a non-quorum unmount could hang if it raced with
the final majority unmounting.  I was able to trigger this hang in our
tests with 5 mounts and 3 quorum members.

Signed-off-by: Zach Brown <zab@versity.com>
This commit is contained in:
Zach Brown
2021-02-04 16:25:15 -08:00
parent 79f6878355
commit 57f34e90e9
4 changed files with 57 additions and 59 deletions

View File

@@ -870,7 +870,6 @@ static void print_super_block(struct scoutfs_super_block *super, u64 blkno)
printf(" next_ino %llu next_trans_seq %llu\n"
" total_meta_blocks %llu first_meta_blkno %llu last_meta_blkno %llu\n"
" total_data_blocks %llu first_data_blkno %llu last_data_blkno %llu\n"
" unmount_barrier %llu\n"
" meta_alloc[0]: "ALCROOT_F"\n"
" meta_alloc[1]: "ALCROOT_F"\n"
" data_alloc: "ALCROOT_F"\n"
@@ -891,7 +890,6 @@ static void print_super_block(struct scoutfs_super_block *super, u64 blkno)
le64_to_cpu(super->total_data_blocks),
le64_to_cpu(super->first_data_blkno),
le64_to_cpu(super->last_data_blkno),
le64_to_cpu(super->unmount_barrier),
ALCROOT_A(&super->meta_alloc[0]),
ALCROOT_A(&super->meta_alloc[1]),
ALCROOT_A(&super->data_alloc),