Compare commits

...

13 Commits

Author SHA1 Message Date
Zach Brown
a49584739a Use count/scan objects shrinking interface
Move to the more recent interfaces for counting and scanning cached
objects to shrink.

Signed-off-by: Zach Brown <zab@versity.com>
2022-08-02 15:29:48 -07:00
Zach Brown
fd1c4777c2 Use more modern bio interfaces
Move towards modern bio intefaces, while unfortunately carrying along a
bunch of compat functions that let us still work with the old
incompatible interfaces.

Signed-off-by: Zach Brown <zab@versity.com>
2022-08-01 14:10:40 -07:00
Zach Brown
0b0beb2830 Use memalloc_nofs_save
memalloc_nofs_save() was introduced as preferential to trying to use GFP
flags to indicate that a task should not recurse during reclaim.  We use
it instead of the _noio_ we were using before.

Signed-off-by: Zach Brown <zab@versity.com>
2022-08-01 09:25:17 -07:00
Zach Brown
bb006191e0 Use percpu_counter_add_batch
__percpu_counter_add_batch was renamed to make it clear that the __
doesn't mean it's less safe, as it means in other calls in the API, but
just that it takes an additional parameter.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-21 11:23:02 -07:00
Zach Brown
89b64ae1f7 Merge pull request #97 from versity/zab/v1_6_release
v1.6 Release
2022-07-07 14:54:26 -07:00
Zach Brown
fc8a5a1b5c v1.6 Release
Finish the release notes for the 1.6 release.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-07 13:07:55 -07:00
Zach Brown
d4c793e010 Merge pull request #94 from versity/zab/mem_free_fixes
Zab/mem free fixes
2022-07-07 13:07:04 -07:00
Zach Brown
8a3058818c Merge pull request #95 from versity/zab/skip_likely_huge
Add skip-likely-huge print option
2022-07-07 10:27:50 -07:00
Zach Brown
ba9a106f72 Free send attempts to disconnected clients
Callers who send to specific client connections can get -ENOTCONN if
their client has gone away.   We forgot to free the send tracking struct
in that case.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-06 15:16:20 -07:00
Zach Brown
310725eb72 Free omap rid list as server exits
The omap code keeps track of rids that are connected to the server.  It
only freed the tracked rids as the server told it that rids were being
removed.   But that removal only happened as clients were evicted.  If
the server shutdown it'd leave the old rid entries around.   They'd be
leaked as the mount was unmounted and could linger and crate duplicate
entries if the server started back up and the same clients reconnected.

The fix is to free the tracking rids as the server shuts down.   They'll
be rebuilt as clients reconnect if the server restarts.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-06 15:16:19 -07:00
Zach Brown
51a8236316 Fix missed partial fill_super teardown
If we return an error from .fill_super without having set sb->s_root
then the vfs won't call our put_super.  Our fill_super is careful to
call put_super so that it can tear down partial state, but we weren't
doing this with a few very early errors in fill_super.  This tripped
leak detection when we weren't freeing the sbi when returning errors
from bad option parsing.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-06 15:16:19 -07:00
Zach Brown
f3dd00895b Don't allocate zero size net info
Clients don't use the net conn info and specified that it has 0 size.
The net layer would try and allocate a zero size region which returns
the magic ZERO_SIZE_PTR, which it would then later try and free.  While
that works, it's a little goofy.   We can avoid the allocation when the
size is 0.  The pointer will remain null which kfree also accepts.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-06 15:16:19 -07:00
Zach Brown
49df98f5a8 Add skip-likely-huge print option
Add an option to skip printing structures that are likely to be so huge
that the print output becomes completely unwieldly on large systems.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-06 15:07:57 -07:00
12 changed files with 304 additions and 73 deletions

View File

@@ -1,6 +1,22 @@
Versity ScoutFS Release Notes
=============================
---
v1.6
\
*Jul 7, 2022*
* **Fix memory leaks in rare corner cases**
\
Analysis tools found a few corner cases that leaked small structures,
generally around error handling or startup and shutdown.
* **Add --skip-likely-huge scoutfs print command option**
\
Add an option to scoutfs print to reduce the size of the output
so that it can be used to see system-wide metadata without being
overwhelmed by file-level details.
---
v1.5
\

View File

@@ -46,6 +46,10 @@ scoutfs-y += \
volopt.o \
xattr.o
ifdef KC_BUILD_KERNELCOMPAT
scoutfs-y += kernelcompat.o
endif
#
# The raw types aren't available in userspace headers. Make sure all
# the types we use in the headers are the exported __ versions.

View File

@@ -34,3 +34,52 @@ endif
ifneq (,$(shell grep 'FMODE_KABI_ITERATE' include/linux/fs.h))
ccflags-y += -DKC_FMODE_KABI_ITERATE
endif
#
# v4.11-12447-g104b4e5139fe
#
# Renamed __percpu_counter_add to percpu_counter_add_batch to clarify
# that the __ wasn't less safe, just took an extra parameter.
#
ifneq (,$(shell grep 'percpu_counter_add_batch' include/linux/percpu_counter.h))
ccflags-y += -DKC_PERCPU_COUNTER_ADD_BATCH
endif
#
# v4.11-4550-g7dea19f9ee63
#
# Introduced memalloc_nofs_{save,restore} preferred instead of _noio_.
#
ifneq (,$(shell grep 'memalloc_nofs_save' include/linux/sched/mm.h))
ccflags-y += -DKC_MEMALLOC_NOFS_SAVE
endif
#
# v4.7-12414-g1eff9d322a44
#
# Renamed bi_rw to bi_opf to force old code to catch up. We use it as a
# single switch between old and new bio structures.
#
ifneq (,$(shell grep 'bi_opf' include/linux/blk_types.h))
ccflags-y += -DKC_BIO_BI_OPF
endif
#
# v4.12-rc2-201-g4e4cbee93d56
#
# Moves to bi_status BLK_STS_ API instead of having a mix of error
# end_io args or bi_error.
#
ifneq (,$(shell grep 'bi_status' include/linux/blk_types.h))
ccflags-y += -DKC_BIO_BI_STATUS
endif
#
# v3.11-8765-ga0b02131c5fc
#
# Remove the old ->shrink() API, ->{scan,count}_objects is preferred.
#
ifneq (,$(shell grep '(*shrink)' include/linux/shrinker.h))
ccflags-y += -DKC_SHRINKER_SHRINK
KC_BUILD_KERNELCOMPAT=1
endif

View File

@@ -21,6 +21,7 @@
#include <linux/blkdev.h>
#include <linux/rhashtable.h>
#include <linux/random.h>
#include <linux/sched/mm.h>
#include "format.h"
#include "super.h"
@@ -57,7 +58,7 @@ struct block_info {
atomic64_t access_counter;
struct rhashtable ht;
wait_queue_head_t waitq;
struct shrinker shrinker;
KC_DEFINE_SHRINKER(shrinker);
struct work_struct free_work;
struct llist_head free_llist;
};
@@ -128,7 +129,7 @@ static __le32 block_calc_crc(struct scoutfs_block_header *hdr, u32 size)
static struct block_private *block_alloc(struct super_block *sb, u64 blkno)
{
struct block_private *bp;
unsigned int noio_flags;
unsigned int nofs_flags;
/*
* If we had multiple blocks per page we'd need to be a little
@@ -156,9 +157,9 @@ static struct block_private *block_alloc(struct super_block *sb, u64 blkno)
* spurious reclaim-on dependencies and warnings.
*/
lockdep_off();
noio_flags = memalloc_noio_save();
nofs_flags = memalloc_nofs_save();
bp->virt = __vmalloc(SCOUTFS_BLOCK_LG_SIZE, GFP_NOFS | __GFP_HIGHMEM, PAGE_KERNEL);
memalloc_noio_restore(noio_flags);
memalloc_nofs_restore(nofs_flags);
lockdep_on();
if (!bp->virt) {
@@ -436,11 +437,10 @@ static void block_remove_all(struct super_block *sb)
* possible. Final freeing, verifying checksums, and unlinking errored
* blocks are all done by future users of the blocks.
*/
static void block_end_io(struct super_block *sb, int rw,
static void block_end_io(struct super_block *sb, unsigned int opf,
struct block_private *bp, int err)
{
DECLARE_BLOCK_INFO(sb, binf);
bool is_read = !(rw & WRITE);
if (err) {
scoutfs_inc_counter(sb, block_cache_end_io_error);
@@ -450,7 +450,7 @@ static void block_end_io(struct super_block *sb, int rw,
if (!atomic_dec_and_test(&bp->io_count))
return;
if (is_read && !test_bit(BLOCK_BIT_ERROR, &bp->bits))
if (!op_is_write(opf) && !test_bit(BLOCK_BIT_ERROR, &bp->bits))
set_bit(BLOCK_BIT_UPTODATE, &bp->bits);
clear_bit(BLOCK_BIT_IO_BUSY, &bp->bits);
@@ -463,13 +463,13 @@ static void block_end_io(struct super_block *sb, int rw,
wake_up(&binf->waitq);
}
static void block_bio_end_io(struct bio *bio, int err)
static void KC_DECLARE_BIO_END_IO(block_bio_end_io, struct bio *bio)
{
struct block_private *bp = bio->bi_private;
struct super_block *sb = bp->sb;
TRACE_BLOCK(end_io, bp);
block_end_io(sb, bio->bi_rw, bp, err);
block_end_io(sb, kc_bio_get_opf(bio), bp, kc_bio_get_errno(bio));
bio_put(bio);
}
@@ -477,7 +477,7 @@ static void block_bio_end_io(struct bio *bio, int err)
* Kick off IO for a single block.
*/
static int block_submit_bio(struct super_block *sb, struct block_private *bp,
int rw)
unsigned int opf)
{
struct scoutfs_sb_info *sbi = SCOUTFS_SB(sb);
struct bio *bio = NULL;
@@ -510,8 +510,9 @@ static int block_submit_bio(struct super_block *sb, struct block_private *bp,
break;
}
bio->bi_sector = sector + (off >> 9);
bio->bi_bdev = sbi->meta_bdev;
kc_bio_set_opf(bio, opf);
kc_bio_set_sector(bio, sector + (off >> 9));
bio_set_dev(bio, sbi->meta_bdev);
bio->bi_end_io = block_bio_end_io;
bio->bi_private = bp;
@@ -528,18 +529,18 @@ static int block_submit_bio(struct super_block *sb, struct block_private *bp,
BUG();
if (!bio_add_page(bio, page, PAGE_SIZE, 0)) {
submit_bio(rw, bio);
submit_bio(bio);
bio = NULL;
}
}
if (bio)
submit_bio(rw, bio);
submit_bio(bio);
blk_finish_plug(&plug);
/* let racing end_io know we're done */
block_end_io(sb, rw, bp, ret);
block_end_io(sb, opf, bp, ret);
return ret;
}
@@ -640,7 +641,7 @@ static struct block_private *block_read(struct super_block *sb, u64 blkno)
if (!test_bit(BLOCK_BIT_UPTODATE, &bp->bits) &&
test_and_clear_bit(BLOCK_BIT_NEW, &bp->bits)) {
ret = block_submit_bio(sb, bp, READ);
ret = block_submit_bio(sb, bp, REQ_OP_READ);
if (ret < 0)
goto out;
}
@@ -939,7 +940,7 @@ int scoutfs_block_writer_write(struct super_block *sb,
/* retry previous write errors */
clear_bit(BLOCK_BIT_ERROR, &bp->bits);
ret = block_submit_bio(sb, bp, WRITE);
ret = block_submit_bio(sb, bp, REQ_OP_WRITE);
if (ret < 0)
break;
}
@@ -1039,6 +1040,17 @@ u64 scoutfs_block_writer_dirty_bytes(struct super_block *sb,
return wri->nr_dirty_blocks * SCOUTFS_BLOCK_LG_SIZE;
}
static unsigned long block_count_objects(struct shrinker *shrink, struct shrink_control *sc)
{
struct block_info *binf = container_of(shrink, struct block_info, shrinker);
struct super_block *sb = binf->sb;
scoutfs_inc_counter(sb, block_cache_scan_objects);
return min_t(u64, (u64)atomic_read(&binf->total_inserted) * SCOUTFS_BLOCK_LG_PAGES_PER,
ULONG_MAX / 2); /* magic numbers as we approach ~0UL :/ */
}
/*
* Remove a number of cached blocks that haven't been used recently.
*
@@ -1059,23 +1071,19 @@ u64 scoutfs_block_writer_dirty_bytes(struct super_block *sb,
* atomically remove blocks when the only references are ours and the
* hash table.
*/
static int block_shrink(struct shrinker *shrink, struct shrink_control *sc)
static unsigned long block_scan_objects(struct shrinker *shrink, struct shrink_control *sc)
{
struct block_info *binf = container_of(shrink, struct block_info,
shrinker);
struct block_info *binf = container_of(shrink, struct block_info, shrinker);
struct super_block *sb = binf->sb;
struct rhashtable_iter iter;
struct block_private *bp;
unsigned long freed = 0;
unsigned long nr;
u64 recently;
nr = sc->nr_to_scan;
if (nr == 0)
goto out;
scoutfs_inc_counter(sb, block_cache_scan_objects);
scoutfs_inc_counter(sb, block_cache_shrink);
nr = DIV_ROUND_UP(nr, SCOUTFS_BLOCK_LG_PAGES_PER);
nr = DIV_ROUND_UP(sc->nr_to_scan, SCOUTFS_BLOCK_LG_PAGES_PER);
restart:
recently = accessed_recently(binf);
@@ -1118,6 +1126,7 @@ restart:
if (block_remove_solo(sb, bp)) {
scoutfs_inc_counter(sb, block_cache_shrink_remove);
TRACE_BLOCK(shrink, bp);
freed++;
nr--;
}
block_put(sb, bp);
@@ -1126,9 +1135,8 @@ restart:
rhashtable_walk_stop(&iter);
rhashtable_walk_exit(&iter);
out:
return min_t(u64, (u64)atomic_read(&binf->total_inserted) * SCOUTFS_BLOCK_LG_PAGES_PER,
INT_MAX);
return freed;
}
struct sm_block_completion {
@@ -1136,11 +1144,11 @@ struct sm_block_completion {
int err;
};
static void sm_block_bio_end_io(struct bio *bio, int err)
static void KC_DECLARE_BIO_END_IO(sm_block_bio_end_io, struct bio *bio)
{
struct sm_block_completion *sbc = bio->bi_private;
sbc->err = err;
sbc->err = kc_bio_get_errno(bio);
complete(&sbc->comp);
bio_put(bio);
}
@@ -1155,9 +1163,8 @@ static void sm_block_bio_end_io(struct bio *bio, int err)
* only layer that sees the full block buffer so we pass the calculated
* crc to the caller for them to check in their context.
*/
static int sm_block_io(struct super_block *sb, struct block_device *bdev, int rw, u64 blkno,
struct scoutfs_block_header *hdr, size_t len,
__le32 *blk_crc)
static int sm_block_io(struct super_block *sb, struct block_device *bdev, unsigned int opf,
u64 blkno, struct scoutfs_block_header *hdr, size_t len, __le32 *blk_crc)
{
struct scoutfs_block_header *pg_hdr;
struct sm_block_completion sbc;
@@ -1171,7 +1178,7 @@ static int sm_block_io(struct super_block *sb, struct block_device *bdev, int rw
return -EIO;
if (WARN_ON_ONCE(len > SCOUTFS_BLOCK_SM_SIZE) ||
WARN_ON_ONCE(!(rw & WRITE) && !blk_crc))
WARN_ON_ONCE(!op_is_write(opf) && !blk_crc))
return -EINVAL;
page = alloc_page(GFP_NOFS);
@@ -1180,7 +1187,7 @@ static int sm_block_io(struct super_block *sb, struct block_device *bdev, int rw
pg_hdr = page_address(page);
if (rw & WRITE) {
if (op_is_write(opf)) {
memcpy(pg_hdr, hdr, len);
if (len < SCOUTFS_BLOCK_SM_SIZE)
memset((char *)pg_hdr + len, 0,
@@ -1194,8 +1201,9 @@ static int sm_block_io(struct super_block *sb, struct block_device *bdev, int rw
goto out;
}
bio->bi_sector = blkno << (SCOUTFS_BLOCK_SM_SHIFT - 9);
bio->bi_bdev = bdev;
bio->bi_opf = opf | REQ_SYNC;
kc_bio_set_sector(bio, blkno << (SCOUTFS_BLOCK_SM_SHIFT - 9));
bio_set_dev(bio, bdev);
bio->bi_end_io = sm_block_bio_end_io;
bio->bi_private = &sbc;
bio_add_page(bio, page, SCOUTFS_BLOCK_SM_SIZE, 0);
@@ -1203,12 +1211,12 @@ static int sm_block_io(struct super_block *sb, struct block_device *bdev, int rw
init_completion(&sbc.comp);
sbc.err = 0;
submit_bio((rw & WRITE) ? WRITE_SYNC : READ_SYNC, bio);
submit_bio(bio);
wait_for_completion(&sbc.comp);
ret = sbc.err;
if (ret == 0 && !(rw & WRITE)) {
if (ret == 0 && !op_is_write(opf)) {
memcpy(hdr, pg_hdr, len);
*blk_crc = block_calc_crc(pg_hdr, SCOUTFS_BLOCK_SM_SIZE);
}
@@ -1222,14 +1230,14 @@ int scoutfs_block_read_sm(struct super_block *sb,
struct scoutfs_block_header *hdr, size_t len,
__le32 *blk_crc)
{
return sm_block_io(sb, bdev, READ, blkno, hdr, len, blk_crc);
return sm_block_io(sb, bdev, REQ_OP_READ, blkno, hdr, len, blk_crc);
}
int scoutfs_block_write_sm(struct super_block *sb,
struct block_device *bdev, u64 blkno,
struct scoutfs_block_header *hdr, size_t len)
{
return sm_block_io(sb, bdev, WRITE, blkno, hdr, len, NULL);
return sm_block_io(sb, bdev, REQ_OP_WRITE, blkno, hdr, len, NULL);
}
int scoutfs_block_setup(struct super_block *sb)
@@ -1254,7 +1262,8 @@ int scoutfs_block_setup(struct super_block *sb)
atomic_set(&binf->total_inserted, 0);
atomic64_set(&binf->access_counter, 0);
init_waitqueue_head(&binf->waitq);
binf->shrinker.shrink = block_shrink;
KC_INIT_SHRINKER_FUNCS(struct block_info, shrinker,
&binf->shrinker, block_count_objects, block_scan_objects);
binf->shrinker.seeks = DEFAULT_SEEKS;
register_shrinker(&binf->shrinker);
INIT_WORK(&binf->free_work, block_free_work);

View File

@@ -30,6 +30,8 @@
EXPAND_COUNTER(block_cache_free) \
EXPAND_COUNTER(block_cache_free_work) \
EXPAND_COUNTER(block_cache_remove_stale) \
EXPAND_COUNTER(block_cache_count_objects) \
EXPAND_COUNTER(block_cache_scan_objects) \
EXPAND_COUNTER(block_cache_shrink) \
EXPAND_COUNTER(block_cache_shrink_next) \
EXPAND_COUNTER(block_cache_shrink_recent) \
@@ -235,12 +237,12 @@ struct scoutfs_counters {
#define SCOUTFS_PCPU_COUNTER_BATCH (1 << 30)
#define scoutfs_inc_counter(sb, which) \
__percpu_counter_add(&SCOUTFS_SB(sb)->counters->which, 1, \
SCOUTFS_PCPU_COUNTER_BATCH)
percpu_counter_add_batch(&SCOUTFS_SB(sb)->counters->which, 1, \
SCOUTFS_PCPU_COUNTER_BATCH)
#define scoutfs_add_counter(sb, which, cnt) \
__percpu_counter_add(&SCOUTFS_SB(sb)->counters->which, cnt, \
SCOUTFS_PCPU_COUNTER_BATCH)
percpu_counter_add_batch(&SCOUTFS_SB(sb)->counters->which, cnt, \
SCOUTFS_PCPU_COUNTER_BATCH)
void __init scoutfs_init_counters(void);
int scoutfs_setup_counters(struct super_block *sb);

23
kmod/src/kernelcompat.c Normal file
View File

@@ -0,0 +1,23 @@
#include "kernelcompat.h"
#ifdef KC_SHRINKER_SHRINK
#include <linux/shrinker.h>
/*
* If a target doesn't have that .{count,scan}_objects() interface then
* we have a .shrink() helper that performs the shrink work in terms of
* count/scan.
*/
int kc_shrink_wrapper(struct shrinker *shrink, struct shrink_control *sc)
{
struct kc_shrinker_funcs *funcs = KC_SHRINKER_FUNCS(shrink);
unsigned long nr;
if (sc->nr_to_scan != 0)
funcs->scan_objects(shrink, sc);
nr = funcs->count_objects(shrink, sc);
return min_t(unsigned long, nr, INT_MAX);
}
#endif

View File

@@ -46,4 +46,81 @@ static inline int dir_emit_dots(struct file *file, void *dirent,
}
#endif
#ifndef KC_DIR_EMIT_DOTS
#define percpu_counter_add_batch __percpu_counter_add
#endif
#ifndef KC_MEMALLOC_NOFS_SAVE
#define memalloc_nofs_save memalloc_noio_save
#define memalloc_nofs_restore memalloc_noio_restore
#endif
#ifdef KC_BIO_BI_OPF
#define kc_bio_get_opf(bio) \
({ \
(bio)->bi_opf; \
})
#define kc_bio_set_opf(bio, opf) \
do { \
(bio)->bi_opf = opf; \
} while (0)
#define kc_bio_set_sector(bio, sect) \
do { \
(bio)->bi_iter.bi_sector = sect;\
} while (0)
#else
#define kc_bio_get_opf(bio) \
({ \
(bio)->bi_rw; \
})
#define kc_bio_set_opf(bio, opf) \
do { \
(bio)->bio_rw = opf; \
} while (0)
#define kc_bio_set_sector(bio, sect) \
do { \
(bio)->bi_sector = sect; \
} while (0)
#endif
#ifdef KC_BIO_BI_STATUS
#define KC_DECLARE_BIO_END_IO(name, bio) name(bio)
#define kc_bio_get_errno(bio) ({ blk_status_to_errno((bio)->bi_status); })
#else
#define KC_DECLARE_BIO_END_IO(name, bio) name(bio, int _error_arg)
#define kc_bio_get_errno(bio) ({ (int)((void)(bio), _error_arg); })
#endif
#ifndef KC_SHRINKER_SHRINK
#define KC_DEFINE_SHRINKER(name) struct shrinker name
#define KC_INIT_SHRINKER_FUNCS(type, name, shrink, count, scan) do { \
__typeof__(shrink) _shrink = (shrink); \
_shrink->count_objects = count; \
_shrink->scan_objects = scan; \
} while (0)
#else
#include <linux/shrinker.h>
struct kc_shrinker_funcs {
unsigned long (*count_objects)(struct shrinker *, struct shrink_control *sc);
unsigned long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
};
/* using adjacent member of an unnamed struct */
#define KC_DEFINE_SHRINKER(name) \
{ \
struct kc_shrinker_funcs shrinker_funcs; \
struct shinker name; \
}
#define KC_SHRINKER_FUNCS(shrinker) \
((void *)((long)(shrink) - sizeof(struct kc_shrinker_funcs)))
#define KC_INIT_SHRINKER_FUNCS(type, name, shrink, count, scan) do { \
BUILD_BUG_ON(offsetof(cont, shrink_funcs) + sizeof(struct kc_shrinker_funcs)) != \
offsetof(cont, name) + sizeof(struct kc_shrinker_funcs); \
struct kc_shrinker_funcs *_funcs = KC_SHRINKER_FUNCS(shrink) \
__typeof__(shrink) _shrink = (shrink); \
_funcs->count_objects = count; \
_funcs->scan_objects = scan; \
_shrink->shrink = kc_shrink_wrapper; \
} while (0)
#endif
#endif

View File

@@ -355,6 +355,7 @@ static int submit_send(struct super_block *sb,
}
if (rid != 0) {
spin_unlock(&conn->lock);
kfree(msend);
return -ENOTCONN;
}
}
@@ -1345,10 +1346,12 @@ scoutfs_net_alloc_conn(struct super_block *sb,
if (!conn)
return NULL;
conn->info = kzalloc(info_size, GFP_NOFS);
if (!conn->info) {
kfree(conn);
return NULL;
if (info_size) {
conn->info = kzalloc(info_size, GFP_NOFS);
if (!conn->info) {
kfree(conn);
return NULL;
}
}
conn->workq = alloc_workqueue("scoutfs_net_%s",

View File

@@ -157,6 +157,15 @@ static int free_rid(struct omap_rid_list *list, struct omap_rid_entry *entry)
return nr;
}
static void free_rid_list(struct omap_rid_list *list)
{
struct omap_rid_entry *entry;
struct omap_rid_entry *tmp;
list_for_each_entry_safe(entry, tmp, &list->head, head)
free_rid(list, entry);
}
static int copy_rids(struct omap_rid_list *to, struct omap_rid_list *from, spinlock_t *from_lock)
{
struct omap_rid_entry *entry;
@@ -804,6 +813,10 @@ void scoutfs_omap_server_shutdown(struct super_block *sb)
llist_for_each_entry_safe(req, tmp, requests, llnode)
kfree(req);
spin_lock(&ominf->lock);
free_rid_list(&ominf->rids);
spin_unlock(&ominf->lock);
synchronize_rcu();
}
@@ -864,6 +877,10 @@ void scoutfs_omap_destroy(struct super_block *sb)
rhashtable_walk_stop(&iter);
rhashtable_walk_exit(&iter);
spin_lock(&ominf->lock);
free_rid_list(&ominf->rids);
spin_unlock(&ominf->lock);
rhashtable_destroy(&ominf->group_ht);
rhashtable_destroy(&ominf->req_ht);
kfree(ominf);

View File

@@ -496,7 +496,7 @@ static int scoutfs_fill_super(struct super_block *sb, void *data, int silent)
ret = assign_random_id(sbi);
if (ret < 0)
return ret;
goto out;
spin_lock_init(&sbi->next_ino_lock);
spin_lock_init(&sbi->data_wait_root.lock);
@@ -505,7 +505,7 @@ static int scoutfs_fill_super(struct super_block *sb, void *data, int silent)
/* parse options early for use during setup */
ret = scoutfs_options_early_setup(sb, data);
if (ret < 0)
return ret;
goto out;
scoutfs_options_read(sb, &opts);
ret = sb_set_blocksize(sb, SCOUTFS_BLOCK_SM_SIZE);

View File

@@ -597,7 +597,7 @@ format.
.PD
.TP
.BI "print META-DEVICE"
.BI "print {-S|--skip-likely-huge} META-DEVICE"
.sp
Prints out all of the metadata in the file system. This makes no effort
to ensure that the structures are consistent as they're traversed and
@@ -607,6 +607,20 @@ output.
.PD 0
.TP
.sp
.B "-S, --skip-likely-huge"
Skip printing structures that are likely to be very large. The
structures that are skipped tend to be global and whose size tends to be
related to the size of the volume. Examples of skipped structures include
the global fs items, srch files, and metadata and data
allocators. Similar structures that are not skipped are related to the
number of mounts and are maintained at a relatively reasonable size.
These include per-mount log trees, srch files, allocators, and the
metadata allocators used by server commits.
.sp
Skipping the larger structures limits the print output to a relatively
constant size rather than being a large multiple of the used metadata
space of the volume making the output much more useful for inspection.
.TP
.B "META-DEVICE"
The path to the metadata device for the filesystem whose metadata will be
printed. Since this command reads via the host's buffer cache, it may not

View File

@@ -8,6 +8,7 @@
#include <errno.h>
#include <string.h>
#include <stdarg.h>
#include <stdbool.h>
#include <ctype.h>
#include <uuid/uuid.h>
#include <sys/socket.h>
@@ -989,9 +990,10 @@ static void print_super_block(struct scoutfs_super_block *super, u64 blkno)
struct print_args {
char *meta_device;
bool skip_likely_huge;
};
static int print_volume(int fd)
static int print_volume(int fd, struct print_args *args)
{
struct scoutfs_super_block *super = NULL;
struct print_recursion_args pa;
@@ -1041,23 +1043,26 @@ static int print_volume(int fd)
ret = err;
}
for (i = 0; i < array_size(super->meta_alloc); i++) {
snprintf(str, sizeof(str), "meta_alloc[%u]", i);
err = print_btree(fd, super, str, &super->meta_alloc[i].root,
if (!args->skip_likely_huge) {
for (i = 0; i < array_size(super->meta_alloc); i++) {
snprintf(str, sizeof(str), "meta_alloc[%u]", i);
err = print_btree(fd, super, str, &super->meta_alloc[i].root,
print_alloc_item, NULL);
if (err && !ret)
ret = err;
}
err = print_btree(fd, super, "data_alloc", &super->data_alloc.root,
print_alloc_item, NULL);
if (err && !ret)
ret = err;
}
err = print_btree(fd, super, "data_alloc", &super->data_alloc.root,
print_alloc_item, NULL);
if (err && !ret)
ret = err;
err = print_btree(fd, super, "srch_root", &super->srch_root,
print_srch_root_item, NULL);
if (err && !ret)
ret = err;
err = print_btree(fd, super, "logs_root", &super->logs_root,
print_log_trees_item, NULL);
if (err && !ret)
@@ -1065,19 +1070,23 @@ static int print_volume(int fd)
pa.super = super;
pa.fd = fd;
err = print_btree_leaf_items(fd, super, &super->srch_root.ref,
print_srch_root_files, &pa);
if (err && !ret)
ret = err;
if (!args->skip_likely_huge) {
err = print_btree_leaf_items(fd, super, &super->srch_root.ref,
print_srch_root_files, &pa);
if (err && !ret)
ret = err;
}
err = print_btree_leaf_items(fd, super, &super->logs_root.ref,
print_log_trees_roots, &pa);
if (err && !ret)
ret = err;
err = print_btree(fd, super, "fs_root", &super->fs_root,
print_fs_item, NULL);
if (err && !ret)
ret = err;
if (!args->skip_likely_huge) {
err = print_btree(fd, super, "fs_root", &super->fs_root,
print_fs_item, NULL);
if (err && !ret)
ret = err;
}
out:
free(super);
@@ -1098,7 +1107,7 @@ static int do_print(struct print_args *args)
return ret;
}
ret = print_volume(fd);
ret = print_volume(fd, args);
close(fd);
return ret;
};
@@ -1108,6 +1117,9 @@ static int parse_opt(int key, char *arg, struct argp_state *state)
struct print_args *args = state->input;
switch (key) {
case 'S':
args->skip_likely_huge = true;
break;
case ARGP_KEY_ARG:
if (!args->meta_device)
args->meta_device = strdup_or_error(state, arg);
@@ -1125,8 +1137,13 @@ static int parse_opt(int key, char *arg, struct argp_state *state)
return 0;
}
static struct argp_option options[] = {
{ "skip-likely-huge", 'S', NULL, 0, "Skip large structures to minimize output size"},
{ NULL }
};
static struct argp argp = {
NULL,
options,
parse_opt,
"META-DEV",
"Print metadata structures"