mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-12 19:02:12 +00:00
database: Make soft-pressure memtable flusher not consider already flushed memtables
The flusher picks the memtable list which contains the largest region according to region_impl::evictable_occupancy().total_space(), which follows region::occupancy().total_space(). But only the latest memtable in the list can start flushing. It can happen that the memtable corresponding to the largest region was already flushed to an sstable (flush permit released), but not yet fsynced or moved to cache, so it's still in the memtable list. The latest memtable in the winning list may be small, or empty, in which case the soft pressure flusher will not be able to make much progress. There could be other memtable lists with non-empty (flushable) latest memtables. This can lead to writes unnecessarily blocking on dirty. I observed this for the system memtable group, where it's easy for the memtables to overshoot small soft pressure limits. The flusher kept trying to flush empty memtables, while the previous non-empty memtable was still in the group. The CPU scheduler makes this worse, because it runs memtable_to_cache in a separate scheduling group, so it further defers in time the removal of the flushed memtable from the memtable list. This patch fixes the problem by making regions corresponding to memtables which started flushing report evictable_occupancy() as 0, so that they're picked by the flusher last. Fixes #3716. Message-Id: <1535040132-11153-2-git-send-email-tgrabiec@scylladb.com>
This commit is contained in:
committed by
Avi Kivity
parent
364418b5c5
commit
1e50f85288
@@ -969,6 +969,11 @@ table::seal_active_memtable(flush_permit&& permit) {
|
||||
}
|
||||
_memtables->add_memtable();
|
||||
_stats.memtable_switch_count++;
|
||||
// This will set evictable occupancy of the old memtable region to zero, so that
|
||||
// this region is considered last for flushing by dirty_memory_manager::flush_when_needed().
|
||||
// If we don't do that, the flusher may keep picking up this memtable list for flushing after
|
||||
// the permit is released even though there is not much to flush in the active memtable of this list.
|
||||
old->region().ground_evictable_occupancy();
|
||||
auto previous_flush = _flush_barrier.advance_and_await();
|
||||
auto op = _flush_barrier.start();
|
||||
|
||||
|
||||
@@ -1140,6 +1140,9 @@ private:
|
||||
// occupancy. We could actually just present this as a scalar as well and never use occupancies,
|
||||
// but consistency is good.
|
||||
size_t _evictable_space = 0;
|
||||
// This is a mask applied to _evictable_space with bitwise-and before it's returned from evictable_space().
|
||||
// Used for forcing the result to zero without using conditionals.
|
||||
size_t _evictable_space_mask = std::numeric_limits<size_t>::max();
|
||||
bool _evictable = false;
|
||||
region_sanitizer _sanitizer;
|
||||
uint64_t _id;
|
||||
@@ -1349,8 +1352,16 @@ public:
|
||||
}
|
||||
|
||||
occupancy_stats evictable_occupancy() const {
|
||||
return occupancy_stats(0, _evictable_space);
|
||||
return occupancy_stats(0, _evictable_space & _evictable_space_mask);
|
||||
}
|
||||
|
||||
void ground_evictable_occupancy() {
|
||||
_evictable_space_mask = 0;
|
||||
if (_group) {
|
||||
_group->decrease_evictable_usage(_heap_handle);
|
||||
}
|
||||
}
|
||||
|
||||
//
|
||||
// Returns true if this region can be compacted and compact() will make forward progress,
|
||||
// so that this will eventually stop:
|
||||
@@ -1739,6 +1750,10 @@ void region::make_evictable(eviction_fn fn) {
|
||||
get_impl().make_evictable(std::move(fn));
|
||||
}
|
||||
|
||||
void region::ground_evictable_occupancy() {
|
||||
get_impl().ground_evictable_occupancy();
|
||||
}
|
||||
|
||||
const eviction_fn& region::evictor() const {
|
||||
return get_impl().evictor();
|
||||
}
|
||||
|
||||
@@ -295,8 +295,12 @@ public:
|
||||
update(delta);
|
||||
}
|
||||
|
||||
void decrease_usage(region_heap::handle_type& r_handle, ssize_t delta) {
|
||||
void decrease_evictable_usage(region_heap::handle_type& r_handle) {
|
||||
_regions.decrease(r_handle);
|
||||
}
|
||||
|
||||
void decrease_usage(region_heap::handle_type& r_handle, ssize_t delta) {
|
||||
decrease_evictable_usage(r_handle);
|
||||
update(delta);
|
||||
}
|
||||
|
||||
@@ -621,6 +625,9 @@ public:
|
||||
return allocator().invalidate_counter();
|
||||
}
|
||||
|
||||
// Will cause subsequent calls to evictable_occupancy() to report empty occupancy.
|
||||
void ground_evictable_occupancy();
|
||||
|
||||
// Makes this region an evictable region. Supplied function will be called
|
||||
// when data from this region needs to be evicted in order to reclaim space.
|
||||
// The function should free some space from this region.
|
||||
|
||||
Reference in New Issue
Block a user