mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-25 11:00:35 +00:00
commitlog: fix total_size_on_disk accounting after segment file removal
Currently, segment file removal first calls `f.remove_file()` and does `total_size_on_disk -= f.known_size()` later. However, `remove_file()` resets `known_size` to 0, so in effect the freed space in not accounted for. `total_size_on_disk` is not just a metric. It is also responsible for deciding whether a segment should be recycled -- it is recycled only if `total_size_on_disk - known_size < max_disk_size`. Therefore this bug has dire performance consequences: if `total_size_on_disk - known_size` ever exceeds `max_disk_size`, the recycling of commitlog segments will stop permanently, because `total_size_on_disk - known_size` will never go back below `max_disk_size` due to the accounting bug. All new segments from this point will be allocated from scratch. The bug was uncovered by a QA performance test. It isn't easy to trigger -- it took the test 7 hours of constant high load to step into it. However, the fact that the effect is permanent, and degrades the performance of the cluster silently, makes the bug potentially quite severe. The bug can be easily spotted with Prometheus as infinitely rising `commitlog_total_size_on_disk` on the affected shards. Fixes #12645 Closes #12646
This commit is contained in:
committed by
Botond Dénes
parent
5d914adcef
commit
fa7e904cd6
@@ -2116,6 +2116,9 @@ future<> db::commitlog::segment_manager::do_pending_deletes() {
|
||||
clogger.debug("Discarding segments {}", ftd);
|
||||
|
||||
for (auto& [f, mode] : ftd) {
|
||||
// `f.remove_file()` resets known_size to 0, so remember the size here,
|
||||
// in order to subtract it from total_size_on_disk accurately.
|
||||
size_t size = f.known_size();
|
||||
try {
|
||||
if (f) {
|
||||
co_await f.close();
|
||||
@@ -2132,7 +2135,6 @@ future<> db::commitlog::segment_manager::do_pending_deletes() {
|
||||
}
|
||||
}
|
||||
|
||||
auto size = f.known_size();
|
||||
auto usage = totals.total_size_on_disk;
|
||||
auto next_usage = usage - size;
|
||||
|
||||
@@ -2165,7 +2167,7 @@ future<> db::commitlog::segment_manager::do_pending_deletes() {
|
||||
// or had such an exception that we consider the file dead
|
||||
// anyway. In either case we _remove_ the file size from
|
||||
// footprint, because it is no longer our problem.
|
||||
totals.total_size_on_disk -= f.known_size();
|
||||
totals.total_size_on_disk -= size;
|
||||
}
|
||||
|
||||
// #8376 - if we had an error in recycling (disk rename?), and no elements
|
||||
|
||||
Reference in New Issue
Block a user