mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-22 09:30:45 +00:00
STCS is considering the smallest bucket, out of the ones which contain more than min_threshold elements, to be the most interesting one to compact now. That's basically saying we'll only compact larger tiers once we're done with smaller ones. That can be problematic because under heavy load, larger tiers cannot be compacted in a timely manner even though they're the ones contributing the most to read amplification. For example, if we're producing sstables in smaller tiers at roughly the same rate that we can compact them, then it may happen that larger tiers will not be compacted even though new sstables are being pushed to them. Therefore, backlog will not be reduced in a satisfactory manner, so read latency is affected. By picking the bucket with largest fan-in instead, we'll choose the most efficient compaction, as we'll target buckets which can reduce more from backlog once compacted. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211103215110.135633-1-raphaelsc@scylladb.com>