needs_cleanup() returns true if a sstable needs cleanup.
Turns out it's very slow because it iterates through all the local
ranges for all sstables in the set, making its complexity:
O(num_sstables * local_ranges)
We can optimize it by taking into account that abstract_replication_strategy
documents that get_ranges() will return a list of ranges that is sorted
and non-overlapping. Compaction for cleanup already takes advantage of that
when checking if a given partition can be actually purged.
So needs_cleanup() can be optimized into O(num_sstables * log(local_ranges)).
With num_sstables=1000, RF=3, then local_ranges=256(num_tokens)*3, it means
the max # of checks performed will go from 768000 to ~9584.
Fixes#6730.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200629171355.45118-2-raphaelsc@scylladb.com>