Files
scylladb/tools
Botond Dénes f453b5bfa3 Merge '[Backport 2025.1] sstables: Fix quadratic space complexity in partitioned_sstable_set' from Scylladb[bot]
Interval map is very susceptible to quadratic space behavior when it's flooded with many entries overlapping all (or most of) intervals, since each such entry will have presence on all intervals it overlaps with.

A trigger we observed was memtable flush storm, which creates many small "L0" sstables that spans roughly the entire token range.

Since we cannot rely on insertion order, solution will be about storing sstables with such wide ranges in a vector (unleveled).

There should be no consequence for single-key reads, since upper layer applies an additional filtering based on token of key being queried.
And for range scans, there can be an increase in memory usage, but not significant because the sstables span an wide range and would have been selected in the combined reader if the range of scan overlaps with them.

Anyway, this is a protection against storm of memtable flushes and shouldn't be the common scenario.

It works both with tablets and vnodes, by adjusting the token range spanned by compaction group accordingly.

Fixes #23634.

We can backport this into 2024.2, 2025.1, but we should let this cook in master for 1 month or so.

- (cherry picked from commit 494ed6b887)

- (cherry picked from commit 59dad2121f)

- (cherry picked from commit 21d1e78457)

- (cherry picked from commit c77f710a0c)

- (cherry picked from commit d5bee4c814)

Parent PR: #23806

Closes scylladb/scylladb#24012

* github.com:scylladb/scylladb:
  test: Verify partitioned set store split and unsplit correctly
  sstables: Fix quadratic space complexity in partitioned_sstable_set
  compaction: Wire table_state into make_sstable_set()
  compaction: Introduce token_range() to table_state
  dht: Add overlap_ratio() for token range
2025-08-06 09:56:43 +03:00
..
2025-02-14 11:14:07 +02:00