Files
scylladb/compaction
Raphael S. Carvalho da04fea71e compaction: Fix key estimation per sstable to produce efficient filters
The estimation assumes that size of other components are irrelevant,
when estimating the number of partitions for each output sstable.
The sstables are split according to the data file size, therefore
size of other files are irrelevant for the estimation.

With certain data models, like single-row partitions containing small
values, the index could be even larger than data.
For example, assume index is as large as data, then the estimation
would say that 2x more sstables will be generated, and as a result,
each sstable are underestimated to have 2x less keys.

Fix it by only accounting size of data file.

Fixes #15726.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15727
2023-10-17 11:21:11 +03:00
..