mirror of
https://github.com/scylladb/scylladb.git
synced 2026-06-05 06:23:03 +00:00
SSTable summary is one of the components fully loaded into memory that may have a significant footprint. This series reduces the summary footprint by reducing the amount of token information that we need to keep in memory for each summary entry. Of course, the benefit of this size optimization is proportional to the amount of summary entries, which in turn is proportional to the number of partitions in a SSTable. Therefore we can say that this optimization will benefit the most tables which have tons of small-sized partitions, which will result in big summaries. Results: ``` BEFORE [1000000 pkeys] data size: 4035888890, summary -> memory footprint: 5843232, entries: 88158 [10000000 pkeys] data size: 40368888890, summary -> memory footprint: 55787128, entries: 844925 AFTER [1000000 pkeys] data size: 4035888890, summary -> memory footprint: 4351536, entries: 88158 [10000000 pkeys] data size: 40368888890, summary -> memory footprint: 42211984, entries: 844925 ``` That shows a 25% reduction in footprint, for both 1 and 10 million pkeys. Closes #13447 * github.com:scylladb/scylladb: sstables: Store raw token into summary entries sstables: Don't store token data into summary's memory pool