scylladb/sstables at 17c66224f70ddf73d29a75286e76267e793b685f - scylladb - Anomalous Gitea

mirrors/scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 04:26:48 +00:00

Files

History

Raphael S. Carvalho 91260cf91b sstables: Fix Incremental Compaction Efficiency

Compaction prevents data resurrection from happening by checking that there's
no way a data shadowed by a GC'able tombstone will survive alone, after
a failure for example.

Consider the following scenario:
We have two runs A and B, each divided to 5 fragments, A1..A5, B1..B5.

They have the following token ranges:

 A:  A1=[0, 3]   A2=[4, 7]  A3=[8, 11]   A4=[12, 15]    A5=[16,18]
B is the same as A's ranges, offset by 1:

 B:  B1=[1,4]    B2=[5,8]  B3=[9,12]    B4=[13,16]    B5=[17,19]

Let's say we are finished flushing output until position 10 in the compaction.
We are currently working on A3 and B3, so obviously those cannot be deleted.
Because B2 overlaps with A3, we cannot delete B2 either.
Otherwise, B2 could have a GC'able tombstone that shadows data in A3, and after
B2 is gone, dead data in A3 could be resurrected *on failure*.
Now, A2 overlaps with B2 which we couldn't delete yet, so we can't delete A2.
Now A2 overlaps with B1 so we can't delete B1. And B1 overlaps with A1 so
we can't delete A1. So we can't delete any fragment.

The problem with this approach is obvious, fragments can potentially not be
released due to data dependency, so incremental compaction efficiency is
severely reduced.
To fix it, let's not purge GC'able tombstones right away in the mutation
compactor step. Instead, let's have compaction writing them to a separate
sstable run that would be deleted in the end of compaction.
By making sure that tombstone information from all compacting sstables is not
lost, we no longer need to have incremental compaction imposing lots of
restriction on which fragments could be released. Now, any sstable which data
is safe in a new sstable can be released right away. In addition, incremental
compaction will only take place if compaction procedure is working with one
multi-fragment sstable run at least.

Fixes #4531.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

2019-10-12 21:36:03 -03:00

..

sstables: writer: Validate that partition is closed when the input mutation stream ends

2019-08-02 11:13:54 +02:00

binary_search.hh

…

checksum_utils.hh

sstables: Use fast_crc_combine() in the default checksummer

2018-12-03 14:40:35 +01:00

column_name_helper.hh

Update seastar submodule

2018-11-21 00:01:44 +02:00

column_translation.hh

sstables: store column name in column_translation::column_info

2018-11-30 08:59:00 +01:00

compaction_backlog_manager.hh

backlog: add level to write progress monitor

2018-05-31 21:09:38 -04:00

compaction_manager.cc

sstables/compaction_manager: Don't perform upgrade on shared SSTables

2019-09-25 11:18:40 +03:00

compaction_manager.hh

sstables: compaction_manager: #include seastarx.hh

2019-09-23 16:12:49 +02:00

compaction_strategy_impl.hh

sstables/compaction_manager: Fix logic for filtering out partial sstable runs

2019-08-08 14:11:35 +03:00

compaction_strategy.cc

sstables/compaction_manager: Fix logic for filtering out partial sstable runs

2019-08-08 14:11:35 +03:00

compaction_weight_registration.hh

database: rename column_family to table

2018-06-24 14:54:46 +03:00

compaction.cc

sstables: Fix Incremental Compaction Efficiency

2019-10-12 21:36:03 -03:00

compaction.hh

compaction: introduce constants for compaction descriptor

2019-07-15 23:39:44 -03:00

component_type.hh

sstable::component_type: add operator<<

2018-04-24 11:30:26 +02:00

compress.cc

Merge "Implement sstable_info API command (info on sstables)" from Calle

2019-08-12 21:16:08 +03:00

compress.hh

Merge "Implement sstable_info API command (info on sstables)" from Calle

2019-08-12 21:16:08 +03:00

consumer.hh

sstable: pass full length of buffer to vint deserialiser

2019-03-14 13:37:06 +00:00

data_consume_context.hh

treewide: silence discarded future warnings for questionable discards

2019-08-26 19:28:43 +03:00

date_tiered_compaction_strategy.hh

Replace std::experimental types with C++17 std version.

2019-01-08 13:16:36 +02:00

disk_types.hh

atomic_cell: introduce fragmented buffer value interface

2018-05-31 15:51:11 +01:00

downsampling.hh

…

exceptions.hh

sstables: convert sprint() to format()

2018-11-01 13:16:17 +00:00

filter.hh

…

hyperloglog.hh

…

index_entry.hh

sstables: Refactor predicates on bound_kind_m

2019-01-02 17:50:44 -08:00

index_reader.hh

sstable: index_reader: close index_reader::reader more robustly

2019-07-26 14:26:04 +02:00

integrity_checked_file_impl.cc

Switch to the the CMake-ified Seastar

2019-01-30 11:17:38 +02:00

integrity_checked_file_impl.hh

…

key.hh

Update seastar submodule

2018-11-21 00:01:44 +02:00

leveled_compaction_strategy.cc

sstables/LCS: Fix increased write amplification due to incorrect SSTable demotion

2019-09-22 10:46:38 +03:00

leveled_compaction_strategy.hh

sstables: Move leveled_compaction_strategy implementation to source file

2019-08-26 16:49:48 +08:00

leveled_manifest.hh

sstables/LCS: Fix increased write amplification due to incorrect SSTable demotion

2019-09-22 10:46:38 +03:00

liveness_info.hh

sstables: mc: use proper gc_clock types for local_deletion_time and ttl

2019-01-22 15:34:32 +02:00

m_format_read_helpers.cc

vint: drop deserialize_type structure

2019-03-14 13:37:06 +00:00

m_format_read_helpers.hh

gc_clock: make 64 bit

2019-01-22 15:34:32 +02:00

metadata_collector.hh

sstables/metadata_collector: move the default values to the global tracker

2019-02-07 10:16:50 +00:00

mp_row_consumer.cc

sstables: Remove unused variables in make_counter_cell

2019-05-21 12:07:31 +02:00

mp_row_consumer.hh

sstable: mc: reader: Do not stop parsing across partitions

2019-06-19 14:29:02 +02:00

mutation_fragment_filter.hh

sstables: mutation_fragment_filter: Drop unnecessary calls to _walker.out_of_range()

2019-06-19 14:29:02 +02:00

partition.cc

treewide: silence discarded future warnings for questionable discards

2019-08-26 19:28:43 +03:00

prepended_input_stream.cc

…

prepended_input_stream.hh

…

progress_monitor.hh

…

random_access_reader.hh

treewide: silence discarded future warnings for legit discards

2019-08-26 18:54:44 +03:00

remove.hh

…

row.hh

sstable: mc: reader: Do not stop parsing across partitions

2019-06-19 14:29:02 +02:00

segmented_compress_params.hh

…

shareable_components.hh

sstables: move shareable_components def to its own header

2019-03-26 16:05:08 +02:00

shared_index_lists.hh

sstables: Switch index_list to chunked_vector to avoid large allocations

2018-07-11 16:55:20 +02:00

shared_sstable.hh

…

size_tiered_backlog_tracker.hh

STCS_backlog: allow users to query for the total bytes managed

2018-05-22 13:40:15 -04:00

size_tiered_compaction_strategy.hh

sstables/size_tiered_compaction_strategy.hh: make self-sufficient

2019-05-14 13:27:30 +03:00

sstable_set.hh

sstables: Include dht/i_partitioner.hh for dht::partition_range

2019-08-26 16:35:18 +08:00

sstable_version_k_l.hh

sstable: Make component_map version dependent

2018-04-24 11:30:26 +02:00

sstable_version_m.hh

sstable: Make component_map version dependent

2018-04-24 11:30:26 +02:00

sstable_version.cc

sstable: Make component_map version dependent

2018-04-24 11:30:26 +02:00

sstable_version.hh

sstable: Make component_map version dependent

2018-04-24 11:30:26 +02:00

sstables_manager.cc

sstables: provide large_data_handler to constructor

2019-03-26 16:24:19 +02:00

sstables_manager.hh

sstables: provide large_data_handler to constructor

2019-03-26 16:24:19 +02:00

sstables.cc

sstables: Fix partition key count estimation for a range

2019-09-28 19:36:43 +03:00

sstables.hh

sstables: Fix partition key count estimation for a range

2019-09-28 19:36:43 +03:00

stats.hh

sstables: add capped_tombstone_deletion_time stats counter

2019-01-22 15:34:32 +02:00

time_window_compaction_strategy.hh

TWCS: implement add_interposer_consumer()

2019-06-26 18:45:36 +03:00

types.hh

sstables: Add a feature for empty counters in Scylla.db.

2019-05-23 10:10:24 +02:00

version.hh

Add sstable format helper methods

2019-04-12 09:33:40 +02:00

writer_impl.hh

sstables: Extract sstable_writer_impl to a header

2018-12-12 12:07:31 +01:00

writer.hh

sstables: checksummed_file_writer: fix dma alignment

2019-02-21 21:26:56 +01:00