scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 15:33:15 +00:00

Files

Raphael S. Carvalho 2d716f3ffe replica: Fix truncate assert failure

Truncate doesn't really go well with concurrent writes. The fix (#23560) exposed
a preexisting fragility which I missed.

1) truncate gets RP mark X, truncated_at = second T
2) new sstable written during snapshot or later, also at second T (difference of MS)
3) discard_sstables() get RP Y > saved RP X, since creation time of sstable
with RP Y is equal to truncated_at = second T.

So the problem is that truncate is using a clock of second granularity for
filtering out sstables written later, and after we got low mark and truncate time,
it can happen that a sstable is flushed later within the same second, but at a
different millisecond.
By switching to a millisecond clock (db_clock), we allow sstables written later
within the same second from being filtered out. It's not perfect but
extremely unlikely a new write lands and get flushed in the same
millisecond we recorded truncated_at timepoint. In practice, truncate
will not be used concurrently to writes, so this should be enough for
our tests performing such concurrent actions.
We're moving away from gc_clock which is our cheap lowres_clock, but
time is only retrieved when creating sstable objects, which frequency of
creation is low enough for not having significant consequences, and also
db_clock should be cheap enough since it's usually syscall-less.

Fixes #23771.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#24426

2025-06-08 15:59:15 +03:00

CMakeLists.txt

…

compaction_group.hh

replica: Fix take_storage_snapshot() running concurrently to merge completion

2025-05-09 14:07:06 +03:00

data_dictionary_impl.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

database_fwd.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00