Files
scylladb/cql3
Avi Kivity c80dc57156 Merge 'batchlog replay: bypass tombstones generated by past replays' from Botond Dénes
The `system.batchlog` table has a partition for each batch that failed to complete. After finally applying the batch, the partition is deleted. Although the table has gc_grace_second = 0, tombstones can still accumulate in memory, because we don't purge partition tombstones from either the memtable or the cache. This can lead to the cache and memtable of this table to accumulate many thousands of even millions of tombstones, making batchlog replay very slow. We didn't notice this before, because we would only replay all failed batches on unbootstrap, which is rare and a heavy and slow operation on its own right already.
With repair-based tombstone-gc however, we do a full batchlog replay at the beginning of each repair, and now this extra delay is noticeable.
Fix this by making sure batchlog replays don't have to scan through all the tombstones generated by previous replays:
* flush the `system.batchlog` memtable at the end of each batchlog replay, so it is cleared of tombstones
* bypass the cache

Fixes: https://github.com/scylladb/scylladb/issues/19376

Although this is not a regression -- replay was like this since forever -- now that repair calls into batchlog replay, every release which uses repair-based tombstone-gc should get this fix

Closes scylladb/scylladb#19377

* github.com:scylladb/scylladb:
  db/batchlog_manager: bypass cache when scanning batchlog table
  db/batchlog_manager: replace open-coded paging with internal one
  db/batchlog_manager: implement cleanup after all batchlog replay
  cql3/query_processor: for_each_cql_result(): move func to the coro frame
2024-06-25 16:11:01 +03:00
..
2024-06-14 09:45:35 +03:00
2024-06-14 09:45:35 +03:00
2024-06-14 09:45:35 +03:00
2024-01-16 16:43:17 +02:00
2024-01-16 16:43:17 +02:00
2024-01-16 16:43:17 +02:00
2024-01-16 16:43:17 +02:00
2024-06-14 09:45:35 +03:00