mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-12 19:02:12 +00:00
repair/row_level: remove reader timeout
This timeout was added to catch reader related deadlocks. We have not seen such deadlocks for a long time, but we did see false-timeouts caused by this, see explanation below. Since the cost now outweight the benefit, remove the timeout altogether. The false timeout happens during mixed-shard repair. The `reader_permit::set_timeout()` call is called on the top-level permit which repair has a handle on. In the case of the mixed-shard repair, this belongs to the multishard reader. Calling set_timeout() on the multishard reader has no effect on the actual shard readers, except in one case: when the shard reader is created, it inherits the multishard reader's current timeout. As the shard reader can be alive for a long time, this timeout is not refreshed and ultimately causes a timeout and fails the repair. Refs: #18269 Closes scylladb/scylladb#20703
This commit is contained in:
committed by
Kamil Braun
parent
e67016540c
commit
3ebb124eb2
@@ -349,11 +349,6 @@ repair_reader::repair_reader(
|
||||
future<mutation_fragment_opt>
|
||||
repair_reader::read_mutation_fragment() {
|
||||
++_reads_issued;
|
||||
// Use a very long timeout for the reader to break out any eventual
|
||||
// deadlock within the reader. Thirty minutes should be more than
|
||||
// enough to read a single mutation fragment.
|
||||
auto timeout = db::timeout_clock::now() + std::chrono::minutes(30);
|
||||
_reader.set_timeout(timeout); // reset to db::no_timeout in pause()
|
||||
return _reader().then_wrapped([this] (future<mutation_fragment_opt> f) {
|
||||
try {
|
||||
auto mfopt = f.get();
|
||||
@@ -397,7 +392,6 @@ void repair_reader::check_current_dk() {
|
||||
}
|
||||
|
||||
void repair_reader::pause() {
|
||||
_reader.set_timeout(db::no_timeout);
|
||||
if (_reader_handle) {
|
||||
_reader_handle->pause();
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user