mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-20 00:20:47 +00:00
When encrypted_data_source::get() caches a trailing block in _next, the next call takes it directly — bypassing input_stream::read(), which checks _eof. It then calls input_stream::read_exactly() on the already-drained stream. Unlike read(), read_up_to(), and consume(), read_exactly() does not check _eof when the buffer is empty, so it calls _fd.get() on a source that already returned EOS.
In production this manifested as stuck encrypted SSTable component downloads during tablet restore: the underlying chunked_download_source hung forever on the post-EOS get(), causing 4 tablets to never complete. The stuck files were always block-aligned sizes (8k, 12k) where _next gets populated and the source is fully consumed in the same call.
Fix by checking _input.eof() before calling read_exactly(). When the stream already reached EOF, buf2 is known to be empty, so the call is skipped entirely.
A comprehensive test is added that uses a strict_memory_source which fails on post-EOS get(), reproducing the exact code path that caused the production deadlock.
Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1128
Backport to 2025.3/4 and 2026.1 is needed since it fixes a bug that may bite us in production, to be on the safe side
Closes scylladb/scylladb#29110
* github.com:scylladb/scylladb:
encryption: fix deadlock in encrypted_data_source::get()
test_lib: mark `limiting_data_source_impl` as not `final`
Fix formatting after previous patch
Fix indentation after previous patch
test_lib: make limiting_data_source_impl available to tests
(cherry picked from commit 3b9398dfc8)
Closes scylladb/scylladb#29198
50 lines
1.5 KiB
C++
50 lines
1.5 KiB
C++
/*
|
|
* Copyright (C) 2017-present ScyllaDB
|
|
*/
|
|
|
|
/*
|
|
* SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
|
|
*/
|
|
|
|
#include "limiting_data_source.hh"
|
|
|
|
using namespace seastar;
|
|
|
|
future<temporary_buffer<char>> limiting_data_source_impl::do_get() {
|
|
uint64_t size = std::min(_limit_generator(), _buf.size());
|
|
auto res = _buf.share(0, size);
|
|
_buf.trim_front(size);
|
|
return make_ready_future<temporary_buffer<char>>(std::move(res));
|
|
}
|
|
|
|
limiting_data_source_impl::limiting_data_source_impl(data_source&& src, seastar::noncopyable_function<size_t()>&& limit_generator) : _src(std::move(src)), _limit_generator(std::move(limit_generator)) {
|
|
}
|
|
|
|
future<temporary_buffer<char>> limiting_data_source_impl::get() {
|
|
if (_buf.empty()) {
|
|
_buf.release();
|
|
return _src.get().then([this](auto&& buf) {
|
|
_buf = std::move(buf);
|
|
return do_get();
|
|
});
|
|
}
|
|
return do_get();
|
|
}
|
|
|
|
future<temporary_buffer<char>> limiting_data_source_impl::skip(uint64_t n) {
|
|
if (n < _buf.size()) {
|
|
_buf.trim_front(n);
|
|
return do_get();
|
|
}
|
|
n -= _buf.size();
|
|
_buf.release();
|
|
return _src.skip(n).then([this](auto&& buf) {
|
|
_buf = std::move(buf);
|
|
return do_get();
|
|
});
|
|
}
|
|
|
|
data_source make_limiting_data_source(data_source&& src, seastar::noncopyable_function<size_t()>&& limit_generator) {
|
|
return data_source{std::make_unique<limiting_data_source_impl>(std::move(src), std::move(limit_generator))};
|
|
}
|