Currently, parsers work with temporary_buffer<char>. This is unsafe when invoked by bsearch_clustered_cursor, which reuses some of the parsers, and passes temporary_buffer<char> which is a view onto LSA buffer which comes from the index file page cache. This view is stable only around consume(). If parsing requires more than one page, it will continue with a different input buffer. The old buffer will be invalid, and it's unsafe for the parser to store and access it. Unfortunetly, the temporary_buffer API allows sharing the buffer via the share() method, which shares the underlying memory area. This is not correct when the underlying is managed by LSA, because storage may move. Parser uses this sharing when parsing blobs, e.g. clustering key components. When parsing resumes in the next page, parser will try to access the stored shared buffers pointing to the previous page, which may result in use-after-free on the memory area. In prearation for fixing the problem, parametrize parsers to work with different kinds of buffers. This will allow us to instantiate them with a buffer kind which supports sharing of LSA buffers properly in a safe way. It's not purely mechanical work. Some parts of the parsing state machine still works with temporary_buffer<char>, and allocate buffers internally, when reading into linearized destination buffer. They used to store this destination in _read_bytes vector, same field which is used to store the shared buffers. Now it's not possible, since shared buffer type may be different than temporary_buffer<char>. So those paths were changed to use a new field: _read_bytes_buf.
41 lines
1.4 KiB
C++
41 lines
1.4 KiB
C++
/*
|
|
* Copyright (C) 2024-present ScyllaDB
|
|
*/
|
|
|
|
/*
|
|
* SPDX-License-Identifier: AGPL-3.0-or-later
|
|
*/
|
|
|
|
#pragma once
|
|
|
|
#include <concepts>
|
|
#include <memory>
|
|
|
|
// A contiguous buffer of char objects which can be trimmed and
|
|
// supports zero-copy sharing of its underlying memory.
|
|
template<typename T>
|
|
concept ContiguousSharedBuffer = std::movable<T>
|
|
&& std::default_initializable<T>
|
|
&& requires(T& obj, size_t pos, size_t len) {
|
|
|
|
// Creates a new buffer that shares the memory of the original buffer.
|
|
// The lifetime of the new buffer is independent of the original buffer.
|
|
{ obj.share() } -> std::same_as<T>;
|
|
|
|
// Like share() but the new buffer represents a sub-range of the original buffer.
|
|
{ obj.share(pos, len) } -> std::same_as<T>;
|
|
|
|
// Trims the suffix of a buffer so that 'len' is the index of the first removed byte.
|
|
{ obj.trim(len) } -> std::same_as<void>;
|
|
|
|
// Trims the prefix of the buffer so that `pos` is the index of the first byte after the trim.
|
|
{ obj.trim_front(pos) } -> std::same_as<void>;
|
|
|
|
{ obj.begin() } -> std::same_as<const char*>;
|
|
{ obj.get() } -> std::same_as<const char*>;
|
|
{ obj.get_write() } -> std::same_as<char*>;
|
|
{ obj.end() } -> std::same_as<const char*>;
|
|
{ obj.size() } -> std::same_as<size_t>;
|
|
{ obj.empty() } -> std::same_as<bool>;
|
|
};
|