mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-26 11:30:36 +00:00
commitlog: Handle oversized entries
Refs #18161 Yet another approach to dealing with large commitlog submissions. We handle oversize single mutation by adding yet another entry type: fragmented. In this case we only add a fragment (aha) of the data that needs storing into each entry, along with metadata to correlate and reconstruct the full entry on replay. Because these fragmented entries are spread over N segments, we also need to add references from the first segment in a chain to the subsequent ones. These are released once we clear the relevant cf_id count in the base. * This approach has the downside that due to how serialization etc works w.r.t. mutations, we need to create an intermediate buffer to hold the full serialized target entry. This is then incrementally written into entries of < max_mutation_size, successively requesting more segments. On replay, when encountering a fragment chain, the fragment is added to a "state", i.e. a mapping of currently processing frag chains. Once we've found all fragments and concatenated the buffers into a single fragmented one, we can issue a replay callback as usual. Note that a replay caller will need to create and provide such a state object. Old signature replay function remains for tests and such. This approach bumps the file format (docs to come). To ensure "atomicity" we both force syncronization, and should the whole op fail, we restore segment state (rewinding), thus discarding data all we wrote. v2: * Improve some bookeep, ensure we keep track of segments and flush properly, to get counter correct
This commit is contained in:
@@ -111,6 +111,7 @@ public:
|
||||
bool use_o_dsync = false;
|
||||
bool warn_about_segments_left_on_disk_after_shutdown = true;
|
||||
bool allow_going_over_size_limit = true;
|
||||
bool allow_fragmented_entries = false;
|
||||
|
||||
// The base segment ID to use.
|
||||
// The segment IDs of newly allocated segments will be issued sequentially
|
||||
@@ -136,7 +137,8 @@ public:
|
||||
static inline constexpr uint32_t segment_version_1 = 1u;
|
||||
static inline constexpr uint32_t segment_version_2 = 2u;
|
||||
static inline constexpr uint32_t segment_version_3 = 3u;
|
||||
static inline constexpr uint32_t current_version = segment_version_3;
|
||||
static inline constexpr uint32_t segment_version_4 = 4u;
|
||||
static inline constexpr uint32_t current_version = segment_version_4;
|
||||
|
||||
descriptor(descriptor&&) noexcept = default;
|
||||
descriptor(const descriptor&) = default;
|
||||
@@ -378,7 +380,7 @@ public:
|
||||
// (Re-)set data mix lifetime.
|
||||
void update_max_data_lifetime(std::optional<uint64_t> commitlog_data_max_lifetime_in_seconds);
|
||||
|
||||
typedef std::function<future<>(buffer_and_replay_position)> commit_load_reader_func;
|
||||
using commit_load_reader_func = std::function<future<>(buffer_and_replay_position)>;
|
||||
|
||||
class segment_error : public std::exception {};
|
||||
|
||||
@@ -424,7 +426,18 @@ public:
|
||||
const char* what() const noexcept override;
|
||||
};
|
||||
|
||||
class replay_state {
|
||||
public:
|
||||
replay_state();
|
||||
~replay_state();
|
||||
private:
|
||||
friend class commitlog;
|
||||
class impl;
|
||||
std::unique_ptr<impl> _impl;
|
||||
};
|
||||
|
||||
static future<> read_log_file(sstring filename, sstring prefix, commit_load_reader_func, position_type = 0, const db::extensions* = nullptr);
|
||||
static future<> read_log_file(const replay_state&, sstring filename, sstring prefix, commit_load_reader_func, position_type = 0, const db::extensions* = nullptr);
|
||||
private:
|
||||
commitlog(config);
|
||||
|
||||
|
||||
Reference in New Issue
Block a user