This patch replaces the sstable read APIs from having "push" style,
to having "pull style".
The sstable read code has two APIs:
1. An API for sequentially consuming low-level sstable items - sstable
row's beginning and end, cells, tombstones, etc.
2. An API for sequentially consuming entire sstable rows in our "mutation"
format.
Before this patch, both APIs were in "push style": The user supplies
callback functions, and the sstable read code "pushes" to these functions
the desired items (low-level sstable parts, or whole mutations).
However, a push API is very inconvenient for users, like the query
processing code, or the compaction code, which both iterate over mutations.
Such code wants to control its own progression through the iteration -
the user prefers to "pull" the next mutation when it wants it; Moreover,
the user wants to *stop* pulling more mutations if it wants, without
worrying about various continuations that are still scheduled in the
background (the latter concern was especially problematic in the "push"
design).
The modified APIs are:
1. The functions for iterating over mutations, sstable::read_rows() et al.,
now return a "mutation_reader" object which can be used for iterating
over the mutation: mutation_reader::read() asks for the next mutation,
and returns a future to it (or an unassigned value on EOF).
You can see an example on how it is used in sstable_mutation_test.cc.
2. The functions for consuming low-level sstable items (row begin, cell,
etc.) are still partially push-style - the items are still fed into
the consume object - but consumpton now *stops* (instead of defering
and continuing later, as in the old code) when the consumer asks to.
The caller can resume the consumption later when it wishes to (in
this sense, this is a "pull" API, because the user asks for more
input when it wants to).
This patch does *not* remove input_stream's feature of a consumer
function returning a non-ready future. However, this feature is no longer
used anywhere in our code - the new sstable reader code stops the
consumption when old sstable reader code paused it temporarily with
a non-ready future.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
67 lines
2.8 KiB
C++
67 lines
2.8 KiB
C++
/*
|
|
* Copyright (C) 2015 Cloudius Systems, Ltd.
|
|
*
|
|
*/
|
|
|
|
#pragma once
|
|
|
|
#include "bytes.hh"
|
|
#include "key.hh"
|
|
#include "core/temporary_buffer.hh"
|
|
|
|
// sstables::data_consume_row feeds the contents of a single row into a
|
|
// row_consumer object:
|
|
//
|
|
// * First, consume_row_start() is called, with some information about the
|
|
// whole row: The row's key, timestamp, etc.
|
|
// * Next, consume_cell() is called once for every column.
|
|
// * Finally, consume_row_end() is called. A consumer written for a single
|
|
// column will likely not want to do anything here.
|
|
//
|
|
// Important note: the row key, column name and column value, passed to the
|
|
// consume_* functions, are passed as a "bytes_view" object, which points to
|
|
// internal data held by the feeder. This internal data is only valid for the
|
|
// duration of the single consume function it was passed to. If the object
|
|
// wants to hold these strings longer, it must make a copy of the bytes_view's
|
|
// contents. [Note, in reality, because our implementation reads the whole
|
|
// row into one buffer, the byte_views remain valid until consume_row_end()
|
|
// is called.]
|
|
class row_consumer {
|
|
public:
|
|
enum class proceed { yes, no };
|
|
|
|
// Consume the row's key and deletion_time. The latter determines if the
|
|
// row is a tombstone, and if so, when it has been deleted.
|
|
// Note that the key is in serialized form, and should be deserialized
|
|
// (according to the schema) before use.
|
|
// As explained above, the key object is only valid during this call, and
|
|
// if the implementation wishes to save it, it must copy the *contents*.
|
|
virtual void consume_row_start(sstables::key_view key, sstables::deletion_time deltime) = 0;
|
|
|
|
// Consume one cell (column name and value). Both are serialized, and need
|
|
// to be deserialized according to the schema.
|
|
// When a cell is set with an expiration time, "ttl" is the time to live
|
|
// (in seconds) originally set for this cell, and "expiration" is the
|
|
// absolute time (in seconds since the UNIX epoch) when this cell will
|
|
// expire. Typical cells, not set to expire, will get expiration = 0.
|
|
virtual void consume_cell(bytes_view col_name, bytes_view value,
|
|
int64_t timestamp,
|
|
int32_t ttl, int32_t expiration) = 0;
|
|
|
|
|
|
// Consume a deleted cell (i.e., a cell tombstone).
|
|
virtual void consume_deleted_cell(bytes_view col_name, sstables::deletion_time deltime) = 0;
|
|
|
|
// Consume one range tombstone.
|
|
virtual void consume_range_tombstone(
|
|
bytes_view start_col, bytes_view end_col,
|
|
sstables::deletion_time deltime) = 0;
|
|
|
|
// Called at the end of the row, after all cells.
|
|
// Returns a flag saying whether the sstable consumer should stop now, or
|
|
// proceed consuming more data.
|
|
virtual proceed consume_row_end() = 0;
|
|
|
|
virtual ~row_consumer() { }
|
|
};
|