mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-21 17:10:35 +00:00
The previous implementation could read either one sstable row or several, but only when all the data was read in advance into a contiguous memory buffer. This patch changes the row read implementation into a state machine, which can work on either a pre-read buffer, or data streamed via the input_stream::consume() function: The sstable::data_consume_rows_at_once() method reads the given byte range into memory and then processes it, while the sstable::data_consume_rows() method reads the data piecementally, not trying to fit all of it into memory. The first function is (or will be...) optimized for reading one row, and the second function for iterating over all rows - although both can be used to read any number of rows. The state-machine implementation is unfortunately a bit ugly (and much longer than the code it replaces), and could probably be improved in the future. But the focus was parsing performance: when we use large buffers (the default is 8192 bytes), most of the time we don't need to read byte-by-byte, and efficiently read entire integers at once, or even larger chunks. For strings (like column names and values), we even avoid copying them if they don't cross a buffer boundary. To test the rare boundary-crossing case despite having a small sstable, the code includes in "#if 0" a hack to split one buffer into many tiny buffers (1 byte, or any other number) and process them one by one. The tests still pass with this hack turned on. This implementation of sstable reading also adds a feature not present in the previous version: reading range tombstones. An sstable with an INSERT of a collection always has a range tombstone (to delete all old items from the collection), so we need this feature to read collections. A test for this is included in this patch. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
52 lines
2.1 KiB
C++
52 lines
2.1 KiB
C++
/*
|
|
* Copyright (C) 2015 Cloudius Systems, Ltd.
|
|
*
|
|
*/
|
|
|
|
#pragma once
|
|
|
|
#include "bytes.hh"
|
|
#include "core/temporary_buffer.hh"
|
|
|
|
// sstables::data_consume_row feeds the contents of a single row into a
|
|
// row_consumer object:
|
|
//
|
|
// * First, consume_row_start() is called, with some information about the
|
|
// whole row: The row's key, timestamp, etc.
|
|
// * Next, consume_cell() is called once for every column.
|
|
// * Finally, consume_row_end() is called. A consumer written for a single
|
|
// column will likely not want to do anything here.
|
|
//
|
|
// Important note: the row key, column name and column value, passed to the
|
|
// consume_* functions, are passed as a "bytes_view" object, which points to
|
|
// internal data held by the feeder. This internal data is only valid for the
|
|
// duration of the single consume function it was passed to. If the object
|
|
// wants to hold these strings longer, it must make a copy of the bytes_view's
|
|
// contents. [Note, in reality, because our implementation reads the whole
|
|
// row into one buffer, the byte_views remain valid until consume_row_end()
|
|
// is called.]
|
|
class row_consumer {
|
|
public:
|
|
// Consume the row's key and deletion_time. The latter determines if the
|
|
// row is a tombstone, and if so, when it has been deleted.
|
|
// Note that the key is in serialized form, and should be deserialized
|
|
// (according to the schema) before use.
|
|
// As explained above, the key object is only valid during this call, and
|
|
// if the implementation wishes to save it, it must copy the *contents*.
|
|
virtual void consume_row_start(bytes_view key, sstables::deletion_time deltime) = 0;
|
|
|
|
// Consume one cell (column name and value). Both are serialized, and need
|
|
// to be deserialized according to the schema.
|
|
virtual void consume_cell(bytes_view col_name, bytes_view value, uint64_t timestamp) = 0;
|
|
|
|
// Consume one range tombstone.
|
|
virtual void consume_range_tombstone(
|
|
bytes_view start_col, bytes_view end_col,
|
|
sstables::deletion_time deltime) = 0;
|
|
|
|
// Called at the end of the row, after all cells.
|
|
virtual void consume_row_end() = 0;
|
|
|
|
virtual ~row_consumer() { }
|
|
};
|