Commit Graph

16 Commits

Author SHA1 Message Date
Tomasz Grabiec
27d86dfe18 sstables: Enable skipping to cells at data_consume_context level 2017-03-28 18:10:39 +02:00
Paweł Dziepak
5905729c4a sstables: read counter cells 2017-02-02 10:35:14 +00:00
Paweł Dziepak
25b91c51e2 ssables: add data_consume_rows_context::reset()
reset() is going to be used to restore valid state after fast forwarding
the reader.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-10-19 15:29:08 +01:00
Paweł Dziepak
9e8db53c46 sstables: allow row consumer to stop at any point
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:50 +01:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Glauber Costa
8e4bf025ae sstables: wire priority for read path
All the SSTable read path can now take an io_priority. The public functions will
take a default parameter which is Seastar's default priority.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-01-25 15:20:38 -05:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Glauber Costa
f45b807f34 row consumer: move proceed class to a separate class
Continuing the work of decoupling the the prestate and state parts of the NSM
so we can reuse it, move the proceed class to a different holding class.
Proceeding or not has nothing to do with "rows".

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-28 17:18:06 -05:00
Nadav Har'El
d42c05b6ad sstable: Pull-style read interface
This patch replaces the sstable read APIs from having "push" style,
to having "pull style".

The sstable read code has two APIs:
 1. An API for sequentially consuming low-level sstable items - sstable
    row's beginning and end, cells, tombstones, etc.
 2. An API for sequentially consuming entire sstable rows in our "mutation"
    format.

Before this patch, both APIs were in "push style": The user supplies
callback functions, and the sstable read code "pushes" to these functions
the desired items (low-level sstable parts, or whole mutations).
However, a push API is very inconvenient for users, like the query
processing code, or the compaction code, which both iterate over mutations.
Such code wants to control its own progression through the iteration -
the user prefers to "pull" the next mutation when it wants it; Moreover,
the user wants to *stop* pulling more mutations if it wants, without
worrying about various continuations that are still scheduled in the
background (the latter concern was especially problematic in the "push"
design).

The modified APIs are:

1. The functions for iterating over mutations, sstable::read_rows() et al.,
   now return a "mutation_reader" object which can be used for iterating
   over the mutation: mutation_reader::read() asks for the next mutation,
   and returns a future to it (or an unassigned value on EOF).
   You can see an example on how it is used in sstable_mutation_test.cc.

2. The functions for consuming low-level sstable items (row begin, cell,
   etc.) are still partially push-style - the items are still fed into
   the consume object - but consumpton now *stops* (instead of defering
   and continuing later, as in the old code) when the consumer asks to.
   The caller can resume the consumption later when it wishes to (in
   this sense, this is a "pull" API, because the user asks for more
   input when it wants to).

This patch does *not* remove input_stream's feature of a consumer
function returning a non-ready future. However, this feature is no longer
used anywhere in our code - the new sstable reader code stops the
consumption when old sstable reader code paused it temporarily with
a non-ready future.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-06-03 10:55:34 +03:00
Nadav Har'El
33270efc39 sstables: make consume_row_end() a future
After commit 3ae81e68a0, we already support
in input_stream::consume() the possibility of the consumer blocking by
returning a future. But the code for sstable consumption had now way to
use this capability. This patch adds a future<> return code for
consume_row_end(), allowing the consumer to pause after reading each
sstable row (but not, currently, after each cell in the row).

We also need to use this capability in read_range_rows(), which wrongly
ignored the future<> returned by the "walker" function - now this future<>
is returned to the sstable reader, and causes it to pause reading until
the future is fulfilled.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-05-28 18:53:32 +03:00
Glauber Costa
2fba948ad8 sstables: move timestamps to signed integer
This is to follow Origin

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-05-13 17:14:02 -04:00
Glauber Costa
590abb800e sstables: pass a key_view instead of bytes_view to consume_row_start
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-05-13 17:14:02 -04:00
Nadav Har'El
8e6d11df1b sstable read: support deleted cells
This patch adds support to reading deleted cells (a.k.a. cell tombstones)
from the SSTable.

The way deleted cells are encoded in the sstable is explained in the
"Cell tombstone" section of
https://github.com/cloudius-systems/urchin/wiki/SSTables-interpretation-in-Urchin

This more-or-less completes the low-level SSTable row reading code - the
only remaining untreated case are counters, which we agreed to leave to
later. If counters are found in the SSTable, we'll throw an exception.

This patch adds a new callback, consume_deleted_cell, taking the name of
the cell and its deletion_time (as usual, deletion_time includes both a
64-bit timestamp, for ordering events, and a 32-bit "local_deletion_time"
used to schedule gc of old tombstones).

This patch also adds a test SSTable with deleted cell, created by the
following Cassandra Commands:

	CREATE TABLE deleted (
		name text,
		age int,
		PRIMARY KEY (name)
	);
	INSERT INTO deleted (name, age) VALUES ('nadav', 40);
	<flush table - the second table is what we're after>
	DELETE age FROM deleted WHERE name = 'nadav';

We test our ability to read this sstable, and see the deleted cell
and its expected deletion time.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-04-28 14:56:04 +03:00
Nadav Har'El
0adc4812ba sstable read: support cell expiration time
This patch adds support to reading sstable cells with expiration time.

It adds two more parameters to the row_consumer::consume_cell() - "ttl"
and "expiration". The "ttl" is the original TTL set on the cell in seconds,
the "expiration" is the absolute time (in seconds since the Unix epoch) when
this cell is set to expire. I don't know why both values are needed...

When a cell has no expiration time set (most cells will be like that), the
callback with will be called expiration==0 (and ttl==0).

This patch also adds a test SSTable with cells with set TTL, created by
the following Cassandra commands:

	CREATE TABLE ttl (
		name text,
		age int,
		PRIMARY KEY (name)
	);
	INSERT INTO ttl (name, age) VALUES ('nadav', 40) USING TTL 3600;

And tests our ability to read the resulting sstable, and get the expected
expiration time.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-04-28 14:56:01 +03:00
Nadav Har'El
486e6271a1 sstables: data file row reading and streaming
The previous implementation could read either one sstable row or several,
but only when all the data was read in advance into a contiguous memory
buffer.

This patch changes the row read implementation into a state machine,
which can work on either a pre-read buffer, or data streamed via the
input_stream::consume() function:

The sstable::data_consume_rows_at_once() method reads the given byte range
into memory and then processes it, while the sstable::data_consume_rows()
method reads the data piecementally, not trying to fit all of it into
memory. The first function is (or will be...) optimized for reading one
row, and the second function for iterating over all rows - although both
can be used to read any number of rows.

The state-machine implementation is unfortunately a bit ugly (and much
longer than the code it replaces), and could probably be improved in the
future. But the focus was parsing performance: when we use large buffers
(the default is 8192 bytes), most of the time we don't need to read
byte-by-byte, and efficiently read entire integers at once, or even larger
chunks. For strings (like column names and values), we even avoid copying
them if they don't cross a buffer boundary.

To test the rare boundary-crossing case despite having a small sstable,
the code includes in "#if 0" a hack to split one buffer into many tiny
buffers (1 byte, or any other number) and process them one by one.
The tests still pass with this hack turned on.

This implementation of sstable reading also adds a feature not present
in the previous version: reading range tombstones. An sstable with an
INSERT of a collection always has a range tombstone (to delete all old
items from the collection), so we need this feature to read collections.
A test for this is included in this patch.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-04-13 17:40:46 +03:00
Nadav Har'El
43b93058a5 sstables: add row consuming function
Add a function sstables::data_consume_row() which reads an entire row
(or several consecutive rows) at a given byte range in the data file,
and feeds them into a "row_consumer" implementation which the user provides.

The row_consumer's method consume_row_start() method is called at the
beginning of the (or each) row with its key and deletion information,
then the consume_cell() method is called for each of the row's cells,
and after all cells of the row, consume_row_end() is called.

The current implementation only supports regular cells, and not other
special cases like range tombstones and counters (see
https://github.com/cloudius-systems/urchin/wiki/SSTables%20Data%20File)
as I did not yet have sstables to test those on; The current
implementation will abort upon seeing these unsupported features.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-04-03 12:37:44 +03:00