scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	27d86dfe18	sstables: Enable skipping to cells at data_consume_context level	2017-03-28 18:10:39 +02:00
Paweł Dziepak	5905729c4a	sstables: read counter cells	2017-02-02 10:35:14 +00:00
Paweł Dziepak	25b91c51e2	ssables: add data_consume_rows_context::reset() reset() is going to be used to restore valid state after fast forwarding the reader. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-10-19 15:29:08 +01:00
Paweł Dziepak	9e8db53c46	sstables: allow row consumer to stop at any point Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:50 +01:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Glauber Costa	8e4bf025ae	sstables: wire priority for read path All the SSTable read path can now take an io_priority. The public functions will take a default parameter which is Seastar's default priority. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Glauber Costa	f45b807f34	row consumer: move proceed class to a separate class Continuing the work of decoupling the the prestate and state parts of the NSM so we can reuse it, move the proceed class to a different holding class. Proceeding or not has nothing to do with "rows". Signed-off-by: Glauber Costa <glommer@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-08-28 17:18:06 -05:00
Nadav Har'El	d42c05b6ad	sstable: Pull-style read interface This patch replaces the sstable read APIs from having "push" style, to having "pull style". The sstable read code has two APIs: 1. An API for sequentially consuming low-level sstable items - sstable row's beginning and end, cells, tombstones, etc. 2. An API for sequentially consuming entire sstable rows in our "mutation" format. Before this patch, both APIs were in "push style": The user supplies callback functions, and the sstable read code "pushes" to these functions the desired items (low-level sstable parts, or whole mutations). However, a push API is very inconvenient for users, like the query processing code, or the compaction code, which both iterate over mutations. Such code wants to control its own progression through the iteration - the user prefers to "pull" the next mutation when it wants it; Moreover, the user wants to stop pulling more mutations if it wants, without worrying about various continuations that are still scheduled in the background (the latter concern was especially problematic in the "push" design). The modified APIs are: 1. The functions for iterating over mutations, sstable::read_rows() et al., now return a "mutation_reader" object which can be used for iterating over the mutation: mutation_reader::read() asks for the next mutation, and returns a future to it (or an unassigned value on EOF). You can see an example on how it is used in sstable_mutation_test.cc. 2. The functions for consuming low-level sstable items (row begin, cell, etc.) are still partially push-style - the items are still fed into the consume object - but consumpton now stops (instead of defering and continuing later, as in the old code) when the consumer asks to. The caller can resume the consumption later when it wishes to (in this sense, this is a "pull" API, because the user asks for more input when it wants to). This patch does not remove input_stream's feature of a consumer function returning a non-ready future. However, this feature is no longer used anywhere in our code - the new sstable reader code stops the consumption when old sstable reader code paused it temporarily with a non-ready future. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-06-03 10:55:34 +03:00
Nadav Har'El	33270efc39	sstables: make consume_row_end() a future After commit `3ae81e68a0`, we already support in input_stream::consume() the possibility of the consumer blocking by returning a future. But the code for sstable consumption had now way to use this capability. This patch adds a future<> return code for consume_row_end(), allowing the consumer to pause after reading each sstable row (but not, currently, after each cell in the row). We also need to use this capability in read_range_rows(), which wrongly ignored the future<> returned by the "walker" function - now this future<> is returned to the sstable reader, and causes it to pause reading until the future is fulfilled. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-05-28 18:53:32 +03:00
Glauber Costa	2fba948ad8	sstables: move timestamps to signed integer This is to follow Origin Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-05-13 17:14:02 -04:00
Glauber Costa	590abb800e	sstables: pass a key_view instead of bytes_view to consume_row_start Signed-off-by: Glauber Costa <glommer@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-05-13 17:14:02 -04:00
Nadav Har'El	8e6d11df1b	sstable read: support deleted cells This patch adds support to reading deleted cells (a.k.a. cell tombstones) from the SSTable. The way deleted cells are encoded in the sstable is explained in the "Cell tombstone" section of https://github.com/cloudius-systems/urchin/wiki/SSTables-interpretation-in-Urchin This more-or-less completes the low-level SSTable row reading code - the only remaining untreated case are counters, which we agreed to leave to later. If counters are found in the SSTable, we'll throw an exception. This patch adds a new callback, consume_deleted_cell, taking the name of the cell and its deletion_time (as usual, deletion_time includes both a 64-bit timestamp, for ordering events, and a 32-bit "local_deletion_time" used to schedule gc of old tombstones). This patch also adds a test SSTable with deleted cell, created by the following Cassandra Commands: CREATE TABLE deleted ( name text, age int, PRIMARY KEY (name) ); INSERT INTO deleted (name, age) VALUES ('nadav', 40); <flush table - the second table is what we're after> DELETE age FROM deleted WHERE name = 'nadav'; We test our ability to read this sstable, and see the deleted cell and its expected deletion time. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-04-28 14:56:04 +03:00
Nadav Har'El	0adc4812ba	sstable read: support cell expiration time This patch adds support to reading sstable cells with expiration time. It adds two more parameters to the row_consumer::consume_cell() - "ttl" and "expiration". The "ttl" is the original TTL set on the cell in seconds, the "expiration" is the absolute time (in seconds since the Unix epoch) when this cell is set to expire. I don't know why both values are needed... When a cell has no expiration time set (most cells will be like that), the callback with will be called expiration==0 (and ttl==0). This patch also adds a test SSTable with cells with set TTL, created by the following Cassandra commands: CREATE TABLE ttl ( name text, age int, PRIMARY KEY (name) ); INSERT INTO ttl (name, age) VALUES ('nadav', 40) USING TTL 3600; And tests our ability to read the resulting sstable, and get the expected expiration time. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-04-28 14:56:01 +03:00
Nadav Har'El	486e6271a1	sstables: data file row reading and streaming The previous implementation could read either one sstable row or several, but only when all the data was read in advance into a contiguous memory buffer. This patch changes the row read implementation into a state machine, which can work on either a pre-read buffer, or data streamed via the input_stream::consume() function: The sstable::data_consume_rows_at_once() method reads the given byte range into memory and then processes it, while the sstable::data_consume_rows() method reads the data piecementally, not trying to fit all of it into memory. The first function is (or will be...) optimized for reading one row, and the second function for iterating over all rows - although both can be used to read any number of rows. The state-machine implementation is unfortunately a bit ugly (and much longer than the code it replaces), and could probably be improved in the future. But the focus was parsing performance: when we use large buffers (the default is 8192 bytes), most of the time we don't need to read byte-by-byte, and efficiently read entire integers at once, or even larger chunks. For strings (like column names and values), we even avoid copying them if they don't cross a buffer boundary. To test the rare boundary-crossing case despite having a small sstable, the code includes in "#if 0" a hack to split one buffer into many tiny buffers (1 byte, or any other number) and process them one by one. The tests still pass with this hack turned on. This implementation of sstable reading also adds a feature not present in the previous version: reading range tombstones. An sstable with an INSERT of a collection always has a range tombstone (to delete all old items from the collection), so we need this feature to read collections. A test for this is included in this patch. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-04-13 17:40:46 +03:00
Nadav Har'El	43b93058a5	sstables: add row consuming function Add a function sstables::data_consume_row() which reads an entire row (or several consecutive rows) at a given byte range in the data file, and feeds them into a "row_consumer" implementation which the user provides. The row_consumer's method consume_row_start() method is called at the beginning of the (or each) row with its key and deletion information, then the consume_cell() method is called for each of the row's cells, and after all cells of the row, consume_row_end() is called. The current implementation only supports regular cells, and not other special cases like range tombstones and counters (see https://github.com/cloudius-systems/urchin/wiki/SSTables%20Data%20File) as I did not yet have sstables to test those on; The current implementation will abort upon seeing these unsupported features. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-04-03 12:37:44 +03:00

16 Commits