scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 11:55:15 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	15246f31f7	sstables: fix incorrect sstable size when compression is enabled Size of uncompressed sstable was being unconditionally used to determine when to stop writing a table. When compression is enabled, compressed size should be used instead. Problem affected Scylla when compression and leveled strategy were used. Fixes #1177. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <d9bf26def41fb33ca297f4127ce042b7f67adf96.1460484529.git.raphaelsc@scylladb.com>	2016-04-13 09:01:01 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Nadav Har'El	2f56577794	sstables: more efficient read of compressed data file Before this patch, reading large ranges from a compressed data file involved two inefficiencies: 1. The compressed data file was read one compressed chunk at a time. Such a chunk is around 30 KB in size, well below our desired sstable read-ahead size (sstable_buffer_size = 128 KB). 2. Because the compressed chunks have variable length (the uncompressed chunk has a fixed length) they are not aligned to disk blocks, so consecutive chunks have overlapping blocks which were unnecessarily read twice. The fix for both issues is to build the compressed_file_input_stream on an existing file_input_stream, instead of using direct file IO to read the individual chunks. file_input_stream takes care of doing the appropriate amount of read-ahead, and the compressed_file_input_stream layer does the decompression of the data read from the underlying layer. Fixes #992. Historical note: Implementing compressed_file_input_stream on top of file_input_stream was already tried in the past, and rejected. The problem at that time was that compressed_file_input_stream's constructor did not specify the end of the range to read, so that when we wanted to read only a small range we got too much read-ahead beyond the exactly one compressed chunk that we needed to read. Following the fix to issue #964, we now know on every streaming read also the intended end of the stream, so we can now use this to stop reading at the end of the last required chunk, even when we use a read-ahead buffer much larger than a chunk. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>	2016-03-09 10:14:15 +02:00
Glauber Costa	8e4bf025ae	sstables: wire priority for read path All the SSTable read path can now take an io_priority. The public functions will take a default parameter which is Seastar's default priority. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Nadav Har'El	4edf7fe206	clean up uses of lw_shared_ptr<file> recently, "file" started to use a shared_ptr internally, and is already copy-able and reference counted, and there is no reason to use lw_shared_ptr<file>. This patch cleans up a few remaining places where lw_shared_ptr<file> was used. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-07-22 11:51:40 +03:00
Raphael S. Carvalho	113d3b1001	sstables: update compression ratio stats If compression is used, we should provide both uncompressed and compressed length to metadata collector, so as for the ratio to be computed. Stats metadata stores compression ratio. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-21 08:14:07 +03:00
Raphael S. Carvalho	f17f3b197a	sstables: add initial support to compression lz4 is the unique compressor algorithm supported so far. missing deflate and snappy algorithms. Adding them should be relatively easy though. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-16 12:42:00 -03:00
Raphael S. Carvalho	3bfb86f541	sstables: add compress_max_size to compression used to return maximum size which compressor may output. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-06-16 09:48:00 -03:00
Raphael S. Carvalho	d1ed0744f0	schema: add sstable compressor property The field compressor is about saying which compressor algorithm must be used in compression of sstable data file. This is a small step towards compressed sstable data file. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-09 11:18:56 +03:00
Glauber Costa	2dbd2b408a	sstables: change describe_type's return type to auto We always return a future, but with the threaded writer, we can get rid of that. So while reads will still return a future, the writer will be able to return void. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-06-08 15:25:35 +03:00
Pekka Enberg	a9d08438cd	sstable: Inline adler32 checksum functions They're called in the fast-path so inline the functions to avoid an extra function call. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-04 15:48:37 +03:00
Raphael S. Carvalho	bdd3fe61c5	sstables: add initial support to generation of CRC component CRC component is composed of chunk size, and a vector of checksums for each chunk (at most chunk size bytes) composing the data file. The implementation is about computing the checksum every time the output stream of data file gets written. A write to output stream may cross the chunk boundary, so that must be handled properly. Note that CRC component will only be created if compression isn't being used. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-01 12:25:01 -03:00
Glauber Costa	cd001d208c	sstable: calculate data size It would be useful in some situations to know where does the data ends. If the file is uncompressed, this is equivalent to the file length. If the data file is compressed, this information needs to come from the compression structure. Provide a method that encodes that. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-04-30 16:04:19 -04:00
Raphael S. Carvalho	fdf50ef643	sstables: add initial support to compression Starting with LZ4, the default compressor. Stub functions were added to other compression algorithms, which should eventually be replaced with an actual implementation. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-04-19 10:07:29 +03:00
Nadav Har'El	f80ac5a629	sstables: rework compression metadata to fix test. Previously we had both a "compression" structure (read from the Compression Info file on disk) and a "compression_metadata" class with additional information, which std::move()ed parts of the compression structure. This caused problems for the simplistic sstable-writing test (which does the non-interesting thing of writing a previously-read sstable). I'm ashamed to say, fixing this was very hard, because all this code is built like a house of cards - try to change one thing, and everything falls apart. After many failed attempts in trying to improve this code, what I ended up doing is simply extending the "compression" structure - the extended part isn't read or written, but it is in the structure. We also no longer move a shared pointer to the compression structure, but rather just an ordinary pointer; The assumption is that the user will already make sure that the sstable structure will live for the durations of any processing on it - and the compression structure is just one part of this sstable structure. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-03-29 16:14:53 +03:00
Nadav Har'El	c6eb2a87ea	Move compress.{cc,hh} to sstables/ Move compress.{cc,hh} from db/ to sstables/. This makes more sense, as this code is only used for sstables (un)compression. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-03-24 16:54:58 +02:00

17 Commits