scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 04:37:00 +00:00

Files

Nadav Har'El 2f56577794 sstables: more efficient read of compressed data file

Before this patch, reading large ranges from a compressed data file involved
two inefficiencies:

 1.  The compressed data file was read one compressed chunk at a time.
     Such a chunk is around 30 KB in size, well below our desired sstable
     read-ahead size (sstable_buffer_size = 128 KB).

 2.  Because the compressed chunks have variable length (the uncompressed
     chunk has a fixed length) they are not aligned to disk blocks, so
     consecutive chunks have overlapping blocks which were unnecessarily
     read twice.

The fix for both issues is to build the compressed_file_input_stream on
an existing file_input_stream, instead of using direct file IO to read the
individual chunks. file_input_stream takes care of doing the appropriate
amount of read-ahead, and the compressed_file_input_stream layer does the
decompression of the data read from the underlying layer.

Fixes #992.

Historical note: Implementing compressed_file_input_stream on top of
file_input_stream was already tried in the past, and rejected. The problem
at that time was that compressed_file_input_stream's constructor did not
specify the *end* of the range to read, so that when we wanted to read
only a small range we got too much read-ahead beyond the exactly one
compressed chunk that we needed to read.  Following the fix to issue #964,
we now know on every streaming read also the intended *end* of the stream,
so we can now use this to stop reading at the end of the last required
chunk, even when we use a read-ahead buffer much larger than a chunk.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>

2016-03-09 10:14:15 +02:00

column_name_helper.hh

Add license notices

2015-09-20 10:43:39 +03:00

compaction_manager.cc

compaction_manager: update stat pending_tasks properly

2016-03-07 17:36:03 +01:00

compaction_manager.hh

compaction manager: add support to wait for termination of cleanup

2016-02-18 17:01:18 +02:00

compaction.cc

sstables: use system clock's epoch for timestamp in compaction history