2f565777945801f99622b3155cdbbae702d52bbd
Before this patch, reading large ranges from a compressed data file involved
two inefficiencies:
1. The compressed data file was read one compressed chunk at a time.
Such a chunk is around 30 KB in size, well below our desired sstable
read-ahead size (sstable_buffer_size = 128 KB).
2. Because the compressed chunks have variable length (the uncompressed
chunk has a fixed length) they are not aligned to disk blocks, so
consecutive chunks have overlapping blocks which were unnecessarily
read twice.
The fix for both issues is to build the compressed_file_input_stream on
an existing file_input_stream, instead of using direct file IO to read the
individual chunks. file_input_stream takes care of doing the appropriate
amount of read-ahead, and the compressed_file_input_stream layer does the
decompression of the data read from the underlying layer.
Fixes #992.
Historical note: Implementing compressed_file_input_stream on top of
file_input_stream was already tried in the past, and rejected. The problem
at that time was that compressed_file_input_stream's constructor did not
specify the *end* of the range to read, so that when we wanted to read
only a small range we got too much read-ahead beyond the exactly one
compressed chunk that we needed to read. Following the fix to issue #964,
we now know on every streaming read also the intended *end* of the stream,
so we can now use this to stop reading at the end of the last required
chunk, even when we use a read-ahead buffer much larger than a chunk.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>
#Scylla
##Building Scylla
In addition to required packages by Seastar, the following packages are required by Scylla.
Submodules
Scylla uses submodules, so make sure you pull the submodules first by doing:
git submodule init
git submodule update --recursive
Building and Running Scylla on Fedora
- Installing required packages:
sudo yum install yaml-cpp-devel lz4-devel zlib-devel snappy-devel jsoncpp-devel thrift-devel antlr3-tool antlr3-C++-devel libasan libubsan gcc-c++ gnutls-devel ninja-build ragel libaio-devel cryptopp-devel xfsprogs-devel numactl-devel hwloc-devel libpciaccess-devel libxml2-devel python3-pyparsing
- Build Scylla
./configure.py --mode=release --with=scylla --disable-xen
ninja-build build/release/scylla -j2 # you can use more cpus if you have tons of RAM
- Run Scylla
./build/release/scylla
- run Scylla with one CPU and ./tmp as data directory
./build/release/scylla --datadir tmp --commitlog-directory tmp --smp 1
- For more run options:
./build/release/scylla --help
Building Fedora RPM
As a pre-requisite, you need to install Mock on your machine:
# Install mock:
sudo yum install mock
# Add user to the "mock" group:
usermod -a -G mock $USER && newgrp mock
Then, to build an RPM, run:
./dist/redhat/build_rpm.sh
The built RPM is stored in /var/lib/mock/<configuration>/result directory.
For example, on Fedora 21 mock reports the following:
INFO: Done(scylla-server-0.00-1.fc21.src.rpm) Config(default) 20 minutes 7 seconds
INFO: Results and/or logs in: /var/lib/mock/fedora-21-x86_64/result
Building Fedora-based Docker image
Build a Docker image with:
cd dist/docker
docker build -t <image-name> .
Run the image with:
docker run -p $(hostname -i):9042:9042 -i -t <image name>
Contributing to Scylla
Do not send pull requests.
Send patches to the mailing list address scylladb-dev@googlegroups.com. Be sure to subscribe.
In order for your patches to be merged, you must sign the Contributor's License Agreement, protecting your rights and ours. See http://www.scylladb.com/opensource/cla/.
Description
Languages
C++
73.5%
Python
25.3%
CMake
0.3%
GAP
0.3%
Shell
0.3%