As Nadav notes we use the chunk length as the buffer size for the compressed
stream too.
Fix by using it only for the outer (uncompressed) stream; the inner
(compressed) stream uses the sstable buffer size, 128 kiB.
Fixes#1402.
Message-Id: <1467910556-5759-1-git-send-email-avi@scylladb.com>
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
All variants of write_component now take an io_priority. The public
interfaces are by default set to Seastar's default priority.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Arguments buffer_size and true were accidently inverted.
GCC wasn't complaning because implicit conversion of bool to
int, and vice-versa, is valid.
However, this conversion is not very safe because we could
accidentaly invert parameters.
This should fix the last problem with sstable_test.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <9478cd266006fdf8a7bd806f1c612ec9d1297c1f.1453301866.git.raphaelsc@scylladb.com>
buf is a stack variable, so it may be destroyed by the time it's
used by output_stream::write().
Spotted while auditing the code.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
- no need to mark us as a friend of file_writer
- should be constructing the fields directly instead of using the constructors body.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
What we are doing now, is computing checksum at every write() operation, possibly
at a small byte quantity - like 2 or 4 bytes, since we write those a lot as sizes.
While adler32 allows those computations and make them very easy, that doesn't mean
they are efficient. It is a lot more efficient to compute the checksum on larger
buffer.
We can do that by doing it at put() time in a data_sink_impl, instead of
keeping that in the file abstraction. The code for the checksum itself now also
becomes remarkably simpler - since there is no need anymore to keep state:
we'll always be presented with full buffers.
The data sink implementation and the file_writer share the full_checksum and
the checksum struct variables: and with that in place, the file writer can
still expose the final results of the computation in the same way it does at
present.
Benchmarked with:
perf_sstable_g --smp 1 --iterations 30 --parallelism 1 --mode write --num_columns 5 --partitions 500000
Before:
178829.07 +- 141.28 partitions / sec (30 runs, 1 concurrent ops)
After:
199744.71 +- 201.64 partitions / sec (30 runs, 1 concurrent ops)
gain: 11.70 %
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
recently, "file" started to use a shared_ptr internally, and is already
copy-able and reference counted, and there is no reason to use
lw_shared_ptr<file>. This patch cleans up a few remaining places where
lw_shared_ptr<file> was used.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Following Nadav's discovery of the problem with large writes to output stream,
it turns out that compressed_file_output_stream also needs the option trim_to_
size enabled. Otherwise, a write to compressed_file_output_stream larger than
_size would result in a buffer larger than chunk size being flushed, which is
definitely wrong.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
A base class with virtual functions should also have a virtual destructor,
so if someone deletes it by the base class pointer, the concrete class's
destructor will be called.
I thought this missing virtual destructor is to blame for a bug I was
hunting, but it's not - but it's still worth adding this missing definition.
The silly "default" definition of the move constructor is also necessary,
because when you define the destructor explicitly, the compiler no longer
defines any constructors implicitly for you.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
If compression is used, we should provide both uncompressed and
compressed length to metadata collector, so as for the ratio to
be computed. Stats metadata stores compression ratio.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
lz4 is the unique compressor algorithm supported so far.
missing deflate and snappy algorithms.
Adding them should be relatively easy though.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
writer.hh includes sstables.hh which includes writer.hh
We can't remove the reference if we include core/fstream.hh into writer.hh instead
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Optimize checksum calculation by deferring "full checksum" update until
we've computed a full per-chunk checksum.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
That was needed because new methods of sstable class will have a
file writer as a parameter, and thus the definition of the file
writer must be available from sstables header.