scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 13:45:53 +00:00

Author	SHA1	Message	Date
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Piotr Jastrzebski	10228b35c5	compress: Remove unused make_compressed_file_k_l_format_output_stream Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2021-06-27 15:12:31 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Piotr Jastrzebski	bacda100ec	sstables: remove std::iterator from const_iterator std::iterator is deprecated since C++17 so define all the required iterator_traits directly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-17 16:53:20 +01:00
Rafael Ávila de Espíndola	13282b3d4c	sstables: Pass an output_stream to make_compressed_file_.*_format_output_stream This is a bit simpler as we don't have to pass in the options and moves the calls to make_file_output_stream to places where we can handle futures. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Avi Kivity	0d0ee20f76	Merge "Implement `sstable_info` API command (info on sstables)" from Calle " Refs #4726 Implement the api portion of a "describe sstables" command. Adds rest types for collecting both fixed and dynamic attributes, some grouped. Allows extensions to add attributes as well. (Hint hint) " * 'sstabledesc' of https://github.com/elcallio/scylla: api/storage_service: Add "sstable_info" command sstables/compress: Make compressor pointer accessible from compression info sstables.hh: Add attribute description API to file extension sstables.hh: Add compression component accessor sstables.hh: Make "has_component" public	2019-08-12 21:16:08 +03:00
Calle Wilund	95a8ff12e7	sstables/compress: Make compressor pointer accessible from compression info	2019-08-06 07:07:44 +00:00
Kamil Braun	f14e6e73bb	Add ZStandard compression This adds the option to compress sstables using the Zstandard algorithm (https://facebook.github.io/zstd/). To use, pass 'sstable_compression': 'org.apache.cassandra.io.compress.ZstdCompressor' to the 'compression' argument when creating a table. You can also specify a 'compression_level'. See Zstd documentation for the available compression levels. Resolves #2613. Signed-off-by: Kamil Braun <kbraun@scylladb.com>	2019-08-05 14:55:53 +02:00
Benny Halevy	0e3f9c25e4	sstables: compress.hh: add missing include Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Vladimir Krivopalov	cc62ad3b69	sstables: Make compressed streams customizable on checksumming. Use either Adler32 or CRC32 while writing to or reading from a compressed stream. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-19 20:52:07 +03:00
Vladimir Krivopalov	5183294676	sstables: Move checksum calculation logic to compressed_output_stream. Previously, compressed_output_stream used to calculate checksum of the supplied chunk and pass it to the 'compression' object to combine with the full checksum calculated on prior writes. Now, all the checksum calculation happens inside compressed_output_stream and 'compression' only stores the result. This is done to loosen ties between two classes and simplify compressed_output_stream customisation with various checksum algorithms. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-19 20:52:07 +03:00
Vladimir Krivopalov	adb43959d1	sstables: Move adler32 routines under the scope of a class. This is a step towards making digest algorithm customizable at compile time. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Vladimir Krivopalov	4e4030676f	sstables: Move checksum utils into separate header. Checksummed writer doesn't need to include all compression stuff. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Calle Wilund	74758c87cd	sstables::compress/compress: Make compression a virtual object Make a "compressor" an actual class, that can be implemented and registered via class registry. For "common" compressors, the objects will be shared, but complex implementors can be semi-stateful. sstable compression is split into two parts: The "static" config which is shared across shards, and a "local" one, which holds a compressor pointer. The latter is encapsulated, along with actual compressed data writers, in sstables/compress.cc. For compression (write), compression writer is instansiated with the settings active in table metadata. For decompression (read), compression reader is instansiated with the settings stored in sstable metadata, which can differ from the currently active table metadata. v2: * Structured patch sets differently (dependencies) * Added more comments/api descs * Added patch to move all sstable compression into compress.cc, effectively separating top-level virtual compressor object from sstable io knowledge v3: * Rebased v4: * Moved all sstable compression logic/knowledge into compress.cc (local compression). Merged the two patches (separation just confuses reader).	2018-02-07 10:11:45 +00:00
Raphael S. Carvalho	09f4ee808f	sstables/compress: Fix race condition in segmented offset reading of shared sstable Race condition was introduced by commit `028c7a0888`, which introduces chunk offset compression, because a reading state is kept in the compress structure which is supposed to be immutable and can be shared among shards owning the same sstable. So it may happen that shard A updates state while shard B relies on information previously set which leads to incorrect decompression, which in turn leads to read misbehaving. We could serialize access to at() which would only lead to contention issues for shared sstables, but that can be avoided by moving state out of compress structure which is expected to be immutable after sstable is loaded and feeded to shards that own it. Sequential accessor (wraps state and reference to segmented_offset) is added to prevent at() and push_back() interfaces from being polluted. Tests: release mode. Fixes #3148. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180205192432.23405-1-raphaelsc@scylladb.com>	2018-02-06 12:10:10 +02:00
Botond Dénes	d1209c548a	Fix -Wreturn-type warnings Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <99f7a006daaa78eb87720ac51c394093398bc868.1504013915.git.bdenes@scylladb.com>	2017-08-29 16:41:09 +03:00
Avi Kivity	fa8d0fe4d0	Revert "Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond"""" This reverts commit `238877a0c6`. A fix was found and will be committed shortly.	2017-08-28 16:14:13 +03:00
Avi Kivity	238877a0c6	Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond""" This reverts commit `9d27455744`. It's still broken. To reproduce: ./tools/bin/cassandra-stress write -schema compression=LZ4Compressor (on a clean database) .0 0x00007ffff32aa69b in raise () from /lib64/libc.so.6 .1 0x00007ffff32ac4a0 in abort () from /lib64/libc.so.6 .2 0x000000000054a0e8 in seastar::memory::abort_on_underflow (size=<optimized out>) at core/memory.cc:1189 .3 seastar::memory::allocate_large (size=<optimized out>) at core/memory.cc:1194 .4 0x000000000054b305 in seastar::memory::allocate (size=size@entry=18446744073702885265) at core/memory.cc:1227 .5 0x000000000054b45e in malloc (n=n@entry=18446744073702885265) at core/memory.cc:1452 .6 0x00000000006013e4 in seastar::temporary_buffer<char>::temporary_buffer (this=0x6010195fc800, size=18446744073702885265) at /home/avi/urchin/seastar/core/temporary_buffer.hh:72 .7 0x0000000000a3908b in seastar::input_stream<char>::read_exactly (this=0x6010053d0248, n=18446744073702885265) at /home/avi/urchin/seastar/core/iostream-impl.hh:189 .8 0x0000000000a9c77f in compressed_file_data_source_impl::get (this=0x6010053d0240) at sstables/compress.cc:499 .9 0x0000000000aa1b01 in seastar::data_source::get (this=<optimized out>) at /home/avi/urchin/seastar/core/iostream.hh:63 .10 seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}::operator()() const (__closure=__closure@entry=0x6010195fcab0) at /home/avi/urchin/seastar/core/iostream-impl.hh:204 .11 0x0000000000aa22f0 in seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >::apply<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}&>(sstables::data_consume_rows_context&&) (func=...) at /home/avi/urchin/seastar/core/future.hh:1312 .12 seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}>(sstables::data_consume_rows_context&&) (action=...) at /home/avi/urchin/seastar/core/future-util.hh:203 .13 0x0000000000a9e730 in seastar::input_stream<char>::consume<sstables::data_consume_rows_context> (consumer=..., this=<optimized out>) at /home/avi/urchin/seastar/core/iostream-impl.hh:237 .14 data_consumer::continuous_data_consumer<sstables::data_consume_rows_context>::consume_input<sstables::data_consume_rows_context> (c=..., this=<optimized out>) at sstables/consumer.hh:226 .15 sstables::data_consume_context::impl::read (this=<optimized out>) at sstables/row.cc:411 .16 sstables::data_consume_context::read (this=<optimized out>) at sstables/row.cc:437 .17 0x0000000000aafbae in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const (__closure=<optimized out>) at sstables/partition.cc:843 .18 seastar::apply_helper<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, std::tuple) (args=..., func=...) at ./seastar/core/apply.hh:36 .19 seastar::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/apply.hh:44 .20 seastar::futurize<seastar::future<> >::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/future.hh:1302 .21 seastar::future<>::then<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&) ( this=this@entry=0x6010195fcbb0, func=...) at ./seastar/core/future.hh:890 .22 0x0000000000ac273f in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const (__closure=0x6010195fcc28) at sstables/partition.cc:843 .23 seastar::do_until_continued<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&, seastar::promise<>) (stop_cond=..., action=..., p=...) at /home/avi/urchin/seastar/core/future-util.hh:155 .24 0x0000000000ac29c3 in seastar::do_until<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&) (action=..., stop_cond=..., this=<optimized out>) at /home/avi/urchin/seastar/core/future-util.hh:330 .25 sstables::sstable_streamed_mutation::fill_buffer (this=<optimized out>) at sstables/partition.cc:844 .26 0x0000000000ad3d2b in streamed_mutation::fill_buffer (this=0x6010195fcd10) at ./streamed_mutation.hh:489 .27 consume_flattened_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >, std::function<bool (streamed_mutation const&)> >(mutation_reader&, stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >&, std::function<bool (streamed_mutation const&)>&&) ( (gdb) p addr $1 = { chunk_start = 13330037, chunk_len = 18446744073702885265, offset = 0 }	2017-08-27 13:32:37 +03:00
Avi Kivity	9d27455744	Revert "Revert "Merge "Compress in-memory compression-info" from Botond"" This reverts commit `9656fd79a0`. A fix is now available.	2017-08-24 13:37:35 +03:00
Tomasz Grabiec	9656fd79a0	Revert "Merge "Compress in-memory compression-info" from Botond" This reverts commit `ef85cf1cb3`, reversing changes made to `de011ece52`. Vlad reports that this causes SIGSEGV on cluster restarts. seastar::backtrace_buffer::append_backtrace() at /home/vladz/work/urchin/seastar/core/reactor.cc:274 (inlined by) print_with_backtrace at /home/vladz/work/urchin/seastar/core/reactor.cc:289 seastar::print_with_backtrace(char const) at /home/vladz/work/urchin/seastar/core/reactor.cc:296 sigsegv_action at /home/vladz/work/urchin/seastar/core/reactor.cc:3512 (inlined by) operator() at /home/vladz/work/urchin/seastar/core/reactor.cc:3498 (inlined by) _FUN at /home/vladz/work/urchin/seastar/core/reactor.cc:3494 ?? ??:0 operator()<seastar::temporary_buffer<char> > at /home/vladz/work/urchin/sstables/sstables.cc:870 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) do_void_futurize_apply_tuple<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1270 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1290 (inlined by) then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:873 (inlined by) do_until_continued<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:155 do_until<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:330 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:874 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:875 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 operator()<seastar::future_state<> > at /home/vladz/work/urchin/seastar/core/future.hh:900 (inlined by) run at /home/vladz/work/urchin/seastar/core/future.hh:395 seastar::reactor::run_tasks(seastar::circular_buffer<std::unique_ptr<seastar::task, std::default_delete<seastar::task> >, std::allocator<std::unique_ptr<seastar::task, std::default_delete<seastar::task> > > >&) at /home/vladz/work/urchin/seastar/core/reactor.cc:2317 seastar::reactor::run() at /home/vladz/work/urchin/seastar/core/reactor.cc:2775 seastar::app_template::run_deprecated(int, char*, std::function<void ()>&&) at /home/vladz/work/urchin/seastar/core/app-template.cc:142	2017-08-24 11:44:14 +02:00
Botond Dénes	028c7a0888	Optimise the storage of compression chunk offsets To reduce the memory footprint of compression-info, n offsets are grouped together into segments, where each segment stores a base absolute offset into the file, the other offsets in the segments being relative offsets (and thus of reduced size). Also offsets are allocated only just enough bits to store their maximum value. The offsets are thus packed in a buffer like so: arrrarrrarrr... where n is 4, a is an absolute offset and r are offsets relative to a. The optimal value of n can be calculated for a given file_size (f) and chunk_size (c), by finding the minima of the following function: f(n) = (f/c)/n * (log2(f) + (n - 1)log2((n-1)(c + 64))) This is done in an empirical way, using a script (see below). Furthermore segments are stored in buckets, where each bucket has its own base offset. Each bucket therefore can address an equal chunk of the file and furthermore each segment in a bucket can address an equal sub-chunk of this area. The value of a given offset i is thus: bucket_base_offset_for(i) + segment_base_offset_for(i) + offset(i) To account for the bucketed storage we calculate a local_f, which is optimized so that a bucketful of segmented offsets can address the largest possible chunk of f. As value of this local_f only depends on the bucket_size (b) and c the value of n can be made independent of f and therefore only depend on one dynamic value, c. This makes life much simpler as we don't need to know the size of the file up-front, we can just append buckets to the storage on demand, while the required storage is still less than a third [1] of the original storage requirements (std::deque<uint64>). The table with the minima(f(n)) for different f and c values is pre-computed by gen_segmented_compress_params.py and stored in sstables/segmented_compress_params.hh. This script also creates a table with the best values of local_f for the given bucket_size. At runtime we only select the best params based on c. [1] This was calculated for c=4K and b=4K	2017-08-21 17:06:12 +03:00
Raphael S. Carvalho	15246f31f7	sstables: fix incorrect sstable size when compression is enabled Size of uncompressed sstable was being unconditionally used to determine when to stop writing a table. When compression is enabled, compressed size should be used instead. Problem affected Scylla when compression and leveled strategy were used. Fixes #1177. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <d9bf26def41fb33ca297f4127ce042b7f67adf96.1460484529.git.raphaelsc@scylladb.com>	2016-04-13 09:01:01 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Nadav Har'El	2f56577794	sstables: more efficient read of compressed data file Before this patch, reading large ranges from a compressed data file involved two inefficiencies: 1. The compressed data file was read one compressed chunk at a time. Such a chunk is around 30 KB in size, well below our desired sstable read-ahead size (sstable_buffer_size = 128 KB). 2. Because the compressed chunks have variable length (the uncompressed chunk has a fixed length) they are not aligned to disk blocks, so consecutive chunks have overlapping blocks which were unnecessarily read twice. The fix for both issues is to build the compressed_file_input_stream on an existing file_input_stream, instead of using direct file IO to read the individual chunks. file_input_stream takes care of doing the appropriate amount of read-ahead, and the compressed_file_input_stream layer does the decompression of the data read from the underlying layer. Fixes #992. Historical note: Implementing compressed_file_input_stream on top of file_input_stream was already tried in the past, and rejected. The problem at that time was that compressed_file_input_stream's constructor did not specify the end of the range to read, so that when we wanted to read only a small range we got too much read-ahead beyond the exactly one compressed chunk that we needed to read. Following the fix to issue #964, we now know on every streaming read also the intended end of the stream, so we can now use this to stop reading at the end of the last required chunk, even when we use a read-ahead buffer much larger than a chunk. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>	2016-03-09 10:14:15 +02:00
Glauber Costa	8e4bf025ae	sstables: wire priority for read path All the SSTable read path can now take an io_priority. The public functions will take a default parameter which is Seastar's default priority. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Nadav Har'El	4edf7fe206	clean up uses of lw_shared_ptr<file> recently, "file" started to use a shared_ptr internally, and is already copy-able and reference counted, and there is no reason to use lw_shared_ptr<file>. This patch cleans up a few remaining places where lw_shared_ptr<file> was used. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-07-22 11:51:40 +03:00
Raphael S. Carvalho	113d3b1001	sstables: update compression ratio stats If compression is used, we should provide both uncompressed and compressed length to metadata collector, so as for the ratio to be computed. Stats metadata stores compression ratio. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-21 08:14:07 +03:00
Raphael S. Carvalho	f17f3b197a	sstables: add initial support to compression lz4 is the unique compressor algorithm supported so far. missing deflate and snappy algorithms. Adding them should be relatively easy though. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-16 12:42:00 -03:00
Raphael S. Carvalho	3bfb86f541	sstables: add compress_max_size to compression used to return maximum size which compressor may output. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-06-16 09:48:00 -03:00
Raphael S. Carvalho	d1ed0744f0	schema: add sstable compressor property The field compressor is about saying which compressor algorithm must be used in compression of sstable data file. This is a small step towards compressed sstable data file. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-09 11:18:56 +03:00
Glauber Costa	2dbd2b408a	sstables: change describe_type's return type to auto We always return a future, but with the threaded writer, we can get rid of that. So while reads will still return a future, the writer will be able to return void. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-06-08 15:25:35 +03:00
Pekka Enberg	a9d08438cd	sstable: Inline adler32 checksum functions They're called in the fast-path so inline the functions to avoid an extra function call. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-04 15:48:37 +03:00
Raphael S. Carvalho	bdd3fe61c5	sstables: add initial support to generation of CRC component CRC component is composed of chunk size, and a vector of checksums for each chunk (at most chunk size bytes) composing the data file. The implementation is about computing the checksum every time the output stream of data file gets written. A write to output stream may cross the chunk boundary, so that must be handled properly. Note that CRC component will only be created if compression isn't being used. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-01 12:25:01 -03:00
Glauber Costa	cd001d208c	sstable: calculate data size It would be useful in some situations to know where does the data ends. If the file is uncompressed, this is equivalent to the file length. If the data file is compressed, this information needs to come from the compression structure. Provide a method that encodes that. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-04-30 16:04:19 -04:00
Raphael S. Carvalho	fdf50ef643	sstables: add initial support to compression Starting with LZ4, the default compressor. Stub functions were added to other compression algorithms, which should eventually be replaced with an actual implementation. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-04-19 10:07:29 +03:00
Nadav Har'El	f80ac5a629	sstables: rework compression metadata to fix test. Previously we had both a "compression" structure (read from the Compression Info file on disk) and a "compression_metadata" class with additional information, which std::move()ed parts of the compression structure. This caused problems for the simplistic sstable-writing test (which does the non-interesting thing of writing a previously-read sstable). I'm ashamed to say, fixing this was very hard, because all this code is built like a house of cards - try to change one thing, and everything falls apart. After many failed attempts in trying to improve this code, what I ended up doing is simply extending the "compression" structure - the extended part isn't read or written, but it is in the structure. We also no longer move a shared pointer to the compression structure, but rather just an ordinary pointer; The assumption is that the user will already make sure that the sstable structure will live for the durations of any processing on it - and the compression structure is just one part of this sstable structure. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-03-29 16:14:53 +03:00
Nadav Har'El	c6eb2a87ea	Move compress.{cc,hh} to sstables/ Move compress.{cc,hh} from db/ to sstables/. This makes more sense, as this code is only used for sstables (un)compression. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-03-24 16:54:58 +02:00

39 Commits