scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 01:20:39 +00:00

Author	SHA1	Message	Date
Kefu Chai	eb7c41767b	treewide: add braces around subobject this change helps to silence the warning of `-Wmissing-braces` from clang. in general, we can initialize an object without constructor with braces. this is called aggregate initialization. but the standard does allow us to initialize each element using either copy-initialization or direct-initialization. but in our case, neither of them applies, so the clang warns like ``` suggest braces around initialization of subobject [-Werror,-Wmissing-braces] options.elements.push_back({bytes(k.begin(), k.end()), bytes(v.begin(), v.end())}); ^~~~~~~~~~~~~~~~~~~~~~~~~ { } ``` in this change, also, take the opportunity to use structured binding to simplify the related code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-28 16:59:29 +08:00
Benny Halevy	77b70dbdb7	sstables: compressed_file_data_source_impl: get: throw malformed_sstable_exception on premature eof Currently, the reader might dereference a null pointer if the input stream reaches eof prematurely, and read_exactly returns an empty temporary_buffer. Detect this condition before dereferencing the buffer and sstables::malformed_sstable_exception. Fixes #13599 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13600	2023-04-21 07:56:58 +03:00
Botond Dénes	dd9a0a16e6	reader_concurrency_semaphore: tracked_file_impl: use request_memory() Use the recently added `request_memory()` to aquire the memory units for the I/O. This allows blocking all but one readers when memory consumption grows too high.	2023-01-17 05:27:04 -05:00
Botond Dénes	c4688563e3	sstables: track decompressed buffers Convert decompressed temporary buffers into tracked buffers just before returning them to the upper layer. This ensures these buffers are known to the reader concurrency semaphore and it has an accurate view of the actual memory consumption of reads. Fixes: #12448 Closes #12454	2023-01-08 15:34:28 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	daf028210b	build: enable -Winconsistent-missing-override warning This warning can catch a virtual function that thinks it overrides another, but doesn't, because the two functions have different signatures. This isn't very likely since most of our virtual functions override pure virtuals, but it's still worth having. Enable the warning and fix numerous violations. Closes #9347	2021-09-15 12:55:54 +03:00
Avi Kivity	331eb57e17	Revert "compression: define 'class' attribute for compression and deprecate 'sstable_compression'" This reverts commit `5571ef0d6d`. It causes rolling upgrade failures. Fixes #9055. Reopens #8948.	2021-07-28 14:14:22 +03:00
Avi Kivity	5571ef0d6d	compression: define 'class' attribute for compression and deprecate 'sstable_compression' Cassandra 3.0 deprecated the 'sstable_compression' attribute and added 'class' as a replacement. Follow by supporting both. The SSTABLE_COMPRESSION variable is renamed to SSTABLE_COMPRESSION_DEPRECATED to detect all uses and prevent future misuse. To prevent old-version nodes from seeing the new name, the compression_parameters class preserves the key name when it is constructed from an options map, and emits the same key name when asked to generate an options map. Existing unit tests are modified to use the new name, and a test is added to ensure the old name is still supported. Fixes #8948. Closes #8949	2021-07-07 19:15:20 +02:00
Raphael S. Carvalho	c7424ca6e6	sstables: Make log more useful when compressed chunk fails checksum The current log is useless when checksum fails, you don't know which compressed chunk failed checksum, the expected and the actual checksum, the size of chunk and so on. Before: compressed chunk failed checksum Now: compressed chunk of size 3735 at file offset 406491 failed checksum, expected=0, actual=1422312584 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-06-28 12:25:51 -03:00
Raphael S. Carvalho	9daf5d1ab8	sstables: Use correct exception when compressed chunk fails checksum Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-06-28 12:25:44 -03:00
Piotr Jastrzebski	10228b35c5	compress: Remove unused make_compressed_file_k_l_format_output_stream Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2021-06-27 15:12:31 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Emelyanov	01b979beca	sstables: Report buffer size from compressed file sink This change just moves the place from which the output_stream knows the compression::uncompressed_chunk_length() value. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-04 12:01:27 +03:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Rafael Ávila de Espíndola	13282b3d4c	sstables: Pass an output_stream to make_compressed_file_.*_format_output_stream This is a bit simpler as we don't have to pass in the options and moves the calls to make_file_output_stream to places where we can handle futures. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Avi Kivity	a4c44cab88	treewide: update concepts language from the Concepts TS to C++20 Seastar recently lost support for the experimental Concepts Technical Specification (TS) and gained support for C++20 concepts. Re-enable concepts in Scylla by updating our use of concepts to the C++20 standard. This change: - peels off uses of the GCC6_CONCEPT macro - removes inclusions of <seastar/gcc6-concepts.hh> - replaces function-style concepts (no longer supported) with equation-style concepts - semicolons added and removed as needed - deprecated std::is_pod replaced by recommended replacement - updates return type constraints to use concepts instead of type names (either std::same_as or std::convertible_to, with std::same_as chosen when possible) No attempt is made to improve the concepts; this is a specification update only. Message-Id: <20200531110254.2555854-1-avi@scylladb.com>	2020-06-02 09:12:21 +03:00
Avi Kivity	0d0ee20f76	Merge "Implement `sstable_info` API command (info on sstables)" from Calle " Refs #4726 Implement the api portion of a "describe sstables" command. Adds rest types for collecting both fixed and dynamic attributes, some grouped. Allows extensions to add attributes as well. (Hint hint) " * 'sstabledesc' of https://github.com/elcallio/scylla: api/storage_service: Add "sstable_info" command sstables/compress: Make compressor pointer accessible from compression info sstables.hh: Add attribute description API to file extension sstables.hh: Add compression component accessor sstables.hh: Make "has_component" public	2019-08-12 21:16:08 +03:00
Calle Wilund	95a8ff12e7	sstables/compress: Make compressor pointer accessible from compression info	2019-08-06 07:07:44 +00:00
Kamil Braun	7a61bcb021	Fix the value of the chunk length parameter passed to compressors This commit also fixes a bug in sstables/compress.cc, where chunk length in bytes was passed to the compressor as chunk length in kilobytes. Fortunately, none of the compressors implemented until now used this parameter. Signed-off-by: Kamil Braun <kbraun@scylladb.com>	2019-08-05 14:31:33 +02:00
Duarte Nunes	fa2b0384d2	Replace std::experimental types with C++17 std version. Replace stdx::optional and stdx::string_view with the C++ std counterparts. Some instances of boost::variant were also replaced with std::variant, namely those that called seastar::visit. Scylla now requires GCC 8 to compile. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190108111141.5369-1-duarte@scylladb.com>	2019-01-08 13:16:36 +02:00
Juliana Oliveira	5eb76c9bc6	compress: add support for Cassandra's compression parameter This patch adds compatibility for Cassandra's "chunk_size_in_kb", as well as it keeps Scylla's "chunk_size_kb" compression parameter. Fixes #3669 Tests: unit (release) v2: use variable instead of array v3: fix commited files Signed-off-by: Juliana Oliveira <juliana@scylladb.com> Message-Id: <20181211215840.GA7379@shenzou.localdomain>	2018-12-11 23:33:27 +00:00
Tomasz Grabiec	743cf43847	sstables: Avoid checksum_combine() for the crc32 checksummer checksum_combine() is much slower than re-feeding the buffer to checksum() for the zlib CRC32 checksummer. Introduce Checksum::prefer_combine() to determine this and select more optimal behavior for given checksummer. Improves performance of memtable flush with compression enabled by 30%.	2018-11-26 18:57:33 +01:00
Tomasz Grabiec	88cf1c61ba	sstables: compress: Avoid unnecessary checksum_combine()	2018-11-26 14:31:38 +01:00
Avi Kivity	738e713edf	sstables: fix bad format string syntax Some sprint() calls use the fmt language instead of the printf syntax. Convert them all the way to format().	2018-11-01 13:16:17 +00:00
Calle Wilund	054514a47a	sstables::compress: Ensure unqualified compressor name if possible Fixes #3546 Both older origin and scylla writes "known" compressor names (i.e. those in origin namespace) unqualified (i.e. LZ4Compressor). This behaviour was not preserved in the virtualization change. But probably should be. Message-Id: <20180627110930.1619-1-calle@scylladb.com>	2018-06-27 14:16:50 +03:00
Vladimir Krivopalov	cc62ad3b69	sstables: Make compressed streams customizable on checksumming. Use either Adler32 or CRC32 while writing to or reading from a compressed stream. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-19 20:52:07 +03:00
Vladimir Krivopalov	5183294676	sstables: Move checksum calculation logic to compressed_output_stream. Previously, compressed_output_stream used to calculate checksum of the supplied chunk and pass it to the 'compression' object to combine with the full checksum calculated on prior writes. Now, all the checksum calculation happens inside compressed_output_stream and 'compression' only stores the result. This is done to loosen ties between two classes and simplify compressed_output_stream customisation with various checksum algorithms. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-19 20:52:07 +03:00
Vladimir Krivopalov	adb43959d1	sstables: Move adler32 routines under the scope of a class. This is a step towards making digest algorithm customizable at compile time. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Calle Wilund	74758c87cd	sstables::compress/compress: Make compression a virtual object Make a "compressor" an actual class, that can be implemented and registered via class registry. For "common" compressors, the objects will be shared, but complex implementors can be semi-stateful. sstable compression is split into two parts: The "static" config which is shared across shards, and a "local" one, which holds a compressor pointer. The latter is encapsulated, along with actual compressed data writers, in sstables/compress.cc. For compression (write), compression writer is instansiated with the settings active in table metadata. For decompression (read), compression reader is instansiated with the settings stored in sstable metadata, which can differ from the currently active table metadata. v2: * Structured patch sets differently (dependencies) * Added more comments/api descs * Added patch to move all sstable compression into compress.cc, effectively separating top-level virtual compressor object from sstable io knowledge v3: * Rebased v4: * Moved all sstable compression logic/knowledge into compress.cc (local compression). Merged the two patches (separation just confuses reader).	2018-02-07 10:11:45 +00:00
Raphael S. Carvalho	09f4ee808f	sstables/compress: Fix race condition in segmented offset reading of shared sstable Race condition was introduced by commit `028c7a0888`, which introduces chunk offset compression, because a reading state is kept in the compress structure which is supposed to be immutable and can be shared among shards owning the same sstable. So it may happen that shard A updates state while shard B relies on information previously set which leads to incorrect decompression, which in turn leads to read misbehaving. We could serialize access to at() which would only lead to contention issues for shared sstables, but that can be avoided by moving state out of compress structure which is expected to be immutable after sstable is loaded and feeded to shards that own it. Sequential accessor (wraps state and reference to segmented_offset) is added to prevent at() and push_back() interfaces from being polluted. Tests: release mode. Fixes #3148. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180205192432.23405-1-raphaelsc@scylladb.com>	2018-02-06 12:10:10 +02:00
Michael Munday	aea5f3bd1c	sstables: fix compression on big endian systems The encoding logic was incorrect for big endian systems (shift needed to be in the opposite direction). Rather than fix that issue I have re-written the relevant code to restrict the storage format to little endian byte order on all systems. My hope is that this will be a bit easier to maintain. Message-Id: <20171211124454.77488-1-mike.munday@ibm.com>	2017-12-11 14:54:22 +02:00
Tomasz Grabiec	5def901a92	sstables: Don't register logger with the same name twice There can be one logger with given name. This was causing --logger-log-level sstable=trace to not work for the majority of log points. Message-Id: <1505902259-4561-1-git-send-email-tgrabiec@scylladb.com>	2017-09-20 16:40:06 +03:00
Botond Dénes	eec451bcf8	segmented_offsets: use _current_bucket_segment_index consistently Previously _current_bucket_segment_index was used differently depending on whether update_position_trackers() is used in a random or sequential access. In the former case was used as the absolute index of the segment (independent of the buckets) and in the latter as the relative index of the segment within its bucket. This caused problems when there was a switch between random and sequential access, meaning one could get different results for an at() call depending on what was the previous at() call. Fix this by consistently using _current_bucket_segment_index as - like its name suggest - the bucket relative segment index. Ref #1946. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <7f68ac1d32c80e8dea6dfa11be02acaa961bce2a.1503924927.git.bdenes@scylladb.com>	2017-08-28 16:14:25 +03:00
Avi Kivity	fa8d0fe4d0	Revert "Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond"""" This reverts commit `238877a0c6`. A fix was found and will be committed shortly.	2017-08-28 16:14:13 +03:00
Avi Kivity	238877a0c6	Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond""" This reverts commit `9d27455744`. It's still broken. To reproduce: ./tools/bin/cassandra-stress write -schema compression=LZ4Compressor (on a clean database) .0 0x00007ffff32aa69b in raise () from /lib64/libc.so.6 .1 0x00007ffff32ac4a0 in abort () from /lib64/libc.so.6 .2 0x000000000054a0e8 in seastar::memory::abort_on_underflow (size=<optimized out>) at core/memory.cc:1189 .3 seastar::memory::allocate_large (size=<optimized out>) at core/memory.cc:1194 .4 0x000000000054b305 in seastar::memory::allocate (size=size@entry=18446744073702885265) at core/memory.cc:1227 .5 0x000000000054b45e in malloc (n=n@entry=18446744073702885265) at core/memory.cc:1452 .6 0x00000000006013e4 in seastar::temporary_buffer<char>::temporary_buffer (this=0x6010195fc800, size=18446744073702885265) at /home/avi/urchin/seastar/core/temporary_buffer.hh:72 .7 0x0000000000a3908b in seastar::input_stream<char>::read_exactly (this=0x6010053d0248, n=18446744073702885265) at /home/avi/urchin/seastar/core/iostream-impl.hh:189 .8 0x0000000000a9c77f in compressed_file_data_source_impl::get (this=0x6010053d0240) at sstables/compress.cc:499 .9 0x0000000000aa1b01 in seastar::data_source::get (this=<optimized out>) at /home/avi/urchin/seastar/core/iostream.hh:63 .10 seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}::operator()() const (__closure=__closure@entry=0x6010195fcab0) at /home/avi/urchin/seastar/core/iostream-impl.hh:204 .11 0x0000000000aa22f0 in seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >::apply<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}&>(sstables::data_consume_rows_context&&) (func=...) at /home/avi/urchin/seastar/core/future.hh:1312 .12 seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}>(sstables::data_consume_rows_context&&) (action=...) at /home/avi/urchin/seastar/core/future-util.hh:203 .13 0x0000000000a9e730 in seastar::input_stream<char>::consume<sstables::data_consume_rows_context> (consumer=..., this=<optimized out>) at /home/avi/urchin/seastar/core/iostream-impl.hh:237 .14 data_consumer::continuous_data_consumer<sstables::data_consume_rows_context>::consume_input<sstables::data_consume_rows_context> (c=..., this=<optimized out>) at sstables/consumer.hh:226 .15 sstables::data_consume_context::impl::read (this=<optimized out>) at sstables/row.cc:411 .16 sstables::data_consume_context::read (this=<optimized out>) at sstables/row.cc:437 .17 0x0000000000aafbae in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const (__closure=<optimized out>) at sstables/partition.cc:843 .18 seastar::apply_helper<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, std::tuple) (args=..., func=...) at ./seastar/core/apply.hh:36 .19 seastar::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/apply.hh:44 .20 seastar::futurize<seastar::future<> >::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/future.hh:1302 .21 seastar::future<>::then<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&) ( this=this@entry=0x6010195fcbb0, func=...) at ./seastar/core/future.hh:890 .22 0x0000000000ac273f in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const (__closure=0x6010195fcc28) at sstables/partition.cc:843 .23 seastar::do_until_continued<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&, seastar::promise<>) (stop_cond=..., action=..., p=...) at /home/avi/urchin/seastar/core/future-util.hh:155 .24 0x0000000000ac29c3 in seastar::do_until<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&) (action=..., stop_cond=..., this=<optimized out>) at /home/avi/urchin/seastar/core/future-util.hh:330 .25 sstables::sstable_streamed_mutation::fill_buffer (this=<optimized out>) at sstables/partition.cc:844 .26 0x0000000000ad3d2b in streamed_mutation::fill_buffer (this=0x6010195fcd10) at ./streamed_mutation.hh:489 .27 consume_flattened_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >, std::function<bool (streamed_mutation const&)> >(mutation_reader&, stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >&, std::function<bool (streamed_mutation const&)>&&) ( (gdb) p addr $1 = { chunk_start = 13330037, chunk_len = 18446744073702885265, offset = 0 }	2017-08-27 13:32:37 +03:00
Avi Kivity	9d27455744	Revert "Revert "Merge "Compress in-memory compression-info" from Botond"" This reverts commit `9656fd79a0`. A fix is now available.	2017-08-24 13:37:35 +03:00
Tomasz Grabiec	9656fd79a0	Revert "Merge "Compress in-memory compression-info" from Botond" This reverts commit `ef85cf1cb3`, reversing changes made to `de011ece52`. Vlad reports that this causes SIGSEGV on cluster restarts. seastar::backtrace_buffer::append_backtrace() at /home/vladz/work/urchin/seastar/core/reactor.cc:274 (inlined by) print_with_backtrace at /home/vladz/work/urchin/seastar/core/reactor.cc:289 seastar::print_with_backtrace(char const) at /home/vladz/work/urchin/seastar/core/reactor.cc:296 sigsegv_action at /home/vladz/work/urchin/seastar/core/reactor.cc:3512 (inlined by) operator() at /home/vladz/work/urchin/seastar/core/reactor.cc:3498 (inlined by) _FUN at /home/vladz/work/urchin/seastar/core/reactor.cc:3494 ?? ??:0 operator()<seastar::temporary_buffer<char> > at /home/vladz/work/urchin/sstables/sstables.cc:870 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) do_void_futurize_apply_tuple<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1270 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1290 (inlined by) then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:873 (inlined by) do_until_continued<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:155 do_until<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:330 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:874 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:875 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 operator()<seastar::future_state<> > at /home/vladz/work/urchin/seastar/core/future.hh:900 (inlined by) run at /home/vladz/work/urchin/seastar/core/future.hh:395 seastar::reactor::run_tasks(seastar::circular_buffer<std::unique_ptr<seastar::task, std::default_delete<seastar::task> >, std::allocator<std::unique_ptr<seastar::task, std::default_delete<seastar::task> > > >&) at /home/vladz/work/urchin/seastar/core/reactor.cc:2317 seastar::reactor::run() at /home/vladz/work/urchin/seastar/core/reactor.cc:2775 seastar::app_template::run_deprecated(int, char*, std::function<void ()>&&) at /home/vladz/work/urchin/seastar/core/app-template.cc:142	2017-08-24 11:44:14 +02:00
Botond Dénes	028c7a0888	Optimise the storage of compression chunk offsets To reduce the memory footprint of compression-info, n offsets are grouped together into segments, where each segment stores a base absolute offset into the file, the other offsets in the segments being relative offsets (and thus of reduced size). Also offsets are allocated only just enough bits to store their maximum value. The offsets are thus packed in a buffer like so: arrrarrrarrr... where n is 4, a is an absolute offset and r are offsets relative to a. The optimal value of n can be calculated for a given file_size (f) and chunk_size (c), by finding the minima of the following function: f(n) = (f/c)/n * (log2(f) + (n - 1)log2((n-1)(c + 64))) This is done in an empirical way, using a script (see below). Furthermore segments are stored in buckets, where each bucket has its own base offset. Each bucket therefore can address an equal chunk of the file and furthermore each segment in a bucket can address an equal sub-chunk of this area. The value of a given offset i is thus: bucket_base_offset_for(i) + segment_base_offset_for(i) + offset(i) To account for the bucketed storage we calculate a local_f, which is optimized so that a bucketful of segmented offsets can address the largest possible chunk of f. As value of this local_f only depends on the bucket_size (b) and c the value of n can be made independent of f and therefore only depend on one dynamic value, c. This makes life much simpler as we don't need to know the size of the file up-front, we can just append buckets to the storage on demand, while the required storage is still less than a third [1] of the original storage requirements (std::deque<uint64>). The table with the minima(f(n)) for different f and c values is pre-computed by gen_segmented_compress_params.py and stored in sstables/segmented_compress_params.hh. This script also creates a table with the best values of local_f for the given bucket_size. At runtime we only select the best params based on c. [1] This was calculated for c=4K and b=4K	2017-08-21 17:06:12 +03:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Nadav Har'El	506e074ba4	sstable decompression: fix skip() to end of file The skip() implementation for the compressed file input stream incorrectly handled the case of skipping to the end of file: In that case we just need to update the file pointer, but not skip anywhere in the compressed disk file; In particular, we must NOT call locate() to find the relevant on-disk compressed chunk, because there is none - locate() can only be called on actual positions of bytes, not on the one-past-end-of-file position. Fixes #2143 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170308100057.23316-1-nyh@scylladb.com>	2017-03-08 12:35:05 +02:00
Raphael S. Carvalho	c26090a6b2	sstables/compress: fix error message for snappy uncompression Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <898ad07db705355bdbf780afdb3aa982b8ca3823.1482364125.git.raphaelsc@scylladb.com>	2016-12-22 09:08:34 +01:00
Duarte Nunes	cc3f26c993	lz4: Conditionally use LZ4_compress_default() Since not all distributions have a version of LZ4 with LZ4_compress_default(), we use it conditionally. This is specially important beginning with version 1.7.3 of LZ4, which deprecates the LZ4_compress() function in favour of LZ4_compress_default() and thus prevents Scylla from compiling due to the deprecated warning. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20161124092339.23017-1-duarte@scylladb.com>	2016-11-24 11:25:03 +02:00
Paweł Dziepak	2124d08b88	sstables: add skip() to compressed_file_data_source Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-10-19 15:29:08 +01:00
Avi Kivity	106e3703d9	sstables: stop using unaligned_cast unaligned_cast violates strict aliasing, and causes code misgeneration on gcc 6. Replace it with read_be/write_be, which are nicer anyway. Message-Id: <1469122850-7511-1-git-send-email-avi@scylladb.com>	2016-07-22 07:03:08 +01:00
Paweł Dziepak	4513f8b52c	sstables: add compressed_file_data_source_impl::close() compressed_file_data_source_impl should close the underlying data source properly when asked to. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-30 13:07:07 +01:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Nadav Har'El	02ba8ffbe8	Allow uncompression at end of file Asking to read from byte 100 when a file has 50 bytes is an obvious error. But what if we ask to read from byte 50? What if we ask to read 0 bytes at byte 50? :-) Before this patch, code which asked to read from the EOF position would get an exception. After this patch, it would simply read nothing, without error. This allows, for example, reading 0 bytes from position 0 on a file with 0 bytes, which apparently happened in issue #1039... A read which starts at a position higher than the EOF position still generates an exception. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1458137867-10998-1-git-send-email-nyh@scylladb.com>	2016-03-16 17:50:23 +02:00
Nadav Har'El	73297c7872	Fix out-of-range exception when uncompressing 0 bytes The uncompression code reads the compressed chunks containing the bytes pos through pos + len - 1. This, however, is not correct when len==0, and pos + len - 1 may even be -1, causing an out-of-range exception when calling locate() to find the chunks containing this byte position. So we need to treat len==0 specially, and in this case we don't read anything, and don't need to locate() the chunks to read. Refs #1039. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1458135987-10200-1-git-send-email-nyh@scylladb.com>	2016-03-16 15:54:48 +02:00
Nadav Har'El	1b4f8842ee	sstable: fix compressed data file overread Since commit `2f56577` ("sstables: more efficient read of compressed data file"), the compressed_file_input_stream uses a file_input_stream to efficiently read the compressed data at chunks some desired size (128 KB is our default) instead of at smaller compressed chunks. However, I had a bug where I mis-calculated the desired length of the read (giving the end byte instead of the length!) and as a result file_input_stream did not know where the read was supposed to stop, and always read 128 KB buffers. The results were not incorrect, because the sstable reader stops when it needs to, even if given too much data. But it was inefficient because too much data was read in the last buffer. With this patch, the length is correctly given to the input stream, and it can read a much smaller buffer at the end of the read, not the full 128 KB. I tested that this actually happens. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457633616-15193-1-git-send-email-nyh@scylladb.com>	2016-03-11 09:17:50 +02:00
Nadav Har'El	2f56577794	sstables: more efficient read of compressed data file Before this patch, reading large ranges from a compressed data file involved two inefficiencies: 1. The compressed data file was read one compressed chunk at a time. Such a chunk is around 30 KB in size, well below our desired sstable read-ahead size (sstable_buffer_size = 128 KB). 2. Because the compressed chunks have variable length (the uncompressed chunk has a fixed length) they are not aligned to disk blocks, so consecutive chunks have overlapping blocks which were unnecessarily read twice. The fix for both issues is to build the compressed_file_input_stream on an existing file_input_stream, instead of using direct file IO to read the individual chunks. file_input_stream takes care of doing the appropriate amount of read-ahead, and the compressed_file_input_stream layer does the decompression of the data read from the underlying layer. Fixes #992. Historical note: Implementing compressed_file_input_stream on top of file_input_stream was already tried in the past, and rejected. The problem at that time was that compressed_file_input_stream's constructor did not specify the end of the range to read, so that when we wanted to read only a small range we got too much read-ahead beyond the exactly one compressed chunk that we needed to read. Following the fix to issue #964, we now know on every streaming read also the intended end of the stream, so we can now use this to stop reading at the end of the last required chunk, even when we use a read-ahead buffer much larger than a chunk. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>	2016-03-09 10:14:15 +02:00

1 2

65 Commits