scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 11:30:36 +00:00

Author	SHA1	Message	Date
Calle Wilund	43f7eecf9e	compress: move compress.cc/hh to sstables/compressor Fixes #22106 Moves the shared compress components to sstables, and rename to match class type. Adjust includes, removing redundant/unneeded ones where possible. Closes scylladb/scylladb#25103	2025-07-31 13:10:41 +03:00
Ernest Zaslavsky	8d49bb8af2	sstables: Start using `make_data_or_index_source` in `sstable` Convert all necessary methods to be awaitable. Start using `make_data_or_index_source` when creating data_source for data and index components. For proper working of compressed/checksummed input streams, start passing stream creator functors to `make_(checksummed/compressed)_file_(k_l/m)_format_input_stream`.	2025-07-15 10:10:23 +03:00
Michał Chojnowski	cee504f66f	sstables/compress: discard hidden compression options after the decompressor is created Dictionary contents are kept in the list of "compression options" in the header of `CompressionInfo.db`, and they are loaded from disk into memory when the `sstable::compression` object is populated. After the decompressor for the SSTable is created based on those dict contents, they are not needed in RAM anymore. And since they take up a sizeable amount of memory, we would like to free them. In this patch, we discard all "hidden compression options" (currently: only the dictionary contents) from the `sstable::compression` object right after the decompressor is created. (Those options are not supposed to be used for anything else anyway).	2025-04-01 00:07:30 +02:00
Michał Chojnowski	10fa4abde7	compress: change compressor_ptr from shared_ptr to unique_ptr Cleanup patch. After we moved the ownership of compressors to sstables, compressor objects never have shared lifetime. `unique_ptr` is more appropriate for them than `shared_ptr` now. (And besides expressing the intent better, using `unique_ptr` prevents an accidental cross-shard `shared_ptr` copy).	2025-04-01 00:07:29 +02:00
Michał Chojnowski	006c631642	sstables/compress: remove get_sstable_compressor() Following up on the previous commit, we avoid constructing a compressor in the `sstable_info` API call, and we instead read the compression options from the `sstable::compression`.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	8e611536b0	sstables/compress: move ownership of `compressor` to `sstable::compression` SSTable readers and writers use `compressor` objects to compress and decompress chunks of SSTable data files. `compressor` objects are read-only, so only one of them is needed for each SSTable. Before this commit, each reader and writer has its own `compressor` object. This isn't necessary, but it's okay. But later in this series it will stop being okay, because the creation of a `compressor` will become an expensive cross-shard operation (because it might require sharing a compression dictionary from another shard). So we have to adjust the code so that there is only once `compressor` per sstable, not one per reader/writer. We stuff the ownership of this compressor into `sstable::compression`. To make the ownership clear, we remove `compression_ptr` shared pointers from readers and writers, and make them access the compressor via the `sstable::compression` instead.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	cfe69e057f	sstables/compress: break the dependency of `compression_parameters` on `compressor` Note: this commit is meant to be a code refactoring only and is not intended to change the observable behaviour. Today `schema` contains a `compression_parameters`. `compression_parameters` contains an instance of `compressor`, and SSTable writers just share that instance. This is fine because `compressor` is a stateless object, functionally dependent on the schema. But in later parts of the series, we will break this functional dependency by adding dictionaries to compressors. Two writers for the same schema might have different dictionaries, so they won't be able to just share a single instance contained in the schema. And when that happens, having a `compressor` instance in the `schema`/`compression_parameters` will become awkward, since it won't be actually used. It will be only a container for options. In addition, for performance reasons, we will want to share some pieces of compressors across shards, which will require -- in the general case -- a construction of a compressor to be asynchronous, and therefore not possible inside the constructor of `compression_parameters`. This commit modifies `compression_parameters` so that it doesn't hold or construct instances of `compressor`. Before this patch, the `compressor` instance constructed in `compression_parameters` has an additional role of validating and holding compressor-specific options. (Today the only such option is the zstd compression level). This means that the pieces of logic responsible for compressor-specific options have to be rewritten. That ends up being the bulk of this commit.	2025-04-01 00:07:27 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Nikos Dragazis	c893f06409	sstables: Add digest check in compressed data source Following the addition of digest check in the checksummed data source, add the same feature to the compressed data source as well. This ensures consistent behavior across any type of SSTable. This is added as an optional feature so that we can preserve the current behavior, that is verify only the per-chunk checksums during normal user reads. To ensure zero cost at runtime when disabled, we introduce the on/off switch as a template parameter. The digest calculation for compressed SSTables depends on the SSTable format, hence the new template argument for the checksum mode. This is consistent with the compressed data sink. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-10-03 18:09:01 +03:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Kefu Chai	9318d21a22	sstables: change const_iterator::value_type to uint64_t in general, the value_type of a `const_iterator` is `T` instead of `const T`, what has the const specifier is `reference`. because, when dereferencing an iterator, the value type does not matter any more, as it always a copy. and GCC-14 points this out: ``` /home/kefu/dev/scylladb/sstables/compress.hh:224:13: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers] 224 \| value_type operator() const { \| ^~~~~~~~~~ /home/kefu/dev/scylladb/sstables/compress.hh:228:13: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers] 228 \| value_type operator[](ssize_t i) const { \| ^~~~~~~~~~ ``` so, in this change, let's change the value_type to `uint64_t`. please note, it's not typical to return `value_type` from `operator` or `operator[]` of an iterator. but due to the design of segmented_offsets, we cannot return a reference, so let's keep it this way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19186	2024-06-09 19:21:16 +03:00
Botond Dénes	37fd568139	sstables/compress.hh: remove unused forward declaration struct compress if forward declared right before its definition. At some point in the past there was probably some code there using it, but now its gone so remove it. Closes scylladb/scylladb#19168	2024-06-09 17:52:05 +03:00
Lakshmi Narayanan Sreethar	de6570e1ec	serializer_impl, sstables: fix build failure due to missing includes When building scylla with cmake, it fails due to missing includes in serializer_impl.hh and sstables/compress.hh files. Fix that by adding the appropriate include files. Fixes #18343 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#18344	2024-04-23 12:03:51 +03:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Kefu Chai	f5b05cf981	treewide: use defaulted operator!=() and operator==() in C++20, compiler generate operator!=() if the corresponding operator==() is already defined, the language now understands that the comparison is symmetric in the new standard. fortunately, our operator!=() is always equivalent to `! operator==()`, this matches the behavior of the default generated operator!=(). so, in this change, all `operator!=` are removed. in addition to the defaulted operator!=, C++20 also brings to us the defaulted operator==() -- it is able to generated the operator==() if the member-wise lexicographical comparison. under some circumstances, this is exactly what we need. so, in this change, if the operator==() is also implemented as a lexicographical comparison of all memeber variables of the class/struct in question, it is implemented using the default generated one by removing its body and mark the function as `default`. moreover, if the class happen to have other comparison operators which are implemented using lexicographical comparison, the default generated `operator<=>` is used in place of the defaulted `operator==`. sometimes, we fail to mark the operator== with the `const` specifier, in this change, to fulfil the need of C++ standard, and to be more correct, the `const` specifier is added. also, to generate the defaulted operator==, the operand should be `const class_name&`, but it is not always the case, in the class of `version`, we use `version` as the parameter type, to fulfill the need of the C++ standard, the parameter type is changed to `const version&` instead. this does not change the semantic of the comparison operator. and is a more idiomatic way to pass non-trivial struct as function parameters. please note, because in C++20, both operator= and operator<=> are symmetric, some of the operators in `multiprecision` are removed. they are the symmetric form of the another variant. if they were not removed, compiler would, for instance, find ambiguous overloaded operator '=='. this change is a cleanup to modernize the code base with C++20 features. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13687	2023-04-27 10:24:46 +03:00
Kefu Chai	df63e2ba27	types: move types.{cc,hh} into types they are part of the CQL type system, and are "closer" to types. let's move them into "types" directory. the building systems are updated accordingly. the source files referencing `types.hh` were updated using following command: ``` find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} + ``` the source files under sstables include "types.hh", which is indeed the one located under "sstables", so include "sstables/types.hh" instea, so it's more explicit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12926	2023-02-19 21:05:45 +02:00
Botond Dénes	c4688563e3	sstables: track decompressed buffers Convert decompressed temporary buffers into tracked buffers just before returning them to the upper layer. This ensures these buffers are known to the reader concurrency semaphore and it has an accurate view of the actual memory consumption of reads. Fixes: #12448 Closes #12454	2023-01-08 15:34:28 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Piotr Jastrzebski	10228b35c5	compress: Remove unused make_compressed_file_k_l_format_output_stream Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2021-06-27 15:12:31 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Piotr Jastrzebski	bacda100ec	sstables: remove std::iterator from const_iterator std::iterator is deprecated since C++17 so define all the required iterator_traits directly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-17 16:53:20 +01:00
Rafael Ávila de Espíndola	13282b3d4c	sstables: Pass an output_stream to make_compressed_file_.*_format_output_stream This is a bit simpler as we don't have to pass in the options and moves the calls to make_file_output_stream to places where we can handle futures. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-03 10:32:46 -07:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Avi Kivity	0d0ee20f76	Merge "Implement `sstable_info` API command (info on sstables)" from Calle " Refs #4726 Implement the api portion of a "describe sstables" command. Adds rest types for collecting both fixed and dynamic attributes, some grouped. Allows extensions to add attributes as well. (Hint hint) " * 'sstabledesc' of https://github.com/elcallio/scylla: api/storage_service: Add "sstable_info" command sstables/compress: Make compressor pointer accessible from compression info sstables.hh: Add attribute description API to file extension sstables.hh: Add compression component accessor sstables.hh: Make "has_component" public	2019-08-12 21:16:08 +03:00
Calle Wilund	95a8ff12e7	sstables/compress: Make compressor pointer accessible from compression info	2019-08-06 07:07:44 +00:00
Kamil Braun	f14e6e73bb	Add ZStandard compression This adds the option to compress sstables using the Zstandard algorithm (https://facebook.github.io/zstd/). To use, pass 'sstable_compression': 'org.apache.cassandra.io.compress.ZstdCompressor' to the 'compression' argument when creating a table. You can also specify a 'compression_level'. See Zstd documentation for the available compression levels. Resolves #2613. Signed-off-by: Kamil Braun <kbraun@scylladb.com>	2019-08-05 14:55:53 +02:00
Benny Halevy	0e3f9c25e4	sstables: compress.hh: add missing include Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Vladimir Krivopalov	cc62ad3b69	sstables: Make compressed streams customizable on checksumming. Use either Adler32 or CRC32 while writing to or reading from a compressed stream. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-19 20:52:07 +03:00
Vladimir Krivopalov	5183294676	sstables: Move checksum calculation logic to compressed_output_stream. Previously, compressed_output_stream used to calculate checksum of the supplied chunk and pass it to the 'compression' object to combine with the full checksum calculated on prior writes. Now, all the checksum calculation happens inside compressed_output_stream and 'compression' only stores the result. This is done to loosen ties between two classes and simplify compressed_output_stream customisation with various checksum algorithms. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-19 20:52:07 +03:00
Vladimir Krivopalov	adb43959d1	sstables: Move adler32 routines under the scope of a class. This is a step towards making digest algorithm customizable at compile time. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Vladimir Krivopalov	4e4030676f	sstables: Move checksum utils into separate header. Checksummed writer doesn't need to include all compression stuff. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Calle Wilund	74758c87cd	sstables::compress/compress: Make compression a virtual object Make a "compressor" an actual class, that can be implemented and registered via class registry. For "common" compressors, the objects will be shared, but complex implementors can be semi-stateful. sstable compression is split into two parts: The "static" config which is shared across shards, and a "local" one, which holds a compressor pointer. The latter is encapsulated, along with actual compressed data writers, in sstables/compress.cc. For compression (write), compression writer is instansiated with the settings active in table metadata. For decompression (read), compression reader is instansiated with the settings stored in sstable metadata, which can differ from the currently active table metadata. v2: * Structured patch sets differently (dependencies) * Added more comments/api descs * Added patch to move all sstable compression into compress.cc, effectively separating top-level virtual compressor object from sstable io knowledge v3: * Rebased v4: * Moved all sstable compression logic/knowledge into compress.cc (local compression). Merged the two patches (separation just confuses reader).	2018-02-07 10:11:45 +00:00
Raphael S. Carvalho	09f4ee808f	sstables/compress: Fix race condition in segmented offset reading of shared sstable Race condition was introduced by commit `028c7a0888`, which introduces chunk offset compression, because a reading state is kept in the compress structure which is supposed to be immutable and can be shared among shards owning the same sstable. So it may happen that shard A updates state while shard B relies on information previously set which leads to incorrect decompression, which in turn leads to read misbehaving. We could serialize access to at() which would only lead to contention issues for shared sstables, but that can be avoided by moving state out of compress structure which is expected to be immutable after sstable is loaded and feeded to shards that own it. Sequential accessor (wraps state and reference to segmented_offset) is added to prevent at() and push_back() interfaces from being polluted. Tests: release mode. Fixes #3148. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180205192432.23405-1-raphaelsc@scylladb.com>	2018-02-06 12:10:10 +02:00
Botond Dénes	d1209c548a	Fix -Wreturn-type warnings Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <99f7a006daaa78eb87720ac51c394093398bc868.1504013915.git.bdenes@scylladb.com>	2017-08-29 16:41:09 +03:00
Avi Kivity	fa8d0fe4d0	Revert "Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond"""" This reverts commit `238877a0c6`. A fix was found and will be committed shortly.	2017-08-28 16:14:13 +03:00
Avi Kivity	238877a0c6	Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond""" This reverts commit `9d27455744`. It's still broken. To reproduce: ./tools/bin/cassandra-stress write -schema compression=LZ4Compressor (on a clean database) .0 0x00007ffff32aa69b in raise () from /lib64/libc.so.6 .1 0x00007ffff32ac4a0 in abort () from /lib64/libc.so.6 .2 0x000000000054a0e8 in seastar::memory::abort_on_underflow (size=<optimized out>) at core/memory.cc:1189 .3 seastar::memory::allocate_large (size=<optimized out>) at core/memory.cc:1194 .4 0x000000000054b305 in seastar::memory::allocate (size=size@entry=18446744073702885265) at core/memory.cc:1227 .5 0x000000000054b45e in malloc (n=n@entry=18446744073702885265) at core/memory.cc:1452 .6 0x00000000006013e4 in seastar::temporary_buffer<char>::temporary_buffer (this=0x6010195fc800, size=18446744073702885265) at /home/avi/urchin/seastar/core/temporary_buffer.hh:72 .7 0x0000000000a3908b in seastar::input_stream<char>::read_exactly (this=0x6010053d0248, n=18446744073702885265) at /home/avi/urchin/seastar/core/iostream-impl.hh:189 .8 0x0000000000a9c77f in compressed_file_data_source_impl::get (this=0x6010053d0240) at sstables/compress.cc:499 .9 0x0000000000aa1b01 in seastar::data_source::get (this=<optimized out>) at /home/avi/urchin/seastar/core/iostream.hh:63 .10 seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}::operator()() const (__closure=__closure@entry=0x6010195fcab0) at /home/avi/urchin/seastar/core/iostream-impl.hh:204 .11 0x0000000000aa22f0 in seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >::apply<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}&>(sstables::data_consume_rows_context&&) (func=...) at /home/avi/urchin/seastar/core/future.hh:1312 .12 seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}>(sstables::data_consume_rows_context&&) (action=...) at /home/avi/urchin/seastar/core/future-util.hh:203 .13 0x0000000000a9e730 in seastar::input_stream<char>::consume<sstables::data_consume_rows_context> (consumer=..., this=<optimized out>) at /home/avi/urchin/seastar/core/iostream-impl.hh:237 .14 data_consumer::continuous_data_consumer<sstables::data_consume_rows_context>::consume_input<sstables::data_consume_rows_context> (c=..., this=<optimized out>) at sstables/consumer.hh:226 .15 sstables::data_consume_context::impl::read (this=<optimized out>) at sstables/row.cc:411 .16 sstables::data_consume_context::read (this=<optimized out>) at sstables/row.cc:437 .17 0x0000000000aafbae in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const (__closure=<optimized out>) at sstables/partition.cc:843 .18 seastar::apply_helper<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, std::tuple) (args=..., func=...) at ./seastar/core/apply.hh:36 .19 seastar::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/apply.hh:44 .20 seastar::futurize<seastar::future<> >::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/future.hh:1302 .21 seastar::future<>::then<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&) ( this=this@entry=0x6010195fcbb0, func=...) at ./seastar/core/future.hh:890 .22 0x0000000000ac273f in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const (__closure=0x6010195fcc28) at sstables/partition.cc:843 .23 seastar::do_until_continued<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&, seastar::promise<>) (stop_cond=..., action=..., p=...) at /home/avi/urchin/seastar/core/future-util.hh:155 .24 0x0000000000ac29c3 in seastar::do_until<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&) (action=..., stop_cond=..., this=<optimized out>) at /home/avi/urchin/seastar/core/future-util.hh:330 .25 sstables::sstable_streamed_mutation::fill_buffer (this=<optimized out>) at sstables/partition.cc:844 .26 0x0000000000ad3d2b in streamed_mutation::fill_buffer (this=0x6010195fcd10) at ./streamed_mutation.hh:489 .27 consume_flattened_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >, std::function<bool (streamed_mutation const&)> >(mutation_reader&, stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >&, std::function<bool (streamed_mutation const&)>&&) ( (gdb) p addr $1 = { chunk_start = 13330037, chunk_len = 18446744073702885265, offset = 0 }	2017-08-27 13:32:37 +03:00
Avi Kivity	9d27455744	Revert "Revert "Merge "Compress in-memory compression-info" from Botond"" This reverts commit `9656fd79a0`. A fix is now available.	2017-08-24 13:37:35 +03:00
Tomasz Grabiec	9656fd79a0	Revert "Merge "Compress in-memory compression-info" from Botond" This reverts commit `ef85cf1cb3`, reversing changes made to `de011ece52`. Vlad reports that this causes SIGSEGV on cluster restarts. seastar::backtrace_buffer::append_backtrace() at /home/vladz/work/urchin/seastar/core/reactor.cc:274 (inlined by) print_with_backtrace at /home/vladz/work/urchin/seastar/core/reactor.cc:289 seastar::print_with_backtrace(char const) at /home/vladz/work/urchin/seastar/core/reactor.cc:296 sigsegv_action at /home/vladz/work/urchin/seastar/core/reactor.cc:3512 (inlined by) operator() at /home/vladz/work/urchin/seastar/core/reactor.cc:3498 (inlined by) _FUN at /home/vladz/work/urchin/seastar/core/reactor.cc:3494 ?? ??:0 operator()<seastar::temporary_buffer<char> > at /home/vladz/work/urchin/sstables/sstables.cc:870 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) do_void_futurize_apply_tuple<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1270 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1290 (inlined by) then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:873 (inlined by) do_until_continued<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:155 do_until<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:330 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:874 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:875 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 operator()<seastar::future_state<> > at /home/vladz/work/urchin/seastar/core/future.hh:900 (inlined by) run at /home/vladz/work/urchin/seastar/core/future.hh:395 seastar::reactor::run_tasks(seastar::circular_buffer<std::unique_ptr<seastar::task, std::default_delete<seastar::task> >, std::allocator<std::unique_ptr<seastar::task, std::default_delete<seastar::task> > > >&) at /home/vladz/work/urchin/seastar/core/reactor.cc:2317 seastar::reactor::run() at /home/vladz/work/urchin/seastar/core/reactor.cc:2775 seastar::app_template::run_deprecated(int, char*, std::function<void ()>&&) at /home/vladz/work/urchin/seastar/core/app-template.cc:142	2017-08-24 11:44:14 +02:00
Botond Dénes	028c7a0888	Optimise the storage of compression chunk offsets To reduce the memory footprint of compression-info, n offsets are grouped together into segments, where each segment stores a base absolute offset into the file, the other offsets in the segments being relative offsets (and thus of reduced size). Also offsets are allocated only just enough bits to store their maximum value. The offsets are thus packed in a buffer like so: arrrarrrarrr... where n is 4, a is an absolute offset and r are offsets relative to a. The optimal value of n can be calculated for a given file_size (f) and chunk_size (c), by finding the minima of the following function: f(n) = (f/c)/n * (log2(f) + (n - 1)log2((n-1)(c + 64))) This is done in an empirical way, using a script (see below). Furthermore segments are stored in buckets, where each bucket has its own base offset. Each bucket therefore can address an equal chunk of the file and furthermore each segment in a bucket can address an equal sub-chunk of this area. The value of a given offset i is thus: bucket_base_offset_for(i) + segment_base_offset_for(i) + offset(i) To account for the bucketed storage we calculate a local_f, which is optimized so that a bucketful of segmented offsets can address the largest possible chunk of f. As value of this local_f only depends on the bucket_size (b) and c the value of n can be made independent of f and therefore only depend on one dynamic value, c. This makes life much simpler as we don't need to know the size of the file up-front, we can just append buckets to the storage on demand, while the required storage is still less than a third [1] of the original storage requirements (std::deque<uint64>). The table with the minima(f(n)) for different f and c values is pre-computed by gen_segmented_compress_params.py and stored in sstables/segmented_compress_params.hh. This script also creates a table with the best values of local_f for the given bucket_size. At runtime we only select the best params based on c. [1] This was calculated for c=4K and b=4K	2017-08-21 17:06:12 +03:00
Raphael S. Carvalho	15246f31f7	sstables: fix incorrect sstable size when compression is enabled Size of uncompressed sstable was being unconditionally used to determine when to stop writing a table. When compression is enabled, compressed size should be used instead. Problem affected Scylla when compression and leveled strategy were used. Fixes #1177. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <d9bf26def41fb33ca297f4127ce042b7f67adf96.1460484529.git.raphaelsc@scylladb.com>	2016-04-13 09:01:01 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Nadav Har'El	2f56577794	sstables: more efficient read of compressed data file Before this patch, reading large ranges from a compressed data file involved two inefficiencies: 1. The compressed data file was read one compressed chunk at a time. Such a chunk is around 30 KB in size, well below our desired sstable read-ahead size (sstable_buffer_size = 128 KB). 2. Because the compressed chunks have variable length (the uncompressed chunk has a fixed length) they are not aligned to disk blocks, so consecutive chunks have overlapping blocks which were unnecessarily read twice. The fix for both issues is to build the compressed_file_input_stream on an existing file_input_stream, instead of using direct file IO to read the individual chunks. file_input_stream takes care of doing the appropriate amount of read-ahead, and the compressed_file_input_stream layer does the decompression of the data read from the underlying layer. Fixes #992. Historical note: Implementing compressed_file_input_stream on top of file_input_stream was already tried in the past, and rejected. The problem at that time was that compressed_file_input_stream's constructor did not specify the end of the range to read, so that when we wanted to read only a small range we got too much read-ahead beyond the exactly one compressed chunk that we needed to read. Following the fix to issue #964, we now know on every streaming read also the intended end of the stream, so we can now use this to stop reading at the end of the last required chunk, even when we use a read-ahead buffer much larger than a chunk. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>	2016-03-09 10:14:15 +02:00
Glauber Costa	8e4bf025ae	sstables: wire priority for read path All the SSTable read path can now take an io_priority. The public functions will take a default parameter which is Seastar's default priority. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Nadav Har'El	4edf7fe206	clean up uses of lw_shared_ptr<file> recently, "file" started to use a shared_ptr internally, and is already copy-able and reference counted, and there is no reason to use lw_shared_ptr<file>. This patch cleans up a few remaining places where lw_shared_ptr<file> was used. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-07-22 11:51:40 +03:00
Raphael S. Carvalho	113d3b1001	sstables: update compression ratio stats If compression is used, we should provide both uncompressed and compressed length to metadata collector, so as for the ratio to be computed. Stats metadata stores compression ratio. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-21 08:14:07 +03:00
Raphael S. Carvalho	f17f3b197a	sstables: add initial support to compression lz4 is the unique compressor algorithm supported so far. missing deflate and snappy algorithms. Adding them should be relatively easy though. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-16 12:42:00 -03:00
Raphael S. Carvalho	3bfb86f541	sstables: add compress_max_size to compression used to return maximum size which compressor may output. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-06-16 09:48:00 -03:00
Raphael S. Carvalho	d1ed0744f0	schema: add sstable compressor property The field compressor is about saying which compressor algorithm must be used in compression of sstable data file. This is a small step towards compressed sstable data file. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-09 11:18:56 +03:00
Glauber Costa	2dbd2b408a	sstables: change describe_type's return type to auto We always return a future, but with the threaded writer, we can get rid of that. So while reads will still return a future, the writer will be able to return void. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-06-08 15:25:35 +03:00

1 2

56 Commits