scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 01:20:39 +00:00

Author	SHA1	Message	Date
Botond Dénes	d1209c548a	Fix -Wreturn-type warnings Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <99f7a006daaa78eb87720ac51c394093398bc868.1504013915.git.bdenes@scylladb.com>	2017-08-29 16:41:09 +03:00
Paweł Dziepak	d5fa07f6df	Merge "sstables: switch from deque<> to a custom container" from Avi Large deques require contiguous storage, which may not be available (or may be expensive to obtain). Switch to new custom container instead, which allocates less contiguous storage. Allocation problems were observed with the summary and compression info. While there is work to reduce compression info contiguous space use, this solves all std::deque problems (and should not conflict with that work). Fixes #2708 * tag '2708/v6' of https://github.com/avikivity/scylla: sstables: switch std::deque to chunked_vector tests: add test for chunked_vector utils: add a new container type chunked_vector	2017-08-29 11:11:01 +01:00
Botond Dénes	eec451bcf8	segmented_offsets: use _current_bucket_segment_index consistently Previously _current_bucket_segment_index was used differently depending on whether update_position_trackers() is used in a random or sequential access. In the former case was used as the absolute index of the segment (independent of the buckets) and in the latter as the relative index of the segment within its bucket. This caused problems when there was a switch between random and sequential access, meaning one could get different results for an at() call depending on what was the previous at() call. Fix this by consistently using _current_bucket_segment_index as - like its name suggest - the bucket relative segment index. Ref #1946. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <7f68ac1d32c80e8dea6dfa11be02acaa961bce2a.1503924927.git.bdenes@scylladb.com>	2017-08-28 16:14:25 +03:00
Avi Kivity	fa8d0fe4d0	Revert "Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond"""" This reverts commit `238877a0c6`. A fix was found and will be committed shortly.	2017-08-28 16:14:13 +03:00
Tomasz Grabiec	65e488c150	sstables: Fix abort in mutation reader for certain skip pattern The problem happens for the following sequence of events: 1) reader stops in the middle of some partition before it skips to another partition range 2) reader is fast forwarded to a partition range which has no data in the sstable. There are some partitions between the previous partition range and the one we skip to 3) the reader is asked for next partition The problem was that mutation_reader::fast_forward_to() was putting the reader in _read_enabled == false state in step 2, but data_consume_context was not fast forwarded to the range. When in step 3 we were asked for the next partition, we attempted to skip using index (because of 1). The result of the skip was some position which is outside of the current range of data_consume_context, which causes it to abort. To fix, add a check for _read_enabled before we try to skip.	2017-08-28 10:28:15 +02:00
Tomasz Grabiec	dc3c8863f3	sstables: Fix reader returning partition past the query range in some cases If index was used to skip to the next partition (because the current partition wasn't consumed in full) and reader's partition range ends before the data file ends, we did not detect that we're out of range before returning a streamed_mutation. Fix by checking _context.eof() before doing that. Refs #2733.	2017-08-28 10:16:27 +02:00
Tomasz Grabiec	6baad2c2e6	sstables: Introduce data_consume_context::eof()	2017-08-28 09:19:43 +02:00
Avi Kivity	238877a0c6	Revert "Revert "Revert "Merge "Compress in-memory compression-info" from Botond""" This reverts commit `9d27455744`. It's still broken. To reproduce: ./tools/bin/cassandra-stress write -schema compression=LZ4Compressor (on a clean database) .0 0x00007ffff32aa69b in raise () from /lib64/libc.so.6 .1 0x00007ffff32ac4a0 in abort () from /lib64/libc.so.6 .2 0x000000000054a0e8 in seastar::memory::abort_on_underflow (size=<optimized out>) at core/memory.cc:1189 .3 seastar::memory::allocate_large (size=<optimized out>) at core/memory.cc:1194 .4 0x000000000054b305 in seastar::memory::allocate (size=size@entry=18446744073702885265) at core/memory.cc:1227 .5 0x000000000054b45e in malloc (n=n@entry=18446744073702885265) at core/memory.cc:1452 .6 0x00000000006013e4 in seastar::temporary_buffer<char>::temporary_buffer (this=0x6010195fc800, size=18446744073702885265) at /home/avi/urchin/seastar/core/temporary_buffer.hh:72 .7 0x0000000000a3908b in seastar::input_stream<char>::read_exactly (this=0x6010053d0248, n=18446744073702885265) at /home/avi/urchin/seastar/core/iostream-impl.hh:189 .8 0x0000000000a9c77f in compressed_file_data_source_impl::get (this=0x6010053d0240) at sstables/compress.cc:499 .9 0x0000000000aa1b01 in seastar::data_source::get (this=<optimized out>) at /home/avi/urchin/seastar/core/iostream.hh:63 .10 seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}::operator()() const (__closure=__closure@entry=0x6010195fcab0) at /home/avi/urchin/seastar/core/iostream-impl.hh:204 .11 0x0000000000aa22f0 in seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >::apply<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}&>(sstables::data_consume_rows_context&&) (func=...) at /home/avi/urchin/seastar/core/future.hh:1312 .12 seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context>(sstables::data_consume_rows_context&)::{lambda()#1}>(sstables::data_consume_rows_context&&) (action=...) at /home/avi/urchin/seastar/core/future-util.hh:203 .13 0x0000000000a9e730 in seastar::input_stream<char>::consume<sstables::data_consume_rows_context> (consumer=..., this=<optimized out>) at /home/avi/urchin/seastar/core/iostream-impl.hh:237 .14 data_consumer::continuous_data_consumer<sstables::data_consume_rows_context>::consume_input<sstables::data_consume_rows_context> (c=..., this=<optimized out>) at sstables/consumer.hh:226 .15 sstables::data_consume_context::impl::read (this=<optimized out>) at sstables/row.cc:411 .16 sstables::data_consume_context::read (this=<optimized out>) at sstables/row.cc:437 .17 0x0000000000aafbae in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const (__closure=<optimized out>) at sstables/partition.cc:843 .18 seastar::apply_helper<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, std::tuple) (args=..., func=...) at ./seastar/core/apply.hh:36 .19 seastar::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/apply.hh:44 .20 seastar::futurize<seastar::future<> >::apply<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) (args=..., func=...) at ./seastar/core/future.hh:1302 .21 seastar::future<>::then<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const::{lambda()#1}&&) ( this=this@entry=0x6010195fcbb0, func=...) at ./seastar/core/future.hh:890 .22 0x0000000000ac273f in sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}::operator()() const (__closure=0x6010195fcc28) at sstables/partition.cc:843 .23 seastar::do_until_continued<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&, seastar::promise<>) (stop_cond=..., action=..., p=...) at /home/avi/urchin/seastar/core/future-util.hh:155 .24 0x0000000000ac29c3 in seastar::do_until<sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}>(sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#1}&&, sstables::sstable_streamed_mutation::fill_buffer()::{lambda()#2}&&) (action=..., stop_cond=..., this=<optimized out>) at /home/avi/urchin/seastar/core/future-util.hh:330 .25 sstables::sstable_streamed_mutation::fill_buffer (this=<optimized out>) at sstables/partition.cc:844 .26 0x0000000000ad3d2b in streamed_mutation::fill_buffer (this=0x6010195fcd10) at ./streamed_mutation.hh:489 .27 consume_flattened_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >, std::function<bool (streamed_mutation const&)> >(mutation_reader&, stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer> >&, std::function<bool (streamed_mutation const&)>&&) ( (gdb) p addr $1 = { chunk_start = 13330037, chunk_len = 18446744073702885265, offset = 0 }	2017-08-27 13:32:37 +03:00
Avi Kivity	1f66940134	sstables: switch std::deque to chunked_vector Reduce susceptibility to memory fragmentation.	2017-08-26 16:44:47 +03:00
Botond Dénes	839d1db4d3	parse(compression): add missing reinterpret_cast<char> std::copy_n was using value as uint64_t, smashing the stack. Also remove unused variable. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <4e2d71fc74326965dfd98bed2347100fb6ebe43b.1503568210.git.bdenes@scylladb.com>	2017-08-24 13:38:03 +03:00
Avi Kivity	9d27455744	Revert "Revert "Merge "Compress in-memory compression-info" from Botond"" This reverts commit `9656fd79a0`. A fix is now available.	2017-08-24 13:37:35 +03:00
Tomasz Grabiec	9656fd79a0	Revert "Merge "Compress in-memory compression-info" from Botond" This reverts commit `ef85cf1cb3`, reversing changes made to `de011ece52`. Vlad reports that this causes SIGSEGV on cluster restarts. seastar::backtrace_buffer::append_backtrace() at /home/vladz/work/urchin/seastar/core/reactor.cc:274 (inlined by) print_with_backtrace at /home/vladz/work/urchin/seastar/core/reactor.cc:289 seastar::print_with_backtrace(char const) at /home/vladz/work/urchin/seastar/core/reactor.cc:296 sigsegv_action at /home/vladz/work/urchin/seastar/core/reactor.cc:3512 (inlined by) operator() at /home/vladz/work/urchin/seastar/core/reactor.cc:3498 (inlined by) _FUN at /home/vladz/work/urchin/seastar/core/reactor.cc:3494 ?? ??:0 operator()<seastar::temporary_buffer<char> > at /home/vladz/work/urchin/sstables/sstables.cc:870 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) do_void_futurize_apply_tuple<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1270 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)>, seastar::temporary_buffer<char> > at /home/vladz/work/urchin/seastar/core/future.hh:1290 (inlined by) then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>::<lambda(auto:104)> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:873 (inlined by) do_until_continued<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:155 do_until<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>, sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()>::<lambda()>&> at /home/vladz/work/urchin/seastar/core/future-util.hh:330 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:874 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 then<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()>::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:890 (inlined by) operator() at /home/vladz/work/urchin/sstables/sstables.cc:875 (inlined by) apply at /home/vladz/work/urchin/seastar/core/apply.hh:36 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/apply.hh:44 (inlined by) apply<sstables::parse(sstables::random_access_reader&, sstables::compression&)::<lambda()> > at /home/vladz/work/urchin/seastar/core/future.hh:1302 operator()<seastar::future_state<> > at /home/vladz/work/urchin/seastar/core/future.hh:900 (inlined by) run at /home/vladz/work/urchin/seastar/core/future.hh:395 seastar::reactor::run_tasks(seastar::circular_buffer<std::unique_ptr<seastar::task, std::default_delete<seastar::task> >, std::allocator<std::unique_ptr<seastar::task, std::default_delete<seastar::task> > > >&) at /home/vladz/work/urchin/seastar/core/reactor.cc:2317 seastar::reactor::run() at /home/vladz/work/urchin/seastar/core/reactor.cc:2775 seastar::app_template::run_deprecated(int, char*, std::function<void ()>&&) at /home/vladz/work/urchin/seastar/core/app-template.cc:142	2017-08-24 11:44:14 +02:00
Paweł Dziepak	31afc2f242	shared_index_lists: restore indentation Message-Id: <20170821162934.25386-4-pdziepak@scylladb.com>	2017-08-22 12:09:42 +02:00
Paweł Dziepak	93eaa95378	sstables: make shared_index_lists::get_or_load exception safe Message-Id: <20170821162934.25386-3-pdziepak@scylladb.com>	2017-08-22 12:09:42 +02:00
Avi Kivity	ef85cf1cb3	Merge "Compress in-memory compression-info" from Botond "Overly large metadata can hog memory which especially hurts in setups with bad disk/memory ratio. To ease the pain compress the in-memory compression-info. The compression is implemented based on Avi's idea which is to group n offsets together into segments, where each segment stores a base absolute offset into the file, the other offsets in the segments being relative offsets (and thus of reduced size). Also offsets are allocated only just enough bits to store their maximum value. The offsets are thus packed in a buffer like so: arrrarrrarrr... where n is 4, a is an absolute offset and r are offsets relative to a. This of course means that stored offsets will not be aligned, not even on a byte boundary, but the size reduction pretty convincing. In addition, segments are stored in buckets, where each bucket has its own base offset. In addition, segments in a buckets are optimized to address as large of a chunk of the data as possible for a given chunk size." Ref #1946. * 'bdenes/compress-compression-v3' of https://github.com/denesb/scylla: Add unit test for compress::offsets Optimise the storage of compression chunk offsets Add script to precompute segmented compression parameters	2017-08-22 10:30:58 +03:00
Botond Dénes	028c7a0888	Optimise the storage of compression chunk offsets To reduce the memory footprint of compression-info, n offsets are grouped together into segments, where each segment stores a base absolute offset into the file, the other offsets in the segments being relative offsets (and thus of reduced size). Also offsets are allocated only just enough bits to store their maximum value. The offsets are thus packed in a buffer like so: arrrarrrarrr... where n is 4, a is an absolute offset and r are offsets relative to a. The optimal value of n can be calculated for a given file_size (f) and chunk_size (c), by finding the minima of the following function: f(n) = (f/c)/n * (log2(f) + (n - 1)log2((n-1)(c + 64))) This is done in an empirical way, using a script (see below). Furthermore segments are stored in buckets, where each bucket has its own base offset. Each bucket therefore can address an equal chunk of the file and furthermore each segment in a bucket can address an equal sub-chunk of this area. The value of a given offset i is thus: bucket_base_offset_for(i) + segment_base_offset_for(i) + offset(i) To account for the bucketed storage we calculate a local_f, which is optimized so that a bucketful of segmented offsets can address the largest possible chunk of f. As value of this local_f only depends on the bucket_size (b) and c the value of n can be made independent of f and therefore only depend on one dynamic value, c. This makes life much simpler as we don't need to know the size of the file up-front, we can just append buckets to the storage on demand, while the required storage is still less than a third [1] of the original storage requirements (std::deque<uint64>). The table with the minima(f(n)) for different f and c values is pre-computed by gen_segmented_compress_params.py and stored in sstables/segmented_compress_params.hh. This script also creates a table with the best values of local_f for the given bucket_size. At runtime we only select the best params based on c. [1] This was calculated for c=4K and b=4K	2017-08-21 17:06:12 +03:00
Avi Kivity	9f415ef870	sstables: accurate summary entry size calculation Calculate the summary entry size correctly, so we don't end up with oversize summaries. Message-Id: <20170819184255.14181-2-avi@scylladb.com>	2017-08-21 14:28:57 +02:00
Avi Kivity	17c372bf0e	sstables: get rid of 64kB minimum index advance to generate summary Limiting summary entry generation to at most one summary entry per 64k of index data can lead to large index pages, with thousands of index entries per summary entry. These are slow to parse, and there is no real gain from the limit, since we already enforce a size limit on the summary. Remove the limit and allow summary entry generation based solely on spanned data size. Fixes #2711. Message-Id: <20170819184255.14181-1-avi@scylladb.com>	2017-08-21 14:26:44 +02:00
Raphael S. Carvalho	10eaa2339e	compaction: Make resharding go through compaction manager Two reasons for this change: 1) every compaction should be multiplexed to manager which in turn will make decision when to schedule. improvements on it will immediately benefit every existing compaction type. 2) active tasks metric will now track ongoing reshard jobs. Fixes #2671. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170817224334.6402-1-raphaelsc@scylladb.com>	2017-08-20 11:35:14 +03:00
Paweł Dziepak	784dcbf1ca	sstables: initialise index metrics on all shards Fixes #2702. Message-Id: <20170816085454.21554-1-pdziepak@scylladb.com>	2017-08-16 15:44:26 +03:00
Botond Dénes	611774b1d9	Use the incremental reader for compaction As leveled compaction strategy stands to gain the most from incrementally opening sstables. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <292648d3fa4ea97376c0b4360754a20132194f63.1502822066.git.bdenes@scylladb.com>	2017-08-15 21:38:04 +03:00
Duarte Nunes	7fb6a74302	combined_mutation_reader: Drop exhausted readers if not in FF mode Exhausted readers can be fast forwarded, so we have to keep them around. However, if the current reader is not fast forwardable, then we can drop those readers and their buffers. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-08-14 14:37:27 +02:00
Raphael S. Carvalho	050a7019b8	sstables/index_reader: fix index reader for summary entry spanning lots of keys quantity prevents index_reader from reading all index entries of a summary entry that span more than min_index_interval entries. That can happen after introduction of size-based sampling, and consequently, sstable will not be able to return a key which logical position in summary entry is beyond min_index_interval. It's ok to not use quantity because index_reader will read all indexes until either next summary entry or end of file is reached. Fixes test_sstable_conforms_to_mutation_source Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170812045821.25269-1-raphaelsc@scylladb.com>	2017-08-12 09:44:16 +03:00
Raphael S. Carvalho	872412d31a	db/config: introduce sstable_summary_ratio option Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-08-11 01:36:21 -03:00
Raphael S. Carvalho	8726ee937d	sstables: introduce size-based sampling for sstable summary Currently, a summary entry is added after min_index_interval index entries were written. Not taking into account size of index entries becomes a problem with large partitions which may create big index entries due to promoted indexes. Read performance is affected as a consequence because index entries spanned by summary are all read from disk to serve request. What we wanna do is to also add a summary entry after index reaches a boundary. To deal with oversampling, we want to write 1 byte to summary for every 2000 bytes written to data file (this will be eventually made into an option in the config file). Both conditions must be met to avoid under or oversampling. That way, the amount of data needed from index file to satify the request is drastically reduced. Fixes #1842. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-08-11 00:30:12 -03:00
Raphael S. Carvalho	da7489720b	sstables: make components_writer::offset const qualified and uint64_t Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-08-10 21:48:11 -03:00
Raphael S. Carvalho	881c479be8	sstables: make writer::offset const qualified and uint64_t Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-08-10 21:46:39 -03:00
Botond Dénes	94fc550e68	sstable_set::incremental_selector: select() now returns a selection A seletion contains - in addition to the list of sstables - a next_token which is a hint as to what is the next best token to call select() with. This should be the smallest token such that at the next call to select() the least number of new sstables will be returned, without skipping any.	2017-08-09 16:27:33 +03:00
Raphael S. Carvalho	dddbd34b52	sstables: close index file when sstable writer fails index's file output stream uses write behind but it's not closed when sstable write fails and that may lead to crash. It happened before for data file (which is obviously easier to reproduce for it) and was fixed by `0977f4fdf8`. Fixes #2673. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170807171146.10243-1-raphaelsc@scylladb.com>	2017-08-08 09:53:14 +03:00
Duarte Nunes	569bbf2edd	sstables/sstables: Use per-cpu noop_write_monitor We employ a thread-per-core architecture, so don't go about sharing seastar::shared_ptrs across cpus. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170801144153.17354-1-duarte@scylladb.com>	2017-08-01 18:10:49 +03:00
Avi Kivity	db7329b1cb	Merge "Ensure correct EOC for PI block cell names" from Duarte "This series ensures the always write correct cell names to promoted index cell blocks, taking into account the eoc of range tombstones. Fixes #2333" * 'pi-cell-name/v1' of github.com:duarten/scylla: tests/sstable_mutation_test: Test promoted index blocks are monotonic sstables: Consider eoc when flushing pi block sstables: Extract out converting bound_kind to eoc	2017-08-01 18:09:07 +03:00
Avi Kivity	1e8bb972b6	compaction: fix iteration in leveled compaction droppable tombstones loop Since get_level_count() is unsigned, it will never be negative, and the loop may never terminate. Message-Id: <20170719133502.13316-1-avi@scylladb.com>	2017-08-01 13:40:36 +03:00
Avi Kivity	ba2e170e4b	compaction: fix return in leveled compaction droppable tombstones loop If the loop ever terminates, we need to return something. Message-Id: <20170719133508.13374-1-avi@scylladb.com>	2017-08-01 13:33:02 +03:00
Duarte Nunes	1a33cc6847	sstables: Release the flush permit before fsyncing This allows a queued flush to start while we fsync the current sstable, which helps reduce the overall time new writes are blocked on dirty memory. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-31 12:40:19 +02:00
Duarte Nunes	784a078e72	sstables: Introduce write_monitor The write_monitor provides callbacks to inform an observer of the state of the ongoing sstable write. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-31 12:40:19 +02:00
Avi Kivity	e855a28fae	Revert "Merge "memtable flush: Fixes and improvements" from Duarte" This reverts commit `733a64a1df`, reversing changes made to `e11e66723a`. Breaks sstable_test and perf_fast_forward.	2017-07-31 12:44:28 +03:00
Duarte Nunes	5e64839e85	sstables: Release the flush permit before fsyncing This allows a queued flush to start while we fsync the current sstable, which helps reduce the overall time new writes are blocked on dirty memory. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-27 21:09:18 +02:00
Duarte Nunes	a737577881	sstables: Introduce write_monitor The write_monitor provides callbacks to inform an observer of the state of the ongoing sstable write. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-27 21:09:18 +02:00
Duarte Nunes	06728bdfe9	sstables: Consider eoc when flushing pi block When flushing a promoted index block using a range tombstone cell name as a bound, use the right eoc value instead of always writing composite::eoc::none. Fixes #2333 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-27 18:23:58 +02:00
Duarte Nunes	718517ed91	sstables: Extract out converting bound_kind to eoc Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-07-27 18:23:58 +02:00
Paweł Dziepak	7b0f75c0d1	sstables: avoid indirect calls to abstract_type::is_multi_cell()	2017-07-26 14:38:27 +01:00
Paweł Dziepak	28c105e4a7	sstables: avoid copying key components	2017-07-26 14:38:27 +01:00
Paweł Dziepak	960a140880	index_reader: advance_and_check_if_present() use index_comparator	2017-07-26 14:36:37 +01:00
Paweł Dziepak	dc7bad9a50	sstables: cache token in index entries When a sstable reader is fast forwarded some index entries may be read (and compared) multiple times. This patch makes sure that once a token is computed we keep it around and reuse if the entry is accessed again.	2017-07-26 14:36:37 +01:00
Paweł Dziepak	bfb7b56c74	sstable: keep a pre-computed token in summary_entry Each sstable index lookup involves a binary search in the summary and each time a partition key of summary entry is compared with anything its token needs to be calculated. Since we keep summary in the memory all the time it is better to also keep the tokens around.	2017-07-26 14:36:36 +01:00
Paweł Dziepak	31d7cfdefb	sstables: introduce decorated_key_view	2017-07-26 14:36:36 +01:00
Paweł Dziepak	e0a04cb7fe	sstables: make sure that fill_buffer() actually fills buffer streamed_mutation::impl::fill_buffer() is supposed to either push mutation fragments to the buffer or set EOS flag. However, it was possible that mp_row_consumer would return proceed::no if a skip was needed without satisfying any of these conditions.	2017-07-26 14:36:36 +01:00
Avi Kivity	c5ee62a6a4	Merge "restrict background writers with scheduling groups" from Glauber "This patchset restricts background writers - such as compactions, streaming flushes and memtable flushes to a maximum amount of CPU usage through a seastar::thread_scheduling_group. The said maximum is recommended to be set 50 % - it is default disabled, but can be adjusted through a configuration option until we are able to auto-tune this. The second patch in this series provides a preview on how such auto-tune would look like. By implementing a simple controller we automatically adjust the quota for the memtable writer processes, so that the rate at which bytes come in is equal to the rates at which bytes are flushed. Tail latencies are greatly reduced by this series, and heavy spikes that previously appeared on CPU-bound workloads are no more." * 'memtable-controller-v5' of https://github.com/glommer/scylla: simple controller for memtable/streaming writer shares. restrict background writers to 50 % of CPU.	2017-07-20 10:58:53 +03:00
Tomasz Grabiec	a9237c1666	schema: Revert back to the 1.7 layout of static compact tables in memory We are using C* 3.x compatible layout in schema tables but want to keep using the 1.7 layout in memory for compatibility during rolling upgrade. This patch switches the schema and schema_builder classes back to the old layout. Translation of layout happens when converting to/from schema mutations. Notable changes: 1) Includes a revert of commit `6260f31e08` "thrift: Update CQL mapping of static CFs". 2) Brings back the "default_validation_class" schema attribute. In v3 it can be dervied from column definitions, but in v2 it can't, so we have to store it. 3) legacy_schema_migrator and schema_builder don't have to do conversions to v3, this is now handled by the v3_columns class. schema_builder works with the same layout as schema, that is v2. 4) Includes a revert of commit `66991a7ccb` "v3 schema test fixes" Fixes #2555.	2017-07-19 09:52:15 +02:00
Raphael S. Carvalho	7ecedac222	compaction: wire up time window compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-07-19 02:58:37 -03:00

1 2 3 4 5 ...

1063 Commits