scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Author	SHA1	Message	Date
Piotr Sarna	00e59a9823	sstables: disambiguate boost::find There are multiple functions named `find` in boost, so to avoid future clashes, this one is explicitly marked as belonging to boost::range.	2021-05-10 11:48:14 +02:00
Raphael S. Carvalho	8480839932	LCS/reshape: Don't reshape single sstable in level 0 with strict mode With strict mode, it could happen that a sstable alone in level 0 is selected for offstrategy compaction, which means that we could run into an infinite reshape process. This is fixed by respecting the offstrategy threshold. Unit test is added. Fixes #8573. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210506181324.49636-1-raphaelsc@scylladb.com>	2021-05-09 11:09:54 +03:00
Lauro Ramos Venancio	15f72f7c9e	TWCS: initialize _highest_window_seen The timestamp_type is an int64_t. So, it has to be explicitly initialized before using it. This missing inicialization prevented the major compactation from happening when a time window finishes, as described in #8569. Fixes #8569 Signed-off-by: Lauro Ramos Venancio <lauro.venancio@incognia.com> Closes #8590	2021-05-05 17:31:05 +03:00
Benny Halevy	ead96e21c3	compaction: size_tiered_compaction_strategy: get_buckets: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-05-05 14:26:37 +03:00
Benny Halevy	c1681cb9ea	compaction: size_tiered_compaction_strategy: get_buckets: don't let the bucket average drift too high SSTables are added in increasing size order so the bucket's average might drift upwards. Don't let it drift too high, to a point where the smallest SSTable might fall out of range. For example, here's a simulation run of the algorithm for these sstable sizes: [21, 123, 252, 363, 379, 394, 407, 428, 463, 467, 470, 523, 752, 774] the simulated compaction strategy options are: min_sstable_size = 4 bucket_low = 0.66667 bucket_high = 1.5 For each bucket, the following is printed: (avg * bucket_low) avg (avg * bucket_high) UNCHANGED: buckets={ ( 14.0) 21.0 ( 31.5): [21] ( 82.0) 123.0 ( 184.5): [123] ( 276.4) 414.6 ( 621.9): [252, 363, 379, 394, 407, 428, 463, 467, 470, 523] ( 508.7) 763.0 (1144.5): [752, 774] } IMPROVED: buckets={ ( 14.0) 21.0 ( 31.5): [21] ( 82.0) 123.0 ( 184.5): [123] ( 247.0) 370.5 ( 555.8): [252, 363, 379, 394, 407, 428] ( 320.5) 480.8 ( 721.1): [463, 467, 470, 523] ( 508.7) 763.0 (1144.5): [752, 774] } Fixes #8584 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-05-05 14:26:28 +03:00
Benny Halevy	d3aa5265ab	compaction: size_tiered_compaction_strategy: get_buckets: keep bucket average size as double precision floating point number Using integer division lose accuracy by rounding down the result. Each time we calculate: ``` auto total_size = bucket.size() * old_average_size; auto new_average_size = (total_size + size) / (bucket.size() + 1); ``` We accumulate the rounding error. total_size might be too small since old_average_size was previously rounded down, and then new_average_size is rounded down again. Rather than trying to compensate for the rounding errors by e.g. adding size / 2 to the dividend, simply keep the average as a double precision number. Note that we multiply old_average_size by options.bucket_{low,high}, that are double precision too so the size comparisons are already using FP instructions implicitly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-05-05 14:26:25 +03:00
Benny Halevy	44b094f9a5	compaction: size_tiered_compaction_strategy: get_buckets: rename old_average_size to bucket_average_size Since now it became a reference used to update the bucket's average size after a new sstable is inserted into it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-05-05 14:26:20 +03:00
Benny Halevy	336a4dc0fd	compaction: size_tiered_compaction_strategy: get_buckets: consider only current bucket for each sstable Since the sstables are sorted in increasing size order there is no need to consider all buckets to find a matching one. Instead, just consider the most recently inserted bucket. Once we see a sstable size outside the allowed range for this bucket, create a new bucket and consider this one for the next sstable. Note, `old_average_size` should be renamed since this change turns it into a reference and it's assigned with the new average_size. This patch keeps the old name to reduce the churn. The following patch will do only the rename. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-05-05 14:26:05 +03:00
Pavel Emelyanov	13b07a3c58	sstables: Make checksum sink report buffer size from lower sink The checksum sink carries another sink on board and forwards the put buffers lower, so there's no point in making these two have different buffer sizes. This is what really happens now, but this change makes this more explicit and makes the checksumming code conform to the new output stream API. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-04 12:01:30 +03:00
Pavel Emelyanov	01b979beca	sstables: Report buffer size from compressed file sink This change just moves the place from which the output_stream knows the compression::uncompressed_chunk_length() value. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-04 12:01:27 +03:00
Botond Dénes	9fc3cba055	sstables: improve error message for invalid sstable paths The error message currently complains about "invalid version" and later says the reason is that the path is not recognized. This is confusing so change the error message to start with "invalid path" instead. It is the path that is invalid not the version after all. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210429092749.52659-1-bdenes@scylladb.com>	2021-04-29 12:50:48 +03:00
Asias He	60ba8eb9b8	sstables: Add debug info when create_sharding_metadata generates zero ranges The range passed to create_sharding_metadata is supposed to be owned or at least partially owned by the shard. Log keys, range and split ranges for debugging if the range does not belong to the shard. This is helpful for debugging "Failed to generate sharding metadata for foo.db" issues reported. Refs #7056 Closes #8557	2021-04-28 11:22:06 +03:00
Benny Halevy	3e7075a739	compaction: setup: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	90a7a8ff0e	compaction: close reader when done consuming Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	7d42a71310	mutation_reader: position_reader_queue: add close method Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	3c05529329	sstables: scrub_compaction: reader: close underlying reader Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:16:10 +03:00
Benny Halevy	75eed563bc	sstables: write_components: close reader when done Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:16:10 +03:00
Benny Halevy	8c585ccb5c	sstables: sstable_mutation_reader: implement close Close both the _index_reader and _context, if they are engaged. Warn and ignore any erros from close as it may be called either from the destructor or from f_m_r close. Call close() for closing in the background if needed when destroyed and warn about. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:16:10 +03:00
Benny Halevy	6a82e9f4be	sstables: index_reader: mark close noexcept We'd like that to simplify the soon-to-be-introduced sstable_mutation_reader::close error handling path. close_index_list can be marked noexcept since parallel_for_each is, with that index_reader::close can be marked noexcept too. Note that since reader close can not fail both lower and upper bounds are closed (since closing lower_bound cannot fail). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:16:10 +03:00
Avi Kivity	350f79c8ce	Merge 'sstables: remove large allocations when parsing cells' from Wojciech Mitros sstable cells are parsed into temporary_buffers, which causes large contiguous allocations for some cells. This is fixed by storing fragments of the cell value in a fragmented_temporary_buffer instead. To achieve this, this patch also adds new methods to the fragmented_temporary_buffer(size(), ostream& operator<<()) and adds methods to the underlying parser(primitive_consumer) for parsing byte strings into fragmented buffers. Fixes #7457 Fixes #6376 Closes #8182 * github.com:scylladb/scylla: primitive_consumer: keep fragments of parsed buffer in a small_vector sstables: add parsing of cell values into fragmented buffers sstables: add non-contiguous parsing of byte strings to the primitive_consumer utils: add ostream operator<<() for fragmented_temporary_buffer::view compound_type: extend serialize_value for all FragmentedView types	2021-04-22 15:38:10 +02:00
Avi Kivity	a063173ace	Merge "Fix unbounded memory usage and high write amplification in TWCS reshape" from Raphael " Memory usage is considerably reduced by making reshape switch to partitioned set, given that input sstables are disjoint. This will benefit reshape for all strategies, not only TWCS. Write amplification is reduced a lot by compacting all input sstables at once, which is possible given that unbounded memory usage is fixed too. With both these issues fixed, TWCS reshape will be much more efficient. tests: mode(dev). " * 'twcs_reshape_fixes' of github.com:raphaelsc/scylla: tests: sstables: Check that TWCS is able to reshape disjoint sstables efficiently TWCS: Reshape all sstables in a time window at once if they're disjoint sstables: Extract code to count amount of overlapping into a function LCS: reshape: Fix overlapping check when determining if a sstable set is disjoint compaction: Make reshape compaction always use partitioned_sstable_set compaction: Allow a compaction type to override the sstable_set for input sstables	2021-04-22 11:24:49 +03:00
Raphael S. Carvalho	d5fc2f3839	TWCS: Reshape all sstables in a time window at once if they're disjoint With repair-based operations, each window will have 256 disjoint sstables due to data segregation which produces N sstables for each vnode range, where N = # of existing windows. So each window ends up with one sstable per vnode range = 256. Given that reshape now unconditionally uses partitioned set's incremental selector, all the 256 sstables can be compacted at once as compaction essentially becomes a copy operation, where only one sstable will be opened at a time, making its memory usage very efficient. By compacting all sstables at once, write amplification is a lot reduced because each byte is now only rewritten once. Previously, with the initial set of 256 sstables, write amp could be up to 8, which makes reshape for TWCS very slow. Refs #8449. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-04-21 11:03:16 -03:00
Raphael S. Carvalho	0f7774a6f8	sstables: Extract code to count amount of overlapping into a function This function will be reused by TWCS reshape when checking if all sstables in a window are disjoint and can be all compacted together. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-04-21 11:03:16 -03:00
Raphael S. Carvalho	39ecddbd34	LCS: reshape: Fix overlapping check when determining if a sstable set is disjoint Wrong comparison operator is used when checking for overlapping. It would miss overlapping when last key of a sstable is equal to the first key of another sstable that comes next in the set, which is sorted by first key. Fixes #8531. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-04-21 11:03:07 -03:00
Piotr Sarna	2ad09d0bf8	Merge 'treewide: remove inclusions of storage_proxy.hh from headers' from Avi Kivity Reduce rebuilds and build time by removing unnecessary includes. Along the way, improve header sanity. Ref #1. Test: dev-headers, unit(dev). Closes #8524 * github.com:scylladb/scylla: treewide: remove inclusions of storage_proxy.hh from headers storage_proxy: unnest coordinator_query_result treewide: make headers self-sufficient utils: intrusive_btree: add missing #pragma once	2021-04-21 08:22:52 +02:00
Benny Halevy	7130e2e7ff	sstables: harden unlink Make sure that sstable::unlink will never fail. It will terminate in the unlikely case toc_filename throws (e,g, on bad_alloc), otherwise it ignores any other error and juts warns about it. Make unlink a coroutine to simplify the implementation without introducing additional allocations. Note that remove_by_toc_name and maybe_delete_large_data_entries are executed asynchronously and concurrently. Waiting for them to finish is serialized by co_await, making sure that both are being waited on so not to leave abandoned futures behind. Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210420135020.102733-1-bhalevy@scylladb.com>	2021-04-21 08:22:52 +02:00
Raphael S. Carvalho	678e4c0bb9	compaction: Make reshape compaction always use partitioned_sstable_set Reshape compaction potentially works with disjoint sstables, so it will benefit a lot from using partitioned_sstable_set, which is able to incrementally open the disjoint sstables. Without it, all sstables are opened at once, which means unbounded memory usage. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-04-20 15:39:51 -03:00
Avi Kivity	14a4173f50	treewide: make headers self-sufficient In preparation for some large header changes, fix up any headers that aren't self-sufficient by adding needed includes or forward declarations.	2021-04-20 21:23:00 +03:00
Raphael S. Carvalho	ad9bc808b9	compaction: Allow a compaction type to override the sstable_set for input sstables By default, compaction will pick a implementation of sstable_set as defined by the underlying compaction strategy. However, reshape compaction potentially works with disjoint sstables and will benefit a lot from always using partitioned set. For example, when reshaping a TWCS table, it's better to use the partitioned set rather than the time window set, as the former will be much more memory efficient by incrementally selecting sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-04-20 12:03:44 -03:00
Benny Halevy	a57459e983	compaction: cleanup_compaction: no need to filter tokens belonging to other shards As sstables are always resharded if needed when loaded. Refs #6807 Test: unit(release,debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210419142743.265729-1-bhalevy@scylladb.com>	2021-04-19 17:22:53 +02:00
Kamil Braun	5c7ed7a83f	time_series_sstable_set: return partition start if some sstables were ck-filtered out When a particular partition exists in at least one sstable, the cache expects any single-partition query to this partition to return a `partition_start` fragment, even if the result is empty. In `time_series_sstable_set::create_single_key_sstable_reader` it could happen that all sstables containing data for the given query get filtered out and only sstables without the relevant partition are left, resulting in a reader which immediately returns end-of-stream (while it should return a `partition_start` and if not in forwarding mode, a `partition_end`). This commit fixes that. We do it by extending the reader queue (used by the clustering reader merger) with a `dummy_reader` which will be returned by the queue as the very first reader. This reader only emits a `partition_start` and, if not in forwarding mode, a `partition_end` fragment. Fixes #8447. Closes #8448	2021-04-14 13:16:00 +02:00
Raphael S. Carvalho	224120f7df	sstables: rewrite compound_sstable_set::all() Procedure is rewritten using std::partition, making it easier to maintain and it also fixes a theoretical quadratic behavior because list is entirely copied when extending it, which isn't harmful because maintenance set will be rarely populated and there are only 2 sets at most. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210409171412.57729-1-raphaelsc@scylladb.com>	2021-04-12 12:45:43 +03:00
Kamil Braun	3687757115	sstables: fix TWCS single key reader sstable filter The filter passed to `min_position_reader_queue`, which was used by `clustering_order_reader_merger`, would incorrectly include sstables as soon as they passed through the PK (bloom) filter, and would include sstables which didn't pass the PK filter (if they passed the CK filter). Fortunately this wouldn't cause incorrect data to be returned, but it would cause sstables to be opened unnecessarily (these sstables would immediately return eof), resulting in a performance drop. This commit fixes the filter and adds a regression test which uses statistics to check how many times the CK filter was invoked. Fixes #8432. Closes #8433	2021-04-08 18:03:49 +03:00
Raphael S. Carvalho	8e0a1ca866	sstable_set: Implement compound_sstable_set's create_single_key_sstable_reader() compound set isn't overriding create_single_key_sstable_reader(), so default implementation is always called. Although default impl will provide correct behavior, specialized ones which provides better perf, which currently is only available for TWCS, were being ignored. compound set impl of single key reader will basically combine single key readers of all sets managed by it. Fixes #8415. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210406205009.75020-1-raphaelsc@scylladb.com>	2021-04-07 12:36:30 +03:00
Wojciech Mitros	201b86b042	primitive_consumer: keep fragments of parsed buffer in a small_vector When we want to parse a linearized buffer of bytes, we're copying them into the first and only element of the _read_bytes vector. Thus _read_bytes often contains only one element, which makes a small_vector a better alternative. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-04-01 16:05:52 +02:00
Wojciech Mitros	599cfe586f	sstables: add parsing of cell values into fragmented buffers The entire sstable cell value is currently stored in a single temporary_buffer. Cells may be very large, so to avoid large contiguous allocations, the buffer is changed to a fragmented_temporary_buffer. Fixes #7457 Fixes #6376 Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-04-01 15:36:58 +02:00
Wojciech Mitros	b1b5bda848	sstables: add non-contiguous parsing of byte strings to the primitive_consumer Currently, the primitive_consumer parses all values in contiguous buffers. A string of bytes may be very long, so parsing it in a single buffer can cause a big allocation. This patch allows parsing into fragmented_temporary_buffers instead of temporary_buffers. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 12:09:52 +02:00
Piotr Sarna	6de2691bbd	sstables,test: remove variables depending on old features In order to maintain backward compatibility wrt. cluster features, two boolean variables were kept in sstable writers: - correctly_serialize_non_compound_range_tombstones - correctly_serialize_static_compact_in_mc Since these features are assumed to always be present now, the above variables are no longer needed and can be purged.	2021-03-30 09:37:41 +02:00
Piotr Sarna	28c9af6fa5	sstables: stop relying on CORRECT_STATIC_COMPACT_IN_MC feature The feature bit is going away because it's over 2 years old, so the code which depended on it becomes unconditional.	2021-03-30 09:37:04 +02:00
Raphael S. Carvalho	a390f4eb61	sstables: optimize LCS reshape for repair-based operations LCS reshape is currently inefficient for repair-based operation, because the disjoint run of 256 sstables is reshaped into bigger L0 files, which will be then integrated into the main sstable set. On reshape completion, LCS has to compact those big L0 files onto higher levels, until last level is reached, producing bad write amplification. A much better approach is to instead compact that disjoint run into the best possible level L, which can be figured out with: log (base fan_out) of (total_size / max_sstable_size) This compaction will be essentially a copy operation. It's important to do it rather than only mutating the level of sstables because we have to reshape the input run according to LCS parameters like sstable size. For repair-based bootstrap/replace, the input disjoint run is now efficiently reshaped into an ideal level L, so there's no compaction backlog once reshape completes. This behavior will manifest in the log as this: LeveledManifest - Reshaping 256 disjoint sstables in level 0 into level 2 For repair-based decommission/removenode though, which reshape wasn't wired on yet, level L may temporarily hold 2 disjoint runs, which overlap one another, but LCS itself will incrementally merge them through either promotion of L-1 into L, or by detecting overlapping in level L and merging the overlapping sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210329171826.42873-1-raphaelsc@scylladb.com>	2021-03-29 20:22:04 +03:00
Avi Kivity	a8463cfb37	Merge "reader_permit: signal leaked resources" from Botond " When a permit is destroyed we check if it still holds on to any resources in the destructor. Any resources the permit still holds on are leaked resources, as users should have released these. Currently we just invoke `on_internal_error_noexcept()` to handle this, which -- depending on the configuration -- will result in an error message or an assert. In the former case, the resources will be leaked for good. This mini-series fixes this, by signaling back these resources to the semaphore. This helps avoid an eventual complete dry-up of all semaphore resources and a subsequent complete shutdown of reads. Tests: unit(release, debug) " * 'reader-permit-signal-leaked-resources/v1' of https://github.com/denesb/scylla: reader_permit: signal leaked resources test: test_reader_lifecycle_policy: keep semaphores alive until all ops cease sstables: generate_summary(): extend the lifecycle of the reader concurrency semaphore	2021-03-29 17:57:31 +03:00
Botond Dénes	f843e3de08	sstables: generate_summary(): extend the lifecycle of the reader concurrency semaphore Used to produce the needed permits for the index reads, such that it over-lives all the permits in use.	2021-03-26 11:06:02 +02:00
Pavel Emelyanov	c6a0e0439e	files: Construct file_impls properly Constructors of classes inherited from file_impl copy alignment values by hands, but miss the overwrite one, thus on a new file it remains default-initialized. To fix this and not to forget to properly initalize future fields from file_impl, use the impl's copy constructor. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210325104830.31923-1-xemul@scylladb.com>	2021-03-26 00:22:11 +01:00
Raphael S. Carvalho	bcbb39999b	LCS: Fix terrible write amplification when reshaping level 0 LCS reshape is basically 'major compacting' level 0 until it contains less than N sstables. That produces terrible write amplification, because any given byte will be compacted (initial # of sstables / max_threshold (32)) times. So if L0 initially contained 256 ssts, there would be a WA of about 8. This terrible write amplification can be reduced by performing STCS instead on L0, which will leave L0 in a good shape without hurting WA as it happens now. Fixes #8345. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210322150655.27011-1-raphaelsc@scylladb.com>	2021-03-24 17:48:50 +02:00
Tomasz Grabiec	9272e74e8c	sstable: writer: ka/la: Write row marker cell after row tombstone Row marker has a cell name which sorts after the row tombstone's start bound. The old code was writing the marker first, then the row tombstone, which is incorrect. This was harmeless to our sstable reader, which recognized both as belonging to the current clustering row fragment, and collects both fine. However, if both atoms trigger creation of promoted index blocks, the writer will create a promoted index with entries wich violate the cell name ordering. It's very unlikely to run into in practice, since to trigger promoted index entries for both atoms, the clustering key would be so large so that the size of the marker cell exceeds the desired promoted index block size, which is 64KB by default (but user-controlled via column_index_size_in_kb option). 64KB is also the limit on clustering key size accepted by the system. This was caught by one of our unit tests: sstable_conforms_to_mutation_source_test ...which runs a battery of mutation reader tests with various desired promoted index block sizes, including the target size of 1 byte, which triggers an entry for every atom. The test started to fail for some random seeds after commit `ecb6abe` inside the test_streamed_mutation_forwarding_is_consistent_with_slicing test case, reporting a mutation mismatch in the following line: assert_that(sliced_m).is_equal_to(fwd_m, slice_with_ranges.row_ranges(*m.schema(), m.key())); It compares mutations read from the same sstable using different methods, slicing using clustering key restricitons, and fast forwarding. The reported mismatch was that fwd_m contained the row marker, but sliced_m did not. The sstable does contain the marker, so both reads should return it. After reverting the commit which introduced dynamic adjustments, the test passes, but both mutations are missing the marker, both are wrong! They are wrong because the promoted index contians entries whose starting positions violate the ordering, so binary search gets confused and selects the row tombstone's position, which is emitted after the marker, thus skipping over the row marker. The explanation for why the test started to fail after dynamic adjustements is the following. The promoted index cursor works by incrementally parsing buffers fed by the file input stream. It first parses the whole block and then does a binary search within the parsed array. The entries which cursor touches during binary search depend on the size of the block read from the file. The commit which enabled dynamic adjustements causes the block size to be different for subsequent reads, which allows one of the reads to walk over the corrupted entries and read the correct data by selecting the entry corresponding to the row marker. Fixes #8324 Message-Id: <20210322235812.1042137-1-tgrabiec@scylladb.com>	2021-03-23 16:13:47 +01:00
Raphael S. Carvalho	c86dd125a1	sstables: clean up partitioned_sstable_set::insert() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210322130227.16805-2-raphaelsc@scylladb.com>	2021-03-22 15:30:32 +02:00
Raphael S. Carvalho	48d8cc261e	sstables: don't swallow exception in partitioned_sstable_set::insert() regression introduced by `02b2df1ea9` (Fri Mar 12 01:22:41 2021 -0300). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210322130227.16805-1-raphaelsc@scylladb.com>	2021-03-22 15:30:31 +02:00
Avi Kivity	3c44445c07	Merge "Introduce off-strategy compaction for repair-based bootstrap and replace" from Raphael " Scylla suffers with aggressive compaction after repair-based operation has initiated. That translates into bad latency and slowness for the operation itself. This aggressiveness comes from the fact that: 1) new sstables are immediately added to the compaction backlog, so reducing bandwidth available for the operation. 2) new sstables are in bad shape when integrated into the main sstable set, not conforming to the strategy invariant. To solve this problem, new sstables will be incrementally reshaped, off the compaction strategy, until finally integrated into the main set. The solution takes advantage there's only one sstable per vnode range, meaning sstables generated by repair-based operations are disjoint. NOTE: off-strategy for repair-based decommission and removenode will follow this series and require little work as the infrastructure is introduced in this series. Refs #5226. " * 'offstrategy_v7' of github.com:raphaelsc/scylla: tests: Add unit test for off-strategy sstable compaction table: Wire up off-strategy compaction on repair-based bootstrap and replace table: extend add_sstable_and_update_cache() for off-strategy sstables/compaction_manager: Add function to submit off-strategy work table: Introduce off-strategy compaction on maintenance sstable set table: change build_new_sstable_list() to accept other sstable sets table: change non_staging_sstables() to filter out off-strategy sstables table: Introduce maintenance sstable set table: Wire compound sstable set table: prepare make_reader_excluding_sstables() to work with compound sstable set table: prepare discard_sstables() to work with compound sstable set table: extract add_sstable() common code into a function sstable_set: Introduce compound sstable set reshape: STCS: preserve token contiguity when reshaping disjoint sstables	2021-03-22 10:43:13 +02:00
Nadav Har'El	abab1d906c	Merge 'sstables: convert enable_if to equivalent concepts' from Avi Kivity enable_if is hard to understand, especially its error messages. Convert enable_if in sstable code to concepts. A new concept is introduced, self_describing, for the case of a type that follows the obj.describe_type() protocol. Otherwise this is quite straightforward. Closes #8315 * github.com:scylladb/scylla: sstables: vector write: convert to concepts sstables: check_truncated_and_assign: convert to concept sstables: convert write() to concepts sstables: convert write_vint() to concepts sstables: vector parse(): convert to concept sstables: convert parse() for a self-describing type to concept sstables: read_vint(): convert enable_if to concepts sstables: add concept for self-describing type	2021-03-18 23:09:34 +02:00
Avi Kivity	bf0c7d1340	sstables: vector write: convert to concepts We have an integral and a non-integral overload, each constrained with enable_if. We use std::integral to constrain the integral overload and leave the other unconstrained, as C++ will choose the more constrained version when applicable.	2021-03-18 19:26:54 +02:00

1 2 3 4 5 ...

2454 Commits