scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-21 00:50:35 +00:00

Author	SHA1	Message	Date
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	72a88e0257	mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_range_tombstone() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	4f5ccf82cb	mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_clustering_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	f2b9cad4c6	mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_static_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Avi Kivity	2bd264ec6a	sstables: remove background_jobs(), await_background_jobs() There are no more users for registering background jobs, so remove the mechanism and the remaining calls.	2020-09-23 20:55:17 +03:00
Avi Kivity	5db96170a5	sstables: make sstables_manager take charge of closing sstables Currently, closing sstables happens from the sstable destructor. This is problematic since a destructor cannot wait for I/O, so we launch the file close process in the background. We therefore lose track of when the closing actually takes place. This patch makes sstables_manager take charge of the close process. Every sstable is linked into one of two intrusive lists in its manager: _active or _undergoing_close. When the reference count of the sstable drops to zero, we move it from _active to _undergoing_close and begin closing the files. sstables_manager remembers all closes and when sstables_manager::close() is called, it waits for all of them to complete. Therefore, sstables_manager::close() allows us to know that all files it manages are closed (and deleted if necessary). The sstables_manager also gains a destructor, which disables move construction.	2020-09-23 20:55:17 +03:00
Avi Kivity	f9aa50dcbf	test: sstables test_env: introduce manager() accessor This returns the sstables_manager carried by the test_env. We will soon retire the global test_sstables_manager, so we need to provide access to one.	2020-09-23 20:55:10 +03:00
Avi Kivity	a90a511d36	sstables_manager: introduce a stub close() sstables_manager is going to take charge of its sstables lifetimes, so it will need a close() to wait until sstables are deleted. This patch adds sstables_manager::close() so that the surrounding infrastructure can be wired to call it. Once that's done, we can make it do the waiting.	2020-09-23 20:55:04 +03:00
Avi Kivity	d19c6c0d98	sstables: size_tiered_backlog_tracker: avoid assignment of non-constexpr expression to constexpr object std::log() is not constexpr, so it cannot be assigned to a constexpr object. Make it non-constexpr and automatic. The optimizer still figures out that it's constant and optimizes it. Found by clang. Apparently gcc only checks the expression is constant, not constexpr.	2020-09-21 16:32:53 +03:00
Avi Kivity	a155b2bced	sstables: leveled_manifest: prevent benign precision loss warning Casting from the maximum int64_t to double loses precision, because int64_t has 64 bits of precision while double has only 53. Clang warns about it. Since it's not a real problem here, add an explicit cast to silence the warning.	2020-09-21 16:32:53 +03:00
Avi Kivity	aa7426bde6	sstables: index_reader: make 'index_bound' public index_reader::index_bound must be constructible by non-friend classes since it's used in std::optional (which isn't anyone's friend). This now works in gcc because gcc's inter-template access checking is broken, but clang correctly rejects it.	2020-09-21 16:32:53 +03:00
Avi Kivity	bd42bdd6b5	sstables: index_reader: disambiguate promoted_index_blocks_reader "state" type and data member promoted_index_blocks_reader has a data member called "state", and a type member called "state". Somehow gcc manages to disambiguate the two when used, but clang doesn't. I believe clang is correct here, one member should subsume the other. Change the type member to have a different name to disambiguate the two.	2020-09-21 16:32:53 +03:00
Piotr Sarna	16b4b86697	sstables: drop checks for non-compound range tombstones support Correct non-compound range tombstones are supported for over 2 years and upgrades are only allowed from versions which already have the support, so the checks are hereby dropped.	2020-09-14 12:09:51 +02:00
Piotr Sarna	f8ed1b5b67	sstables: drop checks for correct counter order support Correct counter order is supported for over 2 years and upgrades are only allowed from versions which already have the support, so the checks are hereby dropped.	2020-09-14 12:05:11 +02:00
Avi Kivity	64c7c81bac	Merge "Update log messages to {fmt} rules" from Pavel E " Before seastar is updated with the {fmt} engine under the logging hood, some changes are to be made in scylla to conform to {fmt} standards. Compilation and tests checked against both -- old (current) and new seastar-s. tests: unit(dev), manual " * 'br-logging-update' of https://github.com/xemul/scylla: code: Force formatting of pointer in .debug and .trace code: Format { and } as {fmt} needs streaming: Do not reveal raw pointer in info message mp_row_consumer: Provide hex-formatting wrapper for bytes_view heat_load_balance: Include fmt/ranges.h	2020-09-03 15:10:09 +03:00
Raphael S. Carvalho	adf576f769	compaction_manager: export method that returns if table has ongoing compaction A compaction strategy, that supports parallel compaction, may want to know if the table has compaction running on its behalf before making a decision. For example, a size-tiered-like strategy may not want to trigger a behavior, like cross-tier compaction, when there's ongoing compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200901134306.23961-1-raphaelsc@scylladb.com>	2020-09-02 16:46:49 +03:00
Raphael S. Carvalho	7f7f366cb5	compaction: add debug msg to inform the amount of expired ssts skipped by compaction this information is useful when debugging compaction issues that involve fully expired ssts. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200828140401.96440-1-raphaelsc@scylladb.com>	2020-08-31 17:18:47 +03:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Pavel Emelyanov	50e3a30dae	mp_row_consumer: Provide hex-formatting wrapper for bytes_view By default {fmt} doesn't know how to format this type (although it's a basic_string_view instantiated), and even providing formatter/operator<< does not help -- it anyway hits an earlier assertion in args mapper about the disallowance of character types mixing. The hex-wrapper with own operator<< solves the problem. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Benny Halevy	f5ffd5fc5f	sstables: Fix reactor stall in sstables::seal_summary() With relatively big summaries, reactor can be stalled for a couple of milliseconds. This patch: a. allocates positions upfront to avoid excessive reallocation. b. returns a future from seal_summary() and uses `seastar::do_for_each` to iterate over the summary entries so the loop can yield if necessary. Fixes #7108. Based on 2470aad5a389dfd32621737d2c17c7e319437692 by Raphael S. Carvalho <raphaelsc@scylladb.com> Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200826091337.28530-1-bhalevy@scylladb.com>	2020-08-26 12:18:05 +03:00
Benny Halevy	78a44dda57	sstables: avoid double close in file_writer destructor If file_writer::close() fails to close the output stream closing will be retried in file_writer::~file_writer, leading to: ``` include/seastar/core/future.hh:1892: seastar::future<T ...> seastar::promise<T>::get_future() [with T = {}]: Assertion `!this->_future && this->_state && !this->_task' failed. ``` as seen in https://github.com/scylladb/scylla/issues/7085 Fixes #7085 Test: unit(dev), database_test with injected error in posix_file_impl::close() Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200826062456.661708-1-bhalevy@scylladb.com>	2020-08-26 11:33:23 +03:00
Rafael Ávila de Espíndola	5fcfbd76a9	sstables: Delete duplicated code For some reason date_tiered_compaction_strategy had its own identical copy of get_value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200819211509.106594-1-espindola@scylladb.com>	2020-08-26 11:33:23 +03:00
Pavel Emelyanov	171822cff8	compaction: Use database from options to get local ranges The cleanup compaction wants to keep local tokens on-board and gets them from storage_service.get_local_ranges(). This method is the wrapper around database.get_keyspace_local_ranges() created in previous patch, the live database reference is already available on the descriptor's options, so we can short-cut the call. This allows removing the last explicit call for global storage_service instance from compaction code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00
Pavel Emelyanov	8333fed8aa	compaction: Keep database reference on upgrade options The only place that creates them is the API upgrade_sstables call. The created options object doesn't over-survive the returned future, so it's safe to keep this reference there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00
Pavel Emelyanov	a6e6856e1f	compaction: Keep database reference on cleanup options The database is available at both places that create the options -- tests and API perform_cleanup call. Options object doesn't over-survive the returned future, so it's safe to keep the reference on it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00
Raphael S. Carvalho	a0e0195a77	sstables: Avoid excessive reallocations when creating sharding metadata Let's reserve space for sharding metadata in advance, to avoid excessive allocations in create_sharding_metadata(). With the default ignore_msb_bits=12, it was observed that the # of reallocations is frequently 11-12. With ignore_msb_bits=16, the number can easily go up to 50. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200814210250.39361-1-raphaelsc@scylladb.com>	2020-08-19 17:58:29 +03:00
Avi Kivity	6f986df458	Merge "Fix TWCS compaction aggressiveness due to data segregation" from Raphael " After data segregation feature, anything that cause out-of-order writes, like read repair, can result in small updates to past time windows. This causes compaction to be very aggressive because whenever a past time window is updated like that, that time window is recompacted into a single SSTable. Users expect that once a window is closed, it will no longer be written to, but that has changed since the introduction of the data segregation future. We didn't anticipate the write amplification issues that the feature would cause. To fix this problem, let's perform size-tiered compaction on the windows that are no longer active and were updated because data was segregated. The current behavior where the last active window is merged into one file is kept. But thereafter, that same window will only be compacted using STCS. Fixes #6928. " * 'fix_twcs_agressiveness_after_data_segregation_v2' of github.com:raphaelsc/scylla: compaction/twcs: improve further debug messages compaction/twcs: Improve debug log which shows all windows test: Check that TWCS properly performs size-tiered compaction on past windows compaction/twcs: Make task estimation take into account the size-tiered behavior compaction/stcs: Export static function that estimates pending tasks compaction/stcs: Make get_buckets() static compact/twcs: Perform size-tiered compaction on past time windows compaction/twcs: Make strategy easier to extend by removing duplicated knowledge compaction/twcs: Make newest_bucket() non-static compaction/twcs: Move TWCS implementation into source file	2020-08-19 17:19:01 +03:00
Avi Kivity	f6b66456fd	Update seastar submodule Contains patch from Rafael to fix up includes. * seastar c872c3408c...7f7cf0f232 (9): > future: Consider result_unavailable invalid in future_state_base::ignore() > future: Consider result_unavailable invalid in future_state_base::valid() > Merge "future-util: split header" from Benny > docs: corrected some text and code-examples in streaming-rpc docs > future: Reduce nesting in future::then > demos: coroutines: include std-compat.hh > sstring: mark str() and methods using it as noexcept > tls: Add an assert > future: fix coroutine compilation	2020-08-19 17:18:57 +03:00
Rafael Ávila de Espíndola	56724d084d	sstables: Move date_tiered_compaction_strategy_options::date_tiered_compaction_strategy_options out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200812232915.442564-6-espindola@scylladb.com>	2020-08-19 11:34:13 +03:00
Rafael Ávila de Espíndola	07b3ead752	sstables: Move size_tiered_compaction_strategy_options::size_tiered_compaction_strategy_options out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200812232915.442564-5-espindola@scylladb.com>	2020-08-19 11:34:13 +03:00
Rafael Ávila de Espíndola	7b3946fa0e	sstables: Move compaction_strategy_impl::compaction_strategy_impl out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200812232915.442564-4-espindola@scylladb.com>	2020-08-19 11:34:13 +03:00
Rafael Ávila de Espíndola	9ba765fe6f	sstables: Move compaction_strategy_impl::get_value out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200812232915.442564-3-espindola@scylladb.com>	2020-08-19 11:34:13 +03:00
Rafael Ávila de Espíndola	06b15aa7e3	sstables: Move time_window_compaction_strategy_options' constructors to a .cc These are not trivial and not hot. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200812232915.442564-2-espindola@scylladb.com>	2020-08-19 11:34:13 +03:00
Raphael S. Carvalho	d601f78b4b	compaction/twcs: improve further debug messages Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-18 15:14:09 -03:00
Raphael S. Carvalho	086f277584	compaction/twcs: Improve debug log which shows all windows The current log prints one log entry for each window, it doesn't print the # of SSTs in the bucket, and the now information is copied across all the window entries. previously, it looked like this: [shard 0] compaction - Key 1597331160000000, now 1597331160000000 [shard 0] compaction - Key 1597331100000000, now 1597331160000000 [shard 0] compaction - Key 1597331040000000, now 1597331160000000 [shard 0] compaction - Key 1597330980000000, now 1597331160000000 this made it harder to group all windows which reflect the state of the strategy in a given time. now, it looks like as follow: [shard 0] compaction - time_window_compaction_strategy::newest_bucket: now 1597331160000000 buckets = { key=1597331160000000, size=1 key=1597331100000000, size=2 key=1597331040000000, size=1 key=1597330980000000, size=1 } Also the level of this log is changed from debug to trace, given that now it's compressed and only printed once. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-18 15:14:09 -03:00
Raphael S. Carvalho	96436312be	compaction/twcs: Make task estimation take into account the size-tiered behavior The task estimation was not taking into account that TWCS does size-tiered on the the windows, and it only added 1 to the estimation when there could be more tasks than that depending on the amount of SSTables in all the existing size tiers. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-18 15:14:09 -03:00
Raphael S. Carvalho	d287b1c198	compaction/stcs: Export static function that estimates pending tasks That will be useful for allowing other compaction strategies that use STCS to properly estimate the pending tasks. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-18 15:14:09 -03:00
Raphael S. Carvalho	b62737fd05	compaction/stcs: Make get_buckets() static STCS will export a static function to estimate pending tasks, and it relies on get_buckets() being static too. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-18 15:14:07 -03:00
Dejan Mircevski	fb6c011b52	everywhere: Insert space after `switch` Quoth @avikivity: "switch is not a function, and we celebrate that by putting a space after it like other control-flow keywords." https://github.com/scylladb/scylla/pull/7052#discussion_r471932710 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 14:31:04 +03:00
Raphael S. Carvalho	f9f0be9ac8	compact/twcs: Perform size-tiered compaction on past time windows After data segregation feature, anything that cause out-of-order writes, like read repair, can result in small updates to past time windows. This causes compaction to be very aggressive because whenever a past time window is updated like that, that time window is recompacted into a single SSTable. Users expect that once a window is closed, it will no longer be written to, but that has changed since the introduction of the data segregation future. We didn't anticipate the write amplification issues that the feature would cause. To fix this problem, let's perform size-tiered compaction on the windows that are no longer active and were updated because data was segregated. The current behavior where the last active window is merged into one file is kept. But thereafter, that same window will only be compacted using STCS. Fixes #6928. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-17 12:29:34 -03:00
Raphael S. Carvalho	820b47e9a3	compaction/twcs: Make strategy easier to extend by removing duplicated knowledge TWCS is hard to extend because its knowledge on what to do with a window bucket is duplicated in two functions. Let's remove this duplication by placing the knowledge into a single function. This is important for the coming change that will perform size-tiered instead of major on windows that are no longer active. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-17 12:29:34 -03:00
Raphael S. Carvalho	f2b588cfc4	compaction/twcs: Make newest_bucket() non-static To fix #6928, newest_bucket() will have to access the class fields. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-17 12:29:34 -03:00
Raphael S. Carvalho	b95359314d	compaction/twcs: Move TWCS implementation into source file Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-17 12:29:34 -03:00
Raphael S. Carvalho	81ec49c82f	sstables/sstable_set: rename method to retrieve sstable runs select() is too generic for the method that retrieve sstable runs, and it has a completely different meaning that the former select method used to select sstables based on token range. let's give it a more descriptive name. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200811193401.22749-1-raphaelsc@scylladb.com>	2020-08-16 17:41:16 +03:00
Raphael S. Carvalho	b07920dd1f	sstables: Fix remove_by_toc_name() on temporary toc regression caused by `55cf219c97`. remove_by_toc_name() must work both for a sealed sstable with toc, and also a partial sstable with tmp toc. so dirname() should be called conditionally on the condition of the sstable. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200813160612.101117-1-raphaelsc@scylladb.com>	2020-08-16 17:35:55 +03:00
Raphael S. Carvalho	7d7f9e1c54	sstables/LCS: increase per-level overlapping tolerance in reshape LCS can have its overlapping invariant broken after operations that can proceed in parallel to regular compaction like cleanup. That's because there could be two compactions in parallel placing data in overlapping token ranges of a given level > 0. After reshape, the whole table will be rewritten, on restart, if a given level has more than (fan_out2)=20 overlaps. That may sound like enough, but that's not taking into account the exponential growth in # of SSTables per level, so 20 overlaps may sound like a lot for level 2 which can afford 100 sstables, but it's only 2% of level 3, and 0.2% of level 4. So let's change the overlapping tolerance from the constant of fan_out2 to 10% of level limit on # of SSTables, or fan_out, whichever is higher. Refs #6938. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200810154510.32794-1-raphaelsc@scylladb.com>	2020-08-16 17:33:48 +03:00
Raphael S. Carvalho	11df96718a	compaction: Prevent non-regular compaction from picking compacting SSTables After `8014c7124`, cleanup can potentially pick a compacting SSTable. Upgrade and scrub can also pick a compacting SSTable. The problem is that table::candidates_for_compaction() was badly named. It misleads the user into thinking that the SSTables returned are perfect candidates for compaction, but manager still need to filter out the compacting SSTables from the returned set. So it's being renamed. When the same SSTable is compacted in parallel, the strategy invariant can be broken like overlapping being introduced in LCS, and also some deletion failures as more than one compaction process would try to delete the same files. Let's fix scrub, cleanup and ugprade by calling the manager function which gets the correct candidates for compaction. Fixes #6938. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200811200135.25421-1-raphaelsc@scylladb.com>	2020-08-16 17:31:03 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Benny Halevy	13f437157a	compaction_manager: register_compacting_sstables: allocate before registering sstables make all required allocations in advance to merging sstables into _compacting_sstables so it should not throw after registering some sstables, but not all. Test: database_test(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200811132440.416945-1-bhalevy@scylladb.com>	2020-08-11 18:14:58 +03:00

1 2 3 4 5 ...

2237 Commits