scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 19:35:12 +00:00

Author	SHA1	Message	Date
Avi Kivity	71398f3fb4	Merge "Cleanup sstable writer" from Benny " This series cleans up the legacy and common ssatble writer code. metadata_collector::_ancestors were moved to class sstable so that the former can be moved out of sstable into file_writer_impl. Moved setting of replay position and sstable level via sstable_writer_config so that compaction won't need to access the metadata_collector via the sstable. With that, metadata_collector could be moved from class sstable to sstable_writer::writer_impl along with the column_stats. That allowed moved "generic" file_writer methods that were actually k/l format specific into sstable_writer_k_l. Eventually `file_writer` code is moved into sstables/writer.cc and sstable_writer_k_l into sstables/kl/writer.{hh,cc} A bonus cleanup is the ability to get rid of sstable::_correctly_serialize_non_compound_range_tombstones as it's now available to the writers via the writer configuration and not required to be stored in the sstable object. Fixes #3012 Test: unit(dev) " * tag 'cleanup-sstable-writer-v2' of github.com:bhalevy/scylla: sstables: move writer code away to writer.cc sstables: move sstable_writer_k_l away to kl/writer sstables: get rid of sstable::_correctly_serialize_non_compound_range_tombstones sstables: move writer methods to sstable_writer_k_l sstables: move compaction ancestors to sstable sstables: sstable_writer: optionally set sstable level via config sstables: sstable_writer: optionally set replay position via config sstables: compaction: make_sstable_writer_config sstables: open code update_stats_on_end_of_stream in sstable_writer::consume_end_of_stream sstables: fold components_writer into sstable_writer_k_l sstables: move sstable_writer_k_l definition upwards sstables: components_writer: turn _index into unique_ptr	2020-10-15 10:40:28 +03:00
Benny Halevy	279865e56c	sstables: move writer code away to writer.cc Move `file_writer` code into sstables/writer.cc Fixes #3012 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 23:41:47 +03:00
Benny Halevy	20adb96f62	sstables: move sstable_writer_k_l away to kl/writer Move the sstable_writer_k_l code into sstables/kl/writer.{hh,cc} Refs #3012 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 23:40:56 +03:00
Benny Halevy	8cd4d53643	sstables: mx/writer: fix copy-paste error in reader_semaphore name It was copied from sstables.cc in `6ca0464af5`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20201014171651.541232-1-bhalevy@scylladb.com>	2020-10-14 22:17:49 +02:00
Benny Halevy	96cd6adc71	sstables: get rid of sstable::_correctly_serialize_non_compound_range_tombstones Now it's available to the writers via the writer configuration and not required to be stored in the sstable object. Refs #3012 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:53:23 +03:00
Benny Halevy	97a446f9fa	sstables: move writer methods to sstable_writer_k_l They are called solely from the sstable_writer_k_l path. With that, moce the metadata collector and column stats to writer_impl. They are now only used by the sstable writers. Refs #3012 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:52:17 +03:00
Benny Halevy	e1692bec17	sstables: move compaction ancestors to sstable Compaction needs access to the sstable's ancestors so we need to keep the ancestors for the sstable separately from the metadata collector as the latter is about to be moved to the sstable writer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:51:26 +03:00
Benny Halevy	a49a5f36c1	sstables: sstable_writer: optionally set sstable level via config And use compaction::make_sstable_writer_config to pass the compaction's `_sstable_level` to the writer via sstable_writer_config, instead of via the sstable metadata_collector, that is going to move from the sstable to the write_impl. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:49:36 +03:00
Benny Halevy	ac3c33ffca	sstables: sstable_writer: optionally set replay position via config And use compaction::make_sstable_writer_config to pass the compaction's replay_position (`_rp`) to the writer via sstable_writer_config, instead of via the sstable metadata_collector, that is going to move from the sstable to the write_impl. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:39:46 +03:00
Benny Halevy	e314eb3f78	sstables: compaction: make_sstable_writer_config Consolidate the code to make the sstable_writer_config for sstable writers into a helper method. Folowing patches will add the ability to set the replay position and sstable level via that config structure. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 18:01:46 +03:00
Benny Halevy	55d73ec2bc	sstables: open code update_stats_on_end_of_stream in sstable_writer::consume_end_of_stream In preparation to moving sstable methods to sstable_writer_k_l as part of #3012. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 17:46:26 +03:00
Benny Halevy	27e3c03ce2	sstables: fold components_writer into sstable_writer_k_l It serves no purpose being a different class but being called by sstable_writer_k_l. Refs #3012. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 17:40:47 +03:00
Benny Halevy	56a6a4ff17	sstables: move sstable_writer_k_l definition upwards To facilitate consolidation of components_writer and some sstable methods into sstable_writer_k_l. Refs #3012. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 17:13:44 +03:00
Benny Halevy	8f239f8f4c	sstables: components_writer: turn _index into unique_ptr In preparation to folding components_writer into sstable_writer_k_l in a following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 17:10:31 +03:00
Avi Kivity	86bbf1763d	Merge "reader concurrency semaphore: dump permit diagnostics on timeout or queue overflow" from Botond " The reader concurrency semaphore timing out or its queue being overflown are fairly common events both in production and in testing. At the same time it is a hard to diagnose problem that often has a benign cause (especially during testing), but it is equally possible that it points to something serious. So when this error starts to appear in logs, usually we want to investigate and the investigation is lengthy... either involves looking at metrics or coredumps or both. This patch intends to jumpstart this process by dumping a diagnostics on semaphore timeout or queue overflow. The diagnostics is printed to the log with debug level to avoid excessive spamming. It contains a histogram of all the permits associated with the problematic semaphore organized by table, operation and state. Example: DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore - Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics: Permits with state admitted, sorted by memory memory count name 3499M 27 ks.test:data-query 3499M 27 total Permits with state waiting, sorted by count count memory name 1 0B ks.test:drain 7650 0B ks.test:data-query 7651 0B total Permits with state registered, sorted by count count memory name 0 0B total Total: permits: 7678, memory: 3499M This allows determining several things at glance: * What are the tables involved * What are the operations involved * Where is the memory This can speed up a follow-up investigation greatly, or it can even be enough on its own to determine that the issue is benign. Tests: unit(dev, debug) " * 'dump-diagnostics-on-semaphore-timeout/v2' of https://github.com/denesb/scylla: reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow utils: add to_hr_size() reader_concurrency_semaphore: link permits into an intrusive list reader_concurrency_semaphore: move expiry_handler::operator()() out-of-line reader_concurrency_semaphore: move constructors out-of-line reader_concurrency_semaphore: add state to permits reader_concurrency_semaphore: name permits querier_cache_test: test_immediate_evict_on_insert: use two permits multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader() multishard_combining_reader: add permit parameter multishard_combining_reader: shard_reader: use multishard reader's permit	2020-10-13 12:44:23 +03:00
Botond Dénes	ff623e70b3	reader_concurrency_semaphore: name permits Require a schema and an operation name to be given to each permit when created. The schema is of the table the read is executed against, and the operation name, which is some name identifying the operation the permit is part of. Ideally this should be different for each site the permit is created at, to be able to discern not only different kind of reads, but different code paths the read took. As not all read can be associated with one schema, the schema is allowed to be null. The name will be used for debugging purposes, both for coredump debugging and runtime logging of permit-related diagnostics.	2020-10-13 12:32:13 +03:00
Avi Kivity	3451579d81	sstables: move component_type formatter to namespace sstables Without this, clang complains that we violate argument dependent lookup rules: note: 'operator<<' should be declared prior to the call site or in namespace 'sstables' std::ostream& operator<<(std::ostream&, const sstables::component_type&); we can't enforce the #include order, but we can easily move it it to namespace sstables (where it belongs anyway), so let's do that. gcc is happy either way. Closes #7413	2020-10-12 21:49:25 +02:00
Avi Kivity	5065ae835f	sstables: move bound_kind_m formatter to namespace sstables Without this, clang complains that we violate argument dependent lookup rules: note: 'operator<<' should be declared prior to the call site or in namespace 'sstables' std::ostream& operator<<(std::ostream&, const sstables::bound_kind_m&); we can't enforce the #include order, but we can easily move it it to namespace sstables (where it belongs anyway), so let's do that. gcc is happy either way.	2020-10-12 20:38:11 +03:00
Avi Kivity	a00fca1a69	sstables: move bound_kind_m formatter to its natural place Move bound_kind_m's formatter to the same header file where is is defined. This prevents cases where the compiler decays the type (an enum) to the underlying integral type because it does not see the formatter declaration, resulting in the wrong output.	2020-10-12 20:36:10 +03:00
Avi Kivity	69c3533d97	sstables: deinline bound_kind_m formatter The formatter is by no means hot code and should not be inlined.	2020-10-12 20:35:08 +03:00
Avi Kivity	4d6739c2e6	Merge "Use max_concurrent_for_each" from Benny " max_concurrent_for_each was added to seastar for replacing sstable_directory::parallel_for_each_restricted by using more efficient concurrency control that doesn't create unlimited number of continuations. The series replaces the use of sstable_directory::parallel_for_each_restricted with max_concurrent_for_each and exposes the sstable_directory::do_for_each_sstable via a static method. This method is used here by table::snapshot to limit concurrency do snapshot operations that suffer from the same unbound concurrency problem sstable_directory solved. In addition sstable_directory::_load_semaphore that was used across calls to do_for_each_sstable was replaced by a static per-shard semaphore that caps concurrency across all calls to `do_for_each_sstable` on that shard. This makes sense since the disk is a shared resource. In the future, we may want to have a load semaphore per device rather than a single global one. We should experiment with that. Test: unit(dev) " * tag 'max_concurrent_for_each-v5' of github.com:bhalevy/scylla: table: snapshot: use max_concurrent_for_each sstable_directory: use a external load_semaphore test: sstable_directory_test: extract sstable_directory creation into with_sstable_directory distributed_loader: process_upload_dir: use initial_sstable_loading_concurrency sstables: sstable_directory: use max_concurrent_for_each	2020-10-12 09:43:12 +03:00
Avi Kivity	b172e4c2ce	sstables: make index_bound a non-nested struct Due to a longstanding bug in clang[1], the compiler doesn't think that such a class is default-constructible. This causes std::optional<index_bound>::optional() not to compile. Because it depends on open_tt_marker, extract that too. [1] https://stackoverflow.com/questions/47974898/clang-5-stdoptional-instantiation-screws-stdis-constructible-trait-of-the-p Closes #7387	2020-10-11 17:40:01 +03:00
Avi Kivity	8932c4e919	compaction: allow _max_sstable_size = 0 Some test (run_based_compaction_test at least) use _max_sstable_size = 0 in order to force one partition per sstable. That triggers an overflow when calculating the expected bloom filter size. The overflow doesn't matter for normal operation, because the result later appears on a divisor, but does trigger a ubsan error. Squelch the error by bot dividing by zero here. I tried using _max_sstable_size = 1, but the test failed for other reasons. Closes #7375	2020-10-11 15:43:51 +03:00
Benny Halevy	57cc5f6ae1	sstable_directory: use a external load_semaphore Although each sstable_directory limits concurrency using max_concurrent_for_each, there could be a large number of calls to do_for_each_sstable running in parallel (e.g per keyspace X per table in the distributed_loader). To cap parallelism across sstable_directory instances and concurrent calls to do_for_each_sstable, start a sharded<semaphore> and pass a shared semaphore& to the sstable_directory:s. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-08 11:57:06 +03:00
Benny Halevy	c26c784882	sstables: sstable_directory: use max_concurrent_for_each Use max_concurrent_for_each instead of parallel_for_each in sstable_directory::parallel_for_each_restricted to avoid creating potentially thousands of continuations, one for each sstable. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-07 14:45:20 +03:00
Botond Dénes	dd372c8457	flat_mutation_reader: de-virtualize buffer_size() The main user of this method, the one which required this method to return the collective buffer size of the entire reader tree, is now gone. The remaining two users just use it to check the size of the reader instance they are working with. So de-virtualize this method and reduce its responsibility to just returning the buffer size of the current reader instance.	2020-10-06 08:22:56 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	72a88e0257	mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_range_tombstone() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	4f5ccf82cb	mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_clustering_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	f2b9cad4c6	mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_static_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Avi Kivity	2bd264ec6a	sstables: remove background_jobs(), await_background_jobs() There are no more users for registering background jobs, so remove the mechanism and the remaining calls.	2020-09-23 20:55:17 +03:00
Avi Kivity	5db96170a5	sstables: make sstables_manager take charge of closing sstables Currently, closing sstables happens from the sstable destructor. This is problematic since a destructor cannot wait for I/O, so we launch the file close process in the background. We therefore lose track of when the closing actually takes place. This patch makes sstables_manager take charge of the close process. Every sstable is linked into one of two intrusive lists in its manager: _active or _undergoing_close. When the reference count of the sstable drops to zero, we move it from _active to _undergoing_close and begin closing the files. sstables_manager remembers all closes and when sstables_manager::close() is called, it waits for all of them to complete. Therefore, sstables_manager::close() allows us to know that all files it manages are closed (and deleted if necessary). The sstables_manager also gains a destructor, which disables move construction.	2020-09-23 20:55:17 +03:00
Avi Kivity	f9aa50dcbf	test: sstables test_env: introduce manager() accessor This returns the sstables_manager carried by the test_env. We will soon retire the global test_sstables_manager, so we need to provide access to one.	2020-09-23 20:55:10 +03:00
Avi Kivity	a90a511d36	sstables_manager: introduce a stub close() sstables_manager is going to take charge of its sstables lifetimes, so it will need a close() to wait until sstables are deleted. This patch adds sstables_manager::close() so that the surrounding infrastructure can be wired to call it. Once that's done, we can make it do the waiting.	2020-09-23 20:55:04 +03:00
Avi Kivity	d19c6c0d98	sstables: size_tiered_backlog_tracker: avoid assignment of non-constexpr expression to constexpr object std::log() is not constexpr, so it cannot be assigned to a constexpr object. Make it non-constexpr and automatic. The optimizer still figures out that it's constant and optimizes it. Found by clang. Apparently gcc only checks the expression is constant, not constexpr.	2020-09-21 16:32:53 +03:00
Avi Kivity	a155b2bced	sstables: leveled_manifest: prevent benign precision loss warning Casting from the maximum int64_t to double loses precision, because int64_t has 64 bits of precision while double has only 53. Clang warns about it. Since it's not a real problem here, add an explicit cast to silence the warning.	2020-09-21 16:32:53 +03:00
Avi Kivity	aa7426bde6	sstables: index_reader: make 'index_bound' public index_reader::index_bound must be constructible by non-friend classes since it's used in std::optional (which isn't anyone's friend). This now works in gcc because gcc's inter-template access checking is broken, but clang correctly rejects it.	2020-09-21 16:32:53 +03:00
Avi Kivity	bd42bdd6b5	sstables: index_reader: disambiguate promoted_index_blocks_reader "state" type and data member promoted_index_blocks_reader has a data member called "state", and a type member called "state". Somehow gcc manages to disambiguate the two when used, but clang doesn't. I believe clang is correct here, one member should subsume the other. Change the type member to have a different name to disambiguate the two.	2020-09-21 16:32:53 +03:00
Piotr Sarna	16b4b86697	sstables: drop checks for non-compound range tombstones support Correct non-compound range tombstones are supported for over 2 years and upgrades are only allowed from versions which already have the support, so the checks are hereby dropped.	2020-09-14 12:09:51 +02:00
Piotr Sarna	f8ed1b5b67	sstables: drop checks for correct counter order support Correct counter order is supported for over 2 years and upgrades are only allowed from versions which already have the support, so the checks are hereby dropped.	2020-09-14 12:05:11 +02:00
Avi Kivity	64c7c81bac	Merge "Update log messages to {fmt} rules" from Pavel E " Before seastar is updated with the {fmt} engine under the logging hood, some changes are to be made in scylla to conform to {fmt} standards. Compilation and tests checked against both -- old (current) and new seastar-s. tests: unit(dev), manual " * 'br-logging-update' of https://github.com/xemul/scylla: code: Force formatting of pointer in .debug and .trace code: Format { and } as {fmt} needs streaming: Do not reveal raw pointer in info message mp_row_consumer: Provide hex-formatting wrapper for bytes_view heat_load_balance: Include fmt/ranges.h	2020-09-03 15:10:09 +03:00
Raphael S. Carvalho	adf576f769	compaction_manager: export method that returns if table has ongoing compaction A compaction strategy, that supports parallel compaction, may want to know if the table has compaction running on its behalf before making a decision. For example, a size-tiered-like strategy may not want to trigger a behavior, like cross-tier compaction, when there's ongoing compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200901134306.23961-1-raphaelsc@scylladb.com>	2020-09-02 16:46:49 +03:00
Raphael S. Carvalho	7f7f366cb5	compaction: add debug msg to inform the amount of expired ssts skipped by compaction this information is useful when debugging compaction issues that involve fully expired ssts. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200828140401.96440-1-raphaelsc@scylladb.com>	2020-08-31 17:18:47 +03:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Pavel Emelyanov	50e3a30dae	mp_row_consumer: Provide hex-formatting wrapper for bytes_view By default {fmt} doesn't know how to format this type (although it's a basic_string_view instantiated), and even providing formatter/operator<< does not help -- it anyway hits an earlier assertion in args mapper about the disallowance of character types mixing. The hex-wrapper with own operator<< solves the problem. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Benny Halevy	f5ffd5fc5f	sstables: Fix reactor stall in sstables::seal_summary() With relatively big summaries, reactor can be stalled for a couple of milliseconds. This patch: a. allocates positions upfront to avoid excessive reallocation. b. returns a future from seal_summary() and uses `seastar::do_for_each` to iterate over the summary entries so the loop can yield if necessary. Fixes #7108. Based on 2470aad5a389dfd32621737d2c17c7e319437692 by Raphael S. Carvalho <raphaelsc@scylladb.com> Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200826091337.28530-1-bhalevy@scylladb.com>	2020-08-26 12:18:05 +03:00
Benny Halevy	78a44dda57	sstables: avoid double close in file_writer destructor If file_writer::close() fails to close the output stream closing will be retried in file_writer::~file_writer, leading to: ``` include/seastar/core/future.hh:1892: seastar::future<T ...> seastar::promise<T>::get_future() [with T = {}]: Assertion `!this->_future && this->_state && !this->_task' failed. ``` as seen in https://github.com/scylladb/scylla/issues/7085 Fixes #7085 Test: unit(dev), database_test with injected error in posix_file_impl::close() Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200826062456.661708-1-bhalevy@scylladb.com>	2020-08-26 11:33:23 +03:00
Rafael Ávila de Espíndola	5fcfbd76a9	sstables: Delete duplicated code For some reason date_tiered_compaction_strategy had its own identical copy of get_value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200819211509.106594-1-espindola@scylladb.com>	2020-08-26 11:33:23 +03:00
Pavel Emelyanov	171822cff8	compaction: Use database from options to get local ranges The cleanup compaction wants to keep local tokens on-board and gets them from storage_service.get_local_ranges(). This method is the wrapper around database.get_keyspace_local_ranges() created in previous patch, the live database reference is already available on the descriptor's options, so we can short-cut the call. This allows removing the last explicit call for global storage_service instance from compaction code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00

1 2 3 4 5 ...

2263 Commits