scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Author	SHA1	Message	Date
Avi Kivity	9c37fdaca3	Revert "dht: incremental_owned_ranges_checker: use lower_bound()" This reverts commit `d85af3dca4`. It restores the linear search algorithm, as we expect the search to terminate near the origin. In this case linear search is O(1) while binary search is O(log n). A comment is added so we don't repeat the mistake. Closes #13704	2023-05-02 08:01:44 +03:00
Kamil Braun	30cc07b40d	Merge 'Introduce tablets' from Tomasz Grabiec This PR introduces an experimental feature called "tablets". Tablets are a way to distribute data in the cluster, which is an alternative to the current vnode-based replication. Vnode-based replication strategy tries to evenly distribute the global token space shared by all tables among nodes and shards. With tablets, the aim is to start from a different side. Divide resources of replica-shard into tablets, with a goal of having a fixed target tablet size, and then assign those tablets to serve fragments of tables (also called tablets). This will allow us to balance the load in a more flexible manner, by moving individual tablets around. Also, unlike with vnode ranges, tablet replicas live on a particular shard on a given node, which will allow us to bind raft groups to tablets. Those goals are not yet achieved with this PR, but it lays the ground for this. Things achieved in this PR: - You can start a cluster and create a keyspace whose tables will use tablet-based replication. This is done by setting `initial_tablets` option: ``` CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3, 'initial_tablets': 8}; ``` All tables created in such a keyspace will be tablet-based. Tablet-based replication is a trait, not a separate replication strategy. Tablets don't change the spirit of replication strategy, it just alters the way in which data ownership is managed. In theory, we could use it for other strategies as well like EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy is augmented to support tablets. - You can create and drop tablet-based tables (no DDL language changes) - DML / DQL work with tablet-based tables Replicas for tablet-based tables are chosen from tablet metadata instead of token metadata Things which are not yet implemented: - handling of views, indexes, CDC created on tablet-based tables - sharding is done using the old method, it ignores the shard allocated in tablet metadata - node operations (topology changes, repair, rebuild) are not handling tablet-based tables - not integrated with compaction groups - tablet allocator piggy-backs on tokens to choose replicas. Eventually we want to allocate based on current load, not statically Closes #13387 * github.com:scylladb/scylladb: test: topology: Introduce test_tablets.py raft: Introduce 'raft_server_force_snapshot' error injection locator: network_topology_strategy: Support tablet replication service: Introduce tablet_allocator locator: Introduce tablet_aware_replication_strategy locator: Extract maybe_remove_node_being_replaced() dht: token_metadata: Introduce get_my_id() migration_manager: Send tablet metadata as part of schema pull storage_service: Load tablet metadata when reloading topology state storage_service: Load tablet metadata on boot and from group0 changes db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata() migration_notifier: Introduce before_drop_keyspace() migration_manager: Make prepare_keyspace_drop_announcement() return a future<> test: perf: Introduce perf-tablets test: Introduce tablets_test test: lib: Do not override table id in create_table() utils, tablets: Introduce external_memory_usage() db: tablets: Add printers db: tablets: Add persistence layer dht: Use last_token_of_compaction_group() in split_token_range_msb() locator: Introduce tablet_metadata dht: Introduce first_token() dht: Introduce next_token() storage_proxy: Improve trace-level logging locator: token_metadata: Fix confusing comment on ring_range() dht, storage_proxy: Abstract token space splitting Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries" db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms() db: Introduce get_non_local_vnode_based_strategy_keyspaces() service: storage_proxy: Avoid copying keyspace name in write handler locator: Introduce per-table replication strategy treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type locator: Introduce effective_replication_map locator: Rename effective_replication_map to vnode_effective_replication_map locator: effective_replication_map: Abstract get_pending_endpoints() db: Propagate feature_service to abstract_replication_strategy::validate_options() db: config: Introduce experimental "TABLETS" feature db: Log replication strategy for debugging purposes db: Log full exception on error in do_parse_schema_tables() db: keyspace: Remove non-const replication strategy getter config: Reformat	2023-04-27 09:40:18 +02:00
Kefu Chai	f5b05cf981	treewide: use defaulted operator!=() and operator==() in C++20, compiler generate operator!=() if the corresponding operator==() is already defined, the language now understands that the comparison is symmetric in the new standard. fortunately, our operator!=() is always equivalent to `! operator==()`, this matches the behavior of the default generated operator!=(). so, in this change, all `operator!=` are removed. in addition to the defaulted operator!=, C++20 also brings to us the defaulted operator==() -- it is able to generated the operator==() if the member-wise lexicographical comparison. under some circumstances, this is exactly what we need. so, in this change, if the operator==() is also implemented as a lexicographical comparison of all memeber variables of the class/struct in question, it is implemented using the default generated one by removing its body and mark the function as `default`. moreover, if the class happen to have other comparison operators which are implemented using lexicographical comparison, the default generated `operator<=>` is used in place of the defaulted `operator==`. sometimes, we fail to mark the operator== with the `const` specifier, in this change, to fulfil the need of C++ standard, and to be more correct, the `const` specifier is added. also, to generate the defaulted operator==, the operand should be `const class_name&`, but it is not always the case, in the class of `version`, we use `version` as the parameter type, to fulfill the need of the C++ standard, the parameter type is changed to `const version&` instead. this does not change the semantic of the comparison operator. and is a more idiomatic way to pass non-trivial struct as function parameters. please note, because in C++20, both operator= and operator<=> are symmetric, some of the operators in `multiprecision` are removed. they are the symmetric form of the another variant. if they were not removed, compiler would, for instance, find ambiguous overloaded operator '=='. this change is a cleanup to modernize the code base with C++20 features. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13687	2023-04-27 10:24:46 +03:00
Kefu Chai	5a11d67709	dht: token: s/tri_compare/operator<=>/ now that C++20 is able to generate the default-generated comparing operators for us. there is no need to define them manually. and, `std::rel_ops::*` are deprecated in C++20. also, use `foo <=> bar` instead of `tri_compare(foo, bar)` for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-26 14:09:57 +08:00
Kefu Chai	cc87e10f40	dht: print pk in decorated_key with "pk" prefix this change ensures that `dk._key` is formatted with the "pk" prefix. as in `3738fcb`, the `operator<<` for partition_key was removed. so the compiler has to find an alternative when trying to fulfill the needs when this operator<< is called. fortunately, from the compiler's perspective, `partition_key` has an `operator managed_bytes_view`, and this operator does not have the explicit specifier, and, `managed_bytes_view` does support `operator<<`. so this ends up with a change in the format of `decorated_key` when it is printed using `operator<<`. the code compiles. but unfortunately, the behavior is changed, and it breaks scylla-dtest/cdc_tracing_info_test.py where the partition_key is supposed to be printed like "pk{010203}" instead of "010203". the latter is how `managed_bytes_view` is formatted. a test is added accordingly to avoid future changes which break the dtest. Fixes scylladb#13628 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13653	2023-04-25 09:53:47 +02:00
Tomasz Grabiec	fa8ad9a585	dht: Use last_token_of_compaction_group() in split_token_range_msb()	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	fceb5f8cf6	locator: Introduce tablet_metadata token_metadata now stores tablet metadata with information about tablets in the system.	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	241f7febec	dht: Introduce first_token()	2023-04-24 10:49:36 +02:00
Tomasz Grabiec	462e3ffd36	dht: Introduce next_token()	2023-04-24 10:49:36 +02:00
Tomasz Grabiec	d3c9ad4ed6	locator: Rename effective_replication_map to vnode_effective_replication_map In preparation for introducing a more abstract effective_replication_map which can describe replication maps which are not based on vnodes.	2023-04-24 10:49:36 +02:00
Kamil Braun	55f43e532c	Merge 'get rid of gms/failure_detector' from Benny Halevy Move gms::arrival_window to api/failure_detector which is its only user. and get rid of the rest, which is not used, now that we use direct_failure_detector instead. TODO: integare direct_failure_detector with failure_detector api. Closes #13576 * github.com:scylladb/scylladb: gms: get rid of unused failure_detector api: failure_detector: remove false dependency on failure_detector::arrival_window test: rest_api: add test_failure_detector	2023-04-21 11:47:44 +02:00
Benny Halevy	3f1ac846d8	gms: get rid of unused failure_detector The legacy failure_detector is now unused and can be removed. TODO: integare direct_failure_detector with failure_detector api. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-21 09:08:27 +03:00
Kefu Chai	fe9f41bd84	dht: remove unnecessarily forward declaration it turns out the declaration of `operator<<(ostream&, const dht::token&)` is unnecessarily. so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 11:41:54 +08:00
Kefu Chai	53dedca8cd	dht: specialize fmt::formatter<dht::token> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `dht::token` without the help of `operator<<`. the corresponding `operator<<()` is preserved in this change, as it has lots of users in this project, we will tackle them case-by-case in follow-up changes. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 11:41:54 +08:00
Gleb Natapov	fd6d45e178	bootstrapper: Add get_random_bootstrap_tokens function Does the same as get_bootstrap_tokens() but does not consult initial token config option. Will be used later.	2023-03-21 16:06:43 +02:00
Kefu Chai	c37f4e5252	treewide: use fmt::join() when appropriate now that fmtlib provides fmt::join(). see https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view there is not need to revent the wheel. so in this change, the homebrew join() is replaced with fmt::join(). as fmt::join() returns an join_view(), this could improve the performance under certain circumstances where the fully materialized string is not needed. please note, the goal of this change is to use fmt::join(), and this change does not intend to improve the performance of existing implementation based on "operator<<" unless the new implementation is much more complicated. we will address the unnecessarily materialized strings in a follow-up commit. some noteworthy things related to this change: * unlike the existing `join()`, `fmt::join()` returns a view. so we have to materialize the view if what we expect is a `sstring` * `fmt::format()` does not accept a view, so we cannot pass the return value of `fmt::join()` to `fmt::format()` * fmtlib does not format a typed pointer, i.e., it does not format, for instance, a `const std::string`. but operator<<() always print a typed pointer. so if we want to format a typed pointer, we either need to cast the pointer to `void` or use `fmt::ptr()`. * fmtlib is not able to pick up the overload of `operator<<(std::ostream& os, const column_definition* cd)`, so we have to use a wrapper class of `maybe_column_definition` for printing a pointer to `column_definition`. since the overload is only used by the two overloads of `statement_restrictions::add_single_column_parition_key_restriction()`, the operator<< for `const column_definition*` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 20:34:18 +08:00
Kamil Braun	fe14d14ce9	Merge 'Eliminate extraneous copies of dht::token_range_vector' from Benny Halevy In several places we copy token range vectors where we could move them and eliminate unnecessary memory copies. Ref #11005 Closes #12344 * github.com:scylladb/scylladb: dht/range_streamer: stream_async: move ranges_to_stream to do_streaming streaming: stream_session: maybe_yield streaming: stream_session: prepare: move token ranges to add_transfer_ranges streaming: stream_plan: transfer_ranges: move token ranges towards add_transfer_ranges dht/range_streamer: stream_async: do_streaming: move ranges downstream dht/range_streamer: add_ranges: clear_gently ranges_for_keyspace dht/range_streamer: get_range_fetch_map: reduce copies dht/range_streamer: add_ranges: move ranges down-stream dht/boot_strapper: move ranges to add_ranges dht/range_streamer: stream_async: incrementally update _nr_ranges_remaining dht/range_streamer: stream_async: erase from range_vec only after do_streaming success	2023-03-07 13:46:33 +01:00
Kefu Chai	c5d1a69859	build: cmake: link couple libraries as whole archive turns out we are using static variables to register entries in global registries, and these variables are not directly referenced, so linker just drops them when linking the executables or shared libraries. to address this problem, we just link the whole archive. another option would be create a linker script or pass --undefined=<symbol> to linker. neither of them is straightforward. a helper function is introduced to do this, as we cannot use CMake 3.24 as yet. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	563fbb2d11	build: cmake: extract more subsystem out into its own CMakeLists.txt namely, cdc, compaction, dht, gms, lang, locator, mutation_writer, raft, readers, replica, service, tools, tracing and transport. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	d85af3dca4	dht: incremental_owned_ranges_checker: use lower_bound() instead of using a while loop for finding the lower_bound, just use std::lower_bound() for finding if current node owns given token. this has two advantages: * better readability: as lower_bound is exactly what this loop calculates. * lower_bound uses binary search for searching the element, this algorithm should be faster than linear under most circumstances. * lower_bound uses std::advance() and prefix increment operator, this should be more performant than the postfix increment operator. as it does not create an temporary instance of iterator. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13008	2023-03-01 11:29:46 +02:00
Benny Halevy	06a0902708	dht/range_streamer: stream_async: move ranges_to_stream to do_streaming Currently the ranges_to_stream variable lives on the caller state, and do_streaming() moves its contents down to request_ranges/transfer_ranges and then calls clear() to make it ready for reuse. This works in principle but it makes it harder for an occasional reader of this code to figure out what going on. This change transfers control of the ranges_to_stream vector to do_streaming, by calling it with (std::exchange(do_streaming, {})) and with that that moved vector doesn't need to be cleared by do_streaming, and the caller is reponsible for readying the variable for reuse in its for loop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:38:34 +02:00
Benny Halevy	775c6b9697	dht/range_streamer: stream_async: do_streaming: move ranges downstream The ranges can be moved rather than copied to both `request_ranges` and `transfer_ranges` as they are only cleared after this point. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:56:55 +02:00
Benny Halevy	3cd8838a09	dht/range_streamer: add_ranges: clear_gently ranges_for_keyspace After calling get_range_fetch_map, ranges_for_keyspace is not used anymore. Synchronously destroying it may potentially stall in large clusters so use utils::clear_gently to gently clear the map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:52:30 +02:00
Benny Halevy	a80c2d16dd	dht/range_streamer: get_range_fetch_map: reduce copies Use const& to refer to the input ranges and endpoints rather than copying them individually along the way more than needed to. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:52:30 +02:00
Benny Halevy	9d6e5d50d1	dht/range_streamer: add_ranges: move ranges down-stream Eliminate extraneous copy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:52:27 +02:00
Benny Halevy	c61f058aa5	dht/boot_strapper: move ranges to add_ranges Eliminate extraneous copy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:50:40 +02:00
Benny Halevy	27b382dcce	dht/range_streamer: stream_async: incrementally update _nr_ranges_remaining Rather than calling nr_ranges_to_stream() inside `do_streaming`. As nr_ranges_to_stream depends on the `_to_stream` that will be updated only later on after the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:50:40 +02:00
Benny Halevy	c3c7efffb1	dht/range_streamer: stream_async: erase from range_vec only after do_streaming success range_vec is used for calculating nr_ranges_to_stream. Currently, the ranges_to_stream that were moved out of range_vec are push back on exception, but this isn't safe, since they may have moved already to request_ranges or transfer_ranges. Instead, erase the ranges we pass to do_streaming only after it succeeds so on exception, range_vec will not need adjusting. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:50:40 +02:00
Kefu Chai	df63e2ba27	types: move types.{cc,hh} into types they are part of the CQL type system, and are "closer" to types. let's move them into "types" directory. the building systems are updated accordingly. the source files referencing `types.hh` were updated using following command: ``` find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} + ``` the source files under sstables include "types.hh", which is indeed the one located under "sstables", so include "sstables/types.hh" instea, so it's more explicit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12926	2023-02-19 21:05:45 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Botond Dénes	c927eea1d5	Merge 'table: trim ranges for compaction group cleanup' from Benny Halevy This series contains the following changes for trimming the ranges passed to cleanup a compaction group to the compaction group owned token_range. table: compaction_group_for_token: use signed arithmetic Fixes #12595 table: make_compaction_groups: calculate compaction_group token ranges table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries Fixes #12594 Closes #12598 * github.com:scylladb/scylladb: table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries table: make_compaction_groups: calculate compaction_group token ranges dht: range_streamer: define logger as static	2023-01-30 13:11:28 +02:00
Benny Halevy	82011fc489	dht: incremental_owned_ranges_checker: belongs_to_current_node: mark as const Its _it member keeps state about the current range. Although it's modified by the method, this is an implementation detail that irrelevant to the caller, hence mark the belongs_to_current_node method as const (and noexcept while at it). This allows the caller, cleanup_compaction, to use it from inside a const method, without having to mark its respective member as mutable too. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12634	2023-01-25 14:52:21 +02:00
Benny Halevy	95a8e0b21d	table: make_compaction_groups: calculate compaction_group token ranges Add dht::split_token_range_msb that returns a token_range_vector with ranges split using a given number of most-significant bits. When creating the table's compaction groups, use dht::split_token_range_msb to calculate the token_range owned by each compaction_group. Refs #12594 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-22 22:54:26 +02:00
Benny Halevy	912b56ebcf	dht: range_streamer: define logger as static dht::logger can't be global in this case, as it's too generic, but should be static to range_streamer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-22 22:54:26 +02:00
Benny Halevy	8009585e7d	table: compaction_group_for_token: use signed arithmetic Add and use dht::compaction_group_of that computes the compaction_group index by unbiasing the token, similar to dht::shard_of. This way, all tokens in `_compaction_groups[i]` are ordered before `_compaction_groups[j]` iff i < j. Fixes #12595 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12599	2023-01-22 11:27:07 +02:00
Botond Dénes	50b155e706	dht/i_partitioner.hh: ring_position_ext: add weight() accessor	2023-01-09 09:46:57 -05:00
Benny Halevy	57ff3f240f	dht: optimize subtract_ranges Take advantage of the fact that both ranges and ranges_to_subtract are deoverlapped and sorted by to reduce the calculation complexity from quadratic to linear. Fixes #11922 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-21 15:48:28 +02:00
Benny Halevy	8b81635d95	compaction: refactor dht::subtract_ranges out of get_ranges_for_invalidation The algorithm is generic and can be used elsewhere. Add a unit test for the function before it gets optimized in the following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-21 15:48:26 +02:00
Benny Halevy	10f8f13b90	db: view_update_generator: always clean up staging sstables Since they are currently not cleaned up by cleanup compaction filter their tokens, processing only tokens owned by the current node (based on the keyspace replication strategy). Refs #9559 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 07:38:22 +02:00
Benny Halevy	fd3e66b0cc	compaction: extract incremental_owned_ranges_checker out to dht It is currently used by cleanup_compaction partition filter. Factor it out so it can be used to filter staging sstables in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 07:32:56 +02:00
Botond Dénes	169a8a66f2	compatible_ring_position_or_view: make it cheap to copy This class exists for one purpose only: to serve as glue code between dht::ring_position and boost::icl::interval_map. The latter requires that keys in its intervals are: * default constructible * copyable * have standalone compare operations For this reason we have to wrap `dht::ring_position` in a class, together with a schema to provide all this. This is `compatible_ring_position`. There is one further requirement by code using the interval map: it wants to do lookups without copying the lookup key(s). To solve this, we came up with `compatible_ring_position_or_view` which is a union of a key or a key view + schema. As we recently found out, boost::icl copies its keys a lot. It seems to assume these keys are cheap to copy and carelessly copies them around even when iterating over the map. But `compatible_ring_position_or_view` is not cheap to copy as it copies a `dht::ring_position` which allocates, and it does that via an `std::optional` and `std::variant` to add insult to injury. This patch make said class cheap to copy, by getting rid of the variant and storing the `dht::ring_position` via a shared pointer. The view is stored separately and either points to the ring position stored in the shared pointer or to an outside ring position (for lookups). Fixes: #11669 Closes #11670	2022-10-04 12:00:21 +03:00
Asias He	9ed401c4b2	streaming: Add finished percentage metrics for node ops using streaming We have added the finished percentage for repair based node operations. This patch adds the finished percentage for node ops using the old streaming. Example output: scylla_streaming_finished_percentage{ops="bootstrap",shard="0"} 1.000000 scylla_streaming_finished_percentage{ops="decommission",shard="0"} 1.000000 scylla_streaming_finished_percentage{ops="rebuild",shard="0"} 0.561945 scylla_streaming_finished_percentage{ops="removenode",shard="0"} 1.000000 scylla_streaming_finished_percentage{ops="repair",shard="0"} 1.000000 scylla_streaming_finished_percentage{ops="replace",shard="0"} 1.000000 In addition to the metrics, log shows the percentage is added. [shard 0] range_streamer - Finished 2698 out of 2817 ranges for rebuild, finished percentage=0.95775646 Fixes #11600 Closes #11601	2022-09-22 14:19:34 +03:00
Pavel Emelyanov	b6fdea9a79	code: Call sort_endpoints_by_proximity() via topology The method is about to be moved from snitch to topology, this patch prepares the rest of the code to use the latter to call it. The topology's method just calls snitch, but it's going to change in the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:14:01 +03:00
Pavel Emelyanov	4184091f1c	snitch, code: Remove get_sorted_list_by_proximity() There are two sorting methods in snitch -- one sorts the list of addresses in place, the other one creates a sorted copy of the passed const list (in fact -- the passed reference is not const, but it's not modified by the method). However, both callers of the latter anyway create their own temporary list of address, so they don't really benefit from snitch generating another copy. So this patch leaves just one sorting method -- the in-place one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:11:37 +03:00
Pavel Emelyanov	6dedc69608	topology: Do not add bootstrapping nodes to topology Recent change in topology (commit `4cbe6ee9` titled "topology: Require entry in the map for update_normal_tokens()") made token_metadata::update_normal_tokens() require the entry presense in the embedded topology object. Respectively, the commit in question equipped most callers of update_normal_tokens() with preceeding topology update call to satisfy the requirement. However, tokens are put into token_metadata not only for normal state, but also for bootstrapping, and one place that added bootstrapping tokens errorneously got topology update. This is wrong -- node must not be present in the topology until switching into normal state. As the result several tests with bootstrapping nodes started to fail. The fix removes topology update for bootstrapping nodes, but this change reveals few other places that piggy-backed this mistaken update, so noy _they_ need to update topology themselves. tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2040/ update_cluster_layout_tests.py::test_simple_add_new_node_while_schema_changes_with_repair update_cluster_layout_tests.py::test_simple_kill_new_node_while_bootstrapping_with_parallel_writes_in_multidc repair_based_node_operations_test.py::test_lcs_reshape_efficiency Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220902082753.17827-1-xemul@scylladb.com>	2022-09-04 13:53:38 +03:00
Pavel Emelyanov	7305061674	replication_strategy: Accept dc-rack as get_pending_address_ranges argument The method creates a copy of token metadata and pushes an endpoint (with some tokens) into it. Next patches will require providing dc/rack info together with the endpoint, this patch prepares for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 09:39:44 +03:00
Pavel Emelyanov	360c4f8608	dht: Carry dc-rack over boot_strapper and range_streamer Both classes may populate (temporarly clones of) token metadata object with endpoint:tokens pairs for the endpoint they work with. Next patches will require that endpoint comes with the dc/rack info. This patch makes sure dht classes have the necessary information at hand (for now it's just empty pair of strings). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 09:37:02 +03:00
Benny Halevy	91ab8ee1c3	effective_replication_map: make get_range_addresses asynchronous So it may yield, preenting reactor stalls as seen in #11005. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:01 +03:00
Benny Halevy	9b2af3f542	range_streamer: add_ranges and friends: get erm as param Rather than getting it in the callee, let the caller (e.g. storage_service) hold the erm and pass it down to potentially multiple async functions. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:01 +03:00
Benny Halevy	7ee6048255	database: add get_non_local_strategy_keyspaces For node operations, we currently call get_non_system_keyspaces but really want to work on all keyspace that have non-local replication strategy as they are replicated on other nodes. Reflect that in the replica::database function name. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:01 +03:00

1 2 3 4 5 ...

455 Commits