scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 20:27:03 +00:00

Author	SHA1	Message	Date
Benny Halevy	3cee0f8bd9	shared_token_metadata: mutate_token_metadata: bump cloned copy ring_version Currently this is done only in storage_service::get_mutable_token_metadata_ptr but it needs to be done here as well for code paths calling mutate_token_metadata directly. Currently, this it is only called from network_topology_strategy_test. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220130152157.2596086-1-bhalevy@scylladb.com>	2022-01-30 18:15:08 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Benny Halevy	17e006106b	token_metadata: update_normal_tokens: avoid unneeded sort when token ownership doesn't change Currently, we first delete all existing token mappings for the endpoint from _token_to_endpoint_map and then we add all updated token mappings for it and set should_sort_tokens if the token is newly inserted, but since we removed all existing mappings for the endpoint unconditionally, we will sort the tokens even if the token existed and its ownership did not change. This is worthwhile since there are scenarios where none of the token ownership change. Searching and erasing tokens from the tokens unordered_set runs at constant time on average so doing it for n tokens is O(n), while sorting the tokens is O(n*log(n)). Test: unit(dev) DTest: replace_address_test.py::TestReplaceAddress::test_serve_writes_during_bootstrap(dev,debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220117101242.122512-2-bhalevy@scylladb.com>	2022-01-17 12:18:42 +02:00
Benny Halevy	25977db7b4	token_metadata: remove update_normal_token entry point It's currently used only by unit tests and it is dangerous to use on a populated token_metadata as update_normal_tokens assumes that the set of tokens owned by the given endpoint is compelte, i.e. previous tokens owned by the endpoint are no longer owned by it, but the single-token update_normal_token interface seems commulative (and has no documentation whatsoever). It is better to remove this interface and calculate a complete map of endpoint->tokens from the tests. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220117101242.122512-1-bhalevy@scylladb.com>	2022-01-17 12:18:42 +02:00
Benny Halevy	044e4a6b72	token_metadata: delete private constructor It is not used. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20211205174306.450536-1-bhalevy@scylladb.com>	2021-12-05 19:49:29 +02:00
Benny Halevy	9d2631daaf	token_metadata: calculate_pending_ranges_for_leaving: maybe yield We see long stalls as reported in https://github.com/scylladb/scylla/issues/8030#issuecomment-974783526 everywhere_replication_strategy::calculate_natural_endpoints is synchronous and doesn't yield, so add maybe_yield() calls when looping over many token ranges. Refs #8030 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20211121090339.3955278-1-bhalevy@scylladb.com> Message-Id: <20211121102606.76700-1-bhalevy@scylladb.com>	2021-11-22 10:48:25 +02:00
Benny Halevy	d953e7b01a	token_metadata: get rid of now-unused sync methods Now that abstract_replication_strategy methods are all async clone_only_token_map_sync, and update_normal_tokens_sync are unused. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	cbe58345b9	abstract_replication_strategy: futurize get_*address_ranges Remaining callers of get_address_ranges and get_pending_address_ranges are all either from a seastar thread or from a coroutine so we can make the methods always async and drop the can_yield param. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	bb0ea0b1c0	shared_token_metadata: set: check version monotonicity Setting the ring version backwards means it got out of sync. Possibly concurrent updates weren't serialized properly using token_metadata_lock / mutate_token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 14:03:51 +03:00
Benny Halevy	43160abaec	token_metadata: use static ring version For generating unique _ring_version. Currently when we clone a mutable token_metadata_ptr it remains with the same _ring_version and the ring version is updated only when the topology changes. To be able to distinguish these traqnsient copies from the ones that got applied, be stricter about the ring version and change it to a unique number using a static counter. Next patch will update the ring version (and consequently invalidate the cached_endpoints on the replication strategy) every time the token_metadata changes, not only when the topology changes. Note that the _cached_endpoints will go away once the transition to effective_replication_map is finished, so this will not degrade performance. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 14:03:17 +03:00
Benny Halevy	685f5e7704	token_metadata: get rid of copy constructor and assignment operator Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 14:00:55 +03:00
Benny Halevy	3393df45eb	token_metadata, storage_service: unify token_metadata_lock and merge_lock. Serialize the metadata changes with keyspace create, update, or drop. This will become necessary in the following patch when we update the effective_replication_map on all keyspaces and we want instances on all shards end up with the same replication map. Note that storage_service::keyspace_changed is called from the scheme_merge path so it already holds the merge_lock. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 13:01:25 +03:00
Benny Halevy	a1c573e6d3	abstract_replication_strategy: make calculate_natural_endpoints_sync private And with that rename calculate_natural_endpoints(const token& search_token, const token_metadata&, can_yield) to do_calculate_natural_endpoints and make it protected, With this patch, all its external users call the async version, so rename it back to calculate_natural_endpoints, and make calculate_natural_endpoints_sync private since it's being called only within abstract_replication_strategy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:39:36 +03:00
Avi Kivity	369afe3124	treewide: use coroutine::maybe_yield() instead of co_await make_ready_future() The dedicated API shows the intent, and may be a tiny bit faster. Closes #9382	2021-09-23 12:28:56 +02:00
Benny Halevy	4ffdafe6dc	token_metadata: delete old java code We no longer need to keep it for reference. It's just causing confusion at this point. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210826095457.994834-1-bhalevy@scylladb.com>	2021-08-26 13:03:59 +03:00
Avi Kivity	222ef17305	build, treewide: enable -Wredundant-move Returning a function parameter guarantees copy elision and does not require a std::move(). Enable -Wredundant-move to warn us that the move is unneeded, and gain slightly more readable code. A few violations are trivially adjusted. Closes #9004	2021-07-11 12:53:02 +03:00
Benny Halevy	612793c2d4	locator: token_metadata: reuse utils::stall_free clear_gently helpers Use the generic clear_gently functions that were added in `eca9f45c59`. Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210706090243.1776466-1-bhalevy@scylladb.com>	2021-07-06 12:06:43 +03:00
Avi Kivity	d2157dfea7	Merge 'locator: token_metadata: simplify `tokens_iterator`' from Michał Chojnowski `ring_range()`/`tokens_iterator` are more complicated than they need to be. The `include_min` parameter is not used anywhere, and `tokens_iterator` is pimplified without a good reason. Simplify that. Closes #8805 * github.com:scylladb/scylla: locator: token_metadata: depimplify tokens_iterator locator: token_metadata: remove _ring_pos from tokens_iterator_impl locator: token_metadata: remove tokens_end() locator: token_metadata: remove `include_min` from tokens_iterator_impl locator: token_metadata: remove the `include_min` parameter from `ring_range()`	2021-06-08 15:42:41 +03:00
Michał Chojnowski	3ea97e7a11	locator: token_metadata: depimplify tokens_iterator This class has no meaningful dependencies, so pimpl is unreasonable here.	2021-06-07 10:41:23 +02:00
Michał Chojnowski	baaac5bb7c	locator: token_metadata: remove _ring_pos from tokens_iterator_impl _ring_pos is slightly confusing. I thought at first that it doesn't do anything since operator== doesn't use it. This cosmetic patch tries to improve the readability, and also removes operator!= which is generated automatically in C++20.	2021-06-07 10:41:22 +02:00
Michał Chojnowski	30e5290cea	locator: token_metadata: remove tokens_end() It's an internal method of token_metadata_impl and doesn't have to exist.	2021-06-07 10:41:11 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Michał Chojnowski	81c1a7f7e9	locator: token_metadata: remove `include_min` from tokens_iterator_impl `include_min` is always set to the default value. Get rid of it.	2021-06-05 17:40:35 +02:00
Michał Chojnowski	2a3bd2babe	locator: token_metadata: remove the `include_min` parameter from `ring_range()` `include_min` is always set to the default value. Remove it.	2021-06-05 17:40:35 +02:00
Nadav Har'El	fb0c4e469a	Merge 'token_metadata: Fix get_all_endpoints to return nodes in the ring' from Asias He The get_all_endpoints() should return the nodes that are part of the ring. A node inside _endpoint_to_host_id_map does not guarantee that the node is part of the ring. To fix, return from _token_to_endpoint_map. Fixes #8534 Closes #8536 * github.com:scylladb/scylla: token_metadata: Get rid of get_all_endpoints_count range_streamer: Handle everywhere_topology range_streamer: Adjust use_strict_sources_for_ranges token_metadata: Fix get_all_endpoints to return nodes in the ring	2021-05-11 18:39:10 +03:00
Asias He	5a410cb6e3	token_metadata: Get rid of get_all_endpoints_count It is now only a wrapper for count_normal_token_owners. Refs #8534	2021-05-06 15:36:20 +08:00
Asias He	ddeabba6aa	token_metadata: Fix get_all_endpoints to return nodes in the ring The get_all_endpoints() should return the nodes that are part of the ring. A node inside _endpoint_to_host_id_map does not guarantee that the node is part of the ring. To fix, return from _token_to_endpoint_map. Fixes #8534	2021-05-06 10:02:11 +08:00
Avi Kivity	cea5493cb7	storage_proxy, treewide: introduce names for vectors of inet_address storage_proxy works with vectors of inet_addresses for replica sets and for topology changes (pending endpoints, dead nodes). This patch introduces new names for these (without changing the underlying type - it's still std::vector<gms::inet_address>). This is so that the following patch, that changes those types to utils::small_vector, will be less noisy and highlight the real changes that take place.	2021-05-05 18:36:48 +03:00
Benny Halevy	322aa2f8b5	token_metadata: add clear_gently clear_gently gently clears the token_metadata members. It uses continuations to allow yielding if needed to prevent reactor stalls. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-22 11:22:21 +02:00
Benny Halevy	56aa49ca81	token_metadata: shared_token_metadata: add mutate_token_metadata mutate_token_metadata acquires the shared_token_metadata lock, clones the token_metadata (using clone_async) and calls an asynchronous functor on the cloned copy of the token_metadata to mutate it. If the functor is successful, the mutated clone is set back to to the shared_token_metadata, otherwise, the clone is destroyed. With that, get rid of shared_token_metadata::clone Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-22 11:22:19 +02:00
Benny Halevy	e089c22ec1	token_metdata: futurize update_normal_tokens The function complexity if O(#tokens) in the worst case as for each endpoint token to traverses _token_to_endpoint_map lineraly to erase the endpoint mapping if it exists. This change renames the current implementation of update_normal_tokens to update_normal_tokens_sync and clones the code as a coroutine that returns a future and may yield if needed. Eventually we should futurize the whole token_metadata and abstract_replication_strategy interface and get rid of the synchronous functions. Until then the sync version is still required from call sites that are neither returning a future nor run in a seastar thread. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-22 10:35:15 +02:00
Asias He	829b4c1438	repair: Make removenode safe by default Currently removenode works like below: - The coordinator node advertises the node to be removed in REMOVING_TOKEN status in gossip - Existing nodes learn the node in REMOVING_TOKEN status - Existing nodes sync data for the range it owns - Existing nodes send notification to the coordinator - The coordinator node waits for notification and announce the node in REMOVED_TOKEN Current problems: - Existing nodes do not tell the coordinator if the data sync is ok or failed. - The coordinator can not abort the removenode operation in case of error - Failed removenode operation will make the node to be removed in REMOVING_TOKEN forever. - The removenode runs in best effort mode which may cause data consistency issues. It means if a node that owns the range after the removenode operation is down during the operation, the removenode node operation will continue to succeed without requiring that node to perform data syncing. This can cause data consistency issues. For example, Five nodes in the cluster, RF = 3, for a range, n1, n2, n3 is the old replicas, n2 is being removed, after the removenode operation, the new replicas are n1, n5, n3. If n3 is down during the removenode operation, only n1 will be used to sync data with the new owner n5. This will break QUORUM read consistency if n1 happens to miss some writes. Improvements in this patch: - This patch makes the removenode safe by default. We require all nodes in the cluster to participate in the removenode operation and sync data if needed. We fail the removenode operation if any of them is down or fails. If the user want the removenode operation to succeed even if some of the nodes are not available, the user has to explicitly pass a list of nodes that can be skipped for the operation. $ nodetool removenode --ignore-dead-nodes <list_of_dead_nodes_to_ignore> <host_id> Example restful api: $ curl -X POST "http://127.0.0.1:10000/storage_service/remove_node/?host_id=7bd303e9-4c7b-4915-84f6-343d0dbd9a49&ignore_nodes=127.0.0.3,127.0.0.5" - The coordinator can abort data sync on existing nodes For example, if one of the nodes fails to sync data. It makes no sense for other nodes to continue to sync data because the whole operation will fail anyway. - The coordinator can decide which nodes to ignore and pass the decision to other nodes Previously, there is no way for the coordinator to tell existing nodes to run in strict mode or best effort mode. Users will have to modify config file or run a restful api cmd on all the nodes to select strict or best effort mode. With this patch, the cluster wide configuration is eliminated. Fixes #7359 Closes #7626	2020-12-10 10:14:39 +02:00
Piotr Jastrzebski	87bf577450	token_metadata: Remove std::iterator from tokens_iterator_impl std::iterator is deprecated since C++17 so define all the required iterator_traits directly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-17 16:53:20 +01:00
Benny Halevy	275fe30628	token_metadata_impl: set_pending_ranges: add can_yield_param To prevent a > 10 ms stall when inserting to boost::icl::interval_map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	ba31350239	abstract_replication_strategy: add can_yield param to get_pending_ranges and friends To prevent reactor stalls as seen in #7313. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	7fb489d338	token_metadata_impl: calculate_pending_ranges_for_* reindent Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	6ce2436a4c	token_metadata_impl: calculate_pending_ranges_for_* pass new_pending_ranges by ref We can use the seastar thread to keep the vector rather thna creating a lw_shared_ptr for it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	0ca423dcfc	token_metadata_impl: calculate_pending_ranges_for_* call in thread The functions can be simplified as they are all now being called from a seastar thread. Make them sequential, returning void, and yielding if necessary. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	84d086dc77	token_metadata: update_pending_ranges: create seastar thread So we can yield in this path to prevent reactor stalls. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	1e6c181678	abstract_replication_strategy: add get_address_ranges method for specific endpoint Some of the callers of get_address_ranges are interested in the ranges of a specific endpoint. Rather than building a map for all endpoints and then traversing it looking for this specific endpoint, build a multimap of token ranges relating only to the specified endpoint. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	2ce6773dae	token_metadata_impl: clone_after_all_left: sort tokens only once Currently the sorted tokens are copied needlessly by on this path by `clone_only_token_map` and then recalculated after calling remove_endpoint for each leaving endpoint. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	0abd8e62cd	token_metadata: futurize clone_after_all_left Call the futurized clone_only_token_map and remove the _leaving_endpoints from the cloned token_metadata_impl. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	4a622c14e1	token_metadata: futurize clone_only_token_map Does part of clone_async() using continuations to prevent stalls. Rename synchronous variant to clone_only_token_map_sync that is going to be deprecated once all its users will be futurized. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:24 +02:00
Benny Halevy	d1a73ec7b3	token_metadata: use mutable_token_metadata_ptr in calculate_pending_ranges_for_* Replacing old code using lw_shared_ptr<token_metadata> with the "modern" mutable_token_metadata_ptr alias. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Benny Halevy	4fc5997949	token_metadata: add clone_async Clone token_metadata object using async continuation to prevent reactor stalls. Refs https://github.com/scylladb/scylla/issues/7220 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Benny Halevy	fa880439c9	storage_service: use token_metadata_lock to serialize updates to token_metadata Rather than using `serialized_action`, grab a lock before mutating _token_metadata and hold it until its replicated to all shards. A following patch will use a mutable token_metadata_ptr that is updated out of line under the lock. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Benny Halevy	5a250f529f	token_metadata: get rid of unused calculate_pending_ranges_for_* methods They are only called inernally by token_metadata_impl. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-09-30 23:16:23 +03:00
Benny Halevy	41e5a3a245	token_metadata: get rid of clone_after_all_settled It's unused. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-09-30 23:15:11 +03:00
Benny Halevy	105a2f5244	token_metadata_impl: remove_endpoint: do not sort tokens Call sort_tokens at the caller as all call sites from within token_metadata_impl call remove_endpoint for multiple endpoints so the tokens can be re-sorted only once, when done removing all tokens. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-09-30 23:12:32 +03:00
Benny Halevy	86303f4fdd	token_metadata_impl: always sort_tokens in place No need to return the sorted tokens vector as it's always assigned to _sorted_tokens. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-09-30 23:08:56 +03:00

1 2 3

134 Commits