scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-19 16:15:07 +00:00

Author	SHA1	Message	Date
Benny Halevy	7c2bd8dc34	locator: host_id_or_endpoint: keep value as variant Rather than allowing to keep both host_id and endpoint, keep only one of them and provide resolve functions that use the token_metadata to resolve the host_id into an inet_address or vice verse. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-04-14 15:25:50 +03:00
Tomasz Grabiec	ef9e5e64a3	locator: token_metadata: Introduce topology barrier stall detector When topology barrier is blocked for longer than configured threshold (2s), stale versions are marked as stalled and when they get released they report backtrace to the logs. This should help to identify what was holding for token metadata pointer for too long. Example log: token_metadata - topology version 30 held for 299.159 [s] past expiry, released at: 0x2397ae1 0x23a36b6 ... Closes scylladb/scylladb#17427	2024-02-21 15:05:34 +02:00
Avi Kivity	605bf6e221	range.hh: retire range.hh was deprecated in `bd794629f9` (2020) since its names conflict with the C++ library concept of an iterator range. The name ::range also mapped to the dangerous wrapping_interval rather than nonwrapping_interval. Complete the deprecation by removing range.hh and replacing all the aliases by the names they point to from the interval library. Note this now exposes uses of wrapping intervals as they are now explicit. The unit tests are renamed and range.hh is deleted. Closes scylladb/scylladb#17428	2024-02-21 00:24:25 +02:00
Kefu Chai	10a11c2886	token_metadata: pass node id when formatting it before this change, we use the format string of "Can't replace node {} with itself", but fail to include the host id as seastar::format()'s arguments. this fails the compile-time check of fmt, which is yet merged. so, if we really run into this problem, {fmt} would throw before the intended runtime_error is raised -- currently, seastar::log formats the logging messages at runtime, this is not intended. in this change, we pass `existing_node`, so it can be formatted, and the intended error message can be printed in log. Refs `11a4908683` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16422	2023-12-15 16:43:44 +01:00
Petr Gusev	fbf507b1ba	token_metadata: topology: cleanup add_or_update_endpoint Make host_id parameter non-optional and move it to the beginning of the arguments list. Delete unused overloads of add_or_update_endpoint. Delete unused overload of token_metadata::update_topology with inet_address argument.	2023-12-12 23:19:54 +04:00
Petr Gusev	11a4908683	token_metadata: add_replacing_endpoint: forbid replacing node with itself This used to work before in replace-with-same-ip scenario, but with host_id-s it's no longer relevant. base_token_metadata has been removed from topology_change_info because the conditions needed for its creation are no longer met.	2023-12-12 23:19:54 +04:00
Petr Gusev	3b59919a9c	topology: drop key_kind, host_id is now the primary key	2023-12-12 23:19:54 +04:00
Petr Gusev	8c551f9104	dc_rack_fn: make it non-template	2023-12-12 23:19:54 +04:00
Petr Gusev	7b55ccbd8e	token_metadata: drop the template Replace token_metadata2 ->token_metadata, make token_metadata back non-template. No behavior changes, just compilation fixes.	2023-12-12 23:19:54 +04:00
Petr Gusev	799f747c8f	shared_token_metadata: switch to the new token_metadata	2023-12-12 23:19:54 +04:00
Petr Gusev	11cc21d0a9	erm: switch to the new token_metadata In this commit we replace token_metadata with token_metadata2 in the erm interface and field types. To accommodate the change some of strategy-related methods are also updated. All the boost and topology tests pass with this change.	2023-12-12 23:19:53 +04:00
Petr Gusev	f53f34f989	storage_service: get_token_to_endpoint_map: use new token_metadata The token_metadata::get_normal_and_bootstrapping_token_to_endpoint_map method was used only here. It's inlined in this commit since it's too specific and incurs the overhead of creating an intermediate map.	2023-12-12 23:19:53 +04:00
Petr Gusev	5a1418fdba	token_metadata: get_endpoint_for_host_id -> get_endpoint_for_host_id_if_known This commit fixes an inconsistency in method names: get_host_id and get_host_id_if_known are (internal_error, returns null), but there was only one method for the opposite conversion - get_endpoint_for_host_id, and it returns null. In this commit we change it to on_internal_error if it can't find the argument and add another method get_endpoint_for_host_id_if_known which returns null in this case. We can't use get_endpoint_for_host_id/get_host_id in host_id_or_endpoint::resolve since it's called from storage_service::parse_node_list -> token_metadata::parse_host_id_and_endpoint, and exceptions are caught and handled in `storage_service::parse_node_list`.	2023-12-11 12:51:34 +04:00
Petr Gusev	08b47d645a	token_metadata: get_host_id: exception -> on_internal_error It's a bug to use get_host_id on a non-existent endpoint, so on_internal_error is more appropriate. Also, it's easier to debug since it provides a backtrace. If a missing inet_address is expected, get_host_id_if_known should be used instead. We update one such case in storage_service::force_remove_completion. Other usages of get_host_id are correct.	2023-12-11 12:51:34 +04:00
Petr Gusev	39bbe5f457	token_metadata: add get_all_ips method This is convenient for migrating code that uses get_all_endpoints.	2023-12-11 12:51:34 +04:00
Petr Gusev	9edf0709e6	token_metadata: support host_id-based version In this commit we enhance token_metadata with a pointer to the new host_id-based generic_token_metadata specialisation (token_metadata2). The idea is that in the following commits we'll go over all token_metadata modifications and make the corresponding modifications to its new host_id-based alternative. The pointer to token_metadata2 is stored in the generic_token_metadata::_new_value field. The pointer can be mutable, immutable, or absent altogether (std::monostate). It's mutable if this generic_token_metadata owns it, meaning it was created using the generic_token_metadata(config cfg) constructor. It's immutable if the generic_token_metadata(lw_shared_ptr<const token_metadata2> new_value); constructor was used. This means this old token_metadata is a wrapper for new token_metadata and we can only use the get_new() method on it. The field _new_value is empty for the new host_id-based token_metadata version. The generic_token_metadata(std::unique_ptr<token_metadata_impl<NodeId>> impl, token_metadata2 new_value); constructor is used for clone methods. We clone both versions, and we need to pass a cloned token_metadata2 into constructor. There are two overloads of get_new, for mutable and immutable generic_token_metadata. Both of them throws an exception if they can't get the appropriate pointer. There is also a get_new_strong method, which returns an immutable owning pointer. This is convenient since a lot of API's want an owning pointer. We can't make the get_new/get_new_strong API simpler and use get_new_strong everywhere since it mutate the original generic_token_metadata by incrementing the reference counter and this causes raises when it's passed between shards in replicate_to_all_cores.	2023-12-11 12:51:34 +04:00
Petr Gusev	63f64f3303	token_metadata: make it a template with NodeId=inet_address/host_id NodeId is used in all internal token_metadata data structures, that previously used inet_address. We choose topology::key_kind based on the value of the template parameter. generic_token_metadata::update_topology overload with host_id parameter is added to make update_topology_change_info work, it now uses NodeId as a parameter type. topology::remove_endpoint(host_id) is added to make generic_token_metadata::remove_endpoint(NodeId) work. pending_endpoints_for and endpoints_for_reading are just removed - they are not used and not implemented. The declarations were left by mistake from a refactoring in which these methods were moved to erm. generic_token_metadata_base is extracted to contain declarations, common to both token_metadata versions. Templates are explicitly instantiated inside token_metadata.cc, since implementation part is also a template and it's not exposed to the header. There are no other behavioral changes in this commit, just syntax fixes to make token_metadata a template.	2023-12-11 12:51:34 +04:00
Petr Gusev	c9fbe3d377	locator: make dc_rack_fn a template In the next commits token_metadata will be made a template with NodeId=inet_address\|host_id parameter. This parameter will be passed to dc_rack_fn function, so it also should be made a template.	2023-12-11 12:51:33 +04:00
Piotr Dulikowski	5227b71363	locator/topology: add key_kind parameter For the host_id-based token_metadata we want host_id to be the main node key, meaning it should be used in add_or_update_endpoint to find the node to update. For the inet_address-based token_metadata version we want to retain the old behaviour during transition period. In this commit we introduce key_kind parameter and use key_kind::inet_address in all current topology usages. Later we'll use key_kind::host_id for the new token_metadata. In the last commits of the series, when the new token_metadata version is used everywhere, we will remove key_kind enum.	2023-12-11 12:51:33 +04:00
Petr Gusev	2f137776c3	token_metadata: topology_change_info: change field types to token_metadata_ptr In subsequent commits we'll need the following api for token_metadata: token_metadata(token_metadata2_ptr); get_new() -> token_metadata2* where token_metadata2 is the new version of token_metadata, based on host_id. In other words: * token_metadata knows the new version of itself and returns a pointer to it through get_new() * token_metadata can be constructed based solely on the new version, without its own implementation. In this case the only method we can use on it is get_new. This allows to pass token_metadata2 to API's with token_metadata in method signature, if these APIs are known to only use the get_new method on the passed token_metadata. And back to topology_change_info - if we got it from the new token_metadata we want to be able to construct token_metadata from token_metadata2 contained in it, and this requires it to be a ptr, not value.	2023-12-11 12:51:33 +04:00
Petr Gusev	f21f23483c	token_metadata: drop unused method get_endpoint_to_token_map_for_reading	2023-12-11 12:51:22 +04:00
Benny Halevy	86716b2048	locator: topology: add helpers to retrieve this host_id and address And respective `is_me()` predicates, to prepare for getting rid of fb_utilities. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	7119c1d8cc	token_metadata: update_topology: make endpoint_dc_rack arg optional It's better to pass a disengaged optional when the caller doesn't have the information rather than passing the default dc_rack location so the latter will never implicitly override a known endpoint dc/rack location. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15300	2023-09-11 16:16:19 +02:00
Benny Halevy	5afc242814	token_metadata: get_endpoint_to_host_id_map_for_reading: just inform that normal node has null host_id It is too early to require that all nodes in normal state have a non-null host_id. The assertion was added in `44c14f3e2b` but unfortunately there are several call sites where we add the node as normal, but without a host_id and we patch it in later on. In the future we should be able to require that once we identify nodes by host_id over gossiper and in token_metadata. Fixes scylladb/scylladb#15181 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15184	2023-08-28 21:40:55 +03:00
Botond Dénes	e7af2a7de8	Merge 'token_metadata::get_endpoint_to_host_id_map_for_reading: restrict to token owners' from Benny Halevy And verify the they returned host_id isn't null. Call on_internal_error_noexcept in that case since all token owners are expected to have their host_id set. Aborting in testing would help fix issues in this area. Fixes scylladb/scylladb#14843 Refs scylladb/scylladb#14793 Closes #14844 * github.com:scylladb/scylladb: api: storage_service: improve description of /storage_service/host_id token_metadata: get_endpoint_to_host_id_map_for_reading: restrict to token owners	2023-08-23 13:55:14 +03:00
Kamil Braun	cdc3cd2b79	Merge 'raft: add fencing tests' from Petr Gusev In this PR a simple test for fencing is added. It exercises the data plane, meaning if it somehow happens that the node has a stale topology version, then requests from this node will get an error 'stale topology'. The test just decrements the node version manually through CQL, so it's quite artificial. To test a more real-world scenario we need to allow the topology change fiber to sometimes skip unavailable nodes. Now the algorithm fails and retries indefinitely in this case. The PR also adds some logs, and removes one seemingly redundant topology version increment, see the commit messages for details. Closes #14901 * github.com:scylladb/scylladb: test_fencing: add test_fence_hints test.py: output the skipped tests test.py: add skip_mode decorator and fixture test.py: add mode fixture hints: add debug log for dropped hints hints: send_one_hint: extend the scope of file_send_gate holder pylib: add ScyllaMetrics hints manager: add send_errors counter token_metadata: add debug logs fencing: add simple data plane test random_tables.py: add counter column type raft topology: don't increment version when transitioning to node_state::normal	2023-08-22 16:28:21 +02:00
Petr Gusev	fa25e6d63e	token_metadata: add debug logs We log the new version when the new token metadata is set. Also, the log for fence_version is moved in shared_token_metadata from storage_service for uniformity.	2023-08-22 14:31:04 +04:00
Benny Halevy	44c14f3e2b	token_metadata: get_endpoint_to_host_id_map_for_reading: restrict to token owners And verify the they returned host_id isn't null. Call on_internal_error_noexcept in that case since all token owners are expected to have their host_id set. Aborting in testing would help fix issues in this area. Fixes scylladb/scylladb#14843 Refs scylladb/scylladb#14793 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-21 09:16:42 +03:00
Raphael S. Carvalho	5d1f60439a	token_metadata: Add this_host_id to topology config The motivation is that token_metadata::get_my_id() is not available early in the bootstrap process, as raft topology is pulled later than new tables are registered and created, and this node is added to topology even later. To allow creation of compaction groups to retrieve "my id" from token metadata early, initialization will now feed local id into topology config which is immutable for each node anyway. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:44 -03:00
Tomasz Grabiec	f2fdf37415	token_metadata: Add non-const getter of tablet_metadata Needed for tests.	2023-07-25 21:08:51 +02:00
Tomasz Grabiec	e110167a2a	locator: Store node shard count in topology Will be needed by tablet allocator.	2023-06-21 00:58:25 +02:00
Petr Gusev	246eaec14e	shared_token_metadata: update_fence_version: on_internal_error -> throw on_internal_error is wrong for fence_version condition violation, since in case of topology change coordinator migrating to another node we can have raft_topology_cmd::command::fence command from the old coordinator running in parallel with the fence command (or topology version upgrading raft command) from the new one. The comment near the raft_topology_cmd::command::fence handling describes this situation, assuming an exception is thrown in this case.	2023-06-20 13:39:17 +04:00
Petr Gusev	f6b019c229	raft topology: add fence_version It's stored outside of topology table, since it's updated not through RAFT, but with a new 'fence' raft command. The current value is cached in shared_token_metadata. An initial fence version is loaded in main during storage_service initialisation.	2023-06-15 15:48:00 +04:00
Petr Gusev	4f99302c2b	raft_topology: add barrier_and_drain cmd We use utils::phased_barrier. The new phase is started each time the version is updated. We track all instances of token_metadata, when an instance is destroyed the corresponding phased_barrier::operation is released.	2023-06-15 15:48:00 +04:00
Petr Gusev	253d8a8c65	token_metadata: add topology version It's stored in as a static column in topology table, will be updated at various steps of the topology change state machine. The initial value is 1, zero means that topology versions are not yet supported, will be used in RPC handling.	2023-06-15 15:48:00 +04:00
Avi Kivity	26c8470f65	treewide: use #include <seastar/...> for seastar headers We treat Seastar as an external library, so fix the few places that didn't do so to use angle brackets. Closes #14037	2023-06-06 08:36:09 +03:00
Petr Gusev	5976277c2c	token_metadata: drop has_pending_ranges and migration_info Use the new erm::has_pending_ranges function, drop the old implementation from token_metadata.	2023-05-21 13:17:42 +04:00
Petr Gusev	8cb709d3d6	token_metadata: drop update_pending_ranges The function storage_service::update_pending_ranges is turned to update_topology_changes_info. The pending_endpoints and read_endpoints will be computed later, when the erms are rebuilt.	2023-05-21 13:17:42 +04:00
Petr Gusev	87307781c4	effective_replication_map: use new get_pending_endpoints and get_endpoints_for_reading We already use the new pending_endpoints from erm though the get_pending_ranges virtual function, in this commit we update all the remaining places to use the new implementation in erm, as well as remove the old implementation in token_metadata.	2023-05-21 13:17:42 +04:00
Petr Gusev	10bf8c7901	token_metadata: introduce topology_change_info We plan to move pending_endpoints and read_endpoints, along with their computation logic, from token_metadata to vnode_effective_replication_map. The vnode_effective_replication_map seems more appropriate for them since it contains functionally similar _replication_map and we will be able to reuse pending_endpoints/read_endpoints across keyspaces sharing the same factory_key. At present, pending_endpoints and read_endpoints are updated in the update_pending_ranges function. The update logic comprises two parts - preparing data common to all keyspaces/replication_strategies, and calculating the migration_info for specific keyspaces. In this commit, we introduce a new topology_change_info structure to hold the first part's data add create an update_topology_change_info function to update it. This structure will later be used in vnode_effective_replication_map to compute pending_endpoints and read_endpoints. This enables the reuse of topology_change_info across all keyspaces, unlike the current update_pending_ranges implementation, which is another benefit of this refactoring. The update_topology_change_info implementation is mostly derived from update_pending_ranges, there are a few differences though: * replacing async and thread with plain co_awaits; * adding a utils::clear_gently call for the previous value to mitigate reactor stalls if target_token_metadata grows large; * substituting immediately invoked lambdas with simple variables and blocks to reduce noise, as lambdas would need to be converted into coroutines. The original update_pending_ranges remains unchanged, and will be removed entirely upon transitioning to the new implementation. Meanwhile, we add an update_topology_change_info call to storage_service::update_pending_ranges so that we can iteratively switch the system to the new implementation.	2023-05-19 19:04:43 +04:00
Petr Gusev	51e80691ef	token_metadata: replace set_topology_transition_state with set_read_new This helps isolate topology::transition_state dependencies, token_metadata doesn't need the entire enum, just this boolean flag.	2023-05-19 19:04:43 +04:00
Petr Gusev	0e4e2df657	token_metadata: add endpoints for reading In this patch we add token_metadata::set_topology_transition_state method. If the current state is write_both_read_new update_pending_ranges will compute new ranges for read requests. The default value of topology_transition_state is null, meaning no read ranges are computed. We will add the appropriate set_topology_transition_state calls later. Also, we add endpoints_for_reading method to get read endpoints based on the computed ranges.	2023-05-09 18:41:59 +04:00
Petr Gusev	0567ab82ac	token_metadata_impl: extract maybe_migration_endpoints helper function We are going to add a function in token_metadata to get read endpoints, similar to pending_endpoints_for. So in this commit we extract the maybe_migration_endpoints helper function, which will be used in both cases.	2023-05-09 13:56:38 +04:00
Petr Gusev	030f0f73aa	token_metadata_impl: introduce migration_info We are going to store read_endpoints in a way similar to pending ranges, so in this commit we add migration_info - a container for two boost::icl::interval_map. Also, _pending_ranges_interval_map is renamed to _keyspace_to_migration_info, since it captures the meaning better.	2023-05-09 13:56:38 +04:00
Petr Gusev	56c2b3e893	token_metadata_impl: refactor update_pending_ranges Now update_pending_ranges is quite complex, mainly because it tries to act efficiently and update only the affected intervals. However, it uses the function abstract_replication_strategy::get_ranges, which calls calculate_natural_endpoints for every token in the ring anyway. Our goal is to start reading from the new replicas for ranges in write_both_read_new state. In the current code structure this is quite difficult to do, so in this commit we first simplify update_pending_ranges. The main idea of the refactoring is to build a new version of token_metadata based on all planned changes (join, bootstrap, replace) and then for each token range compare the result of calculate_natural_endpoints on the old token_metadata and on the new one. Those endpoints that are in the new version and are not in the old version should be added to the pending_ranges. The add_mapping function is extracted for the future - we are going to use it to handle read mappings. Special care is taken when replacing with the same IP. The coordinator employs the get_natural_endpoints_without_node_being_replaced function, which excludes such endpoints from its result. If we compare the new (merged) and current token_metadata configurations, such endpoints will also be absent from pending_endpoints since they exist in both. To address this, we copy the current token_metadata and remove these endpoints prior to comparison. This ensures that nodes being replaced are treated like those being deleted.	2023-05-09 13:56:28 +04:00
Petr Gusev	e5c6af17e6	token_metadata: fix indentation	2023-05-08 13:16:21 +04:00
Petr Gusev	435a7573ff	token_metadata_impl: return unique_ptr from clone functions token_metadata takes token_metadata_impl as unique_ptr, so it makes sense to create it that way in the first place to avoid unnecessary moves. token_metadata_impl constructor with shallow_copy parameter was made public for std::make_unique. The effective accessibility of this constructor hasn't changed though since shallow_copy remains private.	2023-05-08 13:16:21 +04:00
Tomasz Grabiec	e6b76ac4b9	dht: token_metadata: Introduce get_my_id()	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	fceb5f8cf6	locator: Introduce tablet_metadata token_metadata now stores tablet metadata with information about tablets in the system.	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	e4865bd4d1	dht, storage_proxy: Abstract token space splitting Currently, scans are splitting partition ranges around tokens. This will have to change with tablets, where we should split at tablet boundaries. This patch introduces token_range_splitter which abstracts this task. It is provided by effective_replication_map implementation.	2023-04-24 10:49:36 +02:00

1 2 3 4 5

237 Commits