scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 04:06:59 +00:00

Author	SHA1	Message	Date
Kefu Chai	cb781c0ff7	gms: add add formatter for gms::versioned_value before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `gms::versioned_value`. its operator<< is preserved, as it's still being used by the homebrew generic formatter for std::unordered_map<gms::application_state, gms::versioned_value>, which is in turn used in gms/gossiper.cc. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17366	2024-02-18 19:21:54 +02:00
Kefu Chai	4812a57f71	gms: add add formatter for gms::gossip_* before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for - gms::gossip_digest - gms::gossip_digest_ack - gms::gossip_digest_syn and drop their operator<<:s Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17379	2024-02-18 19:21:53 +02:00
Kamil Braun	2e81f045cc	Merge 'transport: controller: do_start_server: do not set_cql_read for maintenance port' from Benny Halevy RPC is not ready yet at this point, so we should not set this application state yet. Also, simplify add_local_application_state as it contains dead code that will never generate an internal error after `1d07a596bf`. Fixes #16932 Closes scylladb/scylladb#17263 * github.com:scylladb/scylladb: gossiper: add_local_application_state: drop internae error transport: controller: do_start_server: do not set_cql_read for maintenance port	2024-02-12 13:26:45 +01:00
Benny Halevy	2ed29e31db	gms: inet_address: make constructors explicit In particular, `inet_address(const sstring& addr)` is dangerous, since a function like `topology::get_datacenter(inet_address ep)` might accidentally convert a `sstring` argument into an `inet_address` (which would most likely throw an obscure std::invalid_argument if the datacenter name does not look like an inet_address). Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#17260	2024-02-11 15:44:13 +02:00
Benny Halevy	f86a5072d6	gossiper: add_local_application_state: drop internae error After `1d07a596bf` that dropped before_change notifications there is no sense in getting the local endpoint_state_ptr twice: before and after the notifications and call on_internal_error if the state isn't found after the notifications. Just throw the runtime_error if the endpoint state is not found, otherwise, use it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-02-11 13:33:26 +02:00
Kamil Braun	e9e24f47ec	Merge 'raft topology: implement upgrade and recovery procedure' from Piotr Dulikowski This PR implements a procedure that upgrades existing clusters to use raft-based topology operations. The procedure does not start automatically, it must be triggered manually by the administrator after making sure that no topology operations are currently running. Upgrade is triggered by sending `POST /storage_service/raft_topology/upgrade` request. This causes the topology coordinator to start who drives the rest of the process: it builds the `system.topology` state based on information observed in gossip and tells all nodes to switch to raft mode. Then, topology coordinator runs normally. Upgrade progress is tracked in a new static column `upgrade_state` in `system.topology`. The procedure also serves as an extension to the current recovery procedure on raft. The current recovery procedure requires restarting nodes in a special mode which disables raft, perform `nodetool removenode` on the dead nodes, clean up some state on the nodes and restart them so that they automatically rebuild the group 0. Raft topology fits into existing procedure by falling back to legacy topology operations after disabling raft. After rebuilding the group 0, upgrade needs to be triggered again. Because upgrade is manual and it might not be convenient for administrators to run it right after upgrading the cluster, we allow the cluster to operate in legacy topology operations mode until upgrade, which includes allowing new nodes to join. In order to allow it, nodes now ask the cluster about the mode they should use to join before proceeding by using a new `JOIN_NODE_QUERY` RPC. The procedure is explained in more detail in `topology-over-raft.md`. Fixes: https://github.com/scylladb/scylladb/issues/15008 Closes scylladb/scylladb#17077 * github.com:scylladb/scylladb: test/topology_custom: upgrade/recovery tests for topology on raft cdc/generation_service: in legacy mode, fall back to raft tables system_keyspace: add read_cdc_generation_opt cdc/generation_service: turn off gossip notifications in raft topo mode cql_test_env: move raft_topology_change_enabled var earlier group0_state_machine: pull snapshot after raft topology feature enabled storage_service: disable persistent feature enabler on upgrade storage_service: replicate raft features to system.peers storage_service: gossip tokens and cdc generation in raft topology mode API: add api for triggering and monitoring topology-on-raft upgrade storage_service: infer which topology operations to use on startup storage_service: set the topology kind value based on group 0 state raft_group0: expose link to the upgrade doc in the header feature_service: fall back to checking legacy features on startup storage_service: add fiber for tracking the topology upgrade progress gms: feature_service: add SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES topology_coordinator: implement core upgrade logic topology_coordinator: extract top-level error handling logic storage_service: initialize discovery leader's state earlier topology_coordinator: allow for custom sharding info in prepare_and_broadcast_cdc_generation_data topology_coordinator: allow for custom sharding info in prepare_new_cdc_generation_data topology_coordinator: remove outdated fixme in prepare_new_cdc_generation_data topology_state_machine: introduce upgrade_state storage_service: disallow topology ops when upgrade is in progress raft_group0_client: add in_recovery method storage_service: introduce join_node_query verb raft_group0: make discover_group0 public raft_group0: filter current node's IP in discover_group0 raft_group0: remove my_id arg from discover_group0 storage_service: make _raft_topology_change_enabled more advanced docs: document raft topology upgrade and recovery	2024-02-09 11:54:53 +01:00
Piotr Dulikowski	53932420f8	storage_service: disable persistent feature enabler on upgrade When starting in legacy mode, a gossip event listener called persistent feature enabler is registered. This listener marks a feature as enabled when it notices, in gossip, that all nodes declare support for the feature. With raft-based topology, features are managed in group 0 instead and do not rely on the persistent feature enabler at all. Make the listener look at the raft_topology_change_enabled() method and prevent it from enabling more features after that method starts returning true.	2024-02-08 19:12:28 +01:00
Piotr Dulikowski	3513a07d8a	feature_service: fall back to checking legacy features on startup When checking features on startup (i.e. whether support for any feature was revoked in an unsafe way), it might happen that upgrade to raft topology didn't finish yet. In that case, instead of loading an empty set of features - which supposedly represents the set of features that were enabled until last boot - we should fall back to loading the set from the legacy `enabled_features` key in `system.scylla_local`.	2024-02-08 19:12:28 +01:00
Piotr Dulikowski	2ecb8641b1	gms: feature_service: add SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES All nodes being capable of support for raft topology is a prerequisite for starting upgrade to raft topology. The newly introduced feature will track this prerequisite.	2024-02-08 19:12:28 +01:00
Kefu Chai	6eae678eb3	db: add formatter for gms::gossip_digest_ack2 before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `gms::gossip_digest_ack2`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17153	2024-02-08 11:49:37 +02:00
Pavel Emelyanov	66b859a29f	gms: Remove unused operator<< for feature object Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17109	2024-02-01 19:00:46 +02:00
Kefu Chai	005d231f96	db: add formatter for gms::application_state before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `gms::application_state`, but its operator<< is preserved, as it is still used by the generic homebrew formatter for `std::unordered_map<>`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17096	2024-02-01 10:02:25 +02:00
Botond Dénes	5f44ae8371	Merge 'Add more logging for `gossiper::lock_endpoint` and `storage_service::handle_state_normal`' from Kamil Braun In a longevity test reported in scylladb/scylladb#16668 we observed that NORMAL state is not being properly handled for a node that replaced another node. Either handle_state_normal is not being called, or it is but getting stuck in the middle. Which is the case couldn't be determined from the logs, and attempts at creating a local reproducer failed. Thus the plan is to continue debugging using the longevity test, but we need more logs. To check whether `handle_state_normal` was called and which branches were taken, include some INFO level logs there. Also, detect deadlocks inside `gossiper::lock_endpoint` by reporting an error message if `lock_endpoint` waits for the lock for too long. Ref: scylladb/scylladb#16668 Closes scylladb/scylladb#16733 * github.com:scylladb/scylladb: gossiper: report error when waiting too long for endpoint lock gossiper: store source_location instead of string in endpoint_permit storage_service: more verbose logging in handle_state_normal	2024-01-12 10:51:21 +02:00
Kamil Braun	cf646022cb	gossiper: report error when waiting too long for endpoint lock In a longevity test reported in scylladb/scylladb#16668 we observed that NORMAL state is not being properly handled for a node that replaced another node. Either handle_state_normal is not being called, or it is but getting stuck in the middle. Which is the case couldn't be determined from the logs, and attempts at creating a local reproducer failed. One hypothesis is that `gossiper` is stuck on `lock_endpoint`. We dealt with gossiper deadlocks in the past (e.g. scylladb/scylladb#7127). Modify the code so it reports an error if `lock_endpoint` waits for the lock for more than a minute. When the issue reproduces again in longevity, we will see if `lock_endpoint` got stuck.	2024-01-11 17:29:25 +01:00
Kefu Chai	7abd263ee6	db/config.cc: do not respect sstable_format option "me" sstable format includes an important feature of storing the `host_id` of the local node when writing sstables. The is crucial for validating the sstable's `replay_position` in stats metadata as it is valid only on the originating node and shard (#10080), therefor we would like to make the `me` format mandatory. before making `me` mandatory, we need to stop handling `sstable_format` option if it is "md". in this change - gms/feature_service: do not disable `ME_SSTABLE_FORMAT` even if `sstable_format` is configured with "md". and in that case, instead, a warning is printed in the logging message to note that this setting is not valid anymore. - docs/architecture/sstable: note that "me" is used by default now. after this change, "sstable_format" will only accept "me" if it's explicitly configured. and when a server with this change joins a cluster, it uses "md" if the any of the node in the cluster still has `sstable_format`. practically, this change makes "me" mandatory in a 6.x cluster, assuming this change will be included in 6.x releases. Fixes #16551 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-11 22:43:05 +08:00
Kefu Chai	bece3eff0c	feature_service: abort if sstable_format < md sstable_format comes from scylla.yaml or from the command line arguments, and we gate scylla from unallowed sstable formats lower than `md` when parsing the configuration, and scylla bails out at seeing the unallowed sstable format like: ``` terminate called after throwing an instance of 'std::invalid_argument' what(): Invalid value for sstable_format: got ka which is not inside the set of allowed values md, me Aborted (core dumped) ``` scylla errors out way before `feature_config_from_db_config()` gets called -- it throws in `bpo::notify(configuration)`, way before `func` is evaluated in `app_template::run_deprecated()`. so, in this change, we do not handle these values anymore, and consider it a bug if we run into any of them. Refs #16551 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-11 22:43:05 +08:00
Kamil Braun	6e39c2ffde	gossiper: store source_location instead of string in endpoint_permit The original code extracted only the function_name from the source_location for logging. We'll use more information from the source_location in later commits.	2024-01-10 17:02:52 +01:00
Kefu Chai	f61f6c27e3	gms: add formatter for gms::endpoint_state before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for gms::endpoint_state, and change update the callers of `operator<<` to use `fmt::print()`. but we cannot drop `operator<<` yet, as we are still using the templated operator<< and templated fmt::formatter to print containers in scylla and in seastar -- they are still using `operator<<` under the hood. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16705	2024-01-10 09:16:23 +02:00
Kefu Chai	b91eb89ffa	gms: heart_beat_state: add formatter for gms::heart_beat_state before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for gms::heart_beat_state, and remove its operator<<(). the only caller site of its operator<< is updated to use `fmt::print()` Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16652	2024-01-09 11:52:40 +02:00
Kefu Chai	cca786e847	gms: endpoint_state: fix a typo in comment Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16653	2024-01-09 11:51:49 +02:00
Kefu Chai	7e84e03f52	gms: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. because the removal of `#include "unimplemented.hh"`, `service/migration_manager.cc` misses the definition of `unimplemented::cause::VALIDATION`, so include the header where it is used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16654	2024-01-05 13:37:08 +02:00
Kamil Braun	f942bf4a1f	Merge 'Do not update endpoint state via gossiper::add_saved_endpoint once it was updated via gossip' from Benny Halevy Currently, `add_saved_endpoint` is called from two paths: One, is when loading states from system.peers in the join path (join_cluster, join_token_ring), when `_raft_topology_change_enabled` is false, and the other is from `storage_service::topology_state_load` when raft topology changes are enabled. In the later path, from `topology_state_load`, `add_saved_endpoint` is called only if the endpoint_state does not exist yet. However, this is checked without acquiring the endpoint_lock and so it races with the gossiper, and once `add_saved_endpoint` acquires the lock, the endpoint state may already be populated. Since `add_saved_endpoint` applies local information about the endpoint state (e.g. tokens, dc, rack), it uses the local heart_beat_version, with generation=0 to update the endpoint states, and that is incompatible with changes applies via gossip that will carry the endpoint's generation and version, determining the state's update order. This change makes sure that the endpoint state is never update in `add_saved_endpoint` if it has non-zero generation. An internal error exception is thrown if non-zero generation is found, and in the only call site that might reach that state, in `storage_service::topology_state_load`, the caller acquires the endpoint_lock for checking for the existence of the endpoint_state, calling `add_saved_endpoint` under the lock only if the endpoint_state does not exist. Fixes #16429 Closes scylladb/scylladb#16432 * github.com:scylladb/scylladb: gossiper: add_saved_endpoint: keep heart_beat_state if ep_state is found storage_service: topology_state_load: lock endpoint for add_saved_endpoint raft_group_registry: move on_alive error injection to gossiper	2024-01-04 14:47:10 +01:00
Benny Halevy	9e8998109f	gossiper: get_*_members_synchronized: acquire endpoint update semaphore To ensure that the value they return is synchronized on all shards. This got broken recently by `147f30caff`. Refs https://github.com/scylladb/scylladb/pull/16597#discussion_r1440445432 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16629	2024-01-03 17:41:46 +01:00
Benny Halevy	147f30caff	gossiper: mutate_live_and_unreachable_endpoints: make exception safe Change the mutate_live_and_unreachable_endpoints procedure so that the called `func` would mutate a cloned `live_and_unreachable_endpoints` object in place. Those are replicated to temporary copies on all shards using `foreign<unique_ptr<>>` so that the would be automatically freed on exception. Only after all copies are made, they are applied on all gossiper shards in a noexcept loop and finally, a `on_success` function is called to apply further side effects if everything else was replicated successfully. The latter is still susceptible to exceptions, but we can live with those as long as `_live_endpoints` and `_unreachable_endpoints` are synchronized on all shards. With that, the read-only methods: `get_live_members_synchronized` and `get_unreachable_members_synchronized` become trivial and they just return the required data from shard 0. Fixes #15089 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16597	2024-01-03 14:46:10 +02:00
Benny Halevy	cdd5605d81	gms: endpoint_state: change application_state_map to std::unordered_map State changes are processed as a batch and there is no reason to maintain them as an ordered map. Instead, use a std::unordered_map that is more efficient. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	ad8a9104d8	endpoint_state subscriptions: batch on_change notification Rather than calling on_change for each particular application_state, pass an endpoint_state::map_type with all changed states, to be processed as a batch. In particular, thise allows storage_service::on_change to update_peer_info once for all changed states. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	1d07a596bf	everywhere: drop before_change subscription None of the subscribers is doing anything before_change. This is done before changing `on_change` in the following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	5abf556399	gms: endpoint_state: define application_state_map Have a central definition for the map held in the endpoint_state (before changing it to std::unordered_map). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	3cba079b26	gossiper: add_saved_endpoint: keep heart_beat_state if ep_state is found Currently, when loading peers' endpoint state from system.peers, add_saved_endpoint is called. The first instance of the endpoint state is created with the default heart_beat_state, with both generation and version set to zero. However, if add_saved_endpoint finds an existing instance of the endpoint state, it reuses it, but it updates its heart_beat_state with the local heart_beat_state() rather than keeping the existing heart_beat_state, as it should. This is a problem since it may confuse updates over gossip later on via do_apply_state_locally that compares the remote generation vs. the local generation, so they must stem from the same root that is the endpoint itself. Fixes #16429 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 16:48:57 +02:00
Benny Halevy	3099c5b8ab	storage_service: topology_state_load: lock endpoint for add_saved_endpoint `topology_state_load` currently calls `add_saved_endpoint` only if it finds no endpoint_state_ptr for the endpoint. However, this is done before locking the endpoint and the endpoint state could be inserted concurrently. To prevent that, a permit_id parameter was added to `add_saved_endpoint` allowing the caller to call it while the endpoint is locked. With that, `topology_state_load` locks the endpoint and checks the existence of the endpoint state under the lock, before calling `add_saved_endpoint`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 16:48:57 +02:00
Benny Halevy	db434e8cb5	raft_group_registry: move on_alive error injection to gossiper Move the `raft_group_registry::on_alive` error injection point to `gossiper::real_mark_alive` so it can delay marking the endpoint as alive, and calling the `on_alive` callback, but without holding the endpoint_lock. Note that the entry for this endpoint in `_pending_mark_alive_endpoints` still blocks marking it as alive until real_mark_alive completes. Fixes #16506 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 15:28:54 +02:00
Pavel Emelyanov	c43501d973	locator,schema: Move initial tablets from r.s. options to params The option is kepd in DDL, but is _not_ stored in system_schema.keyspaces. Instead, it's removed from the provided options and kept in scylla_keyspaces table in its own column. All the places that had optional initial_tablets disengaged now set this value up the way the find appropriate. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-25 16:07:10 +03:00
Kamil Braun	3b108f2e31	Merge 'db: config: make consistent_cluster_management mandatory' from Patryk Jędrzejczak We make `consistent_cluster_management` mandatory in 5.5. This option will be always unused and assumed to be true. Additionally, we make `override_decommission` deprecated, as this option has been supported only with `consistent_cluster_management=false`. Making `consistent_cluster_management` mandatory also simplifies the code. Branches that execute only with `consistent_cluster_management` disabled are removed. We also update documentation by removing information irrelevant in 5.5. Fixes scylladb/scylladb#15854 Note about upgrades: this PR does not introduce any more limitations to the upgrade procedure than there are already. As in scylladb/scylladb#16254, we can upgrade from the first version of Scylla that supports the schema commitlog feature, i.e. from 5.1 (or corresponding Enterprise release) or later. Assuming this PR ends up in 5.5, the documented upgrade support is from 5.4. For corresponding Enterprise release, it's from 2023.x (based on 5.2), so all requirements are met. Closes scylladb/scylladb#16334 * github.com:scylladb/scylladb: docs: update after making consistent_cluster_management mandatory system_keyspace, main, cql_test_env: fix indendations db: config: make consistent_cluster_management mandatory test: boost: schema_change_test: replace disable_raft_schema_config db: config: make override_decommission deprecated db: config: make force_schema_commit_log deprecated	2023-12-18 09:44:52 +01:00
Patryk Jędrzejczak	5ebfbf42bc	db: config: make consistent_cluster_management mandatory Code that executed only when consistent_cluster_management=false is removed. In particular, after this patch: - raft_group0 and raft_group_registry are always enabled, - raft_group0::status_for_monitoring::disabled becomes unused, - topology tests can only run with consistent_cluster_management.	2023-12-14 16:54:04 +01:00
Kamil Braun	26cbd28883	Merge 'token_metadata: switch to host_id' from Petr Gusev In this PR we refactor `token_metadata` to use `locator::host_id` instead of `gms::inet_address` for node identification in its internal data structures. Main motivation for these changes is to make raft state machine deterministic. The use of IPs is a problem since they are distributed through gossiper and can't be used reliably. One specific scenario is outlined [in this comment](https://github.com/scylladb/scylladb/pull/13655#issuecomment-1521389804) - `storage_service::topology_state_load` can't resolve host_id to IP when we are applying old raft log entries, containing host_id-s of the long-gone nodes. The refactoring is structured as follows: * Turn `token_metadata` into a template so that it can be used with host_id or inet_address as the node key. The version with inet_address (the current one) provides a `get_new()` method, which can be used to access the new version. * Go over all places which write to the old version and make the corresponding writes to the new version through `get_new()`. When this stage is finished we can use any version of the `token_metadata` for reading. * Go over all the places which read `token_metadata` and switch them to the new version. * Make `host_id`-based `token_metadata` default, drop `inet_address`-based version, change `token_metadata` back to non-template. These series [depends](`1745a1551a`) on RPC sender `host_id` being present in RPC `clent_info` for `bootstrap` and `replace` node_ops commands. This feature was added in [this commit](`95c726a8df`) and released in `5.4`. It is generally recommended not to skip versions when upgrading, so users who upgrade sequentially first to `5.4` (or the corresponding Enterprise version) then to the version with these changes (`5.5` or `6.0`) should be fine. If for some reason they upgrade from a version without `host_id` in RPC `clent_info` to the version with these changes and they run bootstrap or replace commands during the upgrade procedure itself, these commands may fail with an error `Coordinator host_id not found` if some nodes are already upgraded and the node which started the node_ops command is not yet upgraded. In this case the user can finish the upgrade first to version 5.4 or later, or start bootstrap/replace with an upgraded node. Note that removenode and decommission do not depend on coordinator host_id so they can be started in the middle of upgrade from any node. Closes scylladb/scylladb#15903 * github.com:scylladb/scylladb: topology: remove_endpoint: remove inet_address overload token_metadata: topology: cleanup add_or_update_endpoint token_metadata: add_replacing_endpoint: forbid replacing node with itself topology: drop key_kind, host_id is now the primary key dc_rack_fn: make it non-template token_metadata: drop the template shared_token_metadata: switch to the new token_metadata gossiper: use new token_metadata database: get_token_metadata -> new token_metadata erm: switch to the new token_metadata storage_service: get_token_metadata -> token_metadata2 storage_service: get_token_to_endpoint_map: use new token_metadata api/token_metadata: switch to new version storage_service::on_change: switch to new token_metadata cdc: switch to token_metadata2 calculate_natural_endpoints: fix indentation calculate_natural_endpoints: switch to token_metadata2 storage_service: get_changed_ranges_for_leaving: use new token_metadata decommission_with_repair, removenode_with_repair -> new token_metadata rebuild_with_repair, replace_with_repair: use new token_metadata bootstrap: use new token_metadata tablets: switch to token_metadata2 calculate_effective_replication_map: use new token_metadata calculate_natural_endpoints: fix formatting abstract_replication_strategy: calculate_natural_endpoints: make it work with both versions of token_metadata network_topology_strategy_test: update new token_metadata storage_service: on_alive: update new token_metadata storage_service: handle_state_bootstrap: update new token_metadata storage_service: snitch_reconfigured: update new token_metadata storage_service: leave_ring: update new token_metadata storage_service: node_ops_cmd_handler: update new token_metadata storage_service: node_ops_cmd_handler: add coordinator_host_id storage_service: bootstrap: update new token_metadata storage_service: join_token_ring: update new token_metadata storage_service: excise: update new token_metadata storage_service: join_cluster: update new token_metadata storage_service: on_remove: update new token_metadata storage_service: handle_state_normal: fill new token_metadata storage_service: topology_state_load: fill new token_metadata storage_service: adjust update_topology_change_info to update new token_metadata topology: set self host_id on the new topology locator::topology: allow being_replaced and replacing nodes to have the same IP token_metadata: get_endpoint_for_host_id -> get_endpoint_for_host_id_if_known token_metadata: get_host_id: exception -> on_internal_error token_metadata: add get_all_ips method token_metadata: support host_id-based version token_metadata: make it a template with NodeId=inet_address/host_id NodeId is used in all internal token_metadata data structures, that previously used inet_address. We choose topology::key_kind based on the value of the template parameter. locator: make dc_rack_fn a template locator/topology: add key_kind parameter token_metadata: topology_change_info: change field types to token_metadata_ptr token_metadata: drop unused method get_endpoint_to_token_map_for_reading	2023-12-13 16:35:52 +01:00
Petr Gusev	7b55ccbd8e	token_metadata: drop the template Replace token_metadata2 ->token_metadata, make token_metadata back non-template. No behavior changes, just compilation fixes.	2023-12-12 23:19:54 +04:00
Petr Gusev	799f747c8f	shared_token_metadata: switch to the new token_metadata	2023-12-12 23:19:54 +04:00
Petr Gusev	c7314aa8e2	gossiper: use new token_metadata	2023-12-12 23:19:53 +04:00
Tomasz Grabiec	effb9fb3cb	Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620). If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957). When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary. We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`. Fixes: #7620 Fixes: #13957 --- This is a reincarnation of PR scylladb/scylladb#15331. The previous PR was reverted due to a bug it unmasked; the bug has now been fixed (scylladb/scylladb#16139). Some refactors from the previous PR were already merged separately, so this one is a bit smaller. I have checked with @Lorak-mmk's reproducer (https://github.com/Lorak-mmk/udt_schema_change_reproducer -- many thanks for it!) that the originally exposed bug is no longer reproducing on this PR, and that it can still be reproduced if I revert the aforementioned fix on top of this PR. Closes scylladb/scylladb#16242 * github.com:scylladb/scylladb: docs: describe group 0 schema versioning in raft docs test: add test for group 0 schema versioning feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0 migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations schema_tables: use schema version from group 0 if present migration_manager: store `group0_schema_version` in `scylla_local` during schema changes system_keyspace: make `get/set_scylla_local_param` public feature_service: add `GROUP0_SCHEMA_VERSIONING` feature	2023-12-11 12:17:57 +01:00
Kamil Braun	7dad31c78f	feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode As promised in earlier commits: Fixes: #7620 Fixes: #13957 Also modify two test cases in `schema_change_test` which depend on the digest calculation method in their checks. Details are explained in the comments.	2023-12-08 17:46:31 +01:00
Botond Dénes	d2a88cd8de	Merge 'Typos: fix typos in code' from Yaniv Kaul Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255 Closes scylladb/scylladb#16289 * github.com:scylladb/scylladb: Update unified/build_unified.sh Update main.cc Update dist/common/scripts/scylla-housekeeping Typos: fix typos in code	2023-12-06 07:36:41 +02:00
Avi Kivity	12f160045b	Merge 'Get rid of fb_utilities' from Benny Halevy utils::fb_utilities is a global in-memory registry for storing and retrieving broadcast_address and broadcat_rpc_address. As part of the effort to get rid of all global state, this series gets rid of fb_utilities. This will eventually allow e.g. cql_test_env to instantiate multiple scylla server nodes, each serving on its own address. Closes scylladb/scylladb#16250 * github.com:scylladb/scylladb: treewide: get rid of now unused fb_utilities tracing: use locator::topology rather than fb_utilities streaming: use locator::topology rather than fb_utilities raft: use locator::topology/messaging rather than fb_utilities storage_service: use locator::topology rather than fb_utilities storage_proxy: use locator::topology rather than fb_utilities service_level_controller: use locator::topology rather than fb_utilities misc_services: use locator::topology rather than fb_utilities migration_manager: use messaging rather than fb_utilities forward_service: use messaging rather than fb_utilities messaging_service: accept broadcast_addr in config rather than via fb_utilities messaging_service: move listen_address and port getters inline test: manual: modernize message test table: use gossiper rather than fb_utilities repair: use locator::topology rather than fb_utilities dht/range_streamer: use locator::topology rather than fb_utilities db/view: use locator::topology rather than fb_utilities database: use locator::topology rather than fb_utilities db/system_keyspace: use topology via db rather than fb_utilities db/system_keyspace: save_local_info: get broadcast addresses from caller db/hints/manager: use locator::topology rather than fb_utilities db/consistency_level: use locator::topology rather than fb_utilities api: use locator::topology rather than fb_utilities alternator: ttl: use locator::topology rather than fb_utilities gossiper: use locator::topology rather than fb_utilities gossiper: add get_this_endpoint_state_ptr test: lib: cql_test_env: pass broadcast_address in cql_test_config init: get_seeds_from_db_config: accept broadcast_address locator: replication strategies: use locator::topology rather than fb_utilities locator: topology: add helpers to retrieve this host_id and address snitch: pass broadcast_address in snitch_config snitch: add optional get_broadcast_address method locator: ec2_multi_region_snitch: keep local public address as member ec2_multi_region_snitch: reindent load_config ec2_multi_region_snitch: coroutinize load_config ec2_snitch: reindent load_config ec2_snitch: coroutinize load_config thrift: thrift_validation: use std::numeric_limits rather than fb_utilities	2023-12-05 19:40:14 +02:00
Benny Halevy	0bcce35abd	treewide: get rid of now unused fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 16:22:49 +02:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Kamil Braun	07984215a3	feature_service: add `GROUP0_SCHEMA_VERSIONING` feature This feature, when enabled, will modify how schema versions are calculated and stored. - In group 0 mode, schema versions are persisted by the group 0 command that performs the schema change, then reused by each node instead of being calculated as a digest (hash) by each node independently. - In RECOVERY mode or before Raft upgrade procedure finishes, when we perform a schema change, we revert to the old digest-based way, taking into account the possibility of having performed group0-mode schema changes (that used persistent versions). As we will see in future commits, this will be done by storing additional flags and tombstones in system tables. By "schema versions" we mean both the UUIDs returned from `schema::version()` and the "global" schema version (the one we gossip as `application_state::SCHEMA`). For now, in this commit, the feature is always disabled. Once all necessary code is setup in following commits, we will enable it together with Raft.	2023-12-05 13:03:28 +01:00
Kefu Chai	e88bd9c5bd	gms/inet_address: pass sstring param by std::move() less overhead this way. the caller of lookup() always passes a rvalue reference. and seastar::dns::get_host_by_name() actually moves away from the parameter, so let's pass by std::move() for slightly better performance, and to match the expectation of the underlying seastar API. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16280	2023-12-05 12:05:21 +03:00
Benny Halevy	f3e0358563	gossiper: use locator::topology rather than fb_utilities And add `get_endpoint_state_ptr` for this_node. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	25754f843b	gossiper: add get_this_endpoint_state_ptr Returns this node's endpoint_state_ptr. With this entry point, the caller doesn't need to get_broadcast_address. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Kefu Chai	3608d9be97	gms/inet_address: remove unused '#include' neither <iomanip> nor "utils/to_string.hh" is used in `gms/inet_address.cc`, so let's remove their "#include"s. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16281	2023-12-05 08:30:03 +03:00
Patryk Jędrzejczak	c8ee7d4499	db: make schema commitlog feature mandatory Using consistent cluster management and not using schema commitlog ends with a bad configuration throw during bootstrap. Soon, we will make consistent cluster management mandatory. This forces us to also make schema commitlog mandatory, which we do in this patch. A booting node decides to use schema commitlog if at least one of the two statements below is true: - the node has `force_schema_commitlog=true` config, - the node knows that the cluster supports the `SCHEMA_COMMITLOG` cluster feature. The `SCHEMA_COMMITLOG` cluster feature has been added in version 5.1. This patch is supposed to be a part of version 6.0. We don't support a direct upgrade from 5.1 to 6.0 because it skips two versions - 5.2 and 5.4. So, in a supported upgrade we can assume that the version which we upgrade from has schema commitlog. This means that we don't need to check the `SCHEMA_COMMITLOG` feature during an upgrade. The reasoning above also applies to Scylla Enterprise. Version 2024.2 will be based on 6.0. Probably, we will only support an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even if we support an upgrade from 2023.x, this patch won't break anything because 2023.1 is based on 5.2, which has schema commitlog. Upgrades from 2022.x definitely won't be supported. When we populate a new cluster, we can use the `force_schema_commitlog=true` config to use schema commitlog unconditionally. Then, the cluster feature check is irrelevant. This check could fail because we initiate schema commitlog before we learn about the features. The `force_schema_commitlog=true` config is especially useful when we want to use consistent cluster management. Failing feature checks would lead to crashes during initial bootstraps. Moreover, there is no point in creating a new cluster with `consistent_cluster_management=true` and `force_schema_commitlog=false`. It would just cause some initial bootstraps to fail, and after successful restarts, the result would be the same as if we used `force_schema_commitlog=true` from the start. In conclusion, we can unconditionally use schema commitlog without any checks in 6.0 because we can always safely upgrade a cluster and start a new cluster. Apart from making schema commitlog mandatory, this patch adds two changes that are its consequences: - making the unneeded `force_schema_commitlog` config unused, - deprecating the `SCHEMA_COMMITLOG` feature, which is always assumed to be true. Closes scylladb/scylladb#16254	2023-12-04 21:02:16 +02:00

1 2 3 4 5 ...

1069 Commits