scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	62a23fd86a	config: Remove experimental TABLETS feature ... and replace it with boolean enable_tablets option. All the places in the code are patched to check the latter option instead of the former feature. The option is OFF by default, but the default scylla.yaml file sets this to true, so that newly installed clusters turn tablets ON. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `83d491af02`) Closes scylladb/scylladb#19012	2024-06-03 12:16:41 +03:00
Avi Kivity	54a82fed6b	feature, index: grandfather CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX This feature corrected how we store the token in secondary indexes. It was introduced in `7ff72b0ba5` (2020; 4.4) and can now be assumed present everywhere. Note that we still support indexes created with the old format.	2024-05-18 00:24:11 +03:00
Avi Kivity	2fbd78c769	feature: grandfather DIGEST_FOR_NULL_VALUES The DIGEST_FOR_NULL_VALUES feature was added in `21a77612b3` (2020; 4.4) and can now be assumed to be always present. The hasher which it invoked is removed.	2024-05-18 00:24:00 +03:00
Avi Kivity	7c264e8a71	feature: grandfather PER_TABLE_CACHING The PER_TABLE_CACHING feature was added in `0475dab359` (2020; 4.2) and can now be assumed to be always present.	2024-05-18 00:23:30 +03:00
Avi Kivity	d52c424a5f	feature: grandfather LWT LWT was make non-experimental in `9948f548a5` (2020; 4.1) and can now be assumed to be always present.	2024-05-18 00:20:53 +03:00
Avi Kivity	93088d0921	feature: grandfather HINTED_HANDOFF_SEPARATE_CONNECTION The HINTED_HANDOFF_SEPARATE_CONNECTION feature was introduced in `3a46b1bb2b` (2019; 3.3) and can be assumed always present.	2024-05-18 00:18:27 +03:00
Avi Kivity	3bead8cea0	feature: grandfather PER_TABLE_PARTITIONERS The PER_TABLE_PARTITIONERS feature was added in `90df9a44ce` (2020; 4.0) and can now be assumed to be always present. We also remove the associated schema_feature.	2024-05-18 00:15:07 +03:00
Avi Kivity	93113da01b	feature: grandfather NONFROZEN_UDTS The NONFROZEN_UDTS feature was added in `e74b5deb5d` (2019; 3.2) and can now be assumed to be always present.	2024-05-17 20:41:20 +03:00
Avi Kivity	c7d7ca2c23	feature: grandfather CDC The CDC feature was made non-experimental in `e9072542c1` (2020; 4.4) and can now be assumed to be always present. We also remove the corresponding schema_feature.	2024-05-17 20:41:20 +03:00
Avi Kivity	82ad2913ca	feature: grandfather DIGEST_INSENSITIVE_TO_EXPIRY The DIGEST_INSENSITIVE_TO_EXPIRY feature was added in `9de071d214` (2019; 3.2) and can now be assumed to be always present. We enable the corresponding schema_feature unconditionally. We do not remove the corresponding schema feature, because it can be disabled when the related TABLE_DIGEST_INSENSITIVE_TO_EXPIRY is present.	2024-05-17 20:41:19 +03:00
Avi Kivity	b5f6021a6b	feature: grandfather VIEW_VIRTUAL_COLUMNS The VIEW_VIRTUAL_COLUMNS feature was added in `a108df09f9` (2019; 3.1) and can now be assumed to be always present. The corresponding schema_feature is removed. Note schema_features are not sent over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.	2024-05-17 20:41:19 +03:00
Avi Kivity	7952200c8c	feature: grandfather ME_SSTABLE feature "me" format sstables were introduced in `d370558279` (Jan 2022; 5.1) and so can be assumed always present. The listener that checks when the cluster understands ME_SSTABLE was removed and in its place we default to sstable_version_types::me (and call on_enabled() immediately).	2024-05-17 20:41:19 +03:00
Avi Kivity	6d0c0b542c	feature: grandfather MD_SSTABLE_FORMAT "md" sstable support was introduced in `e8d7744040` (2020; 4.4) and so can be assumed to be present on all versions we upgrade from. Nothing appears to depend on it.	2024-05-17 20:41:19 +03:00
Patryk Jędrzejczak	0d428a3857	treewide: fix indentation after the previous patch	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	3a34bb18cd	db: config: make consistent-topology-changes unused We make the `consistent-topology-changes` experimental feature unused and assumed to be true in 6.0. We remove code branches that executed if `consistent-topology-changes` was disabled.	2024-04-25 14:33:21 +02:00
Piotr Dulikowski	53932420f8	storage_service: disable persistent feature enabler on upgrade When starting in legacy mode, a gossip event listener called persistent feature enabler is registered. This listener marks a feature as enabled when it notices, in gossip, that all nodes declare support for the feature. With raft-based topology, features are managed in group 0 instead and do not rely on the persistent feature enabler at all. Make the listener look at the raft_topology_change_enabled() method and prevent it from enabling more features after that method starts returning true.	2024-02-08 19:12:28 +01:00
Piotr Dulikowski	3513a07d8a	feature_service: fall back to checking legacy features on startup When checking features on startup (i.e. whether support for any feature was revoked in an unsafe way), it might happen that upgrade to raft topology didn't finish yet. In that case, instead of loading an empty set of features - which supposedly represents the set of features that were enabled until last boot - we should fall back to loading the set from the legacy `enabled_features` key in `system.scylla_local`.	2024-02-08 19:12:28 +01:00
Piotr Dulikowski	2ecb8641b1	gms: feature_service: add SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES All nodes being capable of support for raft topology is a prerequisite for starting upgrade to raft topology. The newly introduced feature will track this prerequisite.	2024-02-08 19:12:28 +01:00
Kefu Chai	7abd263ee6	db/config.cc: do not respect sstable_format option "me" sstable format includes an important feature of storing the `host_id` of the local node when writing sstables. The is crucial for validating the sstable's `replay_position` in stats metadata as it is valid only on the originating node and shard (#10080), therefor we would like to make the `me` format mandatory. before making `me` mandatory, we need to stop handling `sstable_format` option if it is "md". in this change - gms/feature_service: do not disable `ME_SSTABLE_FORMAT` even if `sstable_format` is configured with "md". and in that case, instead, a warning is printed in the logging message to note that this setting is not valid anymore. - docs/architecture/sstable: note that "me" is used by default now. after this change, "sstable_format" will only accept "me" if it's explicitly configured. and when a server with this change joins a cluster, it uses "md" if the any of the node in the cluster still has `sstable_format`. practically, this change makes "me" mandatory in a 6.x cluster, assuming this change will be included in 6.x releases. Fixes #16551 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-11 22:43:05 +08:00
Kefu Chai	bece3eff0c	feature_service: abort if sstable_format < md sstable_format comes from scylla.yaml or from the command line arguments, and we gate scylla from unallowed sstable formats lower than `md` when parsing the configuration, and scylla bails out at seeing the unallowed sstable format like: ``` terminate called after throwing an instance of 'std::invalid_argument' what(): Invalid value for sstable_format: got ka which is not inside the set of allowed values md, me Aborted (core dumped) ``` scylla errors out way before `feature_config_from_db_config()` gets called -- it throws in `bpo::notify(configuration)`, way before `func` is evaluated in `app_template::run_deprecated()`. so, in this change, we do not handle these values anymore, and consider it a bug if we run into any of them. Refs #16551 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-11 22:43:05 +08:00
Kefu Chai	7e84e03f52	gms: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. because the removal of `#include "unimplemented.hh"`, `service/migration_manager.cc` misses the definition of `unimplemented::cause::VALIDATION`, so include the header where it is used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16654	2024-01-05 13:37:08 +02:00
Benny Halevy	ad8a9104d8	endpoint_state subscriptions: batch on_change notification Rather than calling on_change for each particular application_state, pass an endpoint_state::map_type with all changed states, to be processed as a batch. In particular, thise allows storage_service::on_change to update_peer_info once for all changed states. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	1d07a596bf	everywhere: drop before_change subscription None of the subscribers is doing anything before_change. This is done before changing `on_change` in the following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Pavel Emelyanov	c43501d973	locator,schema: Move initial tablets from r.s. options to params The option is kepd in DDL, but is _not_ stored in system_schema.keyspaces. Instead, it's removed from the provided options and kept in scylla_keyspaces table in its own column. All the places that had optional initial_tablets disengaged now set this value up the way the find appropriate. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-25 16:07:10 +03:00
Patryk Jędrzejczak	5ebfbf42bc	db: config: make consistent_cluster_management mandatory Code that executed only when consistent_cluster_management=false is removed. In particular, after this patch: - raft_group0 and raft_group_registry are always enabled, - raft_group0::status_for_monitoring::disabled becomes unused, - topology tests can only run with consistent_cluster_management.	2023-12-14 16:54:04 +01:00
Kamil Braun	7dad31c78f	feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode As promised in earlier commits: Fixes: #7620 Fixes: #13957 Also modify two test cases in `schema_change_test` which depend on the digest calculation method in their checks. Details are explained in the comments.	2023-12-08 17:46:31 +01:00
Kamil Braun	07984215a3	feature_service: add `GROUP0_SCHEMA_VERSIONING` feature This feature, when enabled, will modify how schema versions are calculated and stored. - In group 0 mode, schema versions are persisted by the group 0 command that performs the schema change, then reused by each node instead of being calculated as a digest (hash) by each node independently. - In RECOVERY mode or before Raft upgrade procedure finishes, when we perform a schema change, we revert to the old digest-based way, taking into account the possibility of having performed group0-mode schema changes (that used persistent versions). As we will see in future commits, this will be done by storing additional flags and tombstones in system tables. By "schema versions" we mean both the UUIDs returned from `schema::version()` and the "global" schema version (the one we gossip as `application_state::SCHEMA`). For now, in this commit, the feature is always disabled. Once all necessary code is setup in following commits, we will enable it together with Raft.	2023-12-05 13:03:28 +01:00
Patryk Jędrzejczak	c8ee7d4499	db: make schema commitlog feature mandatory Using consistent cluster management and not using schema commitlog ends with a bad configuration throw during bootstrap. Soon, we will make consistent cluster management mandatory. This forces us to also make schema commitlog mandatory, which we do in this patch. A booting node decides to use schema commitlog if at least one of the two statements below is true: - the node has `force_schema_commitlog=true` config, - the node knows that the cluster supports the `SCHEMA_COMMITLOG` cluster feature. The `SCHEMA_COMMITLOG` cluster feature has been added in version 5.1. This patch is supposed to be a part of version 6.0. We don't support a direct upgrade from 5.1 to 6.0 because it skips two versions - 5.2 and 5.4. So, in a supported upgrade we can assume that the version which we upgrade from has schema commitlog. This means that we don't need to check the `SCHEMA_COMMITLOG` feature during an upgrade. The reasoning above also applies to Scylla Enterprise. Version 2024.2 will be based on 6.0. Probably, we will only support an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even if we support an upgrade from 2023.x, this patch won't break anything because 2023.1 is based on 5.2, which has schema commitlog. Upgrades from 2022.x definitely won't be supported. When we populate a new cluster, we can use the `force_schema_commitlog=true` config to use schema commitlog unconditionally. Then, the cluster feature check is irrelevant. This check could fail because we initiate schema commitlog before we learn about the features. The `force_schema_commitlog=true` config is especially useful when we want to use consistent cluster management. Failing feature checks would lead to crashes during initial bootstraps. Moreover, there is no point in creating a new cluster with `consistent_cluster_management=true` and `force_schema_commitlog=false`. It would just cause some initial bootstraps to fail, and after successful restarts, the result would be the same as if we used `force_schema_commitlog=true` from the start. In conclusion, we can unconditionally use schema commitlog without any checks in 6.0 because we can always safely upgrade a cluster and start a new cluster. Apart from making schema commitlog mandatory, this patch adds two changes that are its consequences: - making the unneeded `force_schema_commitlog` config unused, - deprecating the `SCHEMA_COMMITLOG` feature, which is always assumed to be true. Closes scylladb/scylladb#16254	2023-12-04 21:02:16 +02:00
Kamil Braun	5397524875	feature_service: make COMPUTED_COLUMNS feature unconditionally true The feature is assumed to be true, it was introduced in 2019. It's still advertised in gossip, but it's assumed to always be present. The `schema_feature` enum class still contains `COMPUTED_COLUMNS`, and the `all_tables` function in schema_tables.cc still checks for the schema feature when deciding if `computed_columns()` table should be included. This is necessary because digest calculation tests contain many digests calculated with the feature disabled, if we wanted to make it unconditional in the schema_tables code we'd have to regenerate almost all digests in the tests. It is simpler to leave the possibility for the tests to disable the feature.	2023-10-24 13:30:13 +02:00
Avi Kivity	35849fc901	Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun" This reverts commit `3d4398d1b2`, reversing changes made to `45dfce6632`. The commit causes some schema changes to be lost due to incorrect timestamps in some mutations. More information is available in [1]. Reopens: scylladb/scylladb#7620 Reopens: scylladb/scylladb#13957 Fixes scylladb/scylladb#15530. [1] https://github.com/scylladb/scylladb/pull/15687	2023-10-11 00:32:05 +03:00
Kamil Braun	c2beee348a	feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode As promised in earlier commits: Fixes: #7620 Fixes: #13957 Also modify two test cases in `schema_change_test` which depend on the digest calculation method in their checks. Details are explained in the comments.	2023-09-15 17:54:36 +02:00
Kamil Braun	72cd457d53	feature_service: add `GROUP0_SCHEMA_VERSIONING` feature This feature, when enabled, will modify how schema versions are calculated and stored. - In group 0 mode, schema versions are persisted by the group 0 command that performs the schema change, then reused by each node instead of being calculated as a digest (hash) by each node independently. - In RECOVERY mode or before Raft upgrade procedure finishes, when we perform a schema change, we revert to the old digest-based way, taking into account the possibility of having performed group0-mode schema changes (that used persistent versions). As we will see in future commits, this will be done by storing additional flags and tombstones in system tables. By "schema versions" we mean both the UUIDs returned from `schema::version()` and the "global" schema version (the one we gossip as `application_state::SCHEMA`). For now, in this commit, the feature is always disabled. Once all necessary code is setup in following commits, we will enable it together with Raft.	2023-09-15 13:04:04 +02:00
Petr Gusev	a683cebb02	system_keyspace: scylla_local: use schema commitlog We remove flush from set_scylla_local_param_as since it's now redundant. We add it to save_local_enabled_features as features need to be available before schema commitlog replay. We skip the flush if save_local_enabled_features is called from topology_state_load when the features are migrated to system.topology and we don't need strict durability.	2023-09-13 23:17:20 +04:00
Petr Gusev	cbfc512667	main.cc: move schema commitlog replay earlier We want to switch system.local table to schema commitlog, but this table is used in host_id initialization (initialize_local_info), so we need to replay schema commitlog before. In this commit we gather all the actions related to early system_keyspace initialization in one place, before initialize_local_info_thread. The calls to save_system_schema and recalculate_schema_version are tied to legacy_schema_migrator::migrate and initialize_virtual_tables calls, so they are done separately after legacy_schema_migrator::migrate.	2023-09-13 23:17:11 +04:00
Tomasz Grabiec	7b65d4d947	Merge 'Gossiper: provide strong exception safety for endpoint state changes' from Benny Halevy This series ensures that endpoint state changes (for each single endpoint) are applied to the gossiper endpoint_state_map as a whole and on all shards. Any failure in the process will keep the existing endpoint state intact. Note that verbs that modify the endpoint states of multiple endpoints may still succeed to modify some of them before hitting an error and those changes are committed to the endpoint_state_map, so we don't ensure atomicity when updating multiple endpoints' states. Fixes scylladb/scylladb#14794 Fixes scylladb/scylladb#14799 Closes #15073 * github.com:scylladb/scylladb: gossiper: move endpoint_state by value to apply it gossiper: replicate: make exception safe gms: pass endpoint_state_ptr to endpoint_state change subscribers gossiper: modify endpoint state only via replicate gossiper: keep and serve shared endpoint_state_ptr in map gossiper: get_max_endpoint_state_version: get state by reference api/failure_detector: get_all_endpoint_states: reduce allocations cdc/generation: get_generation_id_for: get endpoint_state& gossiper: add for_each_endpoint_state helpers gossiper: add num_endpoints gossiper: add my_endpoint_state	2023-09-01 12:23:19 +02:00
Piotr Dulikowski	aa5401383f	feature_service: introduce unsupported_feature_exception The new `unsupported_feature_exception` is introduced so that the exception thrown by `check_features` can be caught in a type-safe way.	2023-08-31 16:46:10 +02:00
Piotr Dulikowski	8286a2c369	feature_service: move startup feature check to a separate function The logic responsible for checking supported features agains the currently enabled features (and features that are unsafe to disable) is moved to a separate function, `check_features`. Currently, it is only used from `enable_features_on_startup`, but more checks against features in raft will be added in the commits that follow.	2023-08-31 16:45:40 +02:00
Benny Halevy	c16ec870da	gms: pass endpoint_state_ptr to endpoint_state change subscribers Now that the endpoint_state isn't change in place we do not need to copy it to each subscriber. We can rather just pass the lw_shared_ptr holding a snapshot of it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-31 09:35:15 +03:00
Pavel Emelyanov	f1515c610e	code: Remove query-context.hh The whole thing is unused now, so the header is no longer needed Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:11:07 +03:00
Piotr Dulikowski	b7d9348229	feature_service: don't load whole topology state to check features Currently, feature service uses `system_keyspace::load_topology_state` to load information about features from the `system.topology` table. This function implicitly assumes that it is called after schema commitlog replay and will correspond to the state of the topology state machine after some command is applied. However, feature check happens before the commitlog replay. If some group 0 command consists of multiple mutations that are not applied atomically, the `load_topology_state` function may fail to construct a `service::topology` object based on the table state. Moreover, this function not only checks `system.topology` but also `system.cdc_generations_v3` - in the case of the issue, the entry that was loaded from the this table didn't contain the `num_ranges` parameter. In order to fix this, the feature check code now uses `load_topology_features_state` which only loads enabled and supported features from `system.topology`. Only this information is really necessary for the feature check, and it doesn't have any invariants to check. Fixes: #14944	2023-08-04 12:32:05 +02:00
Piotr Dulikowski	f1704eeee6	topology_state_machine: extract features-related fields to a struct `enabled_features` and `supported_features` are now moved to a new `topology::features` struct. This will allow to move load this information independently from the `topology` struct, which will be needed for feature checking during start.	2023-08-04 12:21:51 +02:00
Kamil Braun	39ca07c49b	Merge 'Gossiper endpoint locking' from Benny Halevy This series cleans up and hardens the endpoint locking design and implementation in the gossiper and endpoint-state subscribers. We make sure that all notifications (expect for `before_change`, that apparently can be dropped) are called under lock_endpoint, as well as all calls to gossiper::replicate, to serialize endpoint_state changes across all shards. An endpoint lock gets a unique permit_id that is passed to the notifications and passed back by them if the notification functions call the gossiper back for the same endpoint on paths that modify the endpoint_state and may acquire the same endpoint lock - to prevent a deadlock. Fixes scylladb/scylladb#14838 Refs scylladb/scylladb#14471 Closes #14845 * github.com:scylladb/scylladb: gossiper: replicate: ensure non-null permit gossiper: add_saved_endpoint: lock_endpoint gossiper: mark_as_shutdown: lock_endpoint gossiper: real_mark_alive: lock_endpoint gossiper: advertise_token_removed: lock_endpoint gossiper: do_status_check: lock_endpoint gossiper: remove_endpoint: lock_endpoint if needed gossiper: force_remove_endpoint: lock_endpoint if needed storage_service: lock_endpoint when removing node gossiper: use permit_id to serialize state changes while preventing deadlocks gossiper: lock_endpoint: add debug messages utils: UUID: make default tagged_uuid ctor constexpr gossiper: lock_endpoint must be called on shard 0 gossiper: replicate: simplify interface gossiper: mark_as_shutdown: make private gossiper: convict: make private gossiper: mark_as_shutdown: do not call convict	2023-08-02 13:50:08 +02:00
Piotr Dulikowski	3c1ca12e62	feature_service: enable and check raft cluster features on startup The enable_features_on_startup method is adjusted for the raft-based cluster features. In topology coordinator mode: - Information about enabled features is taken from system.topology instead of the usual system.scylla_local (`enabled_features` key). - Features which, according to the local state, are supported by all nodes but not enabled yet are also checked. Support for such features cannot be revoked safely because the topology coordinator might have performed a successful global barrier and might have proceeded with marking the feature as enabled.	2023-08-01 18:54:58 +02:00
Benny Halevy	f74d154fe3	gossiper: use permit_id to serialize state changes while preventing deadlocks Pass permit_id to subscribers when we acquire one via lock_endpoint. The subscribers then pass it back to gossiper for paths that acquire lock_endpoint for the same endpoint, to detect nested locks when the endpoint is locked with the same permit_id. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-01 17:41:57 +03:00
Avi Kivity	cf81eef370	Merge 'schema_mutations, migration_manager: Ignore empty partitions in per-table digest' from Tomasz Grabiec Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. Tombstones expire 7 days after schema change which introduces them. If one of the nodes is restarted after that, it will compute a different table schema digest on boot. This may cause performance problems. When sending a request from coordinator to replica, the replica needs schema_ptr of exact schema version request by the coordinator. If it doesn't know that version, it will request it from the coordinator and perform a full schema merge. This adds latency to every such request. Schema versions which are not referenced are currently kept in cache for only 1 second, so if request flow has low-enough rate, this situation results in perpetual schema pulls. After `ae8d2a550d` (5.2.0), it is more liekly to run into this situation, because table creation generates tombstones for all schema tables relevant to the table, even the ones which will be otherwise empty for the new table (e.g. computed_columns). This change inroduces a cluster feature which when enabled will change digest calculation to be insensitive to expiry by ignoring empty partitions in digest calculation. When the feature is enabled, schema_ptrs are reloaded so that the window of discrepancy during transition is short and no rolling restart is required. A similar problem was fixed for per-node digest calculation in c2ba94dc39e4add9db213751295fb17b95e6b962. Per-table digest calculation was not fixed at that time because we didn't persist enabled features and they were not enabled early-enough on boot for us to depend on them in digest calculation. Now they are enabled before non-system tables are loaded so digest calculation can rely on cluster features. Fixes #4485. Manually tested using ccm on cluster upgrade scenarios and node restarts. Closes #14441 * github.com:scylladb/scylladb: test: schema_change_test: Verify digests also with TABLE_DIGEST_INSENSITIVE_TO_EXPIRY enabled schema_mutations, migration_manager: Ignore empty partitions in per-table digest migration_manager, schema_tables: Implement migration_manager::reload_schema() schema_tables: Avoid crashing when table selector has only one kind of tables	2023-07-28 00:01:33 +03:00
Piotr Dulikowski	794d3f0b03	feature_service: add error injection for deprecated cluster feature Adds an error injection which allows to enable the TEST_ONLY_FEATURE as a deprecated feature, i.e. it is assumed to be always enabled, but still considered to be supported by the node and advertised in gossip.	2023-07-14 12:41:37 +02:00
Piotr Dulikowski	a775f929df	feature_service: move error injection check to helper function And also extract "features_enable_test_feature" literal to a string constant. This should slightly improve readability and make it more consistent with the next commit.	2023-07-14 12:41:37 +02:00
Piotr Dulikowski	1704d7e4f0	feature_service: handle deprecated features correctly in feature check The feature check in `enable_features_on_startup` loads the list of features that were enabled previously, goes over every one of them and checks whether each feature is considered supported and whether there is a corresponding `gms::feature` object for it (i.e. the feature is "registered"). The second part of the check is unnecessary and wrong. A feature can be marked as supported but its `gms::feature` object not be present anymore: after a feature is supported for long enough (i.e. we only support upgrades from versions that support the feature), we can consider such a feature to be deprecated. When a feature is deprecated, its `gms::feature` object is removed and the feature is always considered enabled which allows to remove some legacy code. We still consider this feature to be supported and advertise it in gossip, for the sake of the old nodes which, even though they always support the feature, they still check whether other nodes support it. The problem with the check as it is now is that it disallows moving features to the disabled list. If one tries to do it, they will find out that upgrading the node to the new version does not work: `enable_features_on_startup` will load the feature, notice that it is not "registered" (there is no `gms::feature` object for it) and fail to boot. This commit fixes the problem by modifying `enable_features_on_startup` not to look at the registered features list at all. In addition to this, some other small cleanups are performed: - "LARGE_COLLECTION_DETECTION" is removed from the deprecated features list. For some reason, it was put there when the feature was being introduced. It does not break anything because there is a `gms::feature` object for it, but it's slightly confusing and therefore is removed. - The comment in `supported_feature_set` that invites developers to add features there as they are introduced is removed. It is no longer necessary to do so because registered features are put there automatically. Deprecated features should still be put there, as indicated as another comment. Fortunately, this issue does not break any upgrades as of now - since we added enabled cluster feature persisting, no features were deprecated, and we only add registered features to the persisted feature list.	2023-07-14 12:41:18 +02:00
Tomasz Grabiec	f2ed9fcd7e	schema_mutations, migration_manager: Ignore empty partitions in per-table digest Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. Tombstones expire 7 days after schema change which introduces them. If one of the nodes is restarted after that, it will compute a different table schema digest on boot. This may cause performance problems. When sending a request from coordinator to replica, the replica needs schema_ptr of exact schema version request by the coordinator. If it doesn't know that version, it will request it from the coordinator and perform a full schema merge. This adds latency to every such request. Schema versions which are not referenced are currently kept in cache for only 1 second, so if request flow has low-enough rate, this situation results in perpetual schema pulls. After `ae8d2a550d`, it is more liekly to run into this situation, because table creation generates tombstones for all schema tables relevant to the table, even the ones which will be otherwise empty for the new table (e.g. computed_columns). This change inroduces a cluster feature which when enabled will change digest calculation to be insensitive to expiry by ignoring empty partitions in digest calculation. When the feature is enabled, schema_ptrs are reloaded so that the window of discrepancy during transition is short and no rolling restart is required. A similar problem was fixed for per-node digest calculation in 18f484cc753d17d1e3658bcb5c73ed8f319d32e8. Per-table digest calculation was not fixed at that time because we didn't persist enabled features and they were not enabled early-enough on boot for us to depend on them in digest calculation. Now they are enabled before non-system tables are loaded so digest calculation can rely on cluster features. Fixes #4485.	2023-07-03 23:06:55 +02:00
Kefu Chai	f014ccf369	Revert "Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai"" This reverts commit `562087beff`. The regressions introduced by the reverted change have been fixed. So let's revert this revert to resurrect the uuid_sstable_identifier_enabled support. Fixes #10459	2023-06-21 13:02:40 +03:00

1 2 3

139 Commits