scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 00:13:31 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	f827cfd5b6	tablet_allocator: unregister metrics when leadership is lost So that graphs are not polluted with stale metrics from past leaders.	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	d653cbae53	tablets: load_balancer: Export metrics	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	67c7aadded	service, raft: Move balance_tablets() to tablet_allocator The implementation will access metrics registered from tablet_allocator.	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	cb0d763a22	tablet_allocator: Start even if tablets feature is not enabled topology coordinator will call it. Rather than spreading ifs there, it's simpler to start it and disable functionality in the tablet allocator.	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	5bfc8b0445	main, storage_service: Pass tablet allocator to storage_service Tablet balancing will be done through tablet_allocator later.	2023-08-05 03:10:26 +02:00
Kamil Braun	421a5ad55c	Merge 'feature_service: don't load whole topology state to check features' from Piotr Dulikowski Currently, feature service uses `system_keyspace::load_topology_state` to load information about features from the `system.topology` table. This function implicitly assumes that it is called after schema commitlog replay and will correspond to the state of the topology state machine after some command is applied. However, feature check happens before the commitlog replay. If some group 0 command consists of multiple mutations that are not applied atomically, the `load_topology_state` function may fail to construct a `service::topology` object based on the table state. Moreover, this function not only checks `system.topology` but also `system.cdc_generations_v3` - in the case of the issue, the entry that was loaded from the this table didn't contain the `num_ranges` parameter. In order to fix this, the feature check code now uses `load_topology_features_state` which only loads enabled and supported features from `system.topology`. Only this information is really necessary for the feature check, and it doesn't have any invariants to check. Fixes: #14944 Closes #14955 * github.com:scylladb/scylladb: feature_service: don't load whole topology state to check features system_keyspace: separate loading topology_features from topology topology_state_machine: extract features-related fields to a struct untyped_result_set: add missing_column_exception	2023-08-04 15:09:12 +02:00
Kamil Braun	fed775e13b	Merge 'group0_state_machine: await transfer_snapshot' from Benny Halevy Hold a (newly added) group0_state_machine gate that is closed and waited on in group0_state_machine::abort() To prevent use-after-free when destroying the group0_state_machine while transfer_snapshot runs. Fixes #14907 Also, use an abort_source in group0_state_machine to abort an ongoing transfer_snapshot operation on group0_state_machine::abort() Closes #14952 * github.com:scylladb/scylladb: raft: group0_state_machine: transfer_snapshot: make abortable raft: group0_state_machine: transfer_snapshot: hold gate	2023-08-04 14:21:57 +02:00
Botond Dénes	68d2397d01	Merge 'repair: delete unused fields' from Aleksandra Martyniuk Delete unused shard_repair_task_impl members and incorrectly used method's argument. Closes #14956 * github.com:scylladb/scylladb: repair: delete task_manager_module::get_progress argument repair: delete unused shard_repair_task_impl fields	2023-08-04 15:08:31 +03:00
Piotr Dulikowski	b7d9348229	feature_service: don't load whole topology state to check features Currently, feature service uses `system_keyspace::load_topology_state` to load information about features from the `system.topology` table. This function implicitly assumes that it is called after schema commitlog replay and will correspond to the state of the topology state machine after some command is applied. However, feature check happens before the commitlog replay. If some group 0 command consists of multiple mutations that are not applied atomically, the `load_topology_state` function may fail to construct a `service::topology` object based on the table state. Moreover, this function not only checks `system.topology` but also `system.cdc_generations_v3` - in the case of the issue, the entry that was loaded from the this table didn't contain the `num_ranges` parameter. In order to fix this, the feature check code now uses `load_topology_features_state` which only loads enabled and supported features from `system.topology`. Only this information is really necessary for the feature check, and it doesn't have any invariants to check. Fixes: #14944	2023-08-04 12:32:05 +02:00
Piotr Dulikowski	8f491457ae	system_keyspace: separate loading topology_features from topology Now, it is possible to load topology_features separately from the topology struct. It will be used in the code that checks enabled features on startup.	2023-08-04 12:32:04 +02:00
Piotr Dulikowski	f1704eeee6	topology_state_machine: extract features-related fields to a struct `enabled_features` and `supported_features` are now moved to a new `topology::features` struct. This will allow to move load this information independently from the `topology` struct, which will be needed for feature checking during start.	2023-08-04 12:21:51 +02:00
Aleksandra Martyniuk	66df686980	repair: delete task_manager_module::get_progress argument Getting reason argument in task_manager_module::get_progress is deceiving as the method works properly only for streaming::stream_reason::repair (repair::shard_repair_task_impl::nr_ranges_finished isn't updated for any other reason).	2023-08-04 11:09:37 +02:00
Aleksandra Martyniuk	93ebbdcf1d	repair: delete unused shard_repair_task_impl fields	2023-08-04 10:52:24 +02:00
Botond Dénes	00a62866ac	Merge 'Make database::add_column_family exception safe.' from Aleksandra Martyniuk If some state update in database::add_column_family throws, info about a column family would be inconsistent. Undo already performed operations in database::add_column_family when one throws. Fixes: #14666. Closes #14672 * github.com:scylladb/scylladb: replica: undo the changes if something fails replica: start table earlier in database::add_column_family	2023-08-04 10:58:17 +03:00
Botond Dénes	4d538e1363	Merge 'Task manager tasks covering compaction group compaction' from Aleksandra Martyniuk All compaction task executors, except for regular compaction one, become task manager compaction tasks. Creating and starting of major_compaction_task_executor is modified to be consistent with other compaction task executors. Closes #14505 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to cover compaction group tasks compaction: turn custom_task_executor into compaction_task_impl compaction: turn sstables_task_executor into sstables_compaction_task_impl compaction: change sstables compaction tasks type compaction: move table_upgrade_sstables_compaction_task_impl compaction: pass task_info through sstables compaction compaction: turn offstrategy_compaction_task_executor into offstrategy_compaction_task_impl compaction: turn cleanup_compaction_task_executor into cleanup_compaction_task_impl comapction: use optional task info in major compaction compaction: use perform_compaction in compaction_manager::perform_major_compaction	2023-08-04 10:11:00 +03:00
Michał Jadwiszczak	b92d47362f	schema::describe: print 'synchronous_updates' only if it was specified While describing materialized view, print `synchronous_updates` option only if the tag is present in schema's extensions map. Previously if the key wasn't present, the default (false) value was printed. Fixes: #14924 Closes #14928	2023-08-04 09:52:37 +03:00
Kefu Chai	d8d91379e7	test: remove unnecessary check in compaction_manager_basic_test we wait for the same condition couple lines before, so no need to check it again using `BOOST_CHECK_EQUAL()`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14921	2023-08-04 09:26:22 +03:00
Piotr Dulikowski	fad1e82bf7	untyped_result_set: add missing_column_exception Currently, when one tries to access a column that an untyped_result_set does not contain, a `std::bad_variant_access` exception is thrown. This exception's message provides very little context and it can be difficult to even figure out where this message is coming from. In order to improve the situation, a new exception `missing_column` is introduced which includes the missing column's name in its error message. The exception derives from `std::bad_variant_access` for compatibility with existing code that may want to catch it.	2023-08-04 07:37:12 +02:00
Kefu Chai	374bed8c3d	tools: do not create bpo::value unless transfer it to an option_description `boost::program_options::value()` create a new typed_value<T> object, without holding it with a shared_ptr. boost::program_options expects developer to construct a `bpo::option_description` right away from it. and `boost::program_options::option_description` takes the ownership of the `type_value<T>*` raw pointer, and manages its life cycle with a shared_ptr. but before passing it to a `bpo::option_description`, the pointer created by `boost::program_options::value()` is a still a raw pointer. before this change, we initialize positional options as global variables using `boost::program_options::value()`. but unfortunately, we don't always initialize a `bpo::option_description` from it -- we only do this on demand when the corresponding subcommand is called. so, if the corresponding subcommand is not called, the created `typed_value<T>` objects are leaked. hence LeakSanitizer warns us. after this change, we create the option vector as a static local variable in a function so it is created on demand as well. as an alternative, we could initialize the options vector as local variable where it used. but to be more consistent with how `global_option` is specified. and to colocate them in a single place, let's keep the existing code layout. Fixes #14929 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14939	2023-08-04 08:03:11 +03:00
Aleksandra Martyniuk	1e9b2972ea	replica: undo the changes if something fails If a step of adding a table fails, previous steps are undone.	2023-08-03 17:37:31 +02:00
Benny Halevy	46c9e3032d	storage_service: get_all_ranges: reserve enough space in ranges Commit `bc5f6cf45d` added a reserve call to the `ranges` vector before inserting all the returned token ranges into it. However, that reservation is too small as we need to express size+1 ranges for size tokens with <unbound, token[0]> and <token[size-1], unbound> ranges at the front and back, respectively. Fixes #14849 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #14938	2023-08-03 17:13:03 +03:00
Benny Halevy	357d57c82d	raft: group0_state_machine: transfer_snapshot: make abortable Use an abort_source in group0_state_machine to abort an ongoing transfer_snapshot operation on group0_state_machine::abort() Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-03 16:32:08 +03:00
Benny Halevy	a23b58231e	raft: group0_state_machine: transfer_snapshot: hold gate Hold a (newly added) group0_state_machine gate that is closed and waited on in group0_state_machine::abort() To prevent use-after-free when destroying the group0_state_machine while transfer_snapshot runs. Fixes #14907 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-03 15:45:34 +03:00
Botond Dénes	946c6487ee	Merge 'repair: Add ranges_parallelism option' from Asias He This patch adds the ranges_parallelism option to repair restful API. Users can use this option to optionally specify the number of ranges to repair in parallel per repair job to a smaller number than the Scylla core calculated default max_repair_ranges_in_parallel. Scylla manager can also use this option to provide more ranges (>N) in a single repair job but only repairing N ranges_parallelism in parallel, instead of providing N ranges in a repair job. To make it safer, unlike the PR #4848, this patch does not allow user to exceed the max_repair_ranges_in_parallel. Fixes #4847 Closes #14886 * github.com:scylladb/scylladb: repair: Add ranges_parallelism option repair: Change to use coroutine in do_repair_ranges	2023-08-03 11:34:05 +03:00
Kefu Chai	d4ee84ee1e	s3/test: nuke tempdir but keep $tempdir/log before this change, if the object_store test fails, the tempdir will be preserved. and if our CI test pipeline is used to perform the test, the test job would scan for the artifacts, and if the test in question fails, it would take over 1 hour to scan the tempdir. to alleviate the pain, let's just keep the scylla logging file no matter the test fails or succeeds. so that jenkins can scan the artifacts faster if the test fails. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14880	2023-08-03 11:07:59 +03:00
Avi Kivity	cb3b808e3f	Merge 'replica/table.cc: Add per-node-per-table metrics' from Amnon Heiman Per-table metrics are very valuable for the users, it does come with a high load on both the reporting and the collecting metrics systems. This patch adds a small subset of per-metrics table that will be reported on the node level. The list of metrics is: system_column_family_memtable_switch - Number of times flush has resulted in the memtable being switched out system_column_family_memtable_partition_writes - Number of write operations performed on partitions in memtables system_column_family_memtable_partition_hits - Number of times a write operation was issued on an existing partition in memtables system_column_family_memtable_row_writes - Number of row writes performed in memtables system_column_family_memtable_row_hits - Number of rows overwritten by write operations in memtables system_column_family_total_disk_space - Total disk space used system_column_family_live_sstable - Live sstable count system_column_family_read_latency_count - Number of reads system_column_family_write_latency_count - Number of writes The names of the read/write metrics is based on the histogram convention, so when latencies histograms will be added, the names will not change. The metrics are label with a specific label __per_table="node" so it will be possible to easily manipulate it. The metrics will be available when enable_metrics_reporting (the per-table full metrics flag) is off Fixes #2198 Closes #13293 * github.com:scylladb/scylladb: replica/table.cc: Add node-per-table metrics config: add enable_node_table_metrics flag	2023-08-02 22:17:47 +03:00
Aleksandra Martyniuk	9f68566038	replica: start table earlier in database::add_column_family In database::add_column_family table::start() is called before a table is registered in different structures.	2023-08-02 16:35:34 +02:00
Kamil Braun	39ca07c49b	Merge 'Gossiper endpoint locking' from Benny Halevy This series cleans up and hardens the endpoint locking design and implementation in the gossiper and endpoint-state subscribers. We make sure that all notifications (expect for `before_change`, that apparently can be dropped) are called under lock_endpoint, as well as all calls to gossiper::replicate, to serialize endpoint_state changes across all shards. An endpoint lock gets a unique permit_id that is passed to the notifications and passed back by them if the notification functions call the gossiper back for the same endpoint on paths that modify the endpoint_state and may acquire the same endpoint lock - to prevent a deadlock. Fixes scylladb/scylladb#14838 Refs scylladb/scylladb#14471 Closes #14845 * github.com:scylladb/scylladb: gossiper: replicate: ensure non-null permit gossiper: add_saved_endpoint: lock_endpoint gossiper: mark_as_shutdown: lock_endpoint gossiper: real_mark_alive: lock_endpoint gossiper: advertise_token_removed: lock_endpoint gossiper: do_status_check: lock_endpoint gossiper: remove_endpoint: lock_endpoint if needed gossiper: force_remove_endpoint: lock_endpoint if needed storage_service: lock_endpoint when removing node gossiper: use permit_id to serialize state changes while preventing deadlocks gossiper: lock_endpoint: add debug messages utils: UUID: make default tagged_uuid ctor constexpr gossiper: lock_endpoint must be called on shard 0 gossiper: replicate: simplify interface gossiper: mark_as_shutdown: make private gossiper: convict: make private gossiper: mark_as_shutdown: do not call convict	2023-08-02 13:50:08 +02:00
Konstantin Osipov	df97135583	test.py: forward the optional property file when creating a server To support multi-DC tests we need to provide a property file when creating a server. Forward it from the test client to test.py. Closes #14683	2023-08-02 13:45:19 +02:00
Kamil Braun	b835acf853	Merge 'Cluster features on raft: topology coordinator + check on boot' from Piotr Dulikowski This PR implements the functionality of the raft-based cluster features needed to safely manage and enable cluster features, according to the cluster features on raft design doc. Enabling features is a two phase process, performed by the topology coordinator when it notices that there are no topology changes in progress and there are some not-yet enabled features that are declared to be supported by all nodes: 1. First, a global barrier is performed to make sure that all nodes saw and persisted the same state of the `system.topology` table as the coordinator and see the same supported features of all nodes. When booting, nodes are now forbidden to revoke support for a feature if all nodes declare support for it, a successful barrier this makes sure that no node will restart and disable the features. 2. After a successful barrier, the features are marked as enabled in the `system.topology` table. The whole procedure is a group 0 operation and fails if the topology table is modified in the meantime (e.g. some node changes its supported features set). For now, the implementation relies on gossip shadow round check to protect from nodes without all features joining the cluster. In a followup, a new joining procedure will be implemented which involves the topology coordinator and lets it verify joining node's cluster features before the new node is added to group 0 and to the cluster. A set of tests for the new implementation is introduced, containing the same tests as for the non-raft-based cluster feature implementation plus one additional test, specific to this implementation. Closes #14722 * github.com:scylladb/scylladb: test: topology_experimental_raft: cluster feature tests test: topology: fix a skipped test storage_service: add injection to prevent enabling features storage_service: initialize enabled features from first node topology_state_machine: add size(), is_empty() group0_state_machine: enable features when applying cmds/snapshots persistent_feature_enabler: attach to gossip only if not using raft feature_service: enable and check raft cluster features on startup storage_service: provide raft_topology_change_enabled flag from outside storage_service: enable features in topology coordinator storage_service: add barrier_after_feature_update topology_coordinator: exec_global_command: make it optional to retake the guard topology_state_machine: add calculate_not_yet_enabled_features	2023-08-02 12:32:27 +02:00
Pavel Emelyanov	c3b23fc03d	Merge 'Skip mode validation for snapshots' from Benny Halevy Skip over verification of owner and mode of the snapshots sub-directory as this might race with scylla-manager trying to delete old snapshots concurrently. Fixes #12010 Closes #14892 * github.com:scylladb/scylladb: distributed_loader: process_sstable_dir: do not verify snapshots utils/directories: verify_owner_and_mode: add recursive flag	2023-08-02 13:05:47 +03:00
Kefu Chai	d28c06b65b	test: remove unused #include in sstable_*_test.cc for faster build times and clear inter-module dependencies, we should not #includes headers not directly used. instead, we should only #include the headers directly used by a certain compilation unit. in this change, the source files under "/compaction" directories are checked using clangd, which identifies the cases where we have an #include which is not directly used. all the #includes identified by clangd are removed, except for "test/lib/scylla_test_case.hh" as it brings some command line options used by scylla tests. see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14922	2023-08-02 11:58:03 +03:00
Kefu Chai	1bcd9dd80a	compaction: drop unnecessary type cast get_compacted_fragments_writer() returns a instance of `compacted_fragments_writer`, there is no need to cast it again. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14919	2023-08-02 11:36:10 +03:00
Amnon Heiman	c30d7ba5d7	replica/table.cc: Add node-per-table metrics Per-table metrics are very valuable for the users, it does come with a high load on both the reporting and the collecting metrics systems. This patch adds a small subset of per-metrics table that will be reported on the node level. The list of metrics is: system_column_family_memtable_switch - Number of times flush has resulted in the memtable being switched out system_column_family_memtable_partition_writes - Number of write operations performed on partitions in memtables system_column_family_memtable_partition_hits - Number of times a write operation was issued on an existing partition in memtables system_column_family_memtable_row_writes - Number of row writes performed in memtables system_column_family_memtable_row_hits - Number of rows overwritten by write operations in memtables system_column_family_total_disk_space - Total disk space used system_column_family_live_sstable - Live sstable count system_column_family_read_latency_count - Number of reads system_column_family_write_latency_count - Number of writes The names of the read/write metrics is based on the histogram convention, so when latencies histograms will be added, the names will not change. The metrics are label with a specific label __per_table="node" so it will be possible to easily manipulate it. The metrics will be available when enable_metrics_reporting (the per-table full metrics flag) is off and enable_node_table_metrics is true. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-08-02 10:20:18 +03:00
Amnon Heiman	d10a3dd19a	config: add enable_node_table_metrics flag By default, per-table-per-shard metrics reporting is turned off, and the aggregated version of the metrics (per-table-per-node) will be turned on. There could be a situation where a user with an excessive number of tables would suffer from performance issues, both from the network and the metrics collection server. This patch adds a config option, enable_node_table_metrics, which allows users to turn off per-table metrics reporting altogether. For example, when running Scylla with the command line argument '--enable-node-aggregated-table_metrics 0' per-table metrics will not be reported. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-08-02 10:20:18 +03:00
Kefu Chai	6c66030b7b	compaction: add formatter for compaction_task_executor add fmt formatter for `compaction_task_executor::state` and `compaction_task_executor` and its derived classes. this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `compaction_task_executor`, its derived classes and `compaction_task_executor::state` without the help of `operator<<`. since all of the callers of 'operator<<' of these types now use formatters, the operator<< are removed in this change. the helpers like `to_string()` and `describe()` are removed as well, as it'd be more consistent if we always use fmtlib for formatting instead of inventing APIs with different names. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14906	2023-08-02 09:15:43 +03:00
Benny Halevy	949ea43034	topology: unindex_node: erase dc from datacenters when empty In branch 5.2 we erase `dc` from `_datacenters` if there are no more endpoints listed in `_dc_endpoints[dc]`. This was lost unintentionally in `f3d5df5448` and this commit restores that behavior, and fixes test_remove_endpoint. Fixes scylladb/scylladb#14896 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #14897	2023-08-02 09:08:24 +03:00
Piotr Dulikowski	d40bb0bacb	test: topology_experimental_raft: cluster feature tests Although the implementation of cluster features on raft is not complete yet, it makes sense to add some tests for the existing implementation. The `test_raft_cluster_features.py` file includes the same set of tests as the file with non-raft-based cluster feature tests, plus one additional test which checks that a node will not allow disabling a feature if it sees that other nodes support it (even though the feature is not enabled yet).	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	435005b6a5	test: topology: fix a skipped test The `test_partial_upgrade_can_be_finished_with_removenode` test does not work because the `cql` variable is used before it is declared. It was not noticed because the test is marked as skipped, and does not work for the non-raft cluster feature implementation. The variable declaration is moved higher and the test now works; it will be used to test the raft cluster feature implementation.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	0e29abae8e	storage_service: add injection to prevent enabling features Adds the `raft_topology_suppress_enabling_features` error injection which, while enabled, prevents the topology coordinator from enabling features.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	b0c57f34d2	storage_service: initialize enabled features from first node The first node in the cluster defines it and it does not need to consult with anybody whether its features should be enabled or not. We can immediately mark those features as enabled in raft when the first node inserts its join request to the topology table.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	82fc6d9360	topology_state_machine: add size(), is_empty() The latter method will be used in the next commit.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	232f2b49d2	group0_state_machine: enable features when applying cmds/snapshots As declared in the previous commit, the group0 state machine now enables features on command application and snapshot transfer.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	7c309549d6	persistent_feature_enabler: attach to gossip only if not using raft The enable_features_on_join function is now only called if the node does not use topology over raft, and so the node will not react to changes in gossip features. In the future, support for switching to topology coordinator in runtime will be added and the persistent feature enabler should disconnect itself during the upgrade procedure. We don't have such procedure yet, so a bunch of TODOs is added instead.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	3c1ca12e62	feature_service: enable and check raft cluster features on startup The enable_features_on_startup method is adjusted for the raft-based cluster features. In topology coordinator mode: - Information about enabled features is taken from system.topology instead of the usual system.scylla_local (`enabled_features` key). - Features which, according to the local state, are supported by all nodes but not enabled yet are also checked. Support for such features cannot be revoked safely because the topology coordinator might have performed a successful global barrier and might have proceeded with marking the feature as enabled.	2023-08-01 18:54:58 +02:00
Piotr Dulikowski	61a44e0bc0	storage_service: provide raft_topology_change_enabled flag from outside Information about whether we are using topology changes on raft or not will be soon necessary for the persistent feature enabler, so that it can do some additional checks based on the local raft topology state.	2023-08-01 18:54:57 +02:00
Piotr Dulikowski	5a45301ac8	storage_service: enable features in topology coordinator If the topology coordinator notices that there are no nodes requesting to be joined, no topology operations in progress and there are some features that are declared to be supported by all normal nodes but not enabled yet, the topology coordinator will attempt to enable those features. This is done in the following way, under a group 0 guard: - A global `barrier_after_feature_update` is performed to make sure that: - All nodes have already updated their supported_features column after boot and won't attempt to revoke any during current runtime, - Saw and persisted the latest topology state so that, after restart, the feature check won't allow them to revoke support for features that the topology coordinator is going to enable. - After the barrier succeeds, the coordinator tries to add the features to the `enabled_features` column.	2023-08-01 18:54:57 +02:00
Benny Halevy	e7f9700836	gossiper: replicate: ensure non-null permit Ensure that replicate is called under lock_endpoint to serialize endpoint state changes on all shards. Otherwise, we may end up with incosistent state across shards. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-01 17:42:02 +03:00
Benny Halevy	cf7858d960	gossiper: add_saved_endpoint: lock_endpoint Modify and replicate the endpoint state must be done under the lock_endpoint. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-01 17:42:02 +03:00
Benny Halevy	6fdec20b59	gossiper: mark_as_shutdown: lock_endpoint The function manipulates the endpoint state and calls replicate and mark_dead, therefore it must ensure this is done under lock_endpoint. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-01 17:42:02 +03:00

1 2 3 4 5 ...

38291 Commits