scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-06 15:03:06 +00:00

Author	SHA1	Message	Date
Beni Peled	bddac3279e	Skip the backport-label workflow for draft pull requests It's not necessary (and annoying) when this workflow runs and fails against PRs in draft mode Closes scylladb/scylladb#17864	2024-03-18 14:42:55 +02:00
Wojciech Mitros	efcb718e0a	mv: adjust memory tracking of single view updates within a batch Currently, when dividing memory tracked for a batch of updates we do not take into account the overhead that we have for processing every update. This patch adds the overhead for single updates and joins the memory calculation path for batches and their parts so that both use the same overhead. Fixes #17854 Closes scylladb/scylladb#17855	2024-03-18 14:31:54 +02:00
Raphael S. Carvalho	2c9b13d2d1	compaction: Check for key presence in memtable when calculating max purgeable timestamp It was observed that some use cases might append old data constantly to memtable, blocking GC of expired tombstones. That's because timestamp of memtable is unconditionally used for calculating max purgeable, even when the memtable doesn't contain the key of the tombstone we're trying to GC. The idea is to treat memtable as we treat L0 sstables, i.e. it will only prevent GC if it contains data that is possibly shadowed by the expired tombstone (after checking for key presence and timestamp). Memtable will usually have a small subset of keys in largest tier, so after this change, a large fraction of keys containing expired tombstones can be GCed when memtable contains old data. Fixes #17599. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#17835	2024-03-18 13:37:44 +02:00
Benny Halevy	2c0b1d1fa7	compaction: get_max_purgeable_timestamp: optimize sstable filtering by min_timestamp There is no point in checking `sst->filter_has_key(*hk)` if the sstable contains no data older than the running minimum timestamp, since even if it matches, it won't change the minimum. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#17839	2024-03-18 13:26:49 +02:00
Avi Kivity	ed211cd0bf	sstables: partition_index_cache: reindent Fix up after `e120ba3514`. Closes scylladb/scylladb#17847	2024-03-18 13:23:21 +02:00
Andrei Chekun	b6edf056ea	Add sanity tests for multi dc Fix writing cassandra-rackdc.properties with correct format data instead of yaml Add a parameter to overwrite RF for specific DC Add the possibility to connect cql to the specific node In this PR 4 tests were added to test multi-DC functionality. One is added from initial commit were multi-DC possibility were introduced, however, this test was not commited. Three of them are migrations from dtest, that later will be deleted. To be able to execute migrated tests additional functionality is added: the ability to connect cql to the specific node in the cluster instead of pooled connection and the possibility to overwrite the replication factor for the specific DC. To be able to use the multi DC in test.py issue with the incorrect format of the properties file fixed in this PR. Closes scylladb/scylladb#17503	2024-03-18 13:00:36 +02:00
Nadav Har'El	680e37c4af	Merge 'schema_tables: unfreeze frozen_mutation:s gently' from Avi Kivity With large schemas, unfreezing can stall, especially as it requires a lot of memory. Switch to a gentle version that will not stall. As a preparation step, we add unfreeze_gently() for a span of mutations. Fixes #17841 Closes scylladb/scylladb#17842 * github.com:scylladb/scylladb: schema_tables: unfreeze frozen_mutation:s gently frozen_mutation: add unfreeze_gently(span<frozen_mutation>)	2024-03-18 12:56:44 +02:00
Kefu Chai	fe28aac440	test/perf: add fmt::formatter for perf_result_with_aio_writes before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `perf_result_with_aio_writes`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17849	2024-03-18 12:53:39 +02:00
Botond Dénes	a4e8bea679	tools/scylla-nodetool: status: handle missing host_id Newly joining nodes may not have a host id yet. Handle this and print a "?" for these nodes, instead of the host-id. Extend the existing test for joining node case (also rename it and add comment). Closes scylladb/scylladb#17853	2024-03-18 12:26:59 +02:00
Avi Kivity	731b5c5120	schema_tables: unfreeze frozen_mutation:s gently With large schemas, unfreezing can stall, especially as it requires a lot of memory. Switch to a gentle version that will not stall.	2024-03-17 17:46:02 +02:00
Avi Kivity	a34edb0a93	frozen_mutation: add unfreeze_gently(span<frozen_mutation>) While we have unfreeze(vector<frozen_mutation>), a gentle version is preferred.	2024-03-17 17:45:30 +02:00
Kefu Chai	8811900602	build: cmake: do not link randomized_nemesis_test with replication.cc test/raft/replication.cc defines a symbol named `tlogger`, while test/raft/randomized_nemesis_test.cc also defines a symbol with the same name. when linking the test with mold, it identified the ODR violation. in this change, we extract test-raft-helper out, so that randomized_nemesis_test can selectively only link against this library. this also matches with the behavior of the rules generated by `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17836	2024-03-17 17:01:47 +02:00
Kefu Chai	e1ae36ecfd	test/boost: add formatter for BOOST_REQUIRE_EQUAL in gossiping_property_file_snitch_test, we use `BOOST_REQUIRE_EQUAL(dc_racks[i], dc_racks[0])` to check the equality of two instances of `pair<sstring, sstring`, like: ```c++ BOOST_REQUIRE_EQUAL(dc_racks[i], dc_racks[0]) ``` since the standard library does not provide the formatter for printing `std::pair<>`, we rely on the homebrew generic formatter to print `std::pair<>, which in turn uses operator<< to format the elements in the `pair`, but we intend to remove this formatter in future, as the last step of #13245 . so in order to enable Boost.test to print out lhs and rhs when `BOOST_REQUIRE_EQUAL` check fails, we are adding `boost_test_print_type()` for `pair<sstring,sstring>`. the helper function uses {fmt} to print the `pair<>`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17831	2024-03-17 16:58:39 +02:00
Kefu Chai	6244a2ae00	service:qos: add fmt::formatter for service_level_options::workload_type this change prepares for the fmt::formatter based formatter used by tests, which will use {fmt} to print the elements in a container, so we need to define the formatter using fmt::formatter for these element. the operator<< for service_level_options::workload_type is preserved, as the tests are still using it. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17837	2024-03-17 16:52:57 +02:00
Kefu Chai	7df3acd39c	repair: add fmt::formatter for row_level_diff_detect_algorithm before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for row_level_diff_detect_algorithm. please note, we already have `format_as()` overload for this type, but we cannot use it as a fallback of the proper `fmt::formatter<>` specialization before {fmt} v10. so before we update our CI to a distro with {fmt} v10, `fmt::formatter<row_level_diff_detect_algorithm>` is still needed. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17824	2024-03-16 19:12:49 +02:00
Botond Dénes	03c47bc30b	tools/scylla-nodetool: status: handle nodes without load Some nodes may not have a load yet. Handle this. Also add a test covering this case. Closes scylladb/scylladb#17823	2024-03-16 17:38:53 +02:00
Pavel Emelyanov	42a2dce4b6	test/lib: Eliminate variadic futures from template The assert_that_failed(future) pair of helpers are templates with variadic futures, but since they are gone in seastar, so should they in test/lib Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17830	2024-03-16 17:37:25 +02:00
Kefu Chai	8bab51733f	db: add fmt::formatter for db::functions::function before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `db::functions::function`. please note, because we use `std::ostream` as the parameter of the polymorphism implementation of `function::print()`. without an intrusive change, we have to use `fmt::ostream_formatter` or at least use similar technique to format the `function` instance into an instance of `ostream` first. so instead of implementing a "native" `fmt::formatter`, in this change, we just use `fmt::ostream_formatter`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17832	2024-03-16 17:36:49 +02:00
Kefu Chai	23e9958ebb	data_dictionary: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17826	2024-03-15 21:17:11 +03:00
Botond Dénes	ad9bad4700	tools/scylla-nodetool: {proxy,table}histograms: handle empty histograms Empty histograms are missing some of the members that non-empty histograms have. The code handling these histograms assumed all required members are always present and thus error out when receiving an empty histogram. Add tests for empty histograms and fix the code handling them to check for the potentially missing members, instead of making assumptions. Closes scylladb/scylladb#17816	2024-03-15 15:59:31 +03:00
Artsiom Mishuta	73ed4c0eb5	test.py: fix aiohttp usage issue in python 3.12 Fix aiohttp usage issue in python 3.12: "Timeout context manager should be used inside a task" This occurs due to UnixRESTClient created in one event loop (created inside pytest) but used in another (created in rewriten event_loop fixture), now it is fixed by updating UnixRESTClient object for every new loop. Closes scylladb/scylladb#17760	2024-03-15 11:17:29 +01:00
Nadav Har'El	6cdb68f094	test/cql-pytest: remove unused function Remove an unused function from test/cql-pytest/test_using_timeout.py. Some linters can complain that this function used re.compile(), but the "re" package was never imported. Since this function isn't used, the right fix is to remove it - and not add the missing import. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#17801	2024-03-15 09:56:30 +02:00
Kefu Chai	e1a9340cc1	partition_version: add fmt::formatter for partition_entry::printer before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `parition_entry::printer`, and drop its operator<< . Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17812	2024-03-15 09:52:27 +02:00
Kefu Chai	a0625261ef	build: cmake: reword the comment for dev-headers before this change, the comment was difficult to parse. let's update it for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17814	2024-03-15 09:51:47 +02:00
Kefu Chai	640d573106	schema_mutations: add fmt::formatter for schema_mutations before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `schema_mutations`, and drop its operator<< . Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17815	2024-03-15 09:49:56 +02:00
Kefu Chai	3edd530bd1	test/boost: add formatter for BOOST_REQUIRE_EQUAL before this change, we rely on the homebrew generic formatter to print unordered_set<>, which in turn uses operator<< to format the elements in the `unordered_set`, but we intend to remove this formatter in future, as the last step of #13245 . so enable Boost.test to print out lhs and rhs when `BOOST_REQUIRE_EQUAL` check fails, we are adding `boost_test_print_type()` for `unordered_set<fruit>`. the helper function uses {fmt} to print the `unordered_set<>`, so we are adding a fmt::formatter for `fruit`, the operator<< for this type is dropped, as it is not used anymore. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17813	2024-03-15 09:40:22 +02:00
Benny Halevy	530d270828	api: /storage_service/tablets/balancing: fix incorrect operation summary It was probably copy-pasted from /storage_service/tablets/move Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#17811	2024-03-14 22:52:57 +01:00
Tomasz Grabiec	8c5d088928	Merge 'Drop tablets of dropped views and indices' from Benny Halevy This series adds notification before dropping views and indices so that the tablet_allocator can generate mutations to respectively drop all tablets associated with them from system.tablets. Additional unit tests were added for these cases. Note that one case is not yet tested: where a table is allowed to be dropped while having views that depend on it, when it is dropped from the alternator path. This is tested indirectly by testing dropping a table with live secondary index as it follows the same notification path as views in this series. Fixes #17627 Closes scylladb/scylladb#17773 * github.com:scylladb/scylladb: migration_manager: notify before_drop_column_family when dropping indices schema_tables: make_update_indices_mutations: use find_schema to lookup the view of dropped indices migration_manager: notify before_drop_column_family before dropping views cql-pytest: test_tablets: add test_tablets_are_dropped_when_dropping_table tablet_allocator: on_before_drop_column_family: remove unused result variable	2024-03-14 22:52:29 +01:00
Raphael S. Carvalho	c46c2d436f	sstables: Reduce cost for loading sstables with tablets Loader was changed to quickly determine ownership after consuming sharding metadata only. If it's not available, it falls back to reading first and last keys from summary. The fallback is only there for backward compatibility and it costs a lot more as we don't skip to the end where keys are located in summary. With tablets, sharding metadata is only first and last keys so we can do it without sharder. So loader will be able to use it instead of looking up keys in summary. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#17805	2024-03-14 21:06:35 +01:00
Pavel Emelyanov	8ffb5f27c7	topology_coordinator: Clear tablet transition session after streaming When jumping from streaming stage into cleanup_target, session must also be cleared as pending replica may still process some incoming mutations blocked in the pipeline. Deleting session prior to executing barrier makes sure those mutations will not be applied. fixes: #17682 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17800	2024-03-14 20:35:00 +01:00
Pavel Emelyanov	6a77f36519	doc: Add tablets migration state diagram Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17790	2024-03-14 20:29:21 +01:00
Benny Halevy	5bfca73b30	migration_manager: notify before_drop_column_family when dropping indices Fixes #17627 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-03-14 20:19:12 +02:00
Benny Halevy	9cf6a2e510	schema_tables: make_update_indices_mutations: use find_schema to lookup the view of dropped indices When dropping indices, we don't need to go through `create_view_for_index` in order to drop the index. That actually creates a new schema for this view which is used just for its metadata for generating mutations dropping it. Instead, use `find_schema` to lookup the current schema for the dropped index. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-03-14 20:19:11 +02:00
Benny Halevy	358e92e645	migration_manager: notify before_drop_column_family before dropping views Call the before_drop_column_family notifications before dropping the views to allow the tablet_allocator to delete the view's tablets. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-03-14 20:14:56 +02:00
Avi Kivity	5e28bf9b5c	Merge 'Do not try to balance tablets on nodes which are known to be down' from Pavel Emelyanov Tablet transition would get stuck anyway for such nodes, so it's not worth trying refs: #16372 (not fixes, because there's also repair transitions with same problem) Closes scylladb/scylladb#17796 * github.com:scylladb/scylladb: topology_coordinator: Skip dead nodes when balancing tablets test: Add test for load_balancer skiplist tablet_allocator: Add skiplist to load_balancer	2024-03-14 18:47:51 +02:00
Avi Kivity	0f188f2d9f	Merge 'tools/scylla-nodetool: implement the status command' from Botond Dénes The status command has an extensive amount of requests to the server. To be able to handle this more easily, the rest api mock server is refactored extensively to be more flexible, accepting expected requests out-of-order. While at it, the rest api mock server also moves away from a deprecated `aiohttp` feature: providing custom router argument to the `aiohttp` app. This forces us to pre-register all API endpoints that any test currently uses, although due to some templateing support, this is not as bad as it sounds. Still, this is an annoyance, but this point we have implemented almost all commands, so this won't be much a of a problem going forward. Refs: https://github.com/scylladb/scylladb/issues/15588 Closes scylladb/scylladb#17547 * github.com:scylladb/scylladb: tools/scylla-nodetool: implement the status command test/nodetool: rest_api_mock.py: match requests out-of-order test/nodetool: rest_api_mock.py: remove trailing / from request paths test/nodetool: rest_api_mock.py: use static routes test/nodetool: check only non-exhausted requests tools/scylla-nodetool: repair: set the jobThreads request parameter	2024-03-14 18:42:54 +02:00
Kamil Braun	5ef47c42b3	Merge 'remove_rpc_client_with_ignored_topology: recreate rpc client earlier' from Petr Gusev It's too late to call `remove_rpc_client_with_ignored_topology` on messaging service when a node becomes normal. Data plane requests can be routed to the node much earlier, at least when topology switches to `write_both_read_new`. The `remove_rpc_client_with_ignored_topology` function shutdowns sockets and causes such requests to timeout. In this PR we move the `remove_rpc_client_with_ignored_topology` call to the earliest point possible when a node first appears in `token_metadata.topology`. From the topology coordinator perspective this happens when a joining node moves to `node_state::bootstrapping` and the topology moves to `transition_state::join_group0`. In `sync_raft_topology_nodes` the node should be contained in transition_nodes. The successful `wait_for_ip` before entering `transition_state::join_group0` ensures that update_topology should find a node's IP and put it into the topology. The barrier in `commit_cdc_generation` will ensure that all nodes in the cluster are using the proper connection parameters. Only outgoing connections are tracked by `remove_rpc_client_with_ignored_topology`, those created by the current node. This means we need to call `remove_rpc_client_with_ignored_topology` on each node of the cluster. fixes scylladb/scylladb#17445 Closes scylladb/scylladb#17757 * github.com:scylladb/scylladb: test_remove_rpc_client_with_pending_requests: add a regression test remove_rpc_client_with_ignored_topology: call it earlier storage_service: decouple remove_rpc_client_with_ignored_topology from notify_joined	2024-03-14 17:20:59 +01:00
Yaniv Kaul	a2ac80340f	Typo: pint -> print Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#17804	2024-03-14 15:50:35 +02:00
Wojciech Mitros	59d5bfa742	mv: fail base writes instead of dropping view updates when overloaded Since `4c767c379c` we can reach a situation where we know that we have admitted too many expensive view update operations and the mechanism of dropping the following view updates can be triggerred in a wider range of scenarios. Ideally, we would want to fail whole requests on the coordinator level, but for now, we change the behavior to failing just the base writes. This allows us to avoid creating inconsistencies between base replicas and views at the cost of introducing inconsistencies between different base replicas. This, however, can be fixed by repair, in contrast to base-view inconsistencies which we don't have a good method of fixing. Fixes #17795 Closes scylladb/scylladb#17777	2024-03-14 15:11:45 +02:00
Aleksandra Martyniuk	43ef6e6ab9	test: fix regular compaction tasks check Since `6b87778` regular compaction tasks are removed from task manager immediately after they are finished. test_regular_compaction_task lists compaction tasks and then requests their statuses. Only one regular compaction task is guaranteed to still be running at that time, the rest of them may finish before their status is requested and so it will no longer be in task manager, causing the test to fail. Fix statuses check to consider the possibility of a regular compaction task being removed from task manager. Fixes: #17776. Closes scylladb/scylladb#17784	2024-03-14 14:40:18 +02:00
Piotr Smaron	ad2d039e3d	db: move all group 0 tables to schema commitlog This is to have durability for the group0 tables. But also because I need it specifially to make `system.topology` & `system_schema.scylla_keyspaces` mutations under a single raft command in https://github.com/scylladb/scylladb/pull/16723 Fixes: #15596 Closes scylladb/scylladb#17783	2024-03-14 13:33:30 +01:00
Piotr Dulikowski	2d9e78b09a	gossiper: failure detector: don't handle directly removed live endpoints Commit `0665d9c346` changed the gossiper failure detector in the following way: when live endpoints change and per-node failure detectors finish their loops, the main failure detector calls gossiper::convict for those nodes which were alive when the current iteration of the main FD started but now are not. This was changed in order to make sure that nodes are marked as down, because some other code in gossiper could concurrently remove nodes from the live node lists without marking them properly. This was committed around 3 years ago and the situation changed: - After `75d1dd3a76` the `endpoint_state::_is_alive` field was removed and liveness of a node is solely determined by its presence in the `gossiper::_live_endpoints` field. - Currently, all gossiper code which modifies `_live_endpoints` takes care to trigger relevant callback. The only function which modifies the field but does not trigger notifications is `gossiper::evict_from_membership`, but it is either called after `gossiper::remove_endpoint` which triggers callbacks by itself, or when a node is already dead and there is no need to trigger callbacks. So, it looks like the reasons it was introduced for are not relevant anymore. What's more important though is that it is involved in a bug described in scylladb/scylladb#17515. In short, the following sequence of events may happen: 1. Failure detector for some remote node X decides that it was dead long enough and `convict`s it, causing live endpoints to be updated. 2. The gossiper main loop sends a successful echo to X and decides to mark it as alive. 3. At the same time, failure detector for all nodes other than X finish and main failure detector continues; it notices that node X is not alive (because it was convicted in point 1.) and decides to convict it. 4. Actions planned in 2 and 3 run one after another, i.e. node is first marked as alive and then immediately as dead. This causes `on_alive` callbacks to run first and then `on_dead`. The second one is problematic as it closes RPC connections to node X - in particular, if X is in the process of replacing another node with the same IP then it may cause the replace operation to fail. In order to simplify the code and fix the bug - remove the piece of logic in question. Fixes: scylladb/scylladb#17515 Closes scylladb/scylladb#17754	2024-03-14 13:29:17 +01:00
Botond Dénes	d6103dc1b6	tools/scylla-nodetool: snapshot: handle ks.tbl positional args correctly Nodetool currently assumes that positional arguments are only keyspaces. ks.tbl pairs are only provided when --kt-list or friends are used. This is not the case however. So check positional args too, and if they look like ks.tbl, handle them accordingly. While at it, also make sure that alternator keyspace and tables names are handled correctly. Closes scylladb/scylladb#17480	2024-03-14 13:42:23 +02:00
Avi Kivity	dd76e1c834	Merge 'Simplify error_injection::inject_with_handler()' from Pavel Emelyanov The method in question can have a shorter name that matches all other injections in this class, and can be non-template Closes scylladb/scylladb#17734 * github.com:scylladb/scylladb: error_injection: De-template inject() with handler error_injection: Overload inject() instead of inject_with_handler()	2024-03-14 13:37:54 +02:00
Petr Gusev	2783985bb2	test_remove_rpc_client_with_pending_requests: add a regression test This test reproduces the problem from scylladb/scylladb#17445. It fails quite reliably without the fix from the previous commit. The test just bootstraps a new node while bombarding the cluster with read requests.	2024-03-14 15:17:34 +04:00
Petr Gusev	398e14d6d0	remove_rpc_client_with_ignored_topology: call it earlier In this commit we move the remove_rpc_client_with_ignored_topology call to the earliest point possible - when a node first appears in token_metadata.topology. From the topology coordinator perspective this happens when a joining node moves to node_state::bootstrapping and the topology moves to transition_state::join_group0. In sync_raft_topology_nodes the node should be contained in transition_nodes. The successful wait_for_ip before entering transition_state::join_group0 ensures that update_topology should find a node's IP and put it into the topology. The barrier in commit_cdc_generation will ensure that all nodes in the cluster are using the proper connection parameters. Only outgoing connections are tracked by remove_rpc_client_with_ignored_topology, those created by the current node. This means we need to call remove_rpc_client_with_ignored_topology on each node of the cluster. fixes scylladb/scylladb#17445	2024-03-14 15:10:09 +04:00
Petr Gusev	1b9f21314f	storage_service: decouple remove_rpc_client_with_ignored_topology from notify_joined It's too late to call remove_rpc_client_with_ignored_topology on messaging service when a node becomes normal. Data plane requests can be routed to the node much earlier, at least when topology switches to write_both_read_new. The remove_rpc_client_with_ignored_topology function shutdowns sockets and causes such requests to timeout. We intend to call remove_rpc_client_with_ignored_topology as soon as a node becomes part of token_metadata topology. In this preparatory commit we refactor storage_service::notify_joined. We remove the remove_rpc_client_with_ignored_topology call from it call it separately from the two call sites of notify_joined.	2024-03-14 15:10:09 +04:00
Kefu Chai	ce17841860	tools/scylla-nodetool: print bpo::options_description with fmt::streamed before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, since boost::program_options::options_description is defined by boost.program_options library, and it only provides the operator<< overload. we're inclined to not specializing `fmt::formatter` for it at this moment, because * this class is not in defined by scylla project. we would have to find a home for this formatter. * we are not likely to reuse the formatter in multiple places so, in this change we just print it using `fmt::streamed`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17791	2024-03-14 10:44:32 +02:00
Pavel Emelyanov	33d258528e	topology_coordinator: Skip dead nodes when balancing tablets The coordinator can find out which nodes are marked as DOWN, thus when calling tablets balancer it can feed it a skiplist Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-14 10:51:11 +03:00
Pavel Emelyanov	ee55e8442a	test: Add test for load_balancer skiplist The test is inspired by the test_load_balancing_with_empty_node one and verifies that when a node is skiplisted, balancer doesn't put load on it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-14 10:50:21 +03:00

1 2 3 4 5 ...

41849 Commits