scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-06 23:13:15 +00:00

Author	SHA1	Message	Date
Jenkins Promoter	019eee28f8	Update pgo profiles - x86_64	2026-01-01 04:06:12 +02:00
Gleb Natapov	0c5780590a	raft topology: Notify that a node was removed only once Raft topology goes over all nodes in a 'left' state and triggers 'remove node' notification in case id/ip mapping is available (meaning the node left recently), but the problem is that, since the mapping is not removed immediately, when multiple nodes are removed in succession a notification for the same node can be sent several times. Fix that by sending notification only if the node still exists in the peers table. It will be removed by the first notification and following notification will not be sent. Closes scylladb/scylladb#27743 (cherry picked from commit `4a5292e815`) Closes scylladb/scylladb#27911	2025-12-30 11:24:02 +01:00
Gleb Natapov	207e302bbf	topology coordinator: set session id for streaming at the correct time Commit `d3efb3ab6f` added streaming session for rebuild, but it set the session and request submission time. The session should be set when request starts the execution, so this patch moved it to the correct place. Closes scylladb/scylladb#27757 (cherry picked from commit `04976875cc`) Closes scylladb/scylladb#27865	2025-12-28 13:34:01 +02:00
Ferenc Szili	2537b9b818	test: fix flakyness caused by TRUNCATE retries The test test_truncate_during_topology_change tests TRUNCATE TABLE while bootstrapping a new node. With tablets enabled TRUNCATE is a global topology operation which needs to serialize with boostrap. When TRUNCATE TABLE is issued, it first checks if there is an already queued truncate for the same table. This can happen if a previous TRUNCATE operation has timed out, and the client retried. The newly issued truncate will only join the queued one if it is waiting to be processed, and will fail immediatelly if the TRUNCATE is already being processed. In this test, TRUNCATE will be retried after a timeout (1 minute) due to the default retry policy, and will be retried up to 3 times, while the bootstrap is delayed by 2 minutes. This means that the test can validate the result of a truncate which was started after bootstrap was completed. Because of the way truncate joins existing truncate operations, we can also have the following scenario: - TRUNCATE times out after one minute because the new node is being bootstrapped - the client retries the TRUNCATE command which also times out after 1m - the third attempt is received during TRUNCATE being processed which fails the test This patch changes the retry policy of the TRUNCATE operation to FallthroughRetryPolicy which guarantees that TRUNCATE will not be retried on timeout. It also increases the timeout of the TRUNCATE from 1 to 4 minutes. This way the test will actually validate the performance of the TRUNCATE operation which was issued during bootstrap, instead of the subsequent, retried TRUNCATEs which could have been issued after the bootstrap was complete. Fixes: #26347 Closes scylladb/scylladb#27245 (cherry picked from commit `d883ff2317`) Closes scylladb/scylladb#27505	2025-12-23 17:08:54 +02:00
Yaron Kaikov	5ff70064e5	auto-backport.py: modify instruction for making PR ready for review Update the comment sent when PR has conflicts with clear instrauctions how to make the PR Ready for review Fixes: https://scylladb.atlassian.net/browse/RELENG-152 Closes scylladb/scylladb#27547 (cherry picked from commit `d3e199984e`) Closes scylladb/scylladb#27563	2025-12-22 15:17:23 +02:00
Anna Stuchlik	ac4d5a0bea	doc: remove the links to the Download Center This commit removes the remaining links to the Download Center on the website. We no longer use it for installation, and we don't want users to infer that something like that still exists. Fixes https://github.com/scylladb/scylladb/issues/27753 Closes scylladb/scylladb#27756 (cherry picked from commit `f65db4e8eb`) Closes scylladb/scylladb#27781	2025-12-21 19:23:26 +02:00
Emil Maskovsky	bfcac7547b	test/raft: fix race condition in failure_detector_test The test had a sporadic failure due to a broken promise exception. The issue was in `test_pinger::ping()` which captured the promise by move into the subscription lambda, causing the promise to be destroyed when the lambda was destroyed during coroutine unwinding. Simplify `test_pinger::ping()` by replacing manual abort_source/promise logic with `seastar::sleep_abortable()`. This removes the risk of promise lifetime/race issues and makes the code simpler and more robust. Fixes: scylladb/scylladb#27136 Backport to active branches: This fixes a CI test issue, so it is beneficial to backport the fix. As this is a test-only fix, it is a low risk change. Closes scylladb/scylladb#27737 (cherry picked from commit `2a75b1374e`) Closes scylladb/scylladb#27780	2025-12-21 14:15:09 +02:00
Patryk Jędrzejczak	0a7d71663a	Merge '[Backport 2025.2] topology_coordinator: handle seastar::abort_requested_exception alongside raft::request_aborted' from Scylladb[bot] In several exception handlers, only `raft::request_aborted` was being caught and rethrown, while `seastar::abort_requested_exception` was falling through to the generic catch(...) block. This caused the exception to be incorrectly treated as a failure that triggers rollback, instead of being recognized as an abort signal. For example, during tablet draining, the error log showed: "tablets draining failed with seastar::abort_requested_exception (abort requested). Aborting the topology operation" This change adds `seastar::abort_requested_exception` handling alongside `raft::request_aborted` in all places where it was missing. When rethrown, these exceptions propagate up to the main `run()` loop where `handle_topology_coordinator_error()` recognizes them as normal abort signals and allows the coordinator to exit gracefully without triggering unnecessary rollback operations. Fixes: scylladb/scylladb#27255 No backport: The problem was only seen in tests and not reported in customer tickets, so it's enough to fix it in the main branch. - (cherry picked from commit `37e3dacf33`) Parent PR: #27314 Closes scylladb/scylladb#27661 * https://github.com/scylladb/scylladb: topology_coordinator: handle seastar::abort_requested_exception alongside raft::request_aborted topology_coordinator: consistently rethrow `raft::request_aborted` for direct/global commands	2025-12-20 19:33:31 +01:00
Patryk Jędrzejczak	eb4523fd03	Merge '[Backport 2025.2] Make direct failure detector verb handler more efficient' from Scylladb[bot] We saw that in large clusters direct failure detector may cause large task queues to be accumulated. The series address this issue and also moves the code into the correct scheduling group. Fixes https://github.com/scylladb/scylladb/issues/27142 Backport to all version where `60f1053087` was backported to since it should improve performance in large clusters. - (cherry picked from commit `82f80478b8`) - (cherry picked from commit `6a6bbbf1a6`) - (cherry picked from commit `86dde50c0d`) Parent PR: #27387 Closes scylladb/scylladb#27480 * https://github.com/scylladb/scylladb: direct_failure_detector: run direct failure detector in the gossiper scheduling group raft: drop invoke_on from the pinger verb handler direct_failure_detector: pass timeout to direct_fd_ping verb idl, message: make with_timeout and cancellable verb attributes composable	2025-12-19 16:48:26 +01:00
Emil Maskovsky	b333141924	topology_coordinator: handle seastar::abort_requested_exception alongside raft::request_aborted In several exception handlers, only raft::request_aborted was being caught and rethrown, while seastar::abort_requested_exception was falling through to the generic catch(...) block. This caused the exception to be incorrectly treated as a failure that triggers rollback, instead of being recognized as an abort signal. For example, during tablet draining, the error log showed: "tablets draining failed with seastar::abort_requested_exception (abort requested). Aborting the topology operation" This change adds seastar::abort_requested_exception handling alongside raft::request_aborted in all places where it was missing. When rethrown, these exceptions propagate up to the main run() loop where handle_topology_coordinator_error() recognizes them as normal abort signals and allows the coordinator to exit gracefully without triggering unnecessary rollback operations. Fixes: scylladb/scylladb#27255 (cherry picked from commit `37e3dacf33`)	2025-12-19 16:26:01 +01:00
Michael Litvak	9c0528fc9f	view_builder: reduce log level for expected aborts during view creation When draining the view builder, we abort ongoing operations using the view builder's abort source, which may cause them to fail with abort_requested_exception or raft::request_aborted exceptions. Since these failures are expected during shutdown, reduce the log level in add_new_view from 'error' to 'debug' for these specific exceptions while keeping 'error' level for unexpected failures. Closes scylladb/scylladb#26297 (cherry picked from commit `6bc41926e2`) Closes scylladb/scylladb#27538	2025-12-18 16:24:35 +01:00
Emil Maskovsky	789eb79a6a	topology_coordinator: consistently rethrow `raft::request_aborted` for direct/global commands Ensure all direct and global topology commands rethrow the `raft::request_aborted` exception when aborted, typically due to leadership changes. This makes abortion explicit to callers, enabling proper handling such as retries or workflow termination. This change completes the work started in PR scylladb/scylladb#23962, covering all remaining cases where the exception was not rethrown. Fixes: scylladb/scylladb#23589 (cherry picked from commit `943af1ef1c`)	2025-12-17 16:20:41 +01:00
Jenkins Promoter	cc39e23be4	Update pgo profiles - aarch64	2025-12-15 04:40:18 +02:00
Jenkins Promoter	11b899680a	Update pgo profiles - x86_64	2025-12-15 04:03:01 +02:00
Jenkins Promoter	f11c584264	Update ScyllaDB version to: 2025.2.6	2025-12-14 14:30:35 +02:00
Benny Halevy	f06269dfca	utils: error_injection: wait_for_message: print injection_name and caller source_location on timeout When waiting for the condition variable times out we call on_internal_error, but unfortunately, the backtrace it generates is obfuscated by `coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume`. To make the log more useful, print the error injection name and the caller's source_location in the timeout error message. Fixes #27531 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#27532 (cherry picked from commit `5f13880a91`) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#27581	2025-12-12 14:20:30 +01:00
Anna Stuchlik	3553ae91c9	replace the Driver pages with a link to the new Drivers pages This commit removes the now redundant driver pages from the Scylla DB documentation. Instead, the link to the pages where we moved the diver information is added. Also, the links are updated across the ScyllaDB manual. Redirections are added for all the removed pages. Fixes https://github.com/scylladb/scylladb/issues/26871 Closes scylladb/scylladb#27277 (cherry picked from commit `c5580399a8`) Closes scylladb/scylladb#27438	2025-12-12 10:30:34 +01:00
Yaron Kaikov	4c20e64f4f	Add JIRA issue validation to backport PR fixes check Extend the Fixes validation pattern to also accept JIRA issue references (format: [A-Z]+-\d+) in addition to GitHub issue references. This allows backport PRs to reference JIRA issues in the format 'Fixes: PROJECT-123'. Fixes: https://github.com/scylladb/scylladb/issues/27571 Closes scylladb/scylladb#27572 (cherry picked from commit `3dfa5ebd7f`) Closes scylladb/scylladb#27597	2025-12-12 09:36:32 +02:00
Gleb Natapov	9ca1439b78	direct_failure_detector: run direct failure detector in the gossiper scheduling group When direct failure detector was introduces the idea was that it will run on the same connection raft group0 verbs are running, but in `60f1053087` raft verbs were moved to run on the gossiper connection while DIRECT_FD_PING was left where it was. This patch move it to gossiper connection as well and fix the pinger code to run in gossiper scheduling group. (cherry picked from commit `86dde50c0d`)	2025-12-09 16:55:00 +02:00
Gleb Natapov	1ee9d1cca5	raft: drop invoke_on from the pinger verb handler Currently raft direct pinger verb jumps to shard 0 to check if group0 is alive before replying. The verb runs relatively often, so it is not very efficient. The patch distributes group0 liveness information (as it changes) to all shard instead, so that the handler itself does not need to jump to shard 0. (cherry picked from commit `6a6bbbf1a6`)	2025-12-09 16:55:00 +02:00
Gleb Natapov	d6ec1ab5b9	direct_failure_detector: pass timeout to direct_fd_ping verb Currently direct_fd_ping runs without timeout, but the verb is not waited forever, the wait is canceled after a timeout, this timeout simply is not passed to the rpc. It may create a situation where the rpc callback can runs on a destination but it is no longer waited on. Change the code to pass timeout to rpc as well and return earlier from the rpc handler if the timeout is reached by the time the callback is called. This is backwards compatible since timeout is passed as optional. (cherry picked from commit `82f80478b8`)	2025-12-09 14:53:32 +02:00
Benny Halevy	61dd81539b	idl, message: make with_timeout and cancellable verb attributes composable And define `send_message_timeout_cancellable` in rpc_protocol_impl.hh using the newly introduced rpc_handler entry point in seastar that accepts both timeout and cancellable params. Note that the interface to the user still uses abort_source while internally the funtion allocates a seastar::rpc::cancellable object. It is possible to provide an interface that will accept a rpc::cancellable& from the caller, but the existing messaging api uses abort_source. Changing it may be considered in the future. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `0b97806771`)	2025-12-09 14:53:32 +02:00
Tomasz Grabiec	104cad1424	Merge '[Backport 2025.2] address_map: Use more efficient and reliable replication method' from Scylladb[bot] Primary issue with the old method is that each update is a separate cross-shard call, and all later updates queue behind it. If one of the shards has high latency for such calls, the queue may accumulate and system will appear unresponsive for mapping changes on non-zero shards. This happened in the field when one of the shards was overloaded with sstables and compaction work, which caused frequent stalls which delayed polling for ~100ms. A queue of 3k address updates accumulated, because we update mapping on each change of gossip states. This made bootstrap impossible because nodes couldn't learn about the IP mapping for the bootstrapping node and streaming failed. To protect against that, use a more efficient method of replication which requires a single cross-shard call to replicate all prior updates. It is also more reliable, if replication fails transiently for some reason, we don't give up and fail all later updates. Fixes #26865 - (cherry picked from commit `ed8d127457`) - (cherry picked from commit `4a85ea8eb2`) - (cherry picked from commit `f83c4ffc68`) Parent PR: #26941 Closes scylladb/scylladb#27187 * github.com:scylladb/scylladb: address_map: Use barrier() to wait for replication address_map: Use more efficient and reliable replication method utils: Introduce helper for replicated data structures utils: add "fatal" version of utils::on_internal_error() scylla-2025.2.5-candidate-20251209122041 scylla-2025.2.5	2025-12-05 20:37:14 +01:00
Tomasz Grabiec	6c942a87c7	address_map: Use barrier() to wait for replication More efficient than 100 pings. There was one ping in test which was done "so this shard notices the clock advance". It's not necessary, since obsering completed SMP call implies that local shard sees the clock advancement done within in. (cherry picked from commit `f83c4ffc68`)	2025-12-05 13:25:14 +01:00
Tomasz Grabiec	db61dfb837	address_map: Use more efficient and reliable replication method Primary issue with the old method is that each update is a separate cross-shard call, and all later updated queue behind it. If one of the shards has high latency for such calls, the queue may accumulate and system will appear unresponsive for mapping changes on non-zero shards. This happened in the field when one of the shards was overloaded with sstables and compaction work, which caused frequent stalls which delayed polling for ~100ms. A queue of 3k address updates accumulated. This made bootstrap impossible, since nodes couldn't learn about the IP mapping for the bootstrapping node and streaming failed. To protect against that, use a more efficient method of replication which requires a single cross-shard call to replicate all prior updates. It is also more reliable, if replication fails transiently for some reason, we don't give up and fail all later updates. Fixes #26865 Fixes #26835 (cherry picked from commit `4a85ea8eb2`)	2025-12-05 13:25:14 +01:00
Tomasz Grabiec	3cc75afbbe	utils: Introduce helper for replicated data structures Key goals: - efficient (batching updates) - reliable (no lost updates) Will be used in data structures maintained on one designed owning shard and replicated to other shards. (cherry picked from commit `ed8d127457`)	2025-12-05 13:25:14 +01:00
Nadav Har'El	9d27db5e98	utils: add "fatal" version of utils::on_internal_error() utils::on_internal_error() is a wrapper for Seastar's on_internal_error() which does not require a logger parameter - because it always uses one logger ("on_internal_error"). Not needing a unique logger is especially important when using on_internal_error() in a header file, where we can't define a logger. Seastar also has a another similar function, on_fatal_internal_error(), for which we forgot to implement a "utils" version (without a logger parameter). This patch fixes that oversight. In the next patch, we need to use on_fatal_internal_error() in a header file, so the "utils" version will be useful. We will need the fatal version because we will encounter an unexpected situation during server destruction, and if we let the regular on_internal_error() just throw an exception, we'll be left in an undefined state. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `33476c7b06`)	2025-12-05 13:25:14 +01:00
Avi Kivity	79aa95bf76	database: fix overflow when computing data distribution over shards We store the per-shard chunk count in a uint64_t vector global_offset, and then convert the counts to offsets with a prefix sum: ```c++ // [1, 2, 3, 0] --> [0, 1, 3, 6] std::exclusive_scan(global_offset.begin(), global_offset.end(), global_offset.begin(), 0, std::plus()); ``` However, std::exclusive_scan takes the accumulator type from the initial value, 0, which is an int, instead of from the range being iterated, which is of uint64_t. As a result, the prefix sum is computed as a 32-bit integer value. If it exceeds 0x8000'0000, it becomes negative. It is then extended to 64 bits and stored. The result is a huge 64-bit number. Later on we try to find an sstable with this chunk and fail, crashing on an assertion. An example of the failure can be seen here: https://godbolt.org/z/6M8aEbo57 The fix is simple: the initial value is passed as uint64_t instead of int. Fixes https://github.com/scylladb/scylladb/issues/27417 Closes scylladb/scylladb#27418 (cherry picked from commit `9696ee64d0`)	2025-12-04 20:19:39 +02:00
Jenkins Promoter	284bd9fc01	Update ScyllaDB version to: 2025.2.5	2025-12-04 15:49:32 +02:00
Pavel Emelyanov	60a70389a3	Update seastar submodule (SIGABRT on assertion) * seastar 450e36d5d...7c86e59c7 (1): > util: make SEASTAR_ASSERT() failure generate SIGABRT Fixes #27127 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#27404	2025-12-04 13:01:07 +03:00
Calle Wilund	7338a42cc1	commitlog::read_log_file: Check for eof position on all data reads Fixes #24346 When reading, we check for each entry and each chunk, if advancing there will hit EOF of the segment. However, IFF the last chunk being read has the last entry _exactly_ matching the chunk size, and the chunk ending at _exactly_ segment size (preset size, typically 32Mb), we did not check the position, and instead complained about not being able to read. This has literally _never_ happened in actual commitlog (that was replayed at least), but has apparently happened more and more in hints replay. Fix is simple, just check the file position against size when advancing said position, i.e. when reading (skipping already does). v2: * Added unit test Closes scylladb/scylladb#27236 (cherry picked from commit `59c87025d1`) Closes scylladb/scylladb#27340	2025-12-03 12:23:06 +03:00
Aleksandra Martyniuk	43963c47e6	replica: database: change type of tables_metadata::_ks_cf_to_uuid If there is a lot of tables, a node reports oversized allocation in _ks_cf_to_uuid of type flat_hash_map. Change the type to std::unordered_map to prevent oversized allocations. Fixes: https://github.com/scylladb/scylladb/issues/26787. Closes scylladb/scylladb#27165 (cherry picked from commit `19a7d8e248`) Closes scylladb/scylladb#27195	2025-12-03 12:22:46 +03:00
Ernest Zaslavsky	7671134209	streaming:: add more logging Start logging all missed streaming options like `scope` and `skip_reshape` flags Fixes: https://github.com/scylladb/scylladb/issues/27299 Closes scylladb/scylladb#27311 (cherry picked from commit `1d5f60baac`) Closes scylladb/scylladb#27338	2025-12-02 12:17:48 +01:00
Jenkins Promoter	af570f75f1	Update pgo profiles - aarch64	2025-12-01 04:40:02 +02:00
Jenkins Promoter	3e6ef72872	Update pgo profiles - x86_64	2025-11-30 21:06:59 -05:00
Patryk Jędrzejczak	4a28929d40	Merge '[Backport 2025.2] locator/node: include _excluded in missing places' from Scylladb[bot] We currently ignore the `_excluded` field in `node::clone()` and the verbose formatter of `locator::node`. The first one is a bug that can have unpredictable consequences on the system. The second one can be a minor inconvenience during debugging. We fix both places in this PR. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-72 This PR is a bugfix that should be backported to all supported branches. - (cherry picked from commit `4160ae94c1`) - (cherry picked from commit `287c9eea65`) Parent PR: #27265 Closes scylladb/scylladb#27289 * https://github.com/scylladb/scylladb: locator/node: include _excluded in verbose formatter locator/node: preserve _excluded in clone()	2025-11-27 12:31:06 +01:00
Avi Kivity	af451b7997	Merge '[Backport 2025.2] fix notification about expiring erm held for to long' from Scylladb[bot] Commit `6e4803a750` broke notification about expired erms held for too long since it resets the tracker without calling its destructor (where notification is triggered). Fix the assign operator to call the destructor like it should. Fixes https://github.com/scylladb/scylladb/issues/27141 - (cherry picked from commit `9f97c376f1`) - (cherry picked from commit `5dcdaa6f66`) Parent PR: #27140 Closes scylladb/scylladb#27274 * github.com:scylladb/scylladb: test: test that expired erm that held for too long triggers notification token_metadata: fix notification about expiring erm held for to long	2025-11-27 12:24:56 +02:00
Patryk Jędrzejczak	0b2d303c00	locator/node: include _excluded in verbose formatter It can be helpful during debugging. (cherry picked from commit `287c9eea65`)	2025-11-26 23:04:12 +00:00
Patryk Jędrzejczak	d8b476e39f	locator/node: preserve _excluded in clone() We currently ignore the `_excluded` field in `clone()`. Losing information about exclusion can have unpredictable consequences. One observed effect (that led to finding this issue) is that the `/storage_service/nodes/excluded` API endpoint sometimes misses excluded nodes. (cherry picked from commit `4160ae94c1`)	2025-11-26 23:04:12 +00:00
Gleb Natapov	18dcbc1783	test: test that expired erm that held for too long triggers notification (cherry picked from commit `5dcdaa6f66`)	2025-11-26 15:08:07 +00:00
Gleb Natapov	9ff81cabed	token_metadata: fix notification about expiring erm held for to long Commit `6e4803a750` broke notification about expired erms held for too long since it resets the tracker without calling its destructor (where notification is triggered). Fix assign operator to call destructor. (cherry picked from commit `9f97c376f1`)	2025-11-26 15:08:07 +00:00
Ernest Zaslavsky	c1d83f47b3	streaming: fix loop break condition in tablet_sstable_streamer::stream Correct the loop termination logic that previously caused certain SSTables to be prematurely excluded, resulting in lost mutations. This change ensures all relevant SSTables are properly streamed and their mutations preserved. (cherry picked from commit `dedc8bdf71`) Closes scylladb/scylladb#27150 Fixes: #26979 Parent PR: #26980 Unfortunatelly the pytest based test cannot be ported back because of changes made to the testing harness and scylla-tools	2025-11-25 11:58:12 +03:00
Avi Kivity	b619fe2882	tools: toolchain: prepare: replace 'reg' with 'skopeo' The prepare scripts uses 'reg' to verify we're not going to overwrite an existing image. The 'reg' command is not available in Fedora 43. Use 'skopeo' instead. Skopeo is part of the podman ecosystem so hopefully will live longer. Fixes #27178. Closes scylladb/scylladb#27179 (cherry picked from commit `d6ef5967ef`) Closes scylladb/scylladb#27196	2025-11-24 19:57:38 +02:00
Raphael S. Carvalho	6ecee4deba	replica: Fail timed-out single-key read on cleaned up tablet replica Consider the following: 1) single-key read starts, blocks on replica e.g. waiting for memory. 2) the same replica is migrated away 3) single-key read expires, coordinator abandons it, releases erm. 4) migration advances to cleanup stage, barrier doesn't wait on timed-out read 5) compaction group of the replica is deallocated on cleanup 6) that single-key resumes, but doesn't find sstable set (post cleanup) 7) with abort-on-internal-error turned on, node crashes It's fine for abandoned (= timed out) reads to fail, since the coordinator is gone. For active reads (non timed out), the barrier will wait for them since their coordinator holds erm. This solution consists of failing reads which underlying tablet replica has been cleaned up, by just converting internal error to plain exception. Fixes #26229. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#27078 (cherry picked from commit `74ecedfb5c`) Closes scylladb/scylladb#27152 scylla-2025.2.4 scylla-2025.2.4-candidate-20251123021921	2025-11-21 17:46:21 +03:00
Patryk Jędrzejczak	5e237c0d0e	test: test_raft_recovery_stuck: ensure mutual visibility before using driver Not waiting for nodes to see each other as alive can cause the driver to fail the request sent in `wait_for_upgrade_state()`. scylladb/scylladb#19771 has already replaced concurrent restarts with `ManagerClient.rolling_restart()`, but it has missed this single place, probably because we do concurrent starts here. Fixes #27055 Closes scylladb/scylladb#27075 (cherry picked from commit `e35ba974ce`) Closes scylladb/scylladb#27108	2025-11-20 10:45:08 +02:00
Botond Dénes	ad3d182e90	Merge '[Backport 2025.2] Automatic cleanup improvements' from Scylladb[bot] This series allows an operator to reset 'cleanup needed' flag if he already cleaned up the node, so that automatic cleanup will not do it again. We also change 'nodetool cleanup' back to run cleanup on one node only (and reset 'cleanup needed' flag in the end), but the new '--global' option allows to run cleanup on all nodes that needed it simultaneously. Fixes https://github.com/scylladb/scylladb/issues/26866 Backport to all supported version since automatic cleanup behaviour as it is now may create unexpected by the operator load during cluster resizing. - (cherry picked from commit `e872f9cb4e`) - (cherry picked from commit `0f0ab11311`) Parent PR: #26868 Closes scylladb/scylladb#27091 * github.com:scylladb/scylladb: cleanup: introduce "nodetool cluster cleanup" command to run cleanup on all dirty nodes in the cluster cleanup: Add RESTful API to allow reset cleanup needed flag	2025-11-20 10:44:35 +02:00
Botond Dénes	a7856c3d52	Merge '[Backport 2025.2] service/qos: Fall back to default scheduling group when using maintenance socket' from Scylladb[bot] The service level controller relies on `auth::service` to collect information about roles and the relation between them and the service levels (those attached to them). Unfortunately, the service level controller is initialized way earlier than `auth::service` and so we had to prevent potential invalid queries of user service levels (cf. `46193f5e79`). Unfortunately, that came at a price: it made the maintenance socket incompatible with the current implementation of the service level controller. The maintenance socket starts early, before the `auth::service` is fully initialized and registered, and is exposed almost immediately. If the user attempts to connect to Scylla within this time window, via the maintenance socket, one of the things that will happen is choosing the right service level for the connection. Since the `auth::service` is not registered, Scylla with fail an assertion and crash. A similar scenario occurs when using maintenance mode. The maintenance socket is how the user communicates with the database, and we're not prepared for that either. To avoid unnecessary crashes, we add new branches if the passed user is absent or if it corresponds to the anonymous role. Since the role corresponding to a connection via the maintenance socket is the anonymous role, that solves the problem. Some accesses to `auth::service` are not affected and we do not modify those. Fixes scylladb/scylladb#26816 Backport: yes. This is a fix of a regression. - (cherry picked from commit `c0f7622d12`) - (cherry picked from commit `222eab45f8`) - (cherry picked from commit `394207fd69`) - (cherry picked from commit `b357c8278f`) Parent PR: #26856 Closes scylladb/scylladb#27034 * github.com:scylladb/scylladb: test/cluster/test_maintenance_mode.py: Wait for initialization test: Disable maintenance mode correctly in test_maintenance_mode.py test: Fix keyspace in test_maintenance_mode.py service/qos: Do not crash Scylla if auth_integration absent	2025-11-20 10:43:18 +02:00
Gleb Natapov	86d6e759bc	cleanup: introduce "nodetool cluster cleanup" command to run cleanup on all dirty nodes in the cluster `97ab3f6622` changed "nodetool cleanup" (without arguments) to run cleanup on all dirty nodes in the cluster. This was somewhat unexpected, so this patch changes it back to run cleanup on the target node only (and reset "cleanup needed" flag afterwards) and it adds "nodetool cluster cleanup" command that runs the cleanup on all dirty nodes in the cluster. (cherry picked from commit `0f0ab11311`)	2025-11-19 10:35:39 +02:00
Gleb Natapov	80d92d68e0	cleanup: Add RESTful API to allow reset cleanup needed flag Cleaning up a node using per keyspace/table interface does not reset cleanup needed flag in the topology. The assumption was that running cleanup on already clean node does nothing and completes quickly. But due to https://github.com/scylladb/scylladb/issues/12215 (which is closed as WONTFIX) this is not the case. This patch provides the ability to reset the flag in the topology if operator cleaned up the node manually already. (cherry picked from commit `e872f9cb4e`)	2025-11-19 10:14:38 +02:00
Avi Kivity	6206b57008	Merge '[Backport 2025.2] Synchronize tablet split and load-and-stream' from Scylladb[bot] Load-and-stream is broken when running concurrently to the finalization step of tablet split. Consider this: 1) split starts 2) split finalization executes barrier and succeed 3) load-and-stream runs now, starts writing sstable (pre-split) 4) split finalization publishes changes to tablet metadata 5) load-and-stream finishes writing sstable 6) sstable cannot be loaded since it spans two tablets two possible fixes (maybe both): 1) load-and-stream awaits for topology to quiesce 2) perform split compaction on sstable that spans both sibling tablets This patch implements # 1. By awaiting for topology to quiesce, we guarantee that load-and-stream only starts when there's no chance coordinator is handling some topology operation like split finalization. Fixes https://github.com/scylladb/scylladb/issues/26455. - (cherry picked from commit `3abc66da5a`) - (cherry picked from commit `4654cdc6fd`) Parent PR: #26456 Closes scylladb/scylladb#26647 * github.com:scylladb/scylladb: sstables_loader: Don't bypass synchronization with busy topology test: Add reproducer for l-a-s and split synchronization issue sstables_loader: Synchronize tablet split and load-and-stream	2025-11-17 17:15:37 +02:00

1 2 3 4 5 ...

48219 Commits