scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-22 15:52:13 +00:00

Author	SHA1	Message	Date
Jenkins Promoter	c19c5d96bf	Update pgo profiles - aarch64	2026-05-15 05:03:33 +03:00
Botond Dénes	57ff84f9e7	schema: fix DESCRIBE showing NullCompactionStrategy when compaction is disabled When a table's compaction is disabled via 'enabled': 'false', the DESCRIBE output incorrectly showed NullCompactionStrategy instead of the actual strategy. This happened because schema_properties() called compaction_strategy(), which returns compaction_strategy_type::null when compaction is disabled. Fix it by using configured_compaction_strategy(), which always returns the real strategy type - consistent with how schema_tables.cc serializes it to disk. Fixes SCYLLADB-1353 Closes scylladb/scylladb#29804 (cherry picked from commit `8d6f031a4a`) Closes scylladb/scylladb#29867 Closes scylladb/scylladb#29886	2026-05-14 17:16:24 +02:00
Wojciech Mitros	d8a9fdddbd	test: run test_mv_admission_control_exception on one shard In the test we perform 2 consecutive writes where the first write is supposed to increase the view update backlog above the mv admission control threshold and the second one is expected to be rejected because of that. On each node/shard we have 2 types of view update backlogs: 1. for deciding whether we should admit writes 2. for propagating the backlog information to other nodes/shards. For the second write to be rejected, it must be performed on a node and shard which updated its backlog of type 1. The view update backlog of type 2. is immediately increased on the base table replica. For this backlog to be registered as a backlog of type 1., it needs to be either carried by gossip (happening once every second) or by attaching it to a replica write response. We don't want to increase the runtime of tests unnecessarily, so we don't wait and we rely on the second mechanism. The response to the first base table write (the one causing increase in the backlog) carries the increased backlog to the coordinator of this write. So for the second write to observe the increased backlog, it needs to be coordinated on the same node+shard as the first write. We make sure that both writes are coordinated on the same node+shard by using prepared statements combined with setting the host in `run_async`. Both writes target the same partition and with prepared statements we route them directly to the correct shard. That was the idea, at least. In practice, for the driver to learn the correct shard, it first needs to learn the token->shard mapping from the server. For vnodes it can expect a shard by calculating the token of the affected partition, but for tablets, it had no opportunity to learn the tablet->shard mapping so the first write may route to any shard. Additionally, we aren't guaranteed that the driver established connections to all shards on all nodes at the point of any write. So if a connection finishes establishing between the two writes, this may also cause us to coordinate these 2 writes on different shards, leading to a missed view backlog growth and not-rejected second write. We fix this in this patch by running the test using one shard on each node. This way, as long as we perform both writes on the same node, they'll also be coordinated on the same shard. This also makes the prepared statement and BoundStatement unnecessary — we can use SimpleStatement with FallthroughRetryPolicy directly. Fixes: SCYLLADB-1957 Closes scylladb/scylladb#29862 (cherry picked from commit `f3cf20803b`) Closes scylladb/scylladb#29873 Closes scylladb/scylladb#29879	2026-05-14 17:15:36 +02:00
Ferenc Szili	1a7ca53620	test: fix flaky test_tablets_split_merge_with_many_tables In debug mode, this test can timeout during tablets merge. While the test already decreases the number of tables in debug mode (20 tables, instead of 200 for dev mode), this is not enough, and the test can still timeout during merge. This change reduces the number of tables from 20 to 5 in debug mode. It also drops the log level for lead_balancer to debug. This should make any potential future problems with this test easier to investigate. Fixes: SCYLLADB-1863 Closes scylladb/scylladb#29682 (cherry picked from commit `ec4b483e88`) Closes scylladb/scylladb#29786 Closes scylladb/scylladb#29887	2026-05-14 17:14:05 +02:00
Gleb Natapov	de9cbc89bc	schema: ensure group0_schema_version is set on boot Starting from 2024.2, schema versions are assigned by group0 using raft state IDs instead of being calculated by hashing schema mutations with MD5. The assigned version is persisted in system.scylla_local as 'group0_schema_version'. When this value is present, it is used as the schema version; otherwise, the legacy calculate_schema_digest() hash is used as a fallback. If a cluster was upgraded from a version before 2024.2 and no schema change was performed after the upgrade, group0_schema_version will never have been written, keeping the cluster permanently dependent on the legacy hashing code path. This prevents us from dropping that code. Add migration_manager::ensure_group0_schema_version_is_set() which is called during join_cluster after finish_setup_after_join completes (in both the raft-topology and non-raft-topology paths). It checks whether group0_schema_version is set. If not, and the GROUP0_SCHEMA_VERSIONING feature is enabled, it performs a no-op schema change through group0 (announce with an empty mutation set). The announce() function automatically appends the group0_schema_version mutation, so this is sufficient to persist the version on all nodes via raft. A double-check is performed: once before acquiring the group0 guard (to avoid unnecessary work), and once after (since start_group0_operation includes a raft barrier, ensuring any concurrently committed changes from other nodes are applied locally). This handles the case where multiple nodes restart simultaneously and race to set the version. Closes scylladb/scylladb#29858	2026-05-14 15:59:07 +02:00
Botond Dénes	cb26a837ba	Merge '[Backport 2026.1] Rare "Vector Store request was aborted"' from Karol Nowacki To ensure the high-availability logic fits within the default 10s CQL timeout, the default unreachable node detection time has been decreased from 5s to 3s. Consequently, this value has been decoupled from the read_request_timeout and dedicated configuration options have been introduced: vector_store_unreachable_node_detection_time_in_ms. Fixes SCYLLADB-1855 Backports to 2025.4 and 2026.1 are required, as spurious CQL timeouts in high-availability scenarios are affecting these releases in vector search XCloud deployments. - (cherry picked from commit `9269ca9cf7`) - (cherry picked from commit `c643f321af`) Parent PR: https://github.com/scylladb/scylladb/pull/28675 Closes scylladb/scylladb#29773 * github.com:scylladb/scylladb: vector_search: decrease default connection timeout to 3s vector_search: add unreachable node detection time config	2026-05-14 14:51:59 +03:00
Karol Nowacki	96a8af14a7	vector_search: decrease default connection timeout to 3s Decrease the default connection timeout to 3s to better align with the default CQL query timeout of 10s. The previous timeout allowed only one failover request in high availability scenario before hitting the CQL query timeout. By decreasing the timeout to 3s, we can perform up to three failover requests within the CQL query timeout, which significantly improves the chances of successfully completing the query in high availability scenarios. Fixes: SCYLLADB-95 (cherry picked from commit `c643f321af`)	2026-05-14 05:41:55 +00:00
Karol Nowacki	66a6541bc5	vector_search: add unreachable node detection time config Add option `vector_store_unreachable_node_detection_time_in_ms` to control parameters related to detecting unreachable vector store nodes. This parameter is used to set the TCP connect timeout, keepalive parameters, and TCP_USER_TIMEOUT. By configuring these parameters, we can detect unreachable vector store nodes faster and trigger failover mechanisms in a timely manner. (cherry picked from commit `9269ca9cf7`)	2026-05-13 16:20:48 +02:00
Raphael S. Carvalho	399e0f8cb7	mutation_compactor: Fix tombstone GC metrics to account for only expired There are 3 metrics (that goes in every compaction_history entry): total_tombstone_purge_attempt total_tombstone_purge_failure_due_to_overlapping_with_memtable total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable When a tombstone is not expired (e.g. doesn't satisfy "gc_before" or grace period), it can be currently accounted as failure due to overlapping with either memtable or uncompacting sstable. So those 2 last metrics have noise of unexpired tombstones. What we should do is to only account for expired tombstones in all those 3 metrics. We lose the info of knowing the amount of tombstones processed by compaction, now we'll only know about the expired ones. But those metrics were primarily added for explaining why expired tombstones cannot be removed. We could have alternatively added a new field purge_failure_due_to_being_unexpired or something, but it requires adding a new field to compaction_history. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-737. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#28669 (cherry picked from commit `f33f324f77`) (cherry picked from commit `acbdf34e5a`) Closes scylladb/scylladb#28744	2026-05-13 16:18:00 +03:00
Jenkins Promoter	044240dfb2	Update ScyllaDB version to: 2026.1.4	2026-05-13 13:47:15 +03:00
Patryk Jędrzejczak	35063d6d74	Merge 'topology_coordinator: join tablet load stats refresh in stop()' from Andrzej Jackowski Commit `2b7aa32` (topology_coordinator: Refresh load stats after table is created or altered) registered topology_coordinator as a schema change listener and added on_create_column_family which fire-and-forgets _tablet_load_stats_refresh.trigger(). The triggered task runs on the gossip scheduling group via with_scheduling_group and accesses the topology_coordinator via 'this'. stop() unregisters the listener but does not wait for any in-flight refresh task. If a notification fires between _tablet_load_stats_refresh.join() in run() and unregister_listener in stop(), the scheduled task can outlive the topology_coordinator and access freed memory after run_topology_coordinator's coroutine frame is destroyed. Wait for the refresh to complete in stop() after unregistering the listener, ensuring no task can fire after destruction. Fixes SCYLLADB-1728 Backport to 2026.1 and 2026.2, because the issue was introduced in `2b7aa32` Closes scylladb/scylladb#29653 * https://github.com/scylladb/scylladb: test: tablet_stats: reproduce shutdown refresh race topology_coordinator: join tablet load stats refresh in stop() (cherry picked from commit `d9dd3bfe53`) Closes scylladb/scylladb#29686 Co-authored-by: Andrzej Jackowski <andrzej.jackowski@scylladb.com> Closes scylladb/scylladb#29811	2026-05-13 09:42:09 +03:00
Botond Dénes	8d67d8185a	Merge '[Backport 2026.1] test: fix flaky test_incremental_repair_race_window_promotes_unrepaired_data' from Raphael Raph Carvalho The test waited for two "Finished tablet repair" log messages on the coordinator, expecting one per tablet. But there are two log sources that emit messages matching this pattern: repair module (repair/repair.cc:2329): "Finished tablet repair for table=..." topology coordinator (topology_coordinator.cc:2083): "Finished tablet repair host=..." When the coordinator is also a repair replica (always the case with RF=3 and 3 nodes), both messages appear in the coordinator log for the same tablet within 1ms of each other. The test consumed both, thinking both tablets were done, while the second tablet repair was still running. From the CI failure logs: 04:08:09.658 Found: repair[...]: Finished tablet repair for table=... global_tablet_id=e42fd650-3542-11f1-9756-85403784a622:0 04:08:09.660 Found: raft_topology - Finished tablet repair host=... tablet=e42fd650-3542-11f1-9756-85403784a622:0 Both messages are for tablet :0. Tablet :1 repair had not finished yet. The test then wrote keys 20-29 while the second tablet repair was still in progress. That repair flushed the memtable (via prepare_sstables_for_incremental_repair), including keys 20-29 in the repair scan, and mark_sstable_as_repaired set repaired_at=2 on the resulting sstable. This caused the assertion failure on servers[0]: "should not have post-repair keys in repaired sstables, got: {20, 21, 22, 23, 24, 25, 26, 27, 28, 29}" Fix by matching "Finished tablet repair host=" which is unique to the topology coordinator message and avoids the ambiguity. Also fix an incorrect comment that said being_repaired=null when at that point in the test being_repaired is still set to the session_id (the delay_end_repair_update injection prevents end_repair from running). Fixes: [SCYLLADB-1478](https://scylladb.atlassian.net/browse/SCYLLADB-1478) Closes https://github.com/scylladb/scylladb/pull/29444 (cherry picked from commit `ebdfa10c8f`) Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1903 [SCYLLADB-1478]: https://scylladb.atlassian.net/browse/SCYLLADB-1478?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Closes scylladb/scylladb#29832 * github.com:scylladb/scylladb: test/cluster/test_incremental_repair: add retry for residual leadership race test/cluster/test_incremental_repair: fix flaky coordinator-change scenario test: fix flaky test_incremental_repair_race_window_promotes_unrepaired_data	2026-05-13 09:38:01 +03:00
Asias He	8bd0e562be	repair: Reject repair requests where start and end tokens are equal When a user calls the repair API with identical startToken and endToken values, the code creates a wrapping interval (T, T]. This causes unwrap() to split it into (-inf, T] and (T, +inf), covering the entire token ring and triggering a full repair. Reject such requests early with an error message matching Cassandra's behavior: "Start and end tokens must be different." Fixes: CUSTOMER-368 Closes scylladb/scylladb#29821 (cherry picked from commit `0204372156`) Closes scylladb/scylladb#29836 Closes scylladb/scylladb#29863	2026-05-13 09:37:12 +03:00
Anna Stuchlik	356ca32994	doc: update the node size limit This commit increases the node size limit from 256 to 4096 CPUs based on `be1f566488` Fixes SCYLLADB-1676 Closes scylladb/scylladb#29602 (cherry picked from commit `a7b7019f90`) Closes scylladb/scylladb#29846 Closes scylladb/scylladb#29864	2026-05-13 09:36:09 +03:00
copilot-swe-agent[bot]	0e41ef82ac	docs: fix typo in materialized views docs - "columns are" instead of "is" The MV Select Statement description was missing the word "columns" and used incorrect verb agreement, making the sentence grammatically broken and ambiguous. docs/cql/mv.rst: "which of the base table is included" → "which of the base table columns are included" Fixes #29662 Closes #29663 Co-authored-by: annastuchlik <37244380+annastuchlik@users.noreply.github.com> (cherry picked from commit `9e7d67612c`) Closes scylladb/scylladb#29835 Closes scylladb/scylladb#29865	2026-05-13 09:35:00 +03:00
Piotr Dulikowski	2c5ae5cee0	database: add missing co_await on lock in create_local_system_table The function database::create_local_system_table calls get_tables_metadata().hold_write_lock(), but does not co_await the returned future. Effectively, this code does not guarantee mutual exclusion because it does not wait for the lock to be acquired and does not guarantee that the lock is held long enough. Fix this by adding the co_await that was missing. Found by manual inspection. This code is not known to have caused any problems so far, but it's clearly wrong - hence the fix. Fixes: SCYLLADB-1916 Closes scylladb/scylladb#29806 (cherry picked from commit `bc482bfdea`) Closes scylladb/scylladb#29815 Closes scylladb/scylladb#29833	2026-05-12 10:16:29 +02:00
Raphael S. Carvalho	c3d8f279dc	test/cluster/test_incremental_repair: add retry for residual leadership race There is a small race window where Raft leadership could transfer back to servers[1] between the ensure_group0_leader_on() check and the actual restart. If this happens, the new coordinator re-initiates repair and masks the compaction-merge bug. Extract the core test logic into _do_race_window_promotes_unrepaired_data() which directly checks get_topology_coordinator() after restart and raises _LeadershipTransferred if servers[1] became coordinator. The test function calls this helper in a retry loop (up to 5 attempts). Also add detection of residual re-repairs (where the coordinator re-initiates tablet repair after seeing tablets stuck in the repair stage following the topology restart) and fresh keyspace creation on each retry attempt to avoid state contamination. Refs: SCYLLADB-1478 Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1903 (cherry picked from commit `2615d0e8d8`) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2026-05-11 09:38:13 -03:00
Avi Kivity	bceb054bbd	test/cluster/test_incremental_repair: fix flaky coordinator-change scenario The test_incremental_repair_race_window_promotes_unrepaired_data test was flaky because it hardcodes servers[1] as the restart target but did not ensure servers[1] was NOT the topology coordinator. When servers[1] happened to be the Raft group0 leader (topology coordinator), restarting it killed the leader, forced a new election, and the new coordinator re-initiated tablet repair. This re-repair flushes memtables on all replicas via take_storage_snapshot() and marks the resulting sstables as repaired -- causing post-repair keys to appear in repaired sstables on servers[0] and servers[2]. The test then hit the wrong assertion (servers[0]/[2] contaminated). Fix: before starting the repair, check whether servers[1] is the topology coordinator. If so, move leadership to another server via ensure_group0_leader_on() so that restarting servers[1] only kills a follower -- which does not trigger an election or coordinator change. Reproducibility was confirmed by forcing leadership to servers[1] via ensure_group0_leader_on() and observing deterministic failure with all three servers showing post-repair keys in repaired sstables (confirming the re-repair scenario), then verifying the fix passes reliably. Fixes: SCYLLADB-1478 Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1903 (cherry picked from commit `914b70c75b`) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2026-05-11 09:31:48 -03:00
Avi Kivity	cb7ce9b4cb	test: fix flaky test_incremental_repair_race_window_promotes_unrepaired_data The test waited for two "Finished tablet repair" log messages on the coordinator, expecting one per tablet. But there are two log sources that emit messages matching this pattern: repair module (repair/repair.cc:2329): "Finished tablet repair for table=..." topology coordinator (topology_coordinator.cc:2083): "Finished tablet repair host=..." When the coordinator is also a repair replica (always the case with RF=3 and 3 nodes), both messages appear in the coordinator log for the same tablet within 1ms of each other. The test consumed both, thinking both tablets were done, while the second tablet repair was still running. From the CI failure logs: 04:08:09.658 Found: repair[...]: Finished tablet repair for table=... global_tablet_id=e42fd650-3542-11f1-9756-85403784a622:0 04:08:09.660 Found: raft_topology - Finished tablet repair host=... tablet=e42fd650-3542-11f1-9756-85403784a622:0 Both messages are for tablet :0. Tablet :1 repair had not finished yet. The test then wrote keys 20-29 while the second tablet repair was still in progress. That repair flushed the memtable (via prepare_sstables_for_incremental_repair), including keys 20-29 in the repair scan, and mark_sstable_as_repaired set repaired_at=2 on the resulting sstable. This caused the assertion failure on servers[0]: "should not have post-repair keys in repaired sstables, got: {20, 21, 22, 23, 24, 25, 26, 27, 28, 29}" Fix by matching "Finished tablet repair host=" which is unique to the topology coordinator message and avoids the ambiguity. Also fix an incorrect comment that said being_repaired=null when at that point in the test being_repaired is still set to the session_id (the delay_end_repair_update injection prevents end_repair from running). Fixes: SCYLLADB-1478 Closes scylladb/scylladb#29444 (cherry picked from commit `ebdfa10c8f`) Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1903 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2026-05-11 09:31:43 -03:00
Botond Dénes	d33fb5159a	sstables/trie: add preemption points in trie_writer The BTI partition index trie writer flushes all buffered nodes at the end of each SSTable via complete_until_depth(0), called from bti_partition_index_writer_impl::finish(). This is a tight synchronous loop that writes trie nodes through file_writer::write(), which uses a buffered output_stream: individual writes that fit in the buffer are plain memcpy operations returning a ready future, so .get() never yields. As a result the reactor can stall for several milliseconds on large SSTables. The entire call chain runs inside seastar::async() (via sstable::write_components()), so seastar::thread::maybe_yield() is safe to call here. Add it at the top of both tight loops: - complete_until_depth(), which iterates over trie depth - lay_out_children(), which iterates over child branches per node Fixes SCYLLADB-1885 Closes scylladb/scylladb#29798 (cherry picked from commit `d0813769ec`) Closes scylladb/scylladb#29810 Closes scylladb/scylladb#29816	2026-05-11 12:50:56 +03:00
Piotr Dulikowski	02c7f44da4	Merge 'table_helper: fix use-after-free on prepared-statement invalidation' from Marcin Maliszkiewicz insert() held no local strong ref to the prepared modification_statement across the suspension in execute(). On a single shard: 1. Fiber A suspends inside _insert_stmt->execute(). 2. DROP TABLE / DROP KEYSPACE on the target, or LRU eviction, removes the prepared_statements_cache entry, releasing its strong ref. 3. Fiber B re-enters cache_table_info(), sees _prepared_stmt (checked_weak_ptr) invalidated, and runs _insert_stmt = nullptr, releasing the last strong ref. The modification_statement is freed. 4. Fiber A resumes inside execute() and touches freed this. Pin strong ref to _insert_stmt locally before the suspension. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1667 Backport: all supported branches, it's memory corruption bug, long present Closes scylladb/scylladb#29588 github.com:scylladb/scylladb: test/boost: add dummy case to table_helper_test for non-injection modes test/boost: add regression test for table_helper insert() UAF utils/error_injection: add waiters() API table_helper: fix use-after-free on prepared-statement invalidation (cherry picked from commit `efcc0b6376`) Closes scylladb/scylladb#29747 Closes scylladb/scylladb#29802	2026-05-10 13:58:15 +03:00
Yaniv Michael Kaul	c0bf728edb	raft/group0: fix destroy assertion on startup failure If start_server_for_group0() successfully registers a server in _raft_gr._servers but a subsequent step (e.g. enable_in_memory_state_machine()) throws, the server is never destroyed because abort_and_drain()/destroy() check std::get_if<raft::group_id>(&_group0) which was only set after the entire with_scheduling_group block completed. Move _group0.emplace<raft::group_id>() inside the lambda, immediately after start_server_for_group() succeeds, so that cleanup paths can always find and destroy the registered server. This fixes the assertion: "raft_group_registry - stop(): server for group ... is not destroyed" which manifests during shutdown after an upgrade where topology_state_load() fails due to netw::unknown_address. Backport: Yes, to 2026.1, 2026.2, as it causes a crash on upgrades Refs: SCYLLADB-1217 Refs: CUSTOMER-340 Refs: CUSTOMER-335 Fixes: SCYLLADB-1809 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> AI-assisted: Yes, Opencode/Opus 4.6 Closes scylladb/scylladb#29702 (cherry picked from commit `6179406467`) Closes scylladb/scylladb#29742 Closes scylladb/scylladb#29754 scylla-2026.1.3-candidate-20260510010724 scylla-2026.1.3	2026-05-08 11:24:02 +02:00
Raphael S. Carvalho	0b47672901	repair/replica: Fix race window where post-repair data is wrongly promoted to repaired During incremental repair, each tablet replica holds three SSTable views: UNREPAIRED, REPAIRING, and REPAIRED. The repair lifecycle is: 1. Replicas snapshot unrepaired SSTables and mark them REPAIRING. 2. Row-level repair streams missing rows between replicas. 3. mark_sstable_as_repaired() runs on all replicas, rewriting the SSTables with repaired_at = sstables_repaired_at + 1 (e.g. N+1). 4. The coordinator atomically commits sstables_repaired_at=N+1 and the end_repair stage to Raft, then broadcasts repair_update_compaction_ctrl which calls clear_being_repaired(). The bug lives in the window between steps 3 and 4. After step 3, each replica has on-disk SSTables with repaired_at=N+1, but sstables_repaired_at in Raft is still N. The classifier therefore sees: is_repaired(N, sst{repaired_at=N+1}) == false sst->being_repaired == null (lost on restart, or not yet set) and puts them in the UNREPAIRED view. If a new write arrives and is flushed (repaired_at=0), STCS minor compaction can fire immediately and merge the two SSTables. The output gets repaired_at = max(N+1, 0) = N+1 because compaction preserves the maximum repaired_at of its inputs. Once step 4 commits sstables_repaired_at=N+1, the compacted output is classified REPAIRED on the affected replica even though it contains data that was never part of the repair scan. Other replicas, which did not experience this compaction, classify the same rows as UNREPAIRED. This divergence is never healed by future repairs because the repaired set is considered authoritative. The result is data resurrection: deleted rows can reappear after the next compaction that merges unrepaired data with the wrongly-promoted repaired SSTable. The fix has two layers: Layer 1 (in-memory, fast path): mark_sstable_as_repaired() now also calls mark_as_being_repaired(session) on the new SSTables it writes. This keeps them in the REPAIRING view from the moment they are created until repair_update_compaction_ctrl clears the flag after step 4, covering the race window in the normal (no-restart) case. Layer 2 (durable, restart-safe): a new is_being_repaired() helper on tablet_storage_group_manager detects the race window even after a node restart, when being_repaired has been lost from memory. It checks: sst.repaired_at == sstables_repaired_at + 1 AND tablet transition kind == tablet_transition_kind::repair Both conditions survive restarts: repaired_at is on-disk in SSTable metadata, and the tablet transition is persisted in Raft. Once the coordinator commits sstables_repaired_at=N+1 (step 4), is_repaired() returns true and the SSTable naturally moves to the REPAIRED view. The classifier in make_repair_sstable_classifier_func() is updated to call is_being_repaired(sst, sstables_repaired_at) in place of the previous sst->being_repaired.uuid().is_null() check. A new test, test_incremental_repair_race_window_promotes_unrepaired_data, reproduces the bug by: - Running repair round 1 to establish sstables_repaired_at=1. - Injecting delay_end_repair_update to hold the race window open. - Running repair round 2 so all replicas complete mark_sstable_as_repaired (repaired_at=2) but the coordinator has not yet committed step 4. - Writing post-repair keys to all replicas and flushing servers[1] to create an SSTable with repaired_at=0 on disk. - Restarting servers[1] so being_repaired is lost from memory. - Waiting for autocompaction to merge the two SSTables on servers[1]. - Asserting that the merged SSTable contains post-repair keys (the bug) and that servers[0] and servers[2] do not see those keys as repaired. NOTE FOR MAINTAINER: Copilot initially only implemented Layer 1 (the in-memory being_repaired guard), missing the restart scenario entirely. I pointed out that being_repaired is lost on restart and guided Copilot to add the durable Layer 2 check. I also polished the implementation: moving is_being_repaired into tablet_storage_group_manager so it can reuse the already-held _tablet_map (avoiding an ERM lookup and try/catch), passing sstables_repaired_at in from the classifier to avoid re-reading it, and using compaction_group_for_sstable inside the function rather than threading a tablet_id parameter through the classifier. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1239. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Closes scylladb/scylladb#29244 (cherry picked from commit `16e387d5f9`) (cherry picked from commit `cc3dcc4ba8`) Closes scylladb/scylladb#29411	2026-05-08 11:25:26 +03:00
Patryk Jędrzejczak	b3b333b597	Merge '[Backport 2026.1] Barrier and drain logging' from Scylladb[bot] Add more logging to barrier and drain rpc to try and pinpoint https://github.com/scylladb/scylladb/issues/26281 Bakport since we want to have it if it happens in the field. Fixes: SCYLLADB-1837 Refs: #26281 - (cherry picked from commit `11b838e71e`) - (cherry picked from commit `e88ce09372`) - (cherry picked from commit `385915c101`) - (cherry picked from commit `d2b695aa64`) Parent PR: #29735 Closes scylladb/scylladb#29770 * https://github.com/scylladb/scylladb: session, raft_topology: add periodic warnings for hung drain and stale version waits session: add info-level logging to drain_closing_sessions raft_topology: log sub-step progress in local_topology_barrier raft_topology: log read_barrier progress in topology cmd handler token_metadata: improve stale versions diagnostics	2026-05-08 10:02:18 +02:00
Łukasz Paszkowski	342a7bfce1	db: fix system.size_estimates to aggregate sstable estimates across all shards The estimate() function in the size_estimates virtual reader only considered sstables local to the shard that happened to own the keyspace's partition key token. Since sstables are distributed across shards, this caused partition count estimates to be approximately 1/smp_count of the actual value. This bug has been present since the virtual reader was introduced in `225648780d`. Use db.container().map_reduce0() to aggregate sstable estimates across all shards. Each shard contributes its local count and estimated_histogram, which are then merged to produce the correct total. Also fix the `test_partitions_estimate_full_overlap` test which becomes flaky (xpassing ~1% of runs) because autocompaction could merge the two overlapping sstables before the size estimate was read. Wrap the test body in nodetool.no_autocompaction_context to prevent this race. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1179 Refs https://github.com/scylladb/scylladb/issues/9083 Closes scylladb/scylladb#29286 (cherry picked from commit `6f364fd3b7`) Closes scylladb/scylladb#29381	2026-05-08 06:38:28 +03:00
Piotr Dulikowski	8a52602ec9	Merge '[Backport 2026.1] vector_search: test: fix flaky test_dns_resolving_repeated' from Scylladb[bot] The `vector_store_client_test_dns_resolving_repeated` test was intermittently timing out on CI. The exact root cause is not fully understood, but the hypothesis is that a single trigger signal can be lost somewhere (not exactly known where). This is not an issue for the production code because refresh trigger will be called multiple times whenever all configured nodes will be unreachable. Fixes SCYLLADB-1794 Backport to 2026.1 and 2026.2, as the same CI flakiness can occur on these branches. - (cherry picked from commit `e9240587f4`) - (cherry picked from commit `44249c0a75`) Parent PR: #29752 Closes scylladb/scylladb#29796 * github.com:scylladb/scylladb: vector_search: test: default timeout in test_dns_resolving_repeated vector_search: test: fix flaky test_dns_resolving_repeated	2026-05-08 03:54:43 +02:00
Karol Nowacki	9ed728a5b3	vector_search: test: default timeout in test_dns_resolving_repeated Replace explicit 1-second timeouts in repeat_until() with the default STANDARD_WAIT (10s). The 1-second timeout could be too aggressive for loaded CI environments where lowres_clock granularity (~10ms) combined with OS scheduling delays and resource contention (-c2 -m2G) could cause the loop to expire before the DNS refresh task completes its cycle. This also unifies test timeouts across test cases. (cherry picked from commit `207de967fb`)	2026-05-07 17:04:15 +00:00
Karol Nowacki	b4467fb229	vector_search: test: fix flaky test_dns_resolving_repeated Move trigger_dns_resolver() inside the repeat_until loop instead of calling it once before the loop. The test was intermittently timing out on CI. The exact root cause is not fully understood, but the hypothesis is that a single trigger signal can be lost somewhere (not exactly known where). This is not an issue for the production code because refresh trigger will be called multiple times - in every query where all configured nodes will be unreachable. By triggering inside the loop, we ensure the signal is re-sent on each iteration until the resolver actually performs the refresh and picks up the new (failing) DNS resolution. This makes the test resilient to timing-dependent signal loss without changing production code. Fixes: SCYLLADB-1794 (cherry picked from commit `4722be1289`)	2026-05-07 17:04:15 +00:00
Marcin Maliszkiewicz	83999f7228	Merge 'utils: loading_cache: add `insert()` that is a no-op when caching is disabled' from Dario Mirovic When `permissions_validity_in_ms` is set to 0, executing a prepared statement under authentication crashes with: ``` Assertion `caching_enabled()' failed. at utils/loading_cache.hh:319 in authorized_prepared_statements_cache::insert ``` `loading_cache::get_ptr()` asserts when caching is disabled (expiry == 0), but `authorized_prepared_statements_cache::insert()` was using it purely for its side effect of populating the cache, which is meaningless when caching is off. Add a new `loading_cache::insert(k, load)` method that is a no-op when caching is disabled and otherwise forwards to `get_ptr()`. Switch `authorized_prepared_statements_cache::insert()` to use it. This completes the disabled-mode safety contract of the cache for the write side, mirroring the fallback that `get()` already provides for the read side. Includes a regression test in `test/boost/loading_cache_test.cc` plus a positive test for the new `insert()` overload. Fixes SCYLLADB-1699 The crash is introduced a long time ago. It is present on all the live versions, from 2025.1 onward. No client tickets, but it should be backported. Closes scylladb/scylladb#29638 * github.com:scylladb/scylladb: test: boost: regression test for loading_cache::insert with caching disabled utils: loading_cache: add insert() that is a no-op when caching is disabled (cherry picked from commit `c00fee0316`) Closes scylladb/scylladb#29762 Closes scylladb/scylladb#29782	2026-05-07 10:42:59 +03:00
Marcin Maliszkiewicz	1793814914	Merge '[Backport 2026.1] auth: fix crash on ghost rows in role_permissions' from Andrzej Jackowski This is manual backport of https://github.com/scylladb/scylladb/pull/29757, because the changes are required on 2026.1 ASAP. === The auth cache crashes when it encounters rows in role_permissions that have a live row marker but no permissions column. These “ghost rows” were created by the now-removed auth v2 migration, which used INSERT (creating row markers) instead of UPDATE. When permissions were later revoked, the row marker remained while the permissions column became null. An empty collection appears as null, since its lifetime is based only on its element's cells. As a result, when the cache reloads and expects the permissions column to exist, it hits a missing_column exception. The series removes dead code that was the primary crash site, adds has() guards to the remaining access paths, and includes a test reproducer. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1816 Backport: all supported versions 2026.1, 2025.4, 2025.1 Parent PR: https://github.com/scylladb/scylladb/pull/29757 Closes scylladb/scylladb#29771 * github.com:scylladb/scylladb: test: add reproducer for auth cache crash on missing permissions column auth: tolerate missing permissions column in authorize() auth: add defensive has() guard for role_attributes value column auth: remove unused permissions field from cache role_record	2026-05-06 19:21:57 +02:00
Gleb Natapov	269637ffa3	session, raft_topology: add periodic warnings for hung drain and stale version waits Add periodic warning timers (every 5 minutes) to help diagnose hangs in barrier_and_drain: - drain_closing_sessions(): warn if semaphore acquisition or session gate close is taking too long, reporting the gate count to show how many guards are still alive. - local_topology_barrier(): warn if stale_versions_in_use() is taking too long, reporting the current stale version trackers. - session::gate_count(): new public accessor for diagnostic purposes. These warnings help distinguish between the two possible hang points in barrier_and_drain (stale versions vs session drain) and provide ongoing visibility into what's blocking progress. (cherry picked from commit `d2b695aa64`)	2026-05-06 15:00:44 +03:00
Gleb Natapov	d232b08be6	session: add info-level logging to drain_closing_sessions drain_closing_sessions() is called as part of the barrier_and_drain topology command and can block on two things: acquiring the drain semaphore (if another drain is in progress) and waiting for individual sessions to close (which blocks until all session guards are released). Previously, all logging in this function was at debug level, making it invisible in production logs. When barrier_and_drain hangs, there is no way to tell whether the function is waiting for the semaphore, waiting for a specific session to close, or was never called. Promote logging to info level and add messages at each blocking point: before/after semaphore acquisition (with count of sessions to drain), before/after each individual session close (with session id), and at function completion. This makes it possible to identify the exact session blocking a topology operation from the node log alone. (cherry picked from commit `385915c101`)	2026-05-06 15:00:44 +03:00
Gleb Natapov	485c2504ad	raft_topology: log sub-step progress in local_topology_barrier When a node processes a barrier_and_drain topology command, it performs two potentially long-running operations inside local_topology_barrier(): waiting for stale token metadata versions to be released (stale_versions_in_use) and draining closing sessions (drain_closing_sessions). Either of these can hang indefinitely -- for example, stale_versions_in_use blocks until all references to previous token metadata versions are released, which depends on in-flight requests completing. Previously, the only logging was a single 'done' message at the end, making it impossible to determine which sub-step was blocking when a barrier_and_drain RPC appeared stuck on a node. In a recent CI failure, a node never responded to barrier_and_drain during a removenode operation, and the logs showed the RPC was received but nothing about what it was waiting on internally. Add info-level logging before each blocking sub-step, including the topology version for correlation. This allows diagnosing hangs by showing whether the node is stuck waiting for stale metadata versions, stuck draining sessions, or never reached these steps at all. (cherry picked from commit `e88ce09372`)	2026-05-06 15:00:44 +03:00
Gleb Natapov	026e870e54	raft_topology: log read_barrier progress in topology cmd handler When a raft topology command (e.g. barrier_and_drain) is received by a node, the handler first performs a raft read_barrier to ensure it sees the latest topology state. This read_barrier can hang indefinitely if raft cannot achieve quorum, but there was no logging around it, making it impossible to tell whether the handler was stuck at this step or somewhere else. Add info-level logging before and after the read_barrier call in raft_topology_cmd_handler, including the command type, index, and term. This allows diagnosing hangs by showing whether the node entered the read_barrier and whether it completed, narrowing down the root cause when a topology command RPC appears stuck on the receiver side. (cherry picked from commit `11b838e71e`)	2026-05-06 15:00:44 +03:00
Petr Gusev	dd75f53c5d	token_metadata: improve stale versions diagnostics Before waiting on stale_versions_in_use(), we log the stale versions the barrier_and_drain handler will wait for, along with the number of token_metadata references representing each version. To achieve this, we store a pointer to token_metadata in version_tracker, traverse the _trackers list, and output all items with a version smaller than the latest. Since token_metadata contains the version_tracker instance, it is guaranteed to remain alive during traversal. To count references, token_metadata now inherits from enable_lw_shared_from_this. This helps diagnose tablet migration stalls and allows more deterministic tests: when a barrier is expected to block, we can verify that the log contains the expected stale versions rather than checking that the barrier_and_drain is blocked on stale_versions_in_use() for a fixed amount of time. (cherry picked from commit `e39f4b399c`)	2026-05-06 14:52:11 +03:00
Marcin Maliszkiewicz	2c7d9b49bc	test: add reproducer for auth cache crash on missing permissions column (cherry picked from commit `5c5306c692`)	2026-05-06 13:17:06 +02:00
Marcin Maliszkiewicz	82c3509752	auth: tolerate missing permissions column in authorize() Ghost rows in role_permissions with a live row marker but no permissions column can occur when permissions created via INSERT (e.g. by the removed auth v2 migration) are later revoked. The row marker survives the revoke, leaving a row visible to queries but with permissions=null. Add a has() guard before accessing the permissions column, matching the pattern already used in list_all(). Return NONE permissions for such ghost rows instead of crashing. (cherry picked from commit `df69a5c79b`)	2026-05-06 13:17:05 +02:00
Marcin Maliszkiewicz	2306562c01	auth: add defensive has() guard for role_attributes value column Add a has() check before accessing the value column in role_attributes to tolerate ghost rows with missing regular columns. In practice this is unlikely to be a problem since attributes are not typically revoked, but the guard is added for consistency and defensive programming. (cherry picked from commit `c44625ebdf`)	2026-05-06 13:17:04 +02:00
Marcin Maliszkiewicz	3f35b3bc55	auth: remove unused permissions field from cache role_record The permissions field in role_record was populated by fetch_role() but never read. Authorization uses cached_permissions instead, which is loaded via the permission_loader callback. Remove the dead field and its fetch code. The removed code also did not check for missing columns before accessing the permissions set, which could crash on ghost rows left by the removed auth v2 migration. The migration used INSERT (creating row markers), and when permissions were later revoked, the row marker survived while the permissions column became null. (cherry picked from commit `797bc28aae`)	2026-05-06 13:16:52 +02:00
Jenkins Promoter	70fa7453d0	Update ScyllaDB version to: 2026.1.3	2026-05-03 17:38:16 +03:00
Jenkins Promoter	fe9ec0cd5a	Update pgo profiles - aarch64	2026-05-01 05:03:53 +03:00
Jenkins Promoter	5f32cbf502	Update pgo profiles - x86_64	2026-05-01 04:21:07 +03:00
Botond Dénes	ecb3f254ad	sstables: fix segfault in parse_assert() when message is nullptr parse_assert() accepts an optional `message` parameter that defaults to nullptr. When the assertion fails and message is nullptr, it is implicitly converted to sstring via the sstring(const char*) constructor, which calls strlen(nullptr) -- undefined behavior that manifests as a segfault in __strlen_evex. This turns what should be a graceful malformed_sstable_exception into a fatal crash. In the case of CUSTOMER-279, a corrupt SSTable triggered parse_assert() during streaming (in continuous_data_consumer:: fast_forward_to()), causing a crash loop on the affected node. Fix by guarding the nullptr case with a ternary, passing an empty sstring() when message is null. on_parse_error() already handles the empty-message case by substituting "parse_assert() failed". Fixes: SCYLLADB-1672 Closes scylladb/scylladb#29285 (cherry picked from commit `cfebe17592`) Closes scylladb/scylladb#29597	2026-04-30 12:12:27 +03:00
Avi Kivity	af59e9200a	build: point seastar submodule at scylla-seastar.git This allows us to backport seastar commits as the need arises.	2026-04-30 11:51:32 +03:00
Patryk Jędrzejczak	5540a16f1b	Merge 'raft: Throw stopped_error if server aborted' from Dawid Mędrek This PR solves a series of similar problems related to executing methods on an already aborted `raft::server`. They materialize in various ways: * For `add_entry` and `modify_config`, a `raft::not_a_leader` with a null ID will be returned IF forwarding is disabled. This wasn't a big problem because forwarding has always been enabled for group0, but it's something that's nice to fix. It's might be relevant for strong consistency that will heavily rely on this code. * For `wait_for_leader` and `wait_for_state_change`, the calls may hang and never resolve. A more detailed scenario is provided in a commit message. For the last two methods, we also extend their descriptions to indicate the new possible exception type, `raft::stopped_error`. This change is correct since either we enter the functions and throw the exception immediately (if the server has already been aborted), or it will be thrown upon the call to `raft::server::abort`. We fix both issues. A few reproducer tests have been included to verify that the calls finish and throw the appropriate errors. Fixes SCYLLADB-841 Backport: Although the hanging problems haven't been spotted so far (at least to the best of my knowledge), it's best to avoid running into a problem like that, so let's backport the changes to all supported versions. They're small enough. Closes scylladb/scylladb#28822 * https://github.com/scylladb/scylladb: raft: Make methods throw stopped_error if server aborted raft: Throw stopped_error if server aborted test: raft: Introduce get_default_cluster (cherry picked from commit `bb1a798c2c`) Closes scylladb/scylladb#28903	2026-04-30 08:54:16 +03:00
Tomasz Grabiec	36637e3583	test: pylib: Ignore exceptions in wait_for() ManagerClient::get_ready_cql() calls server_sees_others(), which waits for servers to see each other as alive in gossip. If one of the servers is still early in boot, RESTful API call to "gossiper/endpoint/live" may fail. It throws an exception, which currently terminates the wait_for() and propagates up, failing the test. Fix this by ignoring errors when polling inside wait_for. In case of timeout, we log the last exception. This should fix the problem not only in this case, for all uses of wait_for(). Example output: ``` pred = <function ManagerClient.server_sees_others.<locals>._sees_min_others at 0x7f022af9a140> deadline = 1775218828.9172852, period = 1.0, before_retry = None backoff_factor = 1.5, max_period = 1.0, label = None async def wait_for( pred: Callable[[], Awaitable[Optional[T]]], deadline: float, period: float = 0.1, before_retry: Optional[Callable[[], Any]] = None, backoff_factor: float = 1.5, max_period: float = 1.0, label: Optional[str] = None) -> T: tag = label or getattr(pred, '__name__', 'unlabeled') start = time.time() retries = 0 last_exception: Exception \| None = None while True: elapsed = time.time() - start if time.time() >= deadline: timeout_msg = f"wait_for({tag}) timed out after {elapsed:.2f}s ({retries} retries)" if last_exception is not None: timeout_msg += ( f"; last exception: {type(last_exception).__name__}: {last_exception}" ) raise AssertionError(timeout_msg) from last_exception raise AssertionError(timeout_msg) try: > res = await pred() test/pylib/util.py:80: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ async def _sees_min_others(): > raise Exception("asd") E Exception: asd test/pylib/manager_client.py:802: Exception The above exception was the direct cause of the following exception: manager = <test.pylib.manager_client.ManagerClient object at 0x7f022af7e7b0> @pytest.mark.asyncio async def test_auth_after_reset(manager: ManagerClient) -> None: servers = await manager.servers_add(3, config=auth_config, auto_rack_dc="dc1") > cql, _ = await manager.get_ready_cql(servers) test/cluster/auth_cluster/test_auth_after_reset.py:33: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ test/pylib/manager_client.py:137: in get_ready_cql await self.servers_see_each_other(servers) test/pylib/manager_client.py:820: in servers_see_each_other await asyncio.gather(others) test/pylib/manager_client.py:806: in server_sees_others await wait_for(_sees_min_others, time() + interval, period=.5) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pred = <function ManagerClient.server_sees_others.<locals>._sees_min_others at 0x7f022af9a140> deadline = 1775218828.9172852, period = 1.0, before_retry = None backoff_factor = 1.5, max_period = 1.0, label = None async def wait_for( pred: Callable[[], Awaitable[Optional[T]]], deadline: float, period: float = 0.1, before_retry: Optional[Callable[[], Any]] = None, backoff_factor: float = 1.5, max_period: float = 1.0, label: Optional[str] = None) -> T: tag = label or getattr(pred, '__name__', 'unlabeled') start = time.time() retries = 0 last_exception: Exception \| None = None while True: elapsed = time.time() - start if time.time() >= deadline: timeout_msg = f"wait_for({tag}) timed out after {elapsed:.2f}s ({retries} retries)" if last_exception is not None: timeout_msg += ( f"; last exception: {type(last_exception).__name__}: {last_exception}" ) > raise AssertionError(timeout_msg) from last_exception E AssertionError: wait_for(_sees_min_others) timed out after 45.30s (46 retries); last exception: Exception: asd test/pylib/util.py:76: AssertionError ``` Fixes a failure observed in test_auth_after_reset: ``` manager = <test.pylib.manager_client.ManagerClient object at 0x7fb3740e1630> @pytest.mark.asyncio async def test_auth_after_reset(manager: ManagerClient) -> None: servers = await manager.servers_add(3, config=auth_config, auto_rack_dc="dc1") cql, _ = await manager.get_ready_cql(servers) await cql.run_async("ALTER ROLE cassandra WITH PASSWORD = 'forgotten_pwd'") logging.info("Stopping cluster") await asyncio.gather([manager.server_stop_gracefully(server.server_id) for server in servers]) logging.info("Deleting sstables") for table in ["roles", "role_members", "role_attributes", "role_permissions"]: await asyncio.gather([manager.server_wipe_sstables(server.server_id, "system", table) for server in servers]) logging.info("Starting cluster") # Don't try connect to the servers yet, with deleted superuser it will be possible only after # quorum is reached. await asyncio.gather([manager.server_start(server.server_id, connect_driver=False) for server in servers]) logging.info("Waiting for CQL connection") await repeat_until_success(lambda: manager.driver_connect(auth_provider=PlainTextAuthProvider(username="cassandra", password="cassandra"))) > await manager.get_ready_cql(servers) test/cluster/auth_cluster/test_auth_after_reset.py:50: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ test/pylib/manager_client.py:137: in get_ready_cql await self.servers_see_each_other(servers) test/pylib/manager_client.py:819: in servers_see_each_other await asyncio.gather(*others) test/pylib/manager_client.py:805: in server_sees_others await wait_for(_sees_min_others, time() + interval, period=.5) test/pylib/util.py:71: in wait_for res = await pred() test/pylib/manager_client.py:802: in _sees_min_others alive_nodes = await self.api.get_alive_endpoints(server_ip) test/pylib/rest_client.py:243: in get_alive_endpoints data = await self.client.get_json(f"/gossiper/endpoint/live", host=node_ip) test/pylib/rest_client.py:99: in get_json ret = await self._fetch("GET", resource_uri, response_type = "json", host = host, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <test.pylib.rest_client.TCPRESTClient object at 0x7fb2404a0650> method = 'GET', resource = '/gossiper/endpoint/live', response_type = 'json' host = '127.15.252.8', port = 10000, params = None, json = None, timeout = None allow_failed = False async def _fetch(self, method: str, resource: str, response_type: Optional[str] = None, host: Optional[str] = None, port: Optional[int] = None, params: Optional[Mapping[str, str]] = None, json: Optional[Mapping] = None, timeout: Optional[float] = None, allow_failed: bool = False) -> Any: # Can raise exception. See https://docs.aiohttp.org/en/latest/web_exceptions.html assert method in ["GET", "POST", "PUT", "DELETE"], f"Invalid HTTP request method {method}" assert response_type is None or response_type in ["text", "json"], \ f"Invalid response type requested {response_type} (expected 'text' or 'json')" # Build the URI port = port if port else self.default_port if hasattr(self, "default_port") else None port_str = f":{port}" if port else "" assert host is not None or hasattr(self, "default_host"), "_fetch: missing host for " \ "{method} {resource}" host_str = host if host is not None else self.default_host uri = self.uri_scheme + "://" + host_str + port_str + resource logging.debug(f"RESTClient fetching {method} {uri}") client_timeout = ClientTimeout(total = timeout if timeout is not None else 300) async with request(method, uri, connector = self.connector if hasattr(self, "connector") else None, params = params, json = json, timeout = client_timeout) as resp: if allow_failed: return await resp.json() if resp.status != 200: text = await resp.text() > raise HTTPError(uri, resp.status, params, json, text) E test.pylib.rest_client.HTTPError: HTTP error 404, uri: http://127.15.252.8:10000/gossiper/endpoint/live, params: None, json: None, body: E {"message": "Not found", "code": 404} test/pylib/rest_client.py:77: HTTPError ``` Fixes: SCYLLADB-1367 Closes scylladb/scylladb#29323 (cherry picked from commit `74542be5aa`) Closes scylladb/scylladb#29338	2026-04-30 08:53:05 +03:00
Wojciech Mitros	f318968cfe	view: apply existing range tombstones after exhausting the update reader When view_update_builder::on_results() hits the path where the update fragment reader is already exhausted, it still needs to keep tracking existing range tombstones and apply them to encountered rows. Otherwise a row covered by an existing range tombstone can appear alive while generating the view update and create a spurious view row. Update the existing tombstone state even on the exhausted-reader path and apply the effective tombstone to clustering rows before generating the row tombstone update. Add a cqlpy regression test covering the partition-delete-after-range-tombstone case. Fixes: SCYLLADB-1649 Closes scylladb/scylladb#29481 (cherry picked from commit `073710a661`) Closes scylladb/scylladb#29649	2026-04-30 08:51:55 +03:00
Roy Dahan	b7974b9b09	ci: pin GitHub Actions to commit SHAs and migrate to Node.js 24 Pin all external GitHub Actions to full commit SHAs and upgrade to their latest major versions to reduce supply chain attack surface: - actions/checkout: v3/v4/v5 -> v6.0.2 - actions/github-script: v7 -> v8.0.0 - actions/setup-python: v5 -> v6.2.0 - actions/upload-artifact: v4 -> v7.0.0 - astral-sh/setup-uv: v6 -> v8.0.0 - mheap/github-action-required-labels: v5.5.2 (pinned) - redhat-plumbers-in-action/differential-shellcheck: v5.5.6 (pinned) - codespell-project/actions-codespell: v2.2 (pinned, was @master) Set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true in all 21 workflows that use JavaScript-based actions to opt into the Node.js 24 runtime now. This resolves the deprecation warning: "Node.js 20 actions are deprecated. Please check if updated versions of these actions are available that support Node.js 24. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Node.js 20 will be removed from the runner on September 16th, 2026." See: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/ scylladb/github-automation references are intentionally left at @main as they are org-internal reusable workflows. Fixes: SCYLLADB-1410 Backport: Backport is required for live branches that run GH actions: 2026.1, 2025.4, 2025.1 and 2024.1 Closes scylladb/scylladb#29525 (cherry picked from commit `d2d7604188`) Closes scylladb/scylladb#29525	2026-04-30 08:50:25 +03:00
Marcin Maliszkiewicz	9d95e0e6ba	Merge 'storage_service: fix REST API races during shutdown and cross-shard forwarding' from Piotr Smaron REST route removal unregisters handlers but does not wait for requests that already entered storage_service. A request can therefore suspend inside an async operation, restart proceeds to tear the service down, and the coroutine later resumes against destroyed members such as _topology_state_machine, _group0, or _sys_ks — a use-after-destruction bug that surfaces as UBSAN dynamic-type failures (e.g. the crash seen from topology_state_load()). Fix this by holding storage_service::_async_gate from the entry boundary of every externally-triggered async operation so that stop() drains them before teardown begins. The gate is acquired in run_with_api_lock, run_with_no_api_lock, and in individual REST handlers that bypass those wrappers (reload_raft_topology_state, mark_excluded, removenode, schema reload, topology-request waits/abort, cleanup, ring/schema queries, SSTable dictionary training/publish, and sampling). Additionally, fix get_ownership() and abort_topology_request() which forward work to shard 0 but were still referencing the caller-shard's `this` pointer instead of the destination-shard instance, causing silent cross-shard access to shard-local state. Add a cluster regression test that repeatedly exercises the multi-shard ownership REST path to cover the forwarding fix. Fixes: SCYLLADB-1415 Should be backported to all branches, the code has been introduced around 2024.1 release. Closes scylladb/scylladb#29373 * github.com:scylladb/scylladb: storage_service: fix shard-0 forwarding in REST helpers storage_service: gate REST-facing async operations during shutdown storage_service: prepare for async gate in REST handlers (cherry picked from commit `4043d95810`) Closes scylladb/scylladb#29611	2026-04-27 13:57:32 +02:00
Avi Kivity	ea78ab34d7	Merge '[Backport 2026.1] test: fix flaky test_read_repair_with_trace_logging by reading tracing with CL=ALL' from Scylladb[bot] Tracing events are written to system_traces.events with CL=ANY, so they are only guaranteed to be present on the local node of the query coordinator. Reading them back with the driver default (CL=LOCAL_ONE) may route the query to a replica that has not yet received all events, causing the assertion on 'digest mismatch, starting read repair' to fail intermittently. Fix execute_with_tracing() to read tracing via the ResponseFuture API with query_cl=ConsistencyLevel.ALL, so events from all replicas are merged before the caller inspects them. Fixes: SCYLLADB-1707 Backport: fixing flaky test, test failure only seen on master so far so no backport - (cherry picked from commit `b49cf6247f`) Parent PR: #29566 Closes scylladb/scylladb#29631 * github.com:scylladb/scylladb: test: fix flaky test_read_repair_with_trace_logging by reading tracing with CL=ALL replica/database: consolidate the two database_apply error injections	2026-04-25 20:47:46 +03:00

1 2 3 4 5 ...

51994 Commits