scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 19:10:42 +00:00

Author	SHA1	Message	Date
Ernest Zaslavsky	116a2f43ee	sstables_loader: prevent use-after-free on table drop during streaming sstables_loader::load_and_stream holds a replica::table& reference via the sstable_streamer for the entire streaming operation. If the table is dropped concurrently (e.g. DROP TABLE or DROP KEYSPACE), the reference becomes dangling and the next access crashes with SEGV. This was observed in a longevity-50gb-12h-master test run where a keyspace was dropped while load_and_stream was still streaming SSTables from a previous batch. Fix by acquiring a stream_in_progress() phaser guard in load_and_stream before creating the streamer. table::stop() calls _pending_streams_phaser.close() which blocks until all outstanding guards are released, keeping the table alive for the duration of the streaming operation. Fixes: SCYLLADB-1639 Closes scylladb/scylladb#29403 (cherry picked from commit `e5e6608f20`) Closes scylladb/scylladb#29558 Closes scylladb/scylladb#29600	2026-04-24 10:33:51 +02:00
Emil Maskovsky	5d1a8b91cd	encryption: cover system.raft table in system_info_encryption Extend system_info_encryption to encrypt system.raft SSTables. system.raft contains the Raft log, which may hold sensitive user data (e.g. batched mutations), so it warrants the same treatment as system.batchlog and system.paxos. During upgrade, existing unencrypted system.raft SSTables remain readable. Existing data is rewritten encrypted via compaction, or immediately via nodetool upgradesstables -a. Update the operator-facing system_info_encryption description to mention system.raft and add a focused test that verifies the schema extension is present on system.raft. Fixes: CUSTOMER-268 Backport: 2026.1 - closes an encryption-at-rest coverage gap: system.raft may persist sensitive user-originated data unencrypted; backport to the current LTS. Closes scylladb/scylladb#29242 (cherry picked from commit `91df3795fc`) Closes scylladb/scylladb#29526 Closes scylladb/scylladb#29582	2026-04-23 10:17:20 +02:00
Botond Dénes	c9ee67c85c	Merge 'transport: improve memory accounting for big responses and slow network' from Marcin Maliszkiewicz After obtaining the CQL response, check if its actual size exceeds the initially acquired memory permit. If so, acquire additional semaphore units and adopt them into the permit, ensuring accurate memory accounting for large responses. Additionally, move the permit into a .then() continuation so that the semaphore units are kept alive until write_message finishes, preventing premature release of memory permit. This is especially important with slow networks and big responses when buffers can accumulate and deplete a node's memory. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1306 Related https://scylladb.atlassian.net/browse/SCYLLADB-740 Backport: all supported versions Closes scylladb/scylladb#29288 * github.com:scylladb/scylladb: transport: add per-service-level pending response memory metric transport: hold memory permit until response write completes transport: account for response size exceeding initial memory estimate (cherry picked from commit `86417d49de`) Closes scylladb/scylladb#29410 Closes scylladb/scylladb#29455	2026-04-21 12:38:40 +02:00
Piotr Dulikowski	04d8663052	Merge 'cql3: pin prepared cache entry in prepare() to avoid invalid weak handle race' from Alex Dathskovsky query_processor::prepare() could race with prepared statement invalidation: after loading from the prepared cache, we converted the cached object to a checked weak pointer and then continued asynchronous work (including error-injection waitpoints). If invalidation happened in that window, the weak handle could no longer be promoted and the prepare path could fail nondeterministically. This change keeps a strong cache entry reference alive across the whole critical section in prepare() by using a pinned cache accessor (get_pinned()), and only deriving the weak handle while the entry is pinned. This removes the lifetime gap without adding retry loops. Test coverage was extended in test/cluster/test_prepare_race.py: - reproduces the invalidation-during-prepare window with injection, - verifies prepare completes successfully, - then invalidates again and executes the same stale client prepared object, - confirms the driver transparently re-requests/re-prepares and execution succeeds. This change introduces: - no behavior change for normal prepare flow besides stronger lifetime guarantees, - no new protocol semantics, - preserves existing cache invalidation logic, - adds explicit cluster-level regression coverage for both the race and driver reprepare path. - pushes the re prepare operation twards the driver, the server will return unprepared error for the first time and the driver will have to re prepare during execution stage Fixes: https://github.com/scylladb/scylladb/issues/27657 Backport to active branches recommended: No node crash, but user-visible PREPARE failures under rare schema-invalidation race; low-risk timeout-bounded retry improves robustness. Closes scylladb/scylladb#28952 * github.com:scylladb/scylladb: transport/messages: hold pinned prepared entry in PREPARE result cql3: pin prepared cache entry in prepare() to avoid invalid weak handle race (cherry picked from commit `d9a277453e`) Closes scylladb/scylladb#29001 Closes scylladb/scylladb#29195	2026-04-20 12:59:53 +02:00
Jenkins Promoter	72cd145990	Update ScyllaDB version to: 2025.4.8	2026-04-17 01:18:47 +03:00
Botond Dénes	41e2c2d1c4	Merge 'tasks: do not fail the wait request if rpc fails' from Aleksandra Martyniuk During decommission, we first mark a topology request as done, then shut down a node and in the following steps we remove node from the topology. Thus, finished request does not imply that a node is removed from the topology. Due to that, in node_ops_virtual_task::wait, while gathering children from the whole cluster, we may hit the connection exception - because a node is still in topology, even though it is down. Modify the get_children method to ignore the exception and warn about the failure instead. Keep token_metadata_ptr in get_children to prevent topology from changing. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-867 Needs backports to all versions Closes scylladb/scylladb#29035 * github.com:scylladb/scylladb: tasks: fix indentation tasks: do not fail the wait request if rpc fails tasks: pass token_metadata_ptr to task_manager::virtual_task::impl::get_children (cherry picked from commit `2e47fd9f56`) Closes scylladb/scylladb#29193	2026-04-16 21:57:08 +03:00
Pavel Emelyanov	2d1fdce790	object_storage_endpoint_param: Make it formattable for real Currently the formatter converts it to json and then tries to emit into the output context with the "...{{}}" format string. The intent was to have the "...{<json text>}" output. However, the double curly brace in format string means "print a curly brace", so the output of the above formatting is "...{}", literally. Fix by keeping a single curly brace. The "<json text>" thing will have its own surrounding curly braces. Fixes #27718 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #27687 (reworked version of commit `a6618f2`, the formatter is in db/config.cc) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#27733	2026-04-16 21:54:23 +03:00
Pavel Emelyanov	1e0487bd57	table: Add formatter for group_id argument in tablet merge exception message Fixes: SCYLLADB-1432 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#29143 (cherry picked from commit `78f5bab7cf`) Closes scylladb/scylladb#29412 Closes scylladb/scylladb#29453	2026-04-16 10:57:37 +03:00
Michał Chojnowski	18fc2eff31	test: add a missing reconnect_driver in test_sstable_compression_dictionaries_upgrade.py Need to work around https://github.com/scylladb/python-driver/issues/295, lest a CQL query fail spuriously after the cluster restart. Fixes: SCYLLADB-1114 Closes scylladb/scylladb#29118 (cherry picked from commit `6b18d95dec`) Closes scylladb/scylladb#29146 Closes scylladb/scylladb#29366	2026-04-16 10:56:59 +03:00
Botond Dénes	340369a4d4	Merge 'Alternator: add per-table batch latency metrics and test coverage' from Amnon Heiman This series fixes a metrics visibility gap in Alternator and adds regression coverage. Until now, BatchGetItem and BatchWriteItem updated global latency histograms but did not consistently update per-table latency histograms. As a result, table-level latency dashboards could miss batch traffic. It updates the batch read/write paths to compute request duration once and record it in both global and per-table latency metrics. Add the missing tests, including a metric-agnostic helper and a dedicated per-table latency test that verifies latency counters increase for item and batch operations. This change is metrics-only (no API/behavior change for requests) and improves observability consistency between global and per-table views. Fixes #28721 We assume the alternator per-table metrics exist, but the batch ones are not updated Closes scylladb/scylladb#28732 * github.com:scylladb/scylladb: test(alternator): add per-table latency coverage for item and batch ops alternator: track per-table latency for batch get/write operations (cherry picked from commit `035aa90d4b`) Closes scylladb/scylladb#29067 Closes scylladb/scylladb#29365	2026-04-16 10:56:16 +03:00
Pavel Emelyanov	9041f70f34	s3: Don't rearm credential timers when credentials are not refreshed The update_credentials_and_rearm() may get "empty" credentials from _creds_provider_chain.get_aws_credentials() -- it doesn't throw, but returns default-initialized value. In that case the expires_at will be set to time_point::min, and it's probably not a good idea to arm the refresh timer and, even worse idea, to subtract 1h from it. Fixes #29056 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#29057 (cherry picked from commit `961fc9e041`) Closes scylladb/scylladb#29158 Closes scylladb/scylladb#29364	2026-04-16 10:55:25 +03:00
Nikos Dragazis	7ed772866e	scylla_swap_setup: Remove Before=swap.target dependency from swap unit When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes #26519. Fixes SCYLLADB-1257 [1] https://github.com/scylladb/scylla-machine-image/pull/426 [2] https://github.com/canonical/cloud-init/pull/1213#issuecomment-1026065501 Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb/scylladb#28504 (cherry picked from commit `6d50e67bd2`) Closes scylladb/scylladb#29339 Closes scylladb/scylladb#29354	2026-04-16 10:53:52 +03:00
Andrzej Jackowski	7b97fe4a92	reader_concurrency_semaphore: fix leak workaround `e4da0afb8d5491bf995cbd1d7a7efb966c79ac34` introduces a protection against resources that are "made up" of thin air to `reader_concurrency_semaphore`. If there are more `_resources` than the `_initial_resources`, it means there is a negative leak, and `on_internal_error_noexcept` is called. In addition to it, `_resources` is set to `std::max(_resources, _initial_resources)`. However, the commit message of `e4da0afb8d5491bf995cbd1d7a7efb966c79ac34` states the opposite: "The detection also clamps the _resources to _initial_resources, to prevent any damage". Before this commit, the protection mechanism doesn't clamp `_resources` to `_initial_resources` but instead keeps `_resources` high, possibly even indefinitely growing. This commit changes `std::max` to `std::min` to make the code behave as intended. Fixes: SCYLLADB-1014 Refs: SCYLLADB-163 Closes scylladb/scylladb#28982 (cherry picked from commit `9247dff8c2`) Closes scylladb/scylladb#28988 Closes scylladb/scylladb#29196	2026-04-15 11:53:42 +02:00
Jenkins Promoter	ba3b7360e0	Update pgo profiles - aarch64	2026-04-15 04:44:13 +03:00
Jenkins Promoter	06e9ecab9b	Update pgo profiles - x86_64	2026-04-15 03:56:26 +03:00
Avi Kivity	c6d356e7cc	Merge '[Backport 2025.4] vector_search: fix race condition on connection timeout' from Scylladb[bot] vector_search: fix race condition on connection timeout When a `with_connect` operation timed out, the underlying connection attempt continued to run in the reactor. This could lead to a crash if the connection was established/rejected after the client object had already been destroyed. This issue was observed during the teardown phase of a upcoming high-availability test case. This commit fixes the race condition by ensuring the connection attempt is properly canceled on timeout. Additionally, the explicit TLS handshake previously forced during the connection is now deferred to the first I/O operation, which is the default and preferred behavior. Fixes: SCYLLADB-832 Backports to 2026.1 and 2025.4 are required, as this issue also exists on those branches and is causing CI flakiness. - (cherry picked from commit `3107d9083e`) Parent PR: #29031 Closes scylladb/scylladb#29360 * github.com:scylladb/scylladb: vector_search: test: fix flaky test vector_search: fix race condition on connection timeout	2026-04-12 14:24:57 +03:00
Botond Dénes	30c2f03749	Merge 'cql3: fix null handling in data_value formatting' from Dario Mirovic `data_value::to_parsable_string()` crashes with a null pointer dereference when called on a `null` data_value. Return `"null"` instead. Added tests after the fix. Manually checked that tests fail without the fix. Fixes SCYLLADB-1350 This is a fix that prevents format crash. No known occurrence in production, but backport is desirable. Closes scylladb/scylladb#29262 * github.com:scylladb/scylladb: test: boost: test null data value to_parsable_string cql3: fix null handling in data_value formatting (cherry picked from commit `816f2bf163`) Closes scylladb/scylladb#29384 Closes scylladb/scylladb#29434	2026-04-12 14:23:23 +03:00
Pavel Emelyanov	17075bf3f9	Merge 'encryption: fix deadlock in encrypted_data_source::get()' from Ernest Zaslavsky When encrypted_data_source::get() caches a trailing block in _next, the next call takes it directly — bypassing input_stream::read(), which checks _eof. It then calls input_stream::read_exactly() on the already-drained stream. Unlike read(), read_up_to(), and consume(), read_exactly() does not check _eof when the buffer is empty, so it calls _fd.get() on a source that already returned EOS. In production this manifested as stuck encrypted SSTable component downloads during tablet restore: the underlying chunked_download_source hung forever on the post-EOS get(), causing 4 tablets to never complete. The stuck files were always block-aligned sizes (8k, 12k) where _next gets populated and the source is fully consumed in the same call. Fix by checking _input.eof() before calling read_exactly(). When the stream already reached EOF, buf2 is known to be empty, so the call is skipped entirely. A comprehensive test is added that uses a strict_memory_source which fails on post-EOS get(), reproducing the exact code path that caused the production deadlock. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1128 Backport to 2025.3/4 and 2026.1 is needed since it fixes a bug that may bite us in production, to be on the safe side Closes scylladb/scylladb#29110 * github.com:scylladb/scylladb: encryption: fix deadlock in encrypted_data_source::get() Fix formatting after previous patch Fix indentation after previous patch (cherry picked from commit `3b9398dfc8`) Closes scylladb/scylladb#29198 Closes scylladb/scylladb#29359	2026-04-12 14:20:37 +03:00
Patryk Jędrzejczak	a2c23793ab	raft_group0: join_group0: fix join hang when node joins group 0 before post_server_start A joining node hung forever if the topology coordinator added it to the group 0 configuration before the node reached `post_server_start`. In that case, `server->get_configuration().contains(my_id)` returned true and the node broke out of the join loop early, skipping `post_server_start`. `_join_node_group0_started` was therefore never set, so the node's `join_node_response` RPC handler blocked indefinitely. Meanwhile the topology coordinator's `respond_to_joining_node` call (which has no timeout) hung forever waiting for the reply that never came. Fix by only taking the early-break path when not starting as a follower (i.e. when the node is the discovery leader or is restarting). A joining node must always reach `post_server_start`. We also provide a regression test. It takes 6s in dev mode. Fixes SCYLLADB-959 Closes scylladb/scylladb#29266 (cherry picked from commit `b9f82f6f23`) Closes scylladb/scylladb#29291 Closes scylladb/scylladb#29308 scylla-2025.4.7-candidate-20260412022226 scylla-2025.4.7	2026-04-09 15:53:43 +02:00
Jenkins Promoter	177996a385	Update ScyllaDB version to: 2025.4.7	2026-04-09 15:52:03 +03:00
Karol Nowacki	f5111bfc9b	vector_search: test: fix flaky test The test assumes that the sleep duration will be at least the value of the sleep parameter. However, the actual sleep time can be slightly less than requested (e.g., a 100ms sleep request might result in a 99ms sleep). This commit adjusts the test's time comparison to be more lenient, preventing test flakiness.	2026-04-09 13:41:23 +02:00
Karol Nowacki	6b5de6394b	vector_search: fix race condition on connection timeout When a `with_connect` operation timed out, the underlying connection attempt continued to run in the reactor. This could lead to a crash if the connection was established/rejected after the client object had already been destroyed. This issue was observed during the teardown phase of a upcoming high-availability test case. This commit fixes the race condition by ensuring the connection attempt is properly canceled on timeout. Additionally, the explicit TLS handshake previously forced during the connection is now deferred to the first I/O operation, which is the default and preferred behavior. Fixes: SCYLLADB-832	2026-04-09 13:19:34 +02:00
Andrzej Jackowski	2b58d396e7	test: use exclusive driver connection in test_limited_concurrency_of_writes Use get_cql_exclusive(node1) so the driver only connects to node1 and never attempts to contact the stopped node2. The test was flaky because the driver received `Host has been marked down or removed` from node2. Fixes: SCYLLADB-1227 Closes scylladb/scylladb#29268 (cherry picked from commit `ab43420d30`) Closes scylladb/scylladb#29278 Closes scylladb/scylladb#29355	2026-04-07 14:22:48 +03:00
Botond Dénes	0aa03677b5	test/cluster: fix flaky test_cleanup_stop by using asyncio.sleep The test was using time.sleep(1) (a blocking call) to wait after scheduling the stop_compaction task, intending to let it register on the server before releasing the sstable_cleanup_wait injection point. However, time.sleep() blocks the asyncio event loop entirely, so the asyncio.create_task(stop_compaction) task never gets to run during the sleep. After the sleep, the directly-awaited message_injection() runs first, releasing the injection point before stop_compaction is even sent. By the time stop_compaction reaches Scylla, the cleanup has already completed successfully -- no exception is raised and the test fails. Fix by replacing time.sleep(1) with await asyncio.sleep(1), which yields control to the event loop and allows the stop_compaction task to actually send its HTTP request before message_injection is called. Fixes: SCYLLADB-834 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Closes scylladb/scylladb#29202 (cherry picked from commit `068a7894aa`) Closes scylladb/scylladb#29277 Closes scylladb/scylladb#29356	2026-04-07 14:22:27 +03:00
Emil Maskovsky	ba10314b16	raft: abort stale snapshot transfers when term changes The Bug Assertion failure: `SCYLLA_ASSERT(res.second)` in `raft/server.cc` when creating a snapshot transfer for a destination that already had a stale in-flight transfer. Root Cause If a node loses leadership and later becomes leader again before the next `io_fiber` iteration, the old transfer from the previous term can remain in `_snapshot_transfers` while `become_leader()` resets progress state. When the new term emits `install_snapshot(dst)`, `send_snapshot(dst)` tries to create a new entry for the same destination and can hit the assertion. The Fix Abort all in-flight snapshot transfers in `process_fsm_output()` when `term_and_vote` is persisted. A term/vote change marks existing transfers as stale, so we clean them up before dispatching messages from that batch and before any new snapshot transfer is started. With cross-term cleanup moved to the term-change path, `send_snapshot()` now asserts the within-term invariant that there is at most one in-flight transfer per destination. Fixes: SCYLLADB-862 Backport: The issue is reproducible in master, but is present in all active branches. Closes scylladb/scylladb#29092 (cherry picked from commit `9dad68e58d`) Closes scylladb/scylladb#29264 Closes scylladb/scylladb#29357	2026-04-07 14:22:06 +03:00
Raphael S. Carvalho	609181d0e3	mutation_compactor: Fix tombstone GC metrics to account for only expired There are 3 metrics (that goes in every compaction_history entry): total_tombstone_purge_attempt total_tombstone_purge_failure_due_to_overlapping_with_memtable total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable When a tombstone is not expired (e.g. doesn't satisfy "gc_before" or grace period), it can be currently accounted as failure due to overlapping with either memtable or uncompacting sstable. So those 2 last metrics have noise of unexpired tombstones. What we should do is to only account for expired tombstones in all those 3 metrics. We lose the info of knowing the amount of tombstones processed by compaction, now we'll only know about the expired ones. But those metrics were primarily added for explaining why expired tombstones cannot be removed. We could have alternatively added a new field purge_failure_due_to_being_unexpired or something, but it requires adding a new field to compaction_history. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-737. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#28669 (cherry picked from commit `f33f324f77`) Closes scylladb/scylladb#28743	2026-04-06 17:53:56 +03:00
Piotr Dulikowski	bccc9be2b6	db: view: mutate_MV: don't hold keyspace ref across preemption Currently, the view_update_generator::mutate_MV function acquires a reference to the keyspace relevant to the operation, then it calls max_concurrent_for_each and uses that reference inside the lambda passed to that function. max_concurrent_for_each can preempt and there is no mechanism that makes sure that the keyspace is alive until the view updates are generated, so it is possible that the keyspace is freed by the time the reference is used. Fix the issue by precomputing the necessary information based on the keyspace reference right away, and then passing that information by value to the other parts of the code. It turns out that we only need to know the replication factor of the datacenter and whether the keyspace uses a network topology strategy. Fixes: scylladb/scylladb#28925 Closes scylladb/scylladb#28928 (cherry picked from commit `42d70baad3`) Closes scylladb/scylladb#28968 Closes scylladb/scylladb#29095	2026-04-06 17:53:40 +03:00
Avi Kivity	d4c28ee317	Merge 'service_levels: mark v2 migration complete on empty legacy table' from Alex Dathskovsky During raft-topology upgrade in 2026.1, service_level_controller::migrate_to_v2() returns early when system_distributed.service_levels is empty. This skips the service_level_version = 2 write, so the cluster is never marked as upgraded to service levels v2 even though there is no data to migrate. Subsequent upgrades may then fail the startup check which requires service_level_version == 2. Remove the early return and let the migration commit the version marker even when there are no legacy service levels rows to copy. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1198 backport: should be backported to all versions that can be upgraded to 2026.2 Closes scylladb/scylladb#29333 * github.com:scylladb/scylladb: test/auth_cluster: cover empty legacy table in service level upgrade service_levels: mark v2 migration complete on empty legacy table (cherry picked from commit `95e422db48`) Closes scylladb/scylladb#29352	2026-04-06 17:51:34 +03:00
Łukasz Paszkowski	e5bd2f8679	test/storage: harden out-of-space prevention tests around restart and disk-utilization transitions The tests in test_out_of_space_prevention.py are flaky. Three issues contribute: 1. After creating/removing the blob file that simulates disk pressure, the tests immediately checked derived state (e.g., "compaction_manager - Drained") without first confirming the disk space monitor had detected the utilization change. Fix: explicitly wait for "Reached/Dropped below critical disk utilization level" right after creating/removing the blob file, before checking downstream effects. 2. Several tests called `manager.driver_connect()` or omitted reconnection entirely after `server_restart()` / `server_start()`. The pre-existing driver session can silently reconnect multiple times, causing subsequent CQL queries to fail. Fix: call `reconnect_driver()` after every node restart. Additionally, call `wait_for_cql_and_get_hosts()` where CQL is used afterward, to ensure all connection pools are established. 3. Some log assertions used marks captured before a restart, so they could match pre-restart messages or miss messages emitted in the correct post-restart window. Fix: refresh marks at the right points. Apart from that, the patch fixes a typo: autotoogle -> autotoggle. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-655 Closes scylladb/scylladb#28626 (cherry picked from commit `826fd5d6c3`) Closes scylladb/scylladb#28967 Closes scylladb/scylladb#29197	2026-04-03 22:23:17 +03:00
Jenkins Promoter	5bdd6ca036	Update pgo profiles - aarch64 scylla-2025.4.6 scylla-2025.4.6-candidate-20260401033549	2026-04-01 04:35:14 +03:00
Jenkins Promoter	ff20b565aa	Update pgo profiles - x86_64	2026-04-01 03:50:02 +03:00
Patryk Jędrzejczak	f5058803c6	locator: everywhere_replication_strategy: fix sanity_check_read_replicas when read_new is true ERMs created in `calculate_vnode_effective_replication_map` have RF computed based on the old token metadata during a topology change. The reading replicas, however, are computed based on the new token metadata (`target_token_metadata`) when `read_new` is true. That can create a mismatch for EverywhereStrategy during some topology changes - RF can be equal to the number of reading replicas +-1. During bootstrap, this can cause the `everywhere_replication_strategy::sanity_check_read_replicas` check to fail in debug mode. We fix the check in this commit by allowing one more reading replica when `read_new` is true. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1147 Closes scylladb/scylladb#29150 (cherry picked from commit `503a6e2d7e`) Closes scylladb/scylladb#29248 Closes scylladb/scylladb#29269	2026-03-31 10:24:30 +02:00
Michał Chojnowski	e693ebe552	test/boost/cache_algorithm_test: disable sstable compression to avoid giant index pages The test intentionally creates huge index pages. But since `5e7fb08bf3`, the index reader allocates a block of memory for a whole index page, instead of incrementally allocating small pieces during index parsing. This giant allocation causes the test to fail spuriously in CI sometimes. Fix this by disabling sstable compression on the test table, which puts a hard cap of 2000 keys per index page. Fixes: SCYLLADB-1152 Closes scylladb/scylladb#29152 (cherry picked from commit `f29525f3a6`) Closes scylladb/scylladb#29172 Closes scylladb/scylladb#29259	2026-03-30 14:59:23 +02:00
Patryk Jędrzejczak	be942e9a4f	test: test_remove_garbage_group0_members: wait for token ring and group0 consistency before removenode The removenove initiator could have an outdated token ring (still considering the node removed by the previous removenode a token owner) and unexpectedly reject the operation. Fix that by waiting for token ring and group0 consistency before removenode. Note that the test already checks that consistency, but only for one node, which is different from the removenode initiator. This test has been removed in master together with the code being tested (the gossip-based topology). Hence, the fix is submitted directly to 2026.1. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1103 Backport to all supported branches (other than 2026.1), as the test can fail there. Closes scylladb/scylladb#29108 (cherry picked from commit `1398a55d16`) Closes scylladb/scylladb#29205	2026-03-24 16:09:02 +01:00
Pavel Emelyanov	e212762ab7	database: Rate limit all tokens from a range The limiter scans ranges to decide whether or not to rate-limit the query. However, when considering each range only the front one's token is accounted. This looks like a misprint. The limiter was introduced in `cc9a2ad41f` Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#29050 (cherry picked from commit `8b1ca6dcd6`) Closes scylladb/scylladb#29107 Closes scylladb/scylladb#29194	2026-03-24 16:04:01 +02:00
Botond Dénes	a41d1ec711	Merge 'doc: fix the installation section' from Anna Stuchlik This PR fixes the Installation page: - Replaces `http `with `https `in the download command. - Replaces the Open Source example from the Installation section for CentOS (we overlooked this example before). Fixes https://github.com/scylladb/scylladb/issues/29087 Fixes https://github.com/scylladb/scylladb/issues/29087 This update affects all supported versions and should be backported as a bug fix. Closes scylladb/scylladb#29088 * github.com:scylladb/scylladb: doc: remove the Open Source Example from Installation doc: replace http with https in the installation instructions (cherry picked from commit `e8b37d1a89`) Closes scylladb/scylladb#29135 Closes scylladb/scylladb#29192	2026-03-23 23:50:15 +02:00
Yaron Kaikov	a4b4c4c0a8	.github/workflows/trigger-scylla-ci: fix heredoc injection in trigger-scylla-ci workflow Move all ${{ }} expression interpolations into env: blocks so they are passed as environment variables instead of being expanded directly into shell scripts. This prevents an attacker from escaping the heredoc in the Validate Comment Trigger step and executing arbitrary commands on the runner. The Verify Org Membership step is hardened in the same way for defense-in-depth. Refs: GHSA-9pmq-v59g-8fxp Fixes: SCYLLADB-954 Closes scylladb/scylladb#28935 (cherry picked from commit `977bdd6260`) Closes scylladb/scylladb#28947	2026-03-20 11:00:38 +02:00
Botond Dénes	155d12f4c9	mutation/collection_mutation: don't copy the serialized collection serialize_collection_mutation() copies the serialized collection into the returned collection_mutation object. Change to move to avoid the copy. Fixes: SCYLLADB-1041 Closes scylladb/scylladb#29010 (cherry picked from commit `15cfa5beeb`) Closes scylladb/scylladb#29024 Closes scylladb/scylladb#29037	2026-03-20 11:00:11 +02:00
Aleksandra Martyniuk	e78426c5d4	nodetool: cluster repair: do not fail if a table was dropped nodetool cluster repair without additional params repairs all tablet keyspaces in a cluster. Currently, if a table is dropped while the command is running, all tables are repaired but the command finishes with a failure. Modify nodetool cluster repair. If a table wasn't specified (i.e. all tables are repaired), the command finishes successfully even if a table was dropped. If a table was specified and it does not exist (e.g. because it was dropped before the repair was requested), then the behavior remains unchanged. Fixes: SCYLLADB-568. Closes scylladb/scylladb#28739 (cherry picked from commit `2e68f48068`) Closes scylladb/scylladb#29006 Closes scylladb/scylladb#29038	2026-03-20 10:59:26 +02:00
Anna Stuchlik	11248e5cef	doc: update the warning about shared dictionary training This commit updates the inadequate warning on the Advanced Internode (RPC) Compression page. The warning is replaced with a note about how training data is encrypted. Fixes https://github.com/scylladb/scylladb/issues/29109 Closes scylladb/scylladb#29111 (cherry picked from commit `88b98fac3a`) Closes scylladb/scylladb#29119 Closes scylladb/scylladb#29139	2026-03-20 10:58:59 +02:00
Avi Kivity	1f7dca0225	Merge 'Fix bad performance for densely populated partition index pages' from Tomasz Grabiec This applies to small partition workload where index pages have high partition count, and the index doesn't fit in cache. It was observed that the count can be in the order of hundreds. In such a workload pages undergo constant population, LSA compaction, and LSA eviction, which has severe impact on CPU utilization. Refs https://scylladb.atlassian.net/browse/SCYLLADB-620 This PR reduces the impact by several changes: - reducing memory footprint in the partition index. Assuming partition key size is 16 bytes, the cost dropped from 96 bytes to 36 bytes per partition. - flattening the object graph and amortizing storage. Storing entries directly in the vector. Storing all key values in a single managed_bytes. Making index_entry a trivial struct. - index entries and key storage are now trivially moveable, and batched inside vector storage so LSA migration can use memcpy(), which amortizes the cost per key. This reduces the cost of LSA segment compaction. - LSA eviction is now pretty much constant time for the whole page regardless of the number of entries, because elements are trivial and batched inside vectors. Page eviction cost dropped from 50 us to 1 us. Performance evaluated with: scylla perf-simple-query -c1 -m200M --partitions=1000000 Before: ``` 7774.96 tps (166.0 allocs/op, 521.7 logallocs/op, 54.0 tasks/op, 802428 insns/op, 430457 cycles/op, 0 errors) 7511.08 tps (166.1 allocs/op, 527.2 logallocs/op, 54.0 tasks/op, 804185 insns/op, 430752 cycles/op, 0 errors) 7740.44 tps (166.3 allocs/op, 526.2 logallocs/op, 54.2 tasks/op, 805347 insns/op, 432117 cycles/op, 0 errors) 7818.72 tps (165.2 allocs/op, 517.6 logallocs/op, 53.7 tasks/op, 794965 insns/op, 427751 cycles/op, 0 errors) 7865.49 tps (165.1 allocs/op, 513.3 logallocs/op, 53.6 tasks/op, 788898 insns/op, 425171 cycles/op, 0 errors) ``` After (+318%): ``` 32492.40 tps (130.7 allocs/op, 12.8 logallocs/op, 36.1 tasks/op, 109236 insns/op, 103203 cycles/op, 0 errors) 32591.99 tps (130.4 allocs/op, 12.8 logallocs/op, 36.0 tasks/op, 108947 insns/op, 102889 cycles/op, 0 errors) 32514.52 tps (130.6 allocs/op, 12.8 logallocs/op, 36.0 tasks/op, 109118 insns/op, 103219 cycles/op, 0 errors) 32491.14 tps (130.6 allocs/op, 12.8 logallocs/op, 36.0 tasks/op, 109349 insns/op, 103272 cycles/op, 0 errors) 32582.90 tps (130.5 allocs/op, 12.8 logallocs/op, 36.0 tasks/op, 109269 insns/op, 102872 cycles/op, 0 errors) 32479.43 tps (130.6 allocs/op, 12.8 logallocs/op, 36.0 tasks/op, 109313 insns/op, 103242 cycles/op, 0 errors) 32418.48 tps (130.7 allocs/op, 12.8 logallocs/op, 36.1 tasks/op, 109201 insns/op, 103301 cycles/op, 0 errors) 31394.14 tps (130.7 allocs/op, 12.8 logallocs/op, 36.1 tasks/op, 109267 insns/op, 103301 cycles/op, 0 errors) 32298.55 tps (130.7 allocs/op, 12.8 logallocs/op, 36.1 tasks/op, 109323 insns/op, 103551 cycles/op, 0 errors) ``` When the workload is miss-only, with both row cache and index cache disabled (no cache maintenance cost): perf-simple-query -c1 -m200M --duration 6000 --partitions=100000 --enable-index-cache=0 --enable-cache=0 Before: ``` 9124.57 tps (146.2 allocs/op, 789.0 logallocs/op, 45.3 tasks/op, 889320 insns/op, 357937 cycles/op, 0 errors) 9437.23 tps (146.1 allocs/op, 789.3 logallocs/op, 45.3 tasks/op, 889613 insns/op, 357782 cycles/op, 0 errors) 9455.65 tps (146.0 allocs/op, 787.4 logallocs/op, 45.2 tasks/op, 887606 insns/op, 357167 cycles/op, 0 errors) 9451.22 tps (146.0 allocs/op, 787.4 logallocs/op, 45.3 tasks/op, 887627 insns/op, 357357 cycles/op, 0 errors) 9429.50 tps (146.0 allocs/op, 787.4 logallocs/op, 45.3 tasks/op, 887761 insns/op, 358148 cycles/op, 0 errors) 9430.29 tps (146.1 allocs/op, 788.2 logallocs/op, 45.3 tasks/op, 888501 insns/op, 357679 cycles/op, 0 errors) 9454.08 tps (146.0 allocs/op, 787.3 logallocs/op, 45.3 tasks/op, 887545 insns/op, 357132 cycles/op, 0 errors) ``` After (+55%): ``` 14484.84 tps (150.7 allocs/op, 6.5 logallocs/op, 44.7 tasks/op, 396164 insns/op, 229490 cycles/op, 0 errors) 14526.21 tps (150.8 allocs/op, 6.5 logallocs/op, 44.8 tasks/op, 396401 insns/op, 228824 cycles/op, 0 errors) 14567.53 tps (150.7 allocs/op, 6.5 logallocs/op, 44.7 tasks/op, 396319 insns/op, 228701 cycles/op, 0 errors) 14545.63 tps (150.6 allocs/op, 6.5 logallocs/op, 44.7 tasks/op, 395889 insns/op, 228493 cycles/op, 0 errors) 14626.06 tps (150.5 allocs/op, 6.5 logallocs/op, 44.7 tasks/op, 395254 insns/op, 227891 cycles/op, 0 errors) 14593.74 tps (150.5 allocs/op, 6.5 logallocs/op, 44.7 tasks/op, 395480 insns/op, 227993 cycles/op, 0 errors) 14538.10 tps (150.8 allocs/op, 6.5 logallocs/op, 44.8 tasks/op, 397035 insns/op, 228831 cycles/op, 0 errors) 14527.18 tps (150.8 allocs/op, 6.5 logallocs/op, 44.8 tasks/op, 396992 insns/op, 228839 cycles/op, 0 errors) ``` Same as above, but with summary ratio increased from 0.0005 to 0.005 (smaller pages): Before: ``` 33906.70 tps (146.1 allocs/op, 83.6 logallocs/op, 45.1 tasks/op, 170553 insns/op, 98104 cycles/op, 0 errors) 32696.16 tps (146.0 allocs/op, 83.5 logallocs/op, 45.1 tasks/op, 170369 insns/op, 98405 cycles/op, 0 errors) 33889.05 tps (146.1 allocs/op, 83.6 logallocs/op, 45.1 tasks/op, 170551 insns/op, 98135 cycles/op, 0 errors) 33893.24 tps (146.1 allocs/op, 83.5 logallocs/op, 45.1 tasks/op, 170488 insns/op, 98168 cycles/op, 0 errors) 33836.73 tps (146.1 allocs/op, 83.6 logallocs/op, 45.1 tasks/op, 170528 insns/op, 98226 cycles/op, 0 errors) 33897.61 tps (146.0 allocs/op, 83.5 logallocs/op, 45.1 tasks/op, 170428 insns/op, 98081 cycles/op, 0 errors) 33834.73 tps (146.1 allocs/op, 83.5 logallocs/op, 45.1 tasks/op, 170438 insns/op, 98178 cycles/op, 0 errors) 33776.31 tps (146.3 allocs/op, 83.9 logallocs/op, 45.2 tasks/op, 170958 insns/op, 98418 cycles/op, 0 errors) 33808.08 tps (146.3 allocs/op, 83.9 logallocs/op, 45.2 tasks/op, 170940 insns/op, 98388 cycles/op, 0 errors) ``` After (+18%): ``` 40081.51 tps (148.2 allocs/op, 4.4 logallocs/op, 45.0 tasks/op, 121047 insns/op, 82231 cycles/op, 0 errors) 40005.85 tps (148.6 allocs/op, 4.4 logallocs/op, 45.2 tasks/op, 121327 insns/op, 82545 cycles/op, 0 errors) 39816.75 tps (148.3 allocs/op, 4.4 logallocs/op, 45.1 tasks/op, 121067 insns/op, 82419 cycles/op, 0 errors) 39953.11 tps (148.1 allocs/op, 4.4 logallocs/op, 45.0 tasks/op, 121027 insns/op, 82258 cycles/op, 0 errors) 40073.96 tps (148.2 allocs/op, 4.4 logallocs/op, 45.0 tasks/op, 121006 insns/op, 82313 cycles/op, 0 errors) 39882.25 tps (148.2 allocs/op, 4.4 logallocs/op, 45.0 tasks/op, 120925 insns/op, 82320 cycles/op, 0 errors) 39916.08 tps (148.3 allocs/op, 4.4 logallocs/op, 45.1 tasks/op, 121054 insns/op, 82393 cycles/op, 0 errors) 39786.30 tps (148.2 allocs/op, 4.4 logallocs/op, 45.0 tasks/op, 121027 insns/op, 82465 cycles/op, 0 errors) 38662.45 tps (148.3 allocs/op, 4.4 logallocs/op, 45.0 tasks/op, 121108 insns/op, 82312 cycles/op, 0 errors) 39849.42 tps (148.3 allocs/op, 4.4 logallocs/op, 45.1 tasks/op, 121098 insns/op, 82447 cycles/op, 0 errors) ``` Closes scylladb/scylladb#28603 * github.com:scylladb/scylladb: sstables: mx: index_reader: Optimize parsing for no promoted index case vint: Use std::countl_zero() test: sstable_partition_index_cache_test: Validate scenario of pages with sparse promoted index placement sstables: mx: index_reader: Amoritze partition key storage managed_bytes: Hoist write_fragmented() to common header utils: managed_vector: Use std::uninitialized_move() to move objects sstables: mx: index_reader: Keep promoted_index info next to index_entry sstables: mx: index_reader: Extract partition_index_page::clear_gently() sstables: mx: index_reader: Shave-off 16 bytes from index_entry by using raw_token sstables: mx: index_reader: Reduce allocation_section overhead during index page parsing by batching allocation sstables: mx: index_reader: Keep index_entry directly in the vector dht: Introduce raw_token test: perf_simple_query: Add 'sstable-format' command-line option test: perf_simple_query: Add 'sstable-summary-ratio' command-line option test: perf-simple-query: Add option to disable index cache test: cql_test_env: Respect enable-index-cache config (cherry picked from commit `5e7fb08bf3`) Closes scylladb/scylladb#29136 Closes scylladb/scylladb#29140	2026-03-20 10:58:26 +02:00
Piotr Dulikowski	6b9aa303d8	Merge '[Backport 2026.1] mv: allow skipping view updates when a collection is unmodified' from Scylladb[bot] mv: allow skipping view updates when a collection is unmodified When we generate view updates, we check whether we can skip the entire view update if all columns selected by the view are unmodified. However, for collection columns, we only check if they were unset before and after the update. In this patch we add a check for the actual collection contents. We perform this check for both virtual and non-virtual selections. When the column is only a virtual column in the view, it would be enough to check the liveness of each collection cell, however for that we'd need to deserialize the entire collection anyway, which should be effectively as expensive as comparing all of its bytes. Fixes: SCYLLADB-996 - (cherry picked from commit `01ddc17ab9`) Parent PR: #28839 Closes scylladb/scylladb#28977 * github.com:scylladb/scylladb: Merge 'mv: allow skipping view updates when a collection is unmodified' from Wojciech Mitros mv: remove dead code in view_updates::can_skip_view_updates Closes scylladb/scylladb#29094	2026-03-18 10:41:50 +01:00
Patryk Jędrzejczak	3863dfbc0a	test: test_raft_no_quorum: decrease group0_raft_op_timeout_in_ms after quorum loss `test_raft_no_quorum.py::test_cannot_add_new_node` is currently flaky in dev mode. The bootstrap of the first node can fail due to `add_entry()` timing out (with the 1s timeout set by the test case). Other test cases in this test file could fail in the same way as well, so we need a general fix. We don't want to increase the timeout in dev mode, as it would slow down the test. The solution is to keep the timeout unchanged, but set it only after quorum is lost. This prevents unexpected timeouts of group0 operations with almost no impact on the test running time. A note about the new `update_group0_raft_op_timeout` function: waiting for the log seems to be necessary only for `test_quorum_lost_during_node_join_response_handler`, but let's do it for all test cases just in case (including `test_can_restart` that shouldn't be flaky currently). Fixes https://scylladb.atlassian.net/browse/SCYLLADB-913 Closes scylladb/scylladb#28998 (cherry picked from commit `526e5986fe`) Closes scylladb/scylladb#29068 Closes scylladb/scylladb#29097	2026-03-18 10:15:34 +01:00
Tomasz Grabiec	0c786045ff	Merge 'service: assert that tables updated via group0 use schema commitlog' from Aleksandra Martyniuk Set enable_schema_commitlog for each group0 tables. Assert that group0 tables use schema commitlog in ensure_group0_schema (per each command). Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-914. Needs backport to all live releases as all are vulnerable Closes scylladb/scylladb#28876 * github.com:scylladb/scylladb: test: add test_group0_tables_use_schema_commitlog db: service: remove group0 tables from schema commitlog schema initializer service: ensure that tables updated via group0 use schema commitlog db: schema: remove set_is_group0_table param (cherry picked from commit `b90fe19a42`) Closes scylladb/scylladb#28916 Closes scylladb/scylladb#28986	2026-03-17 17:29:36 +01:00
Jenkins Promoter	a6f58c154b	Update pgo profiles - aarch64	2026-03-15 04:44:34 +02:00
Piotr Dulikowski	00269ca839	Merge '[Backport 2025.4] vector_search: test: fix HTTPS client test flakiness' from Scylladb[bot] The default 100ms timeout for client readiness in tests is too aggressive. In some test environments, this is not enough time for client creation, which involves address resolution and TLS certificate reading, leading to flaky tests. This commit increases the default client creation timeout to 10 seconds. This makes the tests more robust, especially in slower execution environments, and prevents similar flakiness in other test cases. Fixes: VECTOR-547 Fixes: SCYLLADB-802 Fixes: SCYLLADB-825 Fixes: SCYLLADB-826 Backport to 2025.4 and 2026.1, as the same problem occurs on these branches and can potentially make the CI flaky there as well. - (cherry picked from commit `bf369326d6`) Parent PR: #28879 Closes scylladb/scylladb#28895 * github.com:scylladb/scylladb: vector_search: test: include ANN error in assertion vector_search: test: fix HTTPS client test flakiness	2026-03-12 10:23:35 +01:00
Karol Nowacki	560553f654	vector_search: test: include ANN error in assertion When the test fails, the assertion message does not include the error from the ANN request. This change enhances the assertion to include the specific ANN error, making it easier to diagnose test failures.	2026-03-11 10:17:57 +01:00
Karol Nowacki	9ba8d85c39	vector_search: test: fix HTTPS client test flakiness The default 100ms timeout for client readiness in tests is too aggressive. In some test environments, this is not enough time for client creation, which involves address resolution and TLS certificate reading, leading to flaky tests. This commit increases the default client creation timeout to 10 seconds. This makes the tests more robust, especially in slower execution environments, and prevents similar flakiness in other test cases. Fixes: VECTOR-547, SCYLLADB-802	2026-03-11 10:17:36 +01:00
Patryk Jędrzejczak	9152a8d111	test: test_full_shutdown_during_replace: retry replace after the replacing node is removed from gossip The test is currently flaky with `reuse_ip = True`. The issue is that the test retries replace before the first replace is rolled back and the first replacing node is removed from gossip. The second replacing node can see the entry of the first replacing node in gossip. This entry has a newer generation than the entry of the node being replaced, and both replacing nodes have the same IP as the node being replaced. Therefore, the second replacing node incorrectly considers this entry as the entry of the node being replaced. This entry is missing rack and DC, so the second replace fails with ``` ERROR 2026-02-24 21:19:03,420 [shard 0:main] init - Startup failed: std::runtime_error (Cannot replace node 8762a9d2-3b30-4e66-83a1-98d16c5dd007/127.61.127.1 with a node on a different data center or rack. Current location=UNKNOWN_DC/UNKNOWN_RACK, new location=dc1/rack2) ``` Fixes SCYLLADB-805 Closes scylladb/scylladb#28829 (cherry picked from commit `ba7f314cdc`) Closes scylladb/scylladb#28953	2026-03-10 16:48:05 +01:00
Anna Stuchlik	b0bb0a3731	doc: fix the unified installer instructions This commit updates the documentation for the unified installer. - The Open Source example is replaced with version 2025.1 (Source Available, currently supported, LTS). - The info about CentOS 7 is removed (no longer supported). - Java 8 is removed. - The example for cassandra-stress is removed (as it was already removed on other installation pages). Fixes https://github.com/scylladb/scylladb/issues/28150 Closes scylladb/scylladb#28152 (cherry picked from commit `855c503c63`) Closes scylladb/scylladb#28910 Closes scylladb/scylladb#28927	2026-03-09 21:40:55 +02:00

1 2 3 4 5 ...

50431 Commits