Commit Graph

50437 Commits

Author SHA1 Message Date
Botond Dénes
dfd2507f0d test/cluster/test_incremental_repair: fix flaky do_tablet_incremental_repair_and_ops
The log grep in get_sst_status searched from the beginning of the log
(no from_mark), so the second-repair assertions were checking cumulative
counts across both repairs rather than counts for the second repair alone.

The expected values (sst_add==2, sst_mark==2) relied on this cumulative
behaviour: 1 from the first repair + 1 from the second = 2. This works
when the second repair encounters exactly one unrepaired sstable, but
fails whenever the second repair sees two.

The second repair can see two unrepaired sstables when the 100 keys
inserted before it (via asyncio.gather) trigger a background auto-flush
before take_storage_snapshot runs. take_storage_snapshot always flushes
the memtable itself, so if an auto-flush already split the batch into two
sstables on disk, the second repair's snapshot contains both and logs
"Added sst" twice, making the cumulative count 3 instead of 2.

Fix: take a log mark per-server before each repair call and pass it to
get_sst_status so each check counts only the entries produced by that
repair. The expected values become 1/0/1 and 1/1/1 respectively,
independent of how many sstables happened to exist beforehand.

get_sst_status gains an optional from_mark parameter (default None)
which preserves existing call sites that intentionally grep from the
start of the log.

Fixes: SCYLLADB-1711

Closes scylladb/scylladb#29484

(cherry picked from commit d280517e27)

Closes scylladb/scylladb#29633

Closes scylladb/scylladb#29640
2026-04-25 18:43:47 +03:00
Botond Dénes
04e33d69da Merge 'service: tasks: return successful status if a table was dropped' from Aleksandra Martyniuk
tablet_virtual_task::wait throws if a table on which a tablet operation
was working is dropped.

Treat the tablet operation as successful if a table is dropped.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-494

Needs backport to all live releases

Closes scylladb/scylladb#28933

* github.com:scylladb/scylladb:
  test: add test_tablet_repair_wait_with_table_drop
  service: tasks: return successful status if a table was dropped

(cherry picked from commit 1e41db5948)

Closes scylladb/scylladb#29361
2026-04-24 18:06:09 +03:00
Marcin Maliszkiewicz
5468cd49da Merge 'ldap: fix double-free of LDAPMessage in poll_results()' from Andrzej Jackowski
In the unregistered-ID branch, ldap_msgfree() was called on a result
already owned by an RAII ldap_msg_ptr, causing a double-free on scope
exit. Remove the redundant manual free.

Fixes: SCYLLADB-1344

Backport: 2026.1, 2025.4, 2025.1 - it's a memory corruption, with a one-line fix, so better backport it everywhere.

Closes scylladb/scylladb#29302

* github.com:scylladb/scylladb:
  test: ldap: add regression test for double-free on unregistered message ID
  ldap: fix double-free of LDAPMessage in poll_results()

(cherry picked from commit 895fdb6d29)

Closes scylladb/scylladb#29393

Closes scylladb/scylladb#29454
2026-04-24 18:05:26 +03:00
Botond Dénes
52a3ed4312 Merge 'test: cluster: Deflake test_write_cl_any_to_dead_node_generates_hints' from Dawid Mędrek
Before these changes, we would send mutations to the node and
immediately query the metrics to see how many hints had been written.
However, that could lead to random failures of the test: even if the
mutations have finished executing, hints are stored asynchronously, so
we don't have a guarantee they have already been processed.

To prevent such failures, we rewrite the check: we will perform multiple
checks against the metrics until we have confirmed that the hints have
indeed been written or we hit the timeout.

We're generous with the timeout: we give the test 60 seconds. That
should be enough time to avoid flakiness even on super slow machines,
and if the test does fail, we will know something is really wrong.

As a bonus, we improve the test in general too. We explicitly express
the preconditions we rely on, as well as bump the log level. If the
test fails in the future, it might be very difficult do debug it
without this additional information.

Fixes SCYLLADB-1133

Backport: The test is present on all supported branches. To avoid
          running into more failures, we should backport these changes
          to them.

Closes scylladb/scylladb#29191

* github.com:scylladb/scylladb:
  test: cluster: Increase log level in test_write_cl_any_to_dead_node_generates_hints
  test: cluster: Await all mutations concurrently in test_write_cl_any_to_dead_node_generates_hints
  test: cluster: Specify min_tablet_count in test_write_cl_any_to_dead_node_generates_hints
  test: cluster: Use new_test_table in test_write_cl_any_to_dead_node_generates_hints
  test: cluster: Introduce auxiliary function keyspace_has_tablets
  test: cluster: Deflake test_write_cl_any_to_dead_node_generates_hints

(cherry picked from commit ffd58ca1f0)

Closes scylladb/scylladb#29246

Closes scylladb/scylladb#29596
2026-04-24 18:04:37 +03:00
Tomasz Grabiec
816526842e test/alternator: stop concurrent-requests test when workers hit limit
`test_limit_concurrent_requests` could create far more tables than intended
because worker threads looped indefinitely and only the probe path terminated
the test. In practice, workers often hit `RequestLimitExceeded` first, but the
test kept running and creating tables, increasing memory pressure and causing
flakiness due to bad_alloc errors in logs.

Fix by replacing the old probe-driven termination with worker-driven
termination. Workers now run until any worker sees
`RequestLimitExceeded`.

Fixes SCYLLADB-1181

Closes scylladb/scylladb#29270

(cherry picked from commit b355bb70c2)

Closes scylladb/scylladb#29292

Closes scylladb/scylladb#29610
2026-04-24 18:03:10 +03:00
Benny Halevy
73d7f9c1fe compaction_manager: fix use-after-free in postponed_compactions_reevaluation()
drain() signals the postponed_reevaluation condition variable to terminate
the postponed_compactions_reevaluation() coroutine but does not await its
completion. When enable() is called afterwards, it overwrites
_waiting_reevalution with a new coroutine, orphaning the old one. During
shutdown, really_do_stop() only awaits the latest coroutine via
_waiting_reevalution, leaving the orphaned coroutine still alive. After
sharded::stop() destroys the compaction_manager, the orphaned coroutine
resumes and reads freed memory (is_disabled() accesses _state).

Fix by introducing stop_postponed_compactions(), awaiting the reevaluation coroutine in
both drain() and stop() after signaling it, if postponed_compactions_reevaluation() is running.
It uses an std::optional<future<>> for _waiting_reevalution and std::exchange to leave
_waiting_reevalution disengaged when postponed_compactions_reevaluation() is not running.
This prevents a race between drain() and stop().

While at it, fix typo in _waiting_reevalution -> _waiting_reevaluation.

Fixes: SCYLLADB-1600
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#29443

(cherry picked from commit 05a00fe140)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#29527

Closes scylladb/scylladb#29629
2026-04-24 18:02:31 +03:00
Ernest Zaslavsky
116a2f43ee sstables_loader: prevent use-after-free on table drop during streaming
sstables_loader::load_and_stream holds a replica::table& reference via
the sstable_streamer for the entire streaming operation.  If the table
is dropped concurrently (e.g. DROP TABLE or DROP KEYSPACE), the
reference becomes dangling and the next access crashes with SEGV.

This was observed in a longevity-50gb-12h-master test run where a
keyspace was dropped while load_and_stream was still streaming SSTables
from a previous batch.

Fix by acquiring a stream_in_progress() phaser guard in load_and_stream
before creating the streamer.  table::stop() calls
_pending_streams_phaser.close() which blocks until all outstanding
guards are released, keeping the table alive for the duration of the
streaming operation.

Fixes: SCYLLADB-1639

Closes scylladb/scylladb#29403

(cherry picked from commit e5e6608f20)

Closes scylladb/scylladb#29558

Closes scylladb/scylladb#29600
2026-04-24 10:33:51 +02:00
Emil Maskovsky
5d1a8b91cd encryption: cover system.raft table in system_info_encryption
Extend system_info_encryption to encrypt system.raft SSTables.
system.raft contains the Raft log, which may hold sensitive user data
(e.g. batched mutations), so it warrants the same treatment as
system.batchlog and system.paxos.

During upgrade, existing unencrypted system.raft SSTables remain
readable. Existing data is rewritten encrypted via compaction, or
immediately via nodetool upgradesstables -a.

Update the operator-facing system_info_encryption description to
mention system.raft and add a focused test that verifies the schema
extension is present on system.raft.

Fixes: CUSTOMER-268

Backport: 2026.1 - closes an encryption-at-rest coverage gap: system.raft may persist sensitive user-originated data unencrypted; backport to the current LTS.

Closes scylladb/scylladb#29242

(cherry picked from commit 91df3795fc)

Closes scylladb/scylladb#29526

Closes scylladb/scylladb#29582
2026-04-23 10:17:20 +02:00
Botond Dénes
c9ee67c85c Merge 'transport: improve memory accounting for big responses and slow network' from Marcin Maliszkiewicz
After obtaining the CQL response, check if its actual size exceeds the initially acquired memory permit. If so, acquire additional semaphore units and adopt them into the permit, ensuring accurate memory accounting for large responses.

Additionally, move the permit into a .then() continuation so that the semaphore units are kept alive until write_message finishes, preventing premature release of memory permit. This is especially important with slow networks and big responses when buffers can accumulate and deplete a node's memory.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1306
Related https://scylladb.atlassian.net/browse/SCYLLADB-740

Backport: all supported versions

Closes scylladb/scylladb#29288

* github.com:scylladb/scylladb:
  transport: add per-service-level pending response memory metric
  transport: hold memory permit until response write completes
  transport: account for response size exceeding initial memory estimate

(cherry picked from commit 86417d49de)

Closes scylladb/scylladb#29410

Closes scylladb/scylladb#29455
2026-04-21 12:38:40 +02:00
Piotr Dulikowski
04d8663052 Merge 'cql3: pin prepared cache entry in prepare() to avoid invalid weak handle race' from Alex Dathskovsky
query_processor::prepare() could race with prepared statement invalidation: after loading from the prepared cache, we converted the cached object to a checked weak pointer and then continued asynchronous work (including error-injection waitpoints). If invalidation happened in that window, the weak handle could no longer be promoted and the prepare path could fail nondeterministically.

This change keeps a strong cache entry reference alive across the whole critical section in prepare() by using a pinned cache accessor (get_pinned()), and only deriving the weak handle while the entry is pinned. This removes the lifetime gap without adding retry loops.

  Test coverage was extended in test/cluster/test_prepare_race.py:

  - reproduces the invalidation-during-prepare window with injection,
  - verifies prepare completes successfully,
  - then invalidates again and executes the same stale client prepared object,
  - confirms the driver transparently re-requests/re-prepares and execution succeeds.

  This change introduces:

  - no behavior change for normal prepare flow besides stronger lifetime guarantees,
  - no new protocol semantics,
  - preserves existing cache invalidation logic,
  - adds explicit cluster-level regression coverage for both the race and driver reprepare path.
  - pushes the re prepare operation twards the driver, the server will return unprepared error for the first time and the driver will have to re prepare during execution stage

Fixes: https://github.com/scylladb/scylladb/issues/27657

Backport to active branches recommended: No node crash, but user-visible PREPARE failures under rare schema-invalidation race; low-risk timeout-bounded retry improves robustness.

Closes scylladb/scylladb#28952

* github.com:scylladb/scylladb:
  transport/messages: hold pinned prepared entry in PREPARE result
  cql3: pin prepared cache entry in prepare() to avoid invalid weak handle race

(cherry picked from commit d9a277453e)

Closes scylladb/scylladb#29001

Closes scylladb/scylladb#29195
2026-04-20 12:59:53 +02:00
Jenkins Promoter
72cd145990 Update ScyllaDB version to: 2025.4.8 2026-04-17 01:18:47 +03:00
Botond Dénes
41e2c2d1c4 Merge 'tasks: do not fail the wait request if rpc fails' from Aleksandra Martyniuk
During decommission, we first mark a topology request as done, then shut
down a node and in the following steps we remove node from the topology.
Thus,  finished request does not imply that a node is removed from
the topology.

Due to that, in node_ops_virtual_task::wait, while gathering children
from the whole cluster, we may hit the connection exception - because
a node is still in topology, even though it is down.

Modify the get_children method to ignore the exception and warn
about the failure instead.

Keep token_metadata_ptr in get_children to prevent topology from changing.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-867

Needs backports to all versions

Closes scylladb/scylladb#29035

* github.com:scylladb/scylladb:
  tasks: fix indentation
  tasks: do not fail the wait request if rpc fails
  tasks: pass token_metadata_ptr to task_manager::virtual_task::impl::get_children

(cherry picked from commit 2e47fd9f56)

Closes scylladb/scylladb#29193
2026-04-16 21:57:08 +03:00
Pavel Emelyanov
2d1fdce790 object_storage_endpoint_param: Make it formattable for real
Currently the formatter converts it to json and then tries to emit into
the output context with the "...{{}}" format string. The intent was to
have the "...{<json text>}" output. However, the double curly brace in
format string means "print a curly brace", so the output of the above
formatting is "...{}", literally.

Fix by keeping a single curly brace. The "<json text>" thing will have
its own surrounding curly braces.

Fixes #27718

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #27687

(reworked version of commit a6618f2, the formatter is in db/config.cc)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#27733
2026-04-16 21:54:23 +03:00
Pavel Emelyanov
1e0487bd57 table: Add formatter for group_id argument in tablet merge exception message
Fixes: SCYLLADB-1432

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#29143

(cherry picked from commit 78f5bab7cf)

Closes scylladb/scylladb#29412

Closes scylladb/scylladb#29453
2026-04-16 10:57:37 +03:00
Michał Chojnowski
18fc2eff31 test: add a missing reconnect_driver in test_sstable_compression_dictionaries_upgrade.py
Need to work around https://github.com/scylladb/python-driver/issues/295,
lest a CQL query fail spuriously after the cluster restart.

Fixes: SCYLLADB-1114

Closes scylladb/scylladb#29118

(cherry picked from commit 6b18d95dec)

Closes scylladb/scylladb#29146

Closes scylladb/scylladb#29366
2026-04-16 10:56:59 +03:00
Botond Dénes
340369a4d4 Merge 'Alternator: add per-table batch latency metrics and test coverage' from Amnon Heiman
This series fixes a metrics visibility gap in Alternator and adds regression coverage.

Until now, BatchGetItem and BatchWriteItem updated global latency histograms but did not consistently update per-table latency histograms. As a result, table-level latency dashboards could miss batch traffic.

It updates the batch read/write paths to compute request duration once and record it in both global and per-table latency metrics.

Add the missing tests, including a metric-agnostic helper and a dedicated per-table latency test that verifies latency counters increase for item and batch operations.

This change is metrics-only (no API/behavior change for requests) and improves observability consistency between global and per-table views.

Fixes #28721

**We assume the alternator per-table metrics exist, but the batch ones are not updated**

Closes scylladb/scylladb#28732

* github.com:scylladb/scylladb:
  test(alternator): add per-table latency coverage for item and batch ops
  alternator: track per-table latency for batch get/write operations

(cherry picked from commit 035aa90d4b)

Closes scylladb/scylladb#29067

Closes scylladb/scylladb#29365
2026-04-16 10:56:16 +03:00
Pavel Emelyanov
9041f70f34 s3: Don't rearm credential timers when credentials are not refreshed
The update_credentials_and_rearm() may get "empty" credentials from
_creds_provider_chain.get_aws_credentials() -- it doesn't throw, but
returns default-initialized value. In that case the expires_at will be
set to time_point::min, and it's probably not a good idea to arm the
refresh timer and, even worse idea, to subtract 1h from it.

Fixes #29056

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#29057

(cherry picked from commit 961fc9e041)

Closes scylladb/scylladb#29158

Closes scylladb/scylladb#29364
2026-04-16 10:55:25 +03:00
Nikos Dragazis
7ed772866e scylla_swap_setup: Remove Before=swap.target dependency from swap unit
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes #26519.
Fixes SCYLLADB-1257

[1] https://github.com/scylladb/scylla-machine-image/pull/426
[2] https://github.com/canonical/cloud-init/pull/1213#issuecomment-1026065501

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb/scylladb#28504

(cherry picked from commit 6d50e67bd2)

Closes scylladb/scylladb#29339

Closes scylladb/scylladb#29354
2026-04-16 10:53:52 +03:00
Andrzej Jackowski
7b97fe4a92 reader_concurrency_semaphore: fix leak workaround
`e4da0afb8d5491bf995cbd1d7a7efb966c79ac34` introduces a protection
against resources that are "made up" of thin air to
`reader_concurrency_semaphore`. If there are more `_resources` than
the `_initial_resources`, it means there is a negative leak, and
`on_internal_error_noexcept` is called. In addition to it,
`_resources` is set to `std::max(_resources, _initial_resources)`.

However, the commit message of `e4da0afb8d5491bf995cbd1d7a7efb966c79ac34`
states the opposite: "The detection also clamps the
_resources to _initial_resources, to prevent any damage".

Before this commit, the protection mechanism doesn't clamp
`_resources` to `_initial_resources` but instead keeps `_resources` high,
possibly even indefinitely growing. This commit changes `std::max` to
`std::min` to make the code behave as intended.

Fixes: SCYLLADB-1014
Refs: SCYLLADB-163

Closes scylladb/scylladb#28982

(cherry picked from commit 9247dff8c2)

Closes scylladb/scylladb#28988

Closes scylladb/scylladb#29196
2026-04-15 11:53:42 +02:00
Jenkins Promoter
ba3b7360e0 Update pgo profiles - aarch64 2026-04-15 04:44:13 +03:00
Jenkins Promoter
06e9ecab9b Update pgo profiles - x86_64 2026-04-15 03:56:26 +03:00
Avi Kivity
c6d356e7cc Merge '[Backport 2025.4] vector_search: fix race condition on connection timeout' from Scylladb[bot]
vector_search: fix race condition on connection timeout

When a `with_connect` operation timed out, the underlying connection
attempt continued to run in the reactor. This could lead to a crash
if the connection was established/rejected after the client object had
already been destroyed. This issue was observed during the teardown
phase of a upcoming high-availability test case.

This commit fixes the race condition by ensuring the connection attempt
is properly canceled on timeout.

 Additionally, the explicit TLS handshake previously forced during the
 connection is now deferred to the first I/O operation, which is the
 default and preferred behavior.

 Fixes: SCYLLADB-832

Backports to 2026.1 and 2025.4 are required, as this issue also exists on those branches and is causing CI flakiness.

- (cherry picked from commit 3107d9083e)

Parent PR: #29031

Closes scylladb/scylladb#29360

* github.com:scylladb/scylladb:
  vector_search: test: fix flaky test
  vector_search: fix race condition on connection timeout
2026-04-12 14:24:57 +03:00
Botond Dénes
30c2f03749 Merge 'cql3: fix null handling in data_value formatting' from Dario Mirovic
`data_value::to_parsable_string()` crashes with a null pointer dereference when called on a `null` data_value. Return `"null"` instead.

Added tests after the fix. Manually checked that tests fail without the fix.

Fixes SCYLLADB-1350

This is a fix that prevents format crash. No known occurrence in production, but backport is desirable.

Closes scylladb/scylladb#29262

* github.com:scylladb/scylladb:
  test: boost: test null data value to_parsable_string
  cql3: fix null handling in data_value formatting

(cherry picked from commit 816f2bf163)

Closes scylladb/scylladb#29384

Closes scylladb/scylladb#29434
2026-04-12 14:23:23 +03:00
Pavel Emelyanov
17075bf3f9 Merge 'encryption: fix deadlock in encrypted_data_source::get()' from Ernest Zaslavsky
When encrypted_data_source::get() caches a trailing block in _next, the next call takes it directly — bypassing input_stream::read(), which checks _eof. It then calls input_stream::read_exactly() on the already-drained stream. Unlike read(), read_up_to(), and consume(), read_exactly() does not check _eof when the buffer is empty, so it calls _fd.get() on a source that already returned EOS.

In production this manifested as stuck encrypted SSTable component downloads during tablet restore: the underlying chunked_download_source hung forever on the post-EOS get(), causing 4 tablets to never complete. The stuck files were always block-aligned sizes (8k, 12k) where _next gets populated and the source is fully consumed in the same call.

Fix by checking _input.eof() before calling read_exactly(). When the stream already reached EOF, buf2 is known to be empty, so the call is skipped entirely.

A comprehensive test is added that uses a strict_memory_source which fails on post-EOS get(), reproducing the exact code path that caused the production deadlock.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1128

Backport to 2025.3/4 and 2026.1 is needed since it fixes a bug that may bite us in production, to be on the safe side

Closes scylladb/scylladb#29110

* github.com:scylladb/scylladb:
  encryption: fix deadlock in encrypted_data_source::get()
  Fix formatting after previous patch
  Fix indentation after previous patch

(cherry picked from commit 3b9398dfc8)

Closes scylladb/scylladb#29198

Closes scylladb/scylladb#29359
2026-04-12 14:20:37 +03:00
Patryk Jędrzejczak
a2c23793ab raft_group0: join_group0: fix join hang when node joins group 0 before post_server_start
A joining node hung forever if the topology coordinator added it to the
group 0 configuration before the node reached `post_server_start`. In
that case, `server->get_configuration().contains(my_id)` returned true
and the node broke out of the join loop early, skipping
`post_server_start`. `_join_node_group0_started` was therefore never set,
so the node's `join_node_response` RPC handler blocked indefinitely.
Meanwhile the topology coordinator's `respond_to_joining_node` call
(which has no timeout) hung forever waiting for the reply that never came.

Fix by only taking the early-break path when not starting as a follower
(i.e. when the node is the discovery leader or is restarting). A joining
node must always reach `post_server_start`.

We also provide a regression test. It takes 6s in dev mode.

Fixes SCYLLADB-959

Closes scylladb/scylladb#29266

(cherry picked from commit b9f82f6f23)

Closes scylladb/scylladb#29291

Closes scylladb/scylladb#29308
scylla-2025.4.7-candidate-20260412022226 scylla-2025.4.7
2026-04-09 15:53:43 +02:00
Jenkins Promoter
177996a385 Update ScyllaDB version to: 2025.4.7 2026-04-09 15:52:03 +03:00
Karol Nowacki
f5111bfc9b vector_search: test: fix flaky test
The test assumes that the sleep duration will be at least the value of
the sleep parameter. However, the actual sleep time can be slightly less
than requested (e.g., a 100ms sleep request might result in a 99ms
sleep).

This commit adjusts the test's time comparison to be more lenient,
preventing test flakiness.
2026-04-09 13:41:23 +02:00
Karol Nowacki
6b5de6394b vector_search: fix race condition on connection timeout
When a `with_connect` operation timed out, the underlying connection
attempt continued to run in the reactor. This could lead to a crash
if the connection was established/rejected after the client object had
already been destroyed. This issue was observed during the teardown
phase of a upcoming high-availability test case.

This commit fixes the race condition by ensuring the connection attempt
is properly canceled on timeout.

Additionally, the explicit TLS handshake previously forced during the
connection is now deferred to the first I/O operation, which is the
default and preferred behavior.

Fixes: SCYLLADB-832
2026-04-09 13:19:34 +02:00
Andrzej Jackowski
2b58d396e7 test: use exclusive driver connection in test_limited_concurrency_of_writes
Use get_cql_exclusive(node1) so the driver only connects to node1 and
never attempts to contact the stopped node2. The test was flaky because
the driver received `Host has been marked down or removed` from node2.

Fixes: SCYLLADB-1227

Closes scylladb/scylladb#29268

(cherry picked from commit ab43420d30)

Closes scylladb/scylladb#29278

Closes scylladb/scylladb#29355
2026-04-07 14:22:48 +03:00
Botond Dénes
0aa03677b5 test/cluster: fix flaky test_cleanup_stop by using asyncio.sleep
The test was using time.sleep(1) (a blocking call) to wait after
scheduling the stop_compaction task, intending to let it register on
the server before releasing the sstable_cleanup_wait injection point.

However, time.sleep() blocks the asyncio event loop entirely, so the
asyncio.create_task(stop_compaction) task never gets to run during the
sleep. After the sleep, the directly-awaited message_injection() runs
first, releasing the injection point before stop_compaction is even
sent. By the time stop_compaction reaches Scylla, the cleanup has
already completed successfully -- no exception is raised and the test
fails.

Fix by replacing time.sleep(1) with await asyncio.sleep(1), which
yields control to the event loop and allows the stop_compaction task
to actually send its HTTP request before message_injection is called.

Fixes: SCYLLADB-834

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes scylladb/scylladb#29202

(cherry picked from commit 068a7894aa)

Closes scylladb/scylladb#29277

Closes scylladb/scylladb#29356
2026-04-07 14:22:27 +03:00
Emil Maskovsky
ba10314b16 raft: abort stale snapshot transfers when term changes
**The Bug**

Assertion failure: `SCYLLA_ASSERT(res.second)` in `raft/server.cc`
when creating a snapshot transfer for a destination that already had a
stale in-flight transfer.

**Root Cause**

If a node loses leadership and later becomes leader again before the next
`io_fiber` iteration, the old transfer from the previous term can remain
in `_snapshot_transfers` while `become_leader()` resets progress state.
When the new term emits `install_snapshot(dst)`, `send_snapshot(dst)`
tries to create a new entry for the same destination and can hit the
assertion.

**The Fix**

Abort all in-flight snapshot transfers in `process_fsm_output()` when
`term_and_vote` is persisted. A term/vote change marks existing transfers
as stale, so we clean them up before dispatching messages from that batch
and before any new snapshot transfer is started.

With cross-term cleanup moved to the term-change path, `send_snapshot()`
now asserts the within-term invariant that there is at most one in-flight
transfer per destination.

Fixes: SCYLLADB-862

Backport: The issue is reproducible in master, but is present in all
active branches.

Closes scylladb/scylladb#29092

(cherry picked from commit 9dad68e58d)

Closes scylladb/scylladb#29264

Closes scylladb/scylladb#29357
2026-04-07 14:22:06 +03:00
Raphael S. Carvalho
609181d0e3 mutation_compactor: Fix tombstone GC metrics to account for only expired
There are 3 metrics (that goes in every compaction_history entry):
total_tombstone_purge_attempt
total_tombstone_purge_failure_due_to_overlapping_with_memtable
total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable

When a tombstone is not expired (e.g. doesn't satisfy "gc_before" or
grace period), it can be currently accounted as failure due to
overlapping with either memtable or uncompacting sstable.
So those 2 last metrics have noise of *unexpired* tombstones.

What we should do is to only account for expired tombstones in all
those 3 metrics. We lose the info of knowing the amount of tombstones
processed by compaction, now we'll only know about the expired ones.
But those metrics were primarily added for explaining why expired
tombstones cannot be removed.
We could have alternatively added a new field
purge_failure_due_to_being_unexpired or something, but
it requires adding a new field to compaction_history.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-737.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#28669

(cherry picked from commit f33f324f77)

Closes scylladb/scylladb#28743
2026-04-06 17:53:56 +03:00
Piotr Dulikowski
bccc9be2b6 db: view: mutate_MV: don't hold keyspace ref across preemption
Currently, the view_update_generator::mutate_MV function acquires a
reference to the keyspace relevant to the operation, then it calls
max_concurrent_for_each and uses that reference inside the lambda passed
to that function. max_concurrent_for_each can preempt and there is no
mechanism that makes sure that the keyspace is alive until the view
updates are generated, so it is possible that the keyspace is freed by
the time the reference is used.

Fix the issue by precomputing the necessary information based on the
keyspace reference right away, and then passing that information by
value to the other parts of the code. It turns out that we only need to
know the replication factor of the datacenter and whether the keyspace
uses a network topology strategy.

Fixes: scylladb/scylladb#28925

Closes scylladb/scylladb#28928

(cherry picked from commit 42d70baad3)

Closes scylladb/scylladb#28968

Closes scylladb/scylladb#29095
2026-04-06 17:53:40 +03:00
Avi Kivity
d4c28ee317 Merge 'service_levels: mark v2 migration complete on empty legacy table' from Alex Dathskovsky
During raft-topology upgrade in 2026.1, service_level_controller::migrate_to_v2() returns early when system_distributed.service_levels is empty. This skips the service_level_version = 2 write, so the cluster is never marked as upgraded to service levels v2 even though there is no data to migrate. Subsequent upgrades may then fail the startup check which requires service_level_version == 2.
Remove the early return and let the migration commit the version marker even when there are no legacy service levels rows to copy.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1198

backport: should be backported to all versions that can be upgraded to 2026.2

Closes scylladb/scylladb#29333

* github.com:scylladb/scylladb:
  test/auth_cluster: cover empty legacy table in service level upgrade
  service_levels: mark v2 migration complete on empty legacy table

(cherry picked from commit 95e422db48)

Closes scylladb/scylladb#29352
2026-04-06 17:51:34 +03:00
Łukasz Paszkowski
e5bd2f8679 test/storage: harden out-of-space prevention tests around restart and disk-utilization transitions
The tests in test_out_of_space_prevention.py are flaky. Three issues contribute:

1. After creating/removing the blob file that simulates disk pressure,
   the tests immediately checked derived state (e.g., "compaction_manager
   - Drained") without first confirming the disk space monitor had detected
   the utilization change. Fix: explicitly wait for "Reached/Dropped below
   critical disk utilization level" right after creating/removing the blob
   file, before checking downstream effects.

2. Several tests called `manager.driver_connect()` or omitted reconnection
   entirely after `server_restart()` / `server_start()`. The pre-existing
   driver session can silently reconnect multiple times, causing subsequent
   CQL queries to fail. Fix: call `reconnect_driver()` after every node restart.
   Additionally, call `wait_for_cql_and_get_hosts()` where CQL is used afterward,
   to ensure all connection pools are established.

3. Some log assertions used marks captured before a restart, so they could
   match pre-restart messages or miss messages emitted in the correct post-restart
   window. Fix: refresh marks at the right points.

Apart from that, the patch fixes a typo: autotoogle -> autotoggle.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-655

Closes scylladb/scylladb#28626

(cherry picked from commit 826fd5d6c3)

Closes scylladb/scylladb#28967

Closes scylladb/scylladb#29197
2026-04-03 22:23:17 +03:00
Jenkins Promoter
5bdd6ca036 Update pgo profiles - aarch64 scylla-2025.4.6 scylla-2025.4.6-candidate-20260401033549 2026-04-01 04:35:14 +03:00
Jenkins Promoter
ff20b565aa Update pgo profiles - x86_64 2026-04-01 03:50:02 +03:00
Patryk Jędrzejczak
f5058803c6 locator: everywhere_replication_strategy: fix sanity_check_read_replicas when read_new is true
ERMs created in `calculate_vnode_effective_replication_map` have RF computed based
on the old token metadata during a topology change. The reading replicas, however,
are computed based on the new token metadata (`target_token_metadata`) when
`read_new` is true. That can create a mismatch for EverywhereStrategy during some
topology changes - RF can be equal to the number of reading replicas +-1. During
bootstrap, this can cause the
`everywhere_replication_strategy::sanity_check_read_replicas` check to fail in
debug mode.

We fix the check in this commit by allowing one more reading replica when
`read_new` is true.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1147

Closes scylladb/scylladb#29150

(cherry picked from commit 503a6e2d7e)

Closes scylladb/scylladb#29248

Closes scylladb/scylladb#29269
2026-03-31 10:24:30 +02:00
Michał Chojnowski
e693ebe552 test/boost/cache_algorithm_test: disable sstable compression to avoid giant index pages
The test intentionally creates huge index pages.
But since 5e7fb08bf3,
the index reader allocates a block of memory for a whole index page,
instead of incrementally allocating small pieces during index parsing.
This giant allocation causes the test to fail spuriously in CI sometimes.

Fix this by disabling sstable compression on the test table,
which puts a hard cap of 2000 keys per index page.

Fixes: SCYLLADB-1152

Closes scylladb/scylladb#29152

(cherry picked from commit f29525f3a6)

Closes scylladb/scylladb#29172

Closes scylladb/scylladb#29259
2026-03-30 14:59:23 +02:00
Patryk Jędrzejczak
be942e9a4f test: test_remove_garbage_group0_members: wait for token ring and group0 consistency before removenode
The removenove initiator could have an outdated token ring (still considering
the node removed by the previous removenode a token owner) and unexpectedly
reject the operation.

Fix that by waiting for token ring and group0 consistency before removenode.
Note that the test already checks that consistency, but only for one node,
which is different from the removenode initiator.

This test has been removed in master together with the code being tested
(the gossip-based topology). Hence, the fix is submitted directly to 2026.1.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1103

Backport to all supported branches (other than 2026.1), as the test can fail
there.

Closes scylladb/scylladb#29108

(cherry picked from commit 1398a55d16)

Closes scylladb/scylladb#29205
2026-03-24 16:09:02 +01:00
Pavel Emelyanov
e212762ab7 database: Rate limit all tokens from a range
The limiter scans ranges to decide whether or not to rate-limit the
query. However, when considering each range only the front one's token
is accounted. This looks like a misprint.

The limiter was introduced in cc9a2ad41f

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#29050

(cherry picked from commit 8b1ca6dcd6)

Closes scylladb/scylladb#29107

Closes scylladb/scylladb#29194
2026-03-24 16:04:01 +02:00
Botond Dénes
a41d1ec711 Merge 'doc: fix the installation section' from Anna Stuchlik
This PR fixes the Installation page:

- Replaces `http `with `https `in the download command.
- Replaces the Open Source example from the Installation section for CentOS (we overlooked this example before).

Fixes https://github.com/scylladb/scylladb/issues/29087
Fixes https://github.com/scylladb/scylladb/issues/29087

This update affects all supported versions and should be backported as a bug fix.

Closes scylladb/scylladb#29088

* github.com:scylladb/scylladb:
  doc: remove the Open Source Example from Installation
  doc: replace http with https in the installation instructions

(cherry picked from commit e8b37d1a89)

Closes scylladb/scylladb#29135

Closes scylladb/scylladb#29192
2026-03-23 23:50:15 +02:00
Yaron Kaikov
a4b4c4c0a8 .github/workflows/trigger-scylla-ci: fix heredoc injection in trigger-scylla-ci workflow
Move all ${{ }} expression interpolations into env: blocks so they are
passed as environment variables instead of being expanded directly into
shell scripts. This prevents an attacker from escaping the heredoc in
the Validate Comment Trigger step and executing arbitrary commands on
the runner.

The Verify Org Membership step is hardened in the same way for
defense-in-depth.

Refs: GHSA-9pmq-v59g-8fxp
Fixes: SCYLLADB-954

Closes scylladb/scylladb#28935

(cherry picked from commit 977bdd6260)

Closes scylladb/scylladb#28947
2026-03-20 11:00:38 +02:00
Botond Dénes
155d12f4c9 mutation/collection_mutation: don't copy the serialized collection
serialize_collection_mutation() copies the serialized collection into
the returned collection_mutation object. Change to move to avoid the
copy.

Fixes: SCYLLADB-1041

Closes scylladb/scylladb#29010

(cherry picked from commit 15cfa5beeb)

Closes scylladb/scylladb#29024

Closes scylladb/scylladb#29037
2026-03-20 11:00:11 +02:00
Aleksandra Martyniuk
e78426c5d4 nodetool: cluster repair: do not fail if a table was dropped
nodetool cluster repair without additional params repairs all tablet
keyspaces in a cluster. Currently, if a table is dropped while
the command is running, all tables are repaired but the command finishes
with a failure.

Modify nodetool cluster repair. If a table wasn't specified
(i.e. all tables are repaired), the command finishes successfully
even if a table was dropped.

If a table was specified and it does not exist (e.g. because it was
dropped before the repair was requested), then the behavior remains
unchanged.

Fixes: SCYLLADB-568.

Closes scylladb/scylladb#28739

(cherry picked from commit 2e68f48068)

Closes scylladb/scylladb#29006

Closes scylladb/scylladb#29038
2026-03-20 10:59:26 +02:00
Anna Stuchlik
11248e5cef doc: update the warning about shared dictionary training
This commit updates the inadequate warning on the Advanced Internode (RPC) Compression page.

The warning is replaced with a note about how training data is encrypted.

Fixes https://github.com/scylladb/scylladb/issues/29109

Closes scylladb/scylladb#29111

(cherry picked from commit 88b98fac3a)

Closes scylladb/scylladb#29119

Closes scylladb/scylladb#29139
2026-03-20 10:58:59 +02:00
Avi Kivity
1f7dca0225 Merge 'Fix bad performance for densely populated partition index pages' from Tomasz Grabiec
This applies to small partition workload where index pages have high partition count, and the index doesn't fit in cache. It was observed that the count can be in the order of hundreds. In such a workload pages undergo constant population, LSA compaction, and LSA eviction, which has severe impact on CPU utilization.

Refs https://scylladb.atlassian.net/browse/SCYLLADB-620

This PR reduces the impact by several changes:

  - reducing memory footprint in the partition index. Assuming partition key size is 16 bytes, the cost dropped from 96 bytes to 36 bytes per partition.

  - flattening the object graph and amortizing storage. Storing entries directly in the vector. Storing all key values in a single managed_bytes. Making index_entry a trivial struct.

  - index entries and key storage are now trivially moveable, and batched inside vector storage
    so LSA migration can use memcpy(), which amortizes the cost per key. This reduces the cost of LSA segment compaction.

 - LSA eviction is now pretty much constant time for the whole page
   regardless of the number of entries, because elements are trivial and batched inside vectors.
   Page eviction cost dropped from 50 us to 1 us.

Performance evaluated with:

   scylla perf-simple-query -c1 -m200M --partitions=1000000

Before:

```
7774.96 tps (166.0 allocs/op, 521.7 logallocs/op,  54.0 tasks/op,  802428 insns/op,  430457 cycles/op,        0 errors)
7511.08 tps (166.1 allocs/op, 527.2 logallocs/op,  54.0 tasks/op,  804185 insns/op,  430752 cycles/op,        0 errors)
7740.44 tps (166.3 allocs/op, 526.2 logallocs/op,  54.2 tasks/op,  805347 insns/op,  432117 cycles/op,        0 errors)
7818.72 tps (165.2 allocs/op, 517.6 logallocs/op,  53.7 tasks/op,  794965 insns/op,  427751 cycles/op,        0 errors)
7865.49 tps (165.1 allocs/op, 513.3 logallocs/op,  53.6 tasks/op,  788898 insns/op,  425171 cycles/op,        0 errors)
```

After (+318%):

```
32492.40 tps (130.7 allocs/op,  12.8 logallocs/op,  36.1 tasks/op,  109236 insns/op,  103203 cycles/op,        0 errors)
32591.99 tps (130.4 allocs/op,  12.8 logallocs/op,  36.0 tasks/op,  108947 insns/op,  102889 cycles/op,        0 errors)
32514.52 tps (130.6 allocs/op,  12.8 logallocs/op,  36.0 tasks/op,  109118 insns/op,  103219 cycles/op,        0 errors)
32491.14 tps (130.6 allocs/op,  12.8 logallocs/op,  36.0 tasks/op,  109349 insns/op,  103272 cycles/op,        0 errors)
32582.90 tps (130.5 allocs/op,  12.8 logallocs/op,  36.0 tasks/op,  109269 insns/op,  102872 cycles/op,        0 errors)
32479.43 tps (130.6 allocs/op,  12.8 logallocs/op,  36.0 tasks/op,  109313 insns/op,  103242 cycles/op,        0 errors)
32418.48 tps (130.7 allocs/op,  12.8 logallocs/op,  36.1 tasks/op,  109201 insns/op,  103301 cycles/op,        0 errors)
31394.14 tps (130.7 allocs/op,  12.8 logallocs/op,  36.1 tasks/op,  109267 insns/op,  103301 cycles/op,        0 errors)
32298.55 tps (130.7 allocs/op,  12.8 logallocs/op,  36.1 tasks/op,  109323 insns/op,  103551 cycles/op,        0 errors)
```

When the workload is miss-only, with both row cache and index cache disabled (no cache maintenance cost):

  perf-simple-query -c1 -m200M --duration 6000 --partitions=100000 --enable-index-cache=0 --enable-cache=0

Before:

```
9124.57 tps (146.2 allocs/op, 789.0 logallocs/op,  45.3 tasks/op,  889320 insns/op,  357937 cycles/op,        0 errors)
9437.23 tps (146.1 allocs/op, 789.3 logallocs/op,  45.3 tasks/op,  889613 insns/op,  357782 cycles/op,        0 errors)
9455.65 tps (146.0 allocs/op, 787.4 logallocs/op,  45.2 tasks/op,  887606 insns/op,  357167 cycles/op,        0 errors)
9451.22 tps (146.0 allocs/op, 787.4 logallocs/op,  45.3 tasks/op,  887627 insns/op,  357357 cycles/op,        0 errors)
9429.50 tps (146.0 allocs/op, 787.4 logallocs/op,  45.3 tasks/op,  887761 insns/op,  358148 cycles/op,        0 errors)
9430.29 tps (146.1 allocs/op, 788.2 logallocs/op,  45.3 tasks/op,  888501 insns/op,  357679 cycles/op,        0 errors)
9454.08 tps (146.0 allocs/op, 787.3 logallocs/op,  45.3 tasks/op,  887545 insns/op,  357132 cycles/op,        0 errors)
```

After (+55%):

```
14484.84 tps (150.7 allocs/op,   6.5 logallocs/op,  44.7 tasks/op,  396164 insns/op,  229490 cycles/op,        0 errors)
14526.21 tps (150.8 allocs/op,   6.5 logallocs/op,  44.8 tasks/op,  396401 insns/op,  228824 cycles/op,        0 errors)
14567.53 tps (150.7 allocs/op,   6.5 logallocs/op,  44.7 tasks/op,  396319 insns/op,  228701 cycles/op,        0 errors)
14545.63 tps (150.6 allocs/op,   6.5 logallocs/op,  44.7 tasks/op,  395889 insns/op,  228493 cycles/op,        0 errors)
14626.06 tps (150.5 allocs/op,   6.5 logallocs/op,  44.7 tasks/op,  395254 insns/op,  227891 cycles/op,        0 errors)
14593.74 tps (150.5 allocs/op,   6.5 logallocs/op,  44.7 tasks/op,  395480 insns/op,  227993 cycles/op,        0 errors)
14538.10 tps (150.8 allocs/op,   6.5 logallocs/op,  44.8 tasks/op,  397035 insns/op,  228831 cycles/op,        0 errors)
14527.18 tps (150.8 allocs/op,   6.5 logallocs/op,  44.8 tasks/op,  396992 insns/op,  228839 cycles/op,        0 errors)
```

Same as above, but with summary ratio increased from 0.0005 to 0.005 (smaller pages):

Before:

```
33906.70 tps (146.1 allocs/op,  83.6 logallocs/op,  45.1 tasks/op,  170553 insns/op,   98104 cycles/op,        0 errors)
32696.16 tps (146.0 allocs/op,  83.5 logallocs/op,  45.1 tasks/op,  170369 insns/op,   98405 cycles/op,        0 errors)
33889.05 tps (146.1 allocs/op,  83.6 logallocs/op,  45.1 tasks/op,  170551 insns/op,   98135 cycles/op,        0 errors)
33893.24 tps (146.1 allocs/op,  83.5 logallocs/op,  45.1 tasks/op,  170488 insns/op,   98168 cycles/op,        0 errors)
33836.73 tps (146.1 allocs/op,  83.6 logallocs/op,  45.1 tasks/op,  170528 insns/op,   98226 cycles/op,        0 errors)
33897.61 tps (146.0 allocs/op,  83.5 logallocs/op,  45.1 tasks/op,  170428 insns/op,   98081 cycles/op,        0 errors)
33834.73 tps (146.1 allocs/op,  83.5 logallocs/op,  45.1 tasks/op,  170438 insns/op,   98178 cycles/op,        0 errors)
33776.31 tps (146.3 allocs/op,  83.9 logallocs/op,  45.2 tasks/op,  170958 insns/op,   98418 cycles/op,        0 errors)
33808.08 tps (146.3 allocs/op,  83.9 logallocs/op,  45.2 tasks/op,  170940 insns/op,   98388 cycles/op,        0 errors)
```

After (+18%):

```
40081.51 tps (148.2 allocs/op,   4.4 logallocs/op,  45.0 tasks/op,  121047 insns/op,   82231 cycles/op,        0 errors)
40005.85 tps (148.6 allocs/op,   4.4 logallocs/op,  45.2 tasks/op,  121327 insns/op,   82545 cycles/op,        0 errors)
39816.75 tps (148.3 allocs/op,   4.4 logallocs/op,  45.1 tasks/op,  121067 insns/op,   82419 cycles/op,        0 errors)
39953.11 tps (148.1 allocs/op,   4.4 logallocs/op,  45.0 tasks/op,  121027 insns/op,   82258 cycles/op,        0 errors)
40073.96 tps (148.2 allocs/op,   4.4 logallocs/op,  45.0 tasks/op,  121006 insns/op,   82313 cycles/op,        0 errors)
39882.25 tps (148.2 allocs/op,   4.4 logallocs/op,  45.0 tasks/op,  120925 insns/op,   82320 cycles/op,        0 errors)
39916.08 tps (148.3 allocs/op,   4.4 logallocs/op,  45.1 tasks/op,  121054 insns/op,   82393 cycles/op,        0 errors)
39786.30 tps (148.2 allocs/op,   4.4 logallocs/op,  45.0 tasks/op,  121027 insns/op,   82465 cycles/op,        0 errors)
38662.45 tps (148.3 allocs/op,   4.4 logallocs/op,  45.0 tasks/op,  121108 insns/op,   82312 cycles/op,        0 errors)
39849.42 tps (148.3 allocs/op,   4.4 logallocs/op,  45.1 tasks/op,  121098 insns/op,   82447 cycles/op,        0 errors)
```

Closes scylladb/scylladb#28603

* github.com:scylladb/scylladb:
  sstables: mx: index_reader: Optimize parsing for no promoted index case
  vint: Use std::countl_zero()
  test: sstable_partition_index_cache_test: Validate scenario of pages with sparse promoted index placement
  sstables: mx: index_reader: Amoritze partition key storage
  managed_bytes: Hoist write_fragmented() to common header
  utils: managed_vector: Use std::uninitialized_move() to move objects
  sstables: mx: index_reader: Keep promoted_index info next to index_entry
  sstables: mx: index_reader: Extract partition_index_page::clear_gently()
  sstables: mx: index_reader: Shave-off 16 bytes from index_entry by using raw_token
  sstables: mx: index_reader: Reduce allocation_section overhead during index page parsing by batching allocation
  sstables: mx: index_reader: Keep index_entry directly in the vector
  dht: Introduce raw_token
  test: perf_simple_query: Add 'sstable-format' command-line option
  test: perf_simple_query: Add 'sstable-summary-ratio' command-line option
  test: perf-simple-query: Add option to disable index cache
  test: cql_test_env: Respect enable-index-cache config

(cherry picked from commit 5e7fb08bf3)

Closes scylladb/scylladb#29136

Closes scylladb/scylladb#29140
2026-03-20 10:58:26 +02:00
Piotr Dulikowski
6b9aa303d8 Merge '[Backport 2026.1] mv: allow skipping view updates when a collection is unmodified' from Scylladb[bot]
mv: allow skipping view updates when a collection is unmodified
When we generate view updates, we check whether we can skip the
entire view update if all columns selected by the view are unmodified.
However, for collection columns, we only check if they were unset
before and after the update.
In this patch we add a check for the actual collection contents.
We perform this check for both virtual and non-virtual selections.
When the column is only a virtual column in the view, it would be
enough to check the liveness of each collection cell, however for
that we'd need to deserialize the entire collection anyway, which
should be effectively as expensive as comparing all of its bytes.

Fixes: SCYLLADB-996

- (cherry picked from commit 01ddc17ab9)

Parent PR: #28839

Closes scylladb/scylladb#28977

* github.com:scylladb/scylladb:
  Merge 'mv: allow skipping view updates when a collection is unmodified' from Wojciech Mitros
  mv: remove dead code in view_updates::can_skip_view_updates

Closes scylladb/scylladb#29094
2026-03-18 10:41:50 +01:00
Patryk Jędrzejczak
3863dfbc0a test: test_raft_no_quorum: decrease group0_raft_op_timeout_in_ms after quorum loss
`test_raft_no_quorum.py::test_cannot_add_new_node` is currently flaky in dev
mode. The bootstrap of the first node can fail due to `add_entry()` timing
out (with the 1s timeout set by the test case).

Other test cases in this test file could fail in the same way as well, so we
need a general fix. We don't want to increase the timeout in dev mode, as it
would slow down the test. The solution is to keep the timeout unchanged, but
set it only after quorum is lost. This prevents unexpected timeouts of group0
operations with almost no impact on the test running time.

A note about the new `update_group0_raft_op_timeout` function: waiting for
the log seems to be necessary only for
`test_quorum_lost_during_node_join_response_handler`, but let's do it
for all test cases just in case (including `test_can_restart` that shouldn't
be flaky currently).

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-913

Closes scylladb/scylladb#28998

(cherry picked from commit 526e5986fe)

Closes scylladb/scylladb#29068

Closes scylladb/scylladb#29097
2026-03-18 10:15:34 +01:00
Tomasz Grabiec
0c786045ff Merge 'service: assert that tables updated via group0 use schema commitlog' from Aleksandra Martyniuk
Set enable_schema_commitlog for each group0 tables.

Assert that group0 tables use schema commitlog in ensure_group0_schema
(per each command).

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-914.

Needs backport to all live releases as all are vulnerable

Closes scylladb/scylladb#28876

* github.com:scylladb/scylladb:
  test: add test_group0_tables_use_schema_commitlog
  db: service: remove group0 tables from schema commitlog schema initializer
  service: ensure that tables updated via group0 use schema commitlog
  db: schema: remove set_is_group0_table param

(cherry picked from commit b90fe19a42)

Closes scylladb/scylladb#28916

Closes scylladb/scylladb#28986
2026-03-17 17:29:36 +01:00