Commit Graph

17 Commits

Author SHA1 Message Date
Łukasz Paszkowski
3ef594f9eb test/storage: speed up out-of-space prevention tests by using smaller volumes
Tests in test_out_of_space_prevention.py spend a large fraction of
time creating a random “blob” file to cross the 0.8 critical disk
utilization threshold. With 100MB volumes this requires writing
~70–80MB of data, which is slow inside Docker/Podman-backed volumes.

Most tests only use ~11MB of data, so large volumes are unnecessary.
Reduce the test volume size to 20MB so the critical threshold is
reached at ~16MB and the blob file is much smaller.

This cuts ~5–6s per test.
2026-01-27 15:28:59 +01:00
Łukasz Paszkowski
0f86fc680c test/storage: reduce tablet load stats refresh interval to speed up OOS prevention tests
Set `--tablet-load-stats-refresh-interval-in-seconds=1` for this module’s
clusters applicable to all tests. This significantly reduces runtime
for the slowest cases:
- test_reject_split_compaction: 75.62s -> 23.04s
- test_split_compaction_not_triggered: 69.36s -> 22.98s
2026-01-27 15:28:59 +01:00
Andrei Chekun
cc5ac75d73 test.py: remove deprecated skip_mode decorator
Finishing the deprecation of the skip_mode function in favor of
pytest.mark.skip_mode. This PR is only cleaning and migrating leftover tests
that are still used and old way of skip_mode.

Closes scylladb/scylladb#28299
2026-01-25 18:17:27 +02:00
Tomasz Grabiec
baea12c9cb topology_coordinator, tablets: Fail draining operations when tablet migration fails due to critical disk utilization
Reaching critical disk utilization on destination means the draining
either caused it, or at least works against reliveing it. So it's
better to cancel those requests. In case of decommission, if critical
disk utilization was caused by it due to not enough capacity, aborting
decomission will bring capacity back to the system and rebalancing
will relieve critical disk utlization.
2026-01-18 15:36:07 +01:00
Tomasz Grabiec
5e6935f276 test: Use ManagerClient.{disable,enable}_tablet_balancing() 2026-01-13 00:38:00 +01:00
Łukasz Paszkowski
7bf26ece4d test_user_writes_rejection: Fix test flakiness caused by typo and non-local CL=ONE reads
The current code:
```
try:
   cql.execute(f"INSERT INTO {cf} (pk, t) VALUES (-1, 'x')", host=host[0], execution_profile=cl_one_profile).result()
except Exception:
   pass
```

contains a typo: `host=host[0]` which throws an exception becase Host
object is not subscriptable. The test does not fail because the except
block is too broad and suppresses all exceptions.

Fixing the typo alone is insufficient. The write still succeeds because
the remaining nodes are UP and the query uses CL=ONE, so no failure
should be expected.

Another source of flakiness is data verification:
```
SELECT * FROM {cf} WHERE pk = 0;
```

Even when a coordinator is explicitly provided, using CL=ONE does not
guarantee a local read. The coordinator may forward the read request to
another replica, causing the verification to fail nondeterministically.

This patch rewrites the tests to address these issues:
- Fix the typo: `host[0]` to `hosts[0]`
- Verify data using `MUTATION_FRAGMENTS({cf})` which guarantees a local
  read on the coordinator node
- Reconnect the driver after node restart

Fixes https://github.com/scylladb/scylladb/issues/27933

Closes scylladb/scylladb#27934
2026-01-09 13:42:05 +02:00
Andrei Chekun
c950c2e582 test.py: convert skip_mode function to pytest.mark
Function skip_mode works only on function and only in cluster test. This if OK
when we need to skip one test, but it's not possible to use it with pytestmark
to automatically mark all tests in the file. The goal of this PR is to migrate

skip_mode to be dynamic pytest.mark that can be used as ordinary mark.

Closes scylladb/scylladb#27853

[avi: apply to test/cluster/test_tablets.py::test_table_creation_wakes_up_balancer]
2026-01-08 21:55:16 +02:00
Łukasz Paszkowski
76b84b71d1 storage/test_out_of_space_prevention.py: Fix async/await bugs
- Add missing await keywords for async operations on s2_log.wait_for()
  and coord_log.wait_for()
- Fix incorrect regex: "compaction .* Split {cf}" → "compaction.*Split {cf}"
- The commit https://github.com/scylladb/scylladb/commit/f7324a4 demoted
  compaction start/end log messages to debug level. Hence add
  compaction=debug log messages to the following tests:
    test_split_compaction_not_triggered
    test_node_restart_while_tablet_split
    test_repair_failure_on_split_rejection

Fixes https://github.com/scylladb/scylladb/issues/27931

Closes scylladb/scylladb#27932
2026-01-01 14:24:30 +02:00
Botond Dénes
bfdd4f7776 Merge 'Synchronize incremental repair and tablet split' from Raphael Raph Carvalho
Split prepare can run concurrently with repair.

Consider this:

1) split prepare starts
2) incremental repair starts
3) split prepare finishes
4) incremental repair produces unsplit sstable
5) split is not happening on sstable produced by repair
        5.1) that sstable is not marked as repaired yet
        5.2) might belong to repairing set (has compaction disabled)
6) split executes
7) repairing or repaired set has unsplit sstable

If split was acked to coordinator (meaning prepare phase finished),
repair must make sure that all sstables produced by it are split.
It's not happening today with incremental repair because it disables
split on sstables belonging to repairing group. And there's a window
where sstables produced by repair belong to that group.

To solve the problem, we want the invariant where all sealed sstables
will be split.
To achieve this, streaming consumers are patched to produce unsealed
sstable, and the new variant add_new_sstable_and_update_cache() will
take care of splitting the sstable while it's unsealed.
If no split is needed, the new sstable will be sealed and attached.

This solution was also needed to interact nicely with out of space
prevention too. If disk usage is critical, split must not happen on
restart, and the invariant aforementioned allows for it, since any
unsplit sstable left unsealed will be discarded on restart.
The streaming consumer will fail if disk usage is critical too.

The reason interposer consumer doesn't fully solve the problem is
because incremental repair can start before split, and the sstable
being produced when split decision was emitted must be split before
attached. So we need a solution which covers both scenarios.

Fixes #26041.
Fixes #27414.

Should be backported to 2025.4 that contains incremental repair

Closes scylladb/scylladb#26528

* github.com:scylladb/scylladb:
  test: Add reproducer for split vs intra-node migration race
  test: Verify split failure on behalf of repair during critical disk utilization
  test: boost: Add failure_when_adding_new_sstable_test
  test: Add reproducer for split vs incremental repair race condition
  compaction: Fail split of new sstable if manager is disabled
  replica: Don't split in do_add_sstable_and_update_cache()
  streaming: Leave sstables unsealed until attached to the table
  replica: Wire add_new_sstables_and_update_cache() into intra-node streaming
  replica: Wire add_new_sstable_and_update_cache() into file streaming consumer
  replica: Wire add_new_sstable_and_update_cache() into streaming consumer
  replica: Document old add_sstable_and_update_cache() variants
  replica: Introduce add_new_sstables_and_update_cache()
  replica: Introduce add_new_sstable_and_update_cache()
  replica: Account for sstables being added before ACKing split
  replica: Remove repair read lock from maybe_split_new_sstable()
  compaction: Preserve state of input sstable in maybe_split_new_sstable()
  Rename maybe_split_sstable() to maybe_split_new_sstable()
  sstables: Allow storage::snapshot() to leave destination sstable unsealed
  sstables: Add option to leave sstable unsealed in the stream sink
  test: Verify unsealed sstable can be compacted
  sstables: Allow unsealed sstable to be loaded
  sstables: Restore sstable_writer_config::leave_unsealed
2025-12-23 07:28:56 +02:00
Łukasz Paszkowski
2cb9bb8f3a test_user_writes_rejection: Disable speculative retries
This test starts a 3-node cluster and creates a large blob file so that one
node reaches critical disk utilization, triggering write rejections on that
node. The test then writes data with CL=QUORUM and validates that the data:
- did not reach the critically utilized node
- did reach the remaining two nodes

By default, tables use speculative retries to determine when coordinators may
query additional replicas.

Since the validation uses CL=ONE, it is possible that an additional request
is sent to satisfy the consistency level. As a result:
- the first check may fail if the additional request is sent to a node that
  already contains data, making it appear as if data reached the critically
  utilized node
- the second check may fail if the additional request is sent to the critically
  utilized node, making it appear as if data did not reach the healthy node

The patch fixes the flakiness by disabling the speculative retries.

Fixes https://github.com/scylladb/scylladb/issues/27212

Closes scylladb/scylladb#27488
2025-12-19 09:39:09 +02:00
Raphael S. Carvalho
e3b9abdb30 test: Verify split failure on behalf of repair during critical disk utilization
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-12-12 17:01:18 -03:00
Lakshmi Narayanan Sreethar
4d442f48db compaction/compaction_descriptor: introduce compaction_type::Major
Introduce a new compaction_type enum : `Major`.
This type will be used by the next patches to differentiate between
major compaction and regular compaction (compaction_type::Compaction).

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-10-29 19:21:53 +05:30
Łukasz Paszkowski
7ec369b900 database: Log message after critical_disk_utilization mode is set
This is a follow-up of the previous fix: https://github.com/scylladb/scylladb/pull/26030

The test test_user_writes_rejection starts a 3-node cluster and
creates a large file on one of the nodes, to trigger the out-of-space
prevention mechanism, which should reject writes on that node.

It waits for the log message 'Setting critical disk utilization mode: true'
and then executes a write expecting the node to reject it.

Currently, the message is logged before the `_critical_disk_utilization`
variable is actually updated. This causes the test to fail sporadically
if it runs quickly enough.

The fix splits the logging into two steps:
1. "Asked to set critical disk utilization mode" - logged before any action
2) "Set critical disk utilization mode" - logged after `_critical_disk_utilization` has been updated

The tests are updated to wait for the second message.

Fixes https://github.com/scylladb/scylladb/issues/26004

Closes scylladb/scylladb#26392
2025-10-20 13:24:10 +03:00
Łukasz Paszkowski
62e27e0f77 test_out_of_space_prevention.py: Fix flaky test_node_restart_while_tablet_split test
The test starts a 3-node cluster and immediately creates a big file
on the first nodes in order to trigger the out of space prevention to
disable compaction, including the SPLIT compaction.

In order to trigger a SPLIT compaction, a keyspace with 1 initial tablet
is created followed by alter statement with `tablets = {'min_tablet_count': 2}`.
This triggers a resize decision that should not finalize due to
disabled compaction on the first node.

The test is flaky because, the keyspace is created with RF=1 and there
is no guarantee that the tablet replica will be located on the first node
with critical disk utilization. If that is not the case, the split
is finalized and the test fails, because it expect the split to be
blocked.

Change to RF=3. This ensures there is exactly one tablet replica on
each node, including the one with critical disk utilization. So SPLIT
is blocked until the disk utilization on the first node, drops below
the critical level.

Fixes: https://github.com/scylladb/scylladb/issues/25861

Closes scylladb/scylladb#26225
2025-09-25 11:54:48 +03:00
Łukasz Paszkowski
5f6df4eb97 test/storage: Properly mount/clear volumes
Due to a missing functionality in PythonTest, `unshare` is never used
to mount volumes. As a consequence:
+ volumes are created with sudo which is undesired
+ they are not cleared automatically

Even having the missing support in place, the approach with mounting
volumes with `unshare` would not work as http server, a pool of clusters,
and scylla cluster manager are started outside of the new namespace.
Thus cluster would have no access to volumes created with `unshare`.

The new approach that works with and without dbuild and does not require
sudo, uses the following three commands to mount a volume:

truncate -s 100M /tmp/mydevice.img
mkfs.ext4 /tmp/mydevice.img
fuse2fs /tmp/mydevice.img test/

Additionally, a proper cleanup is performed, i.e. servers are stopped
gracefully and and volumes are unmounted after the tests using them are
completed.

Fixes: https://github.com/scylladb/scylladb/issues/25906

Closes scylladb/scylladb#26065
2025-09-25 11:05:50 +03:00
Łukasz Paszkowski
29de947851 test_out_of_space_prevention.py: Fix flaky test_user_writes_rejection test
The test starts a 3-node cluster and immediately creates a big file
on one of the nodes, to trigger the out of space prevention to start
rejecting writes on this node. Then a write is executed and checked it
did not reach the node with critical disk utilization but reached
the remaining nodes (it should, RF=3 is set)

However, when not specified, a default LOCAL_ONE consistency level
is used. This means that only one node is required to acknowledge the
write.

After the write, the test checks if the write
+ did NOT reach the node with critical disk utilization (works)
+ did reach the remaning nodes

This can cause the test to fail sporadically as the write might not
yet be on the last node.

Use CL=QUORUM instead.

Fixes: https://github.com/scylladb/scylladb/issues/26004

Closes scylladb/scylladb#26030
2025-09-25 08:05:45 +03:00
Łukasz Paszkowski
e34deea50e tests/cluster: Add new storage tests
The storage submodule contains tests that require mounted volumes
to be executed. The volumes are created automatically with the
`volumes_factory` fixture.

The tests in this suite are executed with the custom launcher
`unshare -mr pytest`

Test scenarios (when one node reaches critical disk utilization):
1. Reject user table writes
2. Disable/Enabled compaction
3. Reject split compactions
4. New split compactions not triggered
5. Abort tablet repair
6. Disable/Enabled incoming tablet migrations
7. Restart a node while a tablet split is triggered
2025-08-29 14:56:13 +02:00