Introduce a new compaction_type enum : `Major`.
This type will be used by the next patches to differentiate between
major compaction and regular compaction (compaction_type::Compaction).
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
This is a follow-up of the previous fix: https://github.com/scylladb/scylladb/pull/26030
The test test_user_writes_rejection starts a 3-node cluster and
creates a large file on one of the nodes, to trigger the out-of-space
prevention mechanism, which should reject writes on that node.
It waits for the log message 'Setting critical disk utilization mode: true'
and then executes a write expecting the node to reject it.
Currently, the message is logged before the `_critical_disk_utilization`
variable is actually updated. This causes the test to fail sporadically
if it runs quickly enough.
The fix splits the logging into two steps:
1. "Asked to set critical disk utilization mode" - logged before any action
2) "Set critical disk utilization mode" - logged after `_critical_disk_utilization` has been updated
The tests are updated to wait for the second message.
Fixes https://github.com/scylladb/scylladb/issues/26004Closesscylladb/scylladb#26392
The test starts a 3-node cluster and immediately creates a big file
on the first nodes in order to trigger the out of space prevention to
disable compaction, including the SPLIT compaction.
In order to trigger a SPLIT compaction, a keyspace with 1 initial tablet
is created followed by alter statement with `tablets = {'min_tablet_count': 2}`.
This triggers a resize decision that should not finalize due to
disabled compaction on the first node.
The test is flaky because, the keyspace is created with RF=1 and there
is no guarantee that the tablet replica will be located on the first node
with critical disk utilization. If that is not the case, the split
is finalized and the test fails, because it expect the split to be
blocked.
Change to RF=3. This ensures there is exactly one tablet replica on
each node, including the one with critical disk utilization. So SPLIT
is blocked until the disk utilization on the first node, drops below
the critical level.
Fixes: https://github.com/scylladb/scylladb/issues/25861Closesscylladb/scylladb#26225
Due to a missing functionality in PythonTest, `unshare` is never used
to mount volumes. As a consequence:
+ volumes are created with sudo which is undesired
+ they are not cleared automatically
Even having the missing support in place, the approach with mounting
volumes with `unshare` would not work as http server, a pool of clusters,
and scylla cluster manager are started outside of the new namespace.
Thus cluster would have no access to volumes created with `unshare`.
The new approach that works with and without dbuild and does not require
sudo, uses the following three commands to mount a volume:
truncate -s 100M /tmp/mydevice.img
mkfs.ext4 /tmp/mydevice.img
fuse2fs /tmp/mydevice.img test/
Additionally, a proper cleanup is performed, i.e. servers are stopped
gracefully and and volumes are unmounted after the tests using them are
completed.
Fixes: https://github.com/scylladb/scylladb/issues/25906Closesscylladb/scylladb#26065
The test starts a 3-node cluster and immediately creates a big file
on one of the nodes, to trigger the out of space prevention to start
rejecting writes on this node. Then a write is executed and checked it
did not reach the node with critical disk utilization but reached
the remaining nodes (it should, RF=3 is set)
However, when not specified, a default LOCAL_ONE consistency level
is used. This means that only one node is required to acknowledge the
write.
After the write, the test checks if the write
+ did NOT reach the node with critical disk utilization (works)
+ did reach the remaning nodes
This can cause the test to fail sporadically as the write might not
yet be on the last node.
Use CL=QUORUM instead.
Fixes: https://github.com/scylladb/scylladb/issues/26004Closesscylladb/scylladb#26030
The storage submodule contains tests that require mounted volumes
to be executed. The volumes are created automatically with the
`volumes_factory` fixture.
The tests in this suite are executed with the custom launcher
`unshare -mr pytest`
Test scenarios (when one node reaches critical disk utilization):
1. Reject user table writes
2. Disable/Enabled compaction
3. Reject split compactions
4. New split compactions not triggered
5. Abort tablet repair
6. Disable/Enabled incoming tablet migrations
7. Restart a node while a tablet split is triggered