From b6ebbbf036ad555de5bbcf666ffd147e8197d8ac Mon Sep 17 00:00:00 2001 From: "Raphael S. Carvalho" Date: Wed, 1 Apr 2026 20:23:52 -0300 Subject: [PATCH] test/cluster/test_tablets2: Fix test_split_stopped_on_shutdown race with stale log messages The test was failing because the call to: await log.wait_for('Stopping.*ongoing compactions') was missing the 'from_mark=log_mark' argument. The log mark was updated (line: log_mark = await log.mark()) immediately after detecting 'splitting_mutation_writer_switch_wait: waiting', and just before launching the shutdown task. However, the wait_for call on the following line was scanning from the beginning of the log, not from that mark. As a result, the search immediately matched old 'Stopping N tasks for N ongoing compactions for table system.X due to table removal' messages emitted during initial server bootstrap (for system.large_partitions, system.large_rows, system.large_cells), rather than waiting for the shutdown to actually stop the user-table split compaction. This caused the test to prematurely send the message to the 'splitting_mutation_writer_switch_wait' injection. The split compaction was unblocked before the shutdown had aborted it, so it completed successfully. Since the split succeeded, 'Failed to complete splitting of table' was never logged. Meanwhile, 'storage_service_drain_wait' was blocking do_drain() waiting for a message. With the split already done, the test was stuck waiting for the expected failure log that would never come (600s timeout). At the same time, after 60s the 'storage_service_drain_wait' injection timed out internally, triggering on_internal_error() which -- with --abort-on-internal-error=1 -- crashed the server (exit code -6). Fix: pass from_mark=log_mark to the wait_for('Stopping.*ongoing compactions') call so it only matches messages that appear after the shutdown has started, ensuring the test correctly synchronizes with the shutdown aborting the user-table split compaction before releasing the injection. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1319. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Closes scylladb/scylladb#29311 --- test/cluster/test_tablets2.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/cluster/test_tablets2.py b/test/cluster/test_tablets2.py index ce7f56273d..cc1c8c534f 100644 --- a/test/cluster/test_tablets2.py +++ b/test/cluster/test_tablets2.py @@ -2279,7 +2279,7 @@ async def test_split_stopped_on_shutdown(manager: ManagerClient): shutdown_task = asyncio.create_task(manager.server_stop_gracefully(server.server_id)) - await log.wait_for('Stopping.*ongoing compactions') + await log.wait_for('Stopping.*ongoing compactions', from_mark=log_mark) await manager.api.message_injection(server.ip_addr, "splitting_mutation_writer_switch_wait") await log.wait_for('storage_service_drain_wait: waiting', from_mark=log_mark)