test: explicitly set compression algorithm in test_autoretrain_dict

When `test_autoretrain_dict` was originally written, the default
`sstable_compression_user_table_options` was `LZ4Compressor`. The
test assumed (correctly) that initially the compression doesn't use
a trained dictionary, and later in the test scenario, it changed
the algorithm to one with a dictionary.

However, the default `sstable_compression_user_table_options` is now
`LZ4WithDictsCompressor`, so the old assumption is no longer correct.
As a result, the assertion that data is initially not compressed well
may or may not fail depending on dictionary training timing.

To fix this, this commit explicitly sets `ZstdCompressor`
as the initial `sstable_compression_user_table_options`, ensuring that
the assumption that initial compression is without a dictionary
is always met.

Note: `ZstdCompressor` differs from the former default `LZ4Compressor`.
However, it's a better choice — the test aims to show the benefit of
using a dictionary, not the benefit of Zstd over LZ4 (and the test uses
ZstdWithDictsCompressor as the algorithm with the dictionary).

Fixes: scylladb/scylladb#28204
(cherry picked from commit 9ffa62a986)
This commit is contained in:
Andrzej Jackowski
2026-02-12 14:47:52 +01:00
committed by GitHub Action
parent 91bf817955
commit c46ae2c2ab

View File

@@ -53,6 +53,9 @@ async def test_autoretrain_dict(manager: ManagerClient):
n_blobs = 1024
uncompressed_size = blob_size * n_blobs * rf
# Start with compressor without a dictionary
cfg = { "sstable_compression_user_table_options": "ZstdCompressor" }
logger.info("Bootstrapping cluster")
servers = await manager.servers_add(2, cmdline=[
'--logger-log-level=storage_service=debug',
@@ -61,7 +64,7 @@ async def test_autoretrain_dict(manager: ManagerClient):
'--sstable-compression-dictionaries-retrain-period-in-seconds=1',
'--sstable-compression-dictionaries-autotrainer-tick-period-in-seconds=1',
f'--sstable-compression-dictionaries-min-training-dataset-bytes={int(uncompressed_size/2)}',
], auto_rack_dc="dc1")
], auto_rack_dc="dc1", config=cfg)
logger.info("Creating table")
cql = manager.get_cql()