scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 00:02:37 +00:00

Author	SHA1	Message	Date
Aleksandr Bykov	8afdae24d2	test: fix flaky test_kill_coordinator_during_op The test hardcoded the expected number of coordinator elections (2, 3, 4, 5) for each phase. If a prior phase triggered an extra election, subsequent phases would wait for a count that was already reached or would never match. Fix by reading the current election count before each operation and expecting exactly one more, making each phase independent of prior history. Also add wait_for_no_pending_topology_transition() calls after each coordinator election to ensure the topology state machine has fully settled before proceeding with restarts and further operations. Decrease the failure detector timeout (failure_detector_timeout_in_ms) to 2000 ms on all test nodes so that coordinator crashes are detected faster, reducing test wallclock time and timeout-related flakiness. Enable raft_topology=trace logging on all test nodes to aid post-failure diagnosis. Add diagnostic logging in wait_new_coordinator_elected(). Fixes: SCYLLADB-1089 Closes scylladb/scylladb#29284	2026-04-30 21:27:56 +03:00
Avi Kivity	795478fa7a	test: fix race window test flakiness from residual re-repair The test_incremental_repair_race_window_promotes_unrepaired_data test was still flaky because: 1. Only coordinator changes TO servers[1] were detected, but ANY coordinator change can trigger a residual re-repair that flushes memtables on all replicas and marks post-repair data as repaired. 2. Even without a coordinator change, the topology coordinator can initiate a residual re-repair when it sees tablets stuck in the repair stage after the servers[1] restart. This re-repair contaminates the repaired set with post-repair data, masking the compaction-merge bug the test detects. Fix by: - Broadening the coordinator check from == servers[1] to != coord - Adding re-repair detection (grep for 'Initiating tablet repair host=') at three points: post-restart, during the compaction poll, and after injection release - On retry, creating a completely fresh keyspace+table via _setup_table_for_race_window() so the new attempt starts with clean tablet metadata uncontaminated by prior re-repairs Fixes: SCYLLADB-1478	2026-04-30 18:40:18 +03:00
Avi Kivity	12d5e758ed	test: extract _setup_table_for_race_window helper for race window test Move the keyspace+table setup logic for test_incremental_repair_race_window_promotes_unrepaired_data into a dedicated helper function _setup_table_for_race_window(). The helper creates a fresh keyspace (unique name via create_new_test_keyspace), the table, configures STCS min_threshold=2, inserts baseline keys, runs repair 1, inserts keys for repair 2, and flushes. This is a pure refactor with no behavioral change: the test function now calls the helper once instead of inlining the setup. The extraction enables a subsequent commit to call the helper again on retry when a leadership transfer is detected.	2026-04-30 18:37:42 +03:00
Dario Mirovic	3875d79ac6	test: boost: regression test for loading_cache::insert with caching disabled Add two test cases for the new loading_cache::insert() method: * test_loading_cache_insert verifies that insert() populates the cache and invokes the loader exactly once per key when caching is enabled. * test_loading_cache_insert_caching_disabled is a regression test for SCYLLADB-1699: when the cache is constructed with expiry == 0 (caching disabled), insert() must be a no-op rather than asserting in loading_cache::get_ptr() via caching_enabled(). The loader must not be invoked and the cache must remain empty. Refs SCYLLADB-1699	2026-04-30 16:52:51 +02:00
Marcin Maliszkiewicz	b08e0c67e4	test/boost: add dummy case to table_helper_test for non-injection modes The only test requires SCYLLA_ENABLE_ERROR_INJECTION. In modes without it (e.g. release) the suite was empty, so pytest exited with code 5 ("no tests collected") and CI failed. Add a no-op case in that branch so collection always yields at least one test.	2026-04-30 11:45:12 +02:00
Marcin Maliszkiewicz	515b5722fd	test/boost: add regression test for table_helper insert() UAF Deterministic reproducer using an error injection point placed in table_helper::insert() between cache_table_info() and execute(). The test parks fiber A at the injection, drops the target table (evicting the prepared_statements_cache entry), runs fiber B which nulls _insert_stmt, then releases fiber A. Without the fix this crashes in execute(); with the fix fiber A holds a local strong ref and proceeds. Uses the new waiters() API to synchronize with fiber A's entry into the injection.	2026-04-30 11:45:12 +02:00
Ernest Zaslavsky	1febfbd9b5	test: rename sstable_tablet_streaming.cc to match the naming convention apparently, boost test MUST end with "_test" to be executed by the test.py Closes scylladb/scylladb#29693	2026-04-30 11:16:39 +03:00
Pavel Emelyanov	1ca97f0c0a	Merge 'test: fix disabled test handling and deduplicate CLI test arguments' from Evgeniy Naydanov - Revert the previous "test.py: fix test collection bug" commit (`92c09d10`) which worked around broken deduplication by filtering items without `BUILD_MODE` in `pytest_collection_modifyitems`. This approach masked the root cause and is superseded by the proper fixes below. - Backport pytest 9.0.3's argument normalization algorithm into `test.py` to work around broken deduplication in pytest 8.3.5 ([pytest-dev/pytest#12083](https://github.com/pytest-dev/pytest/issues/12083)). Duplicate or subsumed test paths (e.g. `test/cql` and `test/cql/lua_test.cql`) are collapsed before invoking pytest. Revert when upgrading to pytest 9.x. - Return a `DisabledFile` collector instead of an empty list in `pytest_collect_file` when all modes are disabled for a file, fixing a bug where subsequent files would not get their stash items set (`REPEATING_FILES`). Restructure `pytest_collect_file` to use a walrus operator (`if repeats := ...`) with a single `remove(file_path)` and `return collectors` at the end, eliminating the early return. - Add `--keep-duplicates` CLI argument to bypass deduplication and forward to pytest. - Move `RUN_ID` assignment from `pytest_collect_file` to `modify_pytest_item`. A shared `run_ids` cache (`defaultdict[tuple[str, str], count]`) is created in `pytest_collection_modifyitems` and passed to `modify_pytest_item`, keyed by `(build_mode, nodeid)` so each mode gets independent counters. This ensures unique run IDs even when `--keep-duplicates` causes the same file to be collected multiple times. - Fix `--repeat` option default from string `"1"` to int `1` — argparse only applies `type=` to CLI-parsed values, not defaults. pytest normally deduplicates overlapping test arguments — e.g. `test/cql test/cql/lua_test.cql` collects `lua_test.cql` only once. The original `test.py` never performed this deduplication, and the pytest version in the toolchain image (8.3.5) has a bug that breaks it ([pytest-dev/pytest#12083](https://github.com/pytest-dev/pytest/issues/12083).) Since we are moving to bare pytest, `test.py` should match pytest's default behavior: deduplicate. Because we cannot easily upgrade pytest, commit 2 backports the deduplication logic from pytest 9.0.3. To match pytest's interface, `--keep-duplicates` is added as an opt-out. This lets a user intentionally run overlapping paths — e.g. `./test.py test/blah test/blah/test_foo.py --keep-duplicates` runs `test_foo.py` twice. The flag is forwarded to pytest and also skips the backported deduplication in `test.py`. - Revert `92c09d10` which filtered items without `BUILD_MODE` in `pytest_collection_modifyitems` and added an early return in `CppFile.collect()`. This workaround is superseded by the proper deduplication and `DisabledFile` fixes. - Add `_CollectionArgument` dataclass (`order=True`, `__contains__` for subsumption) and `_deduplicate_test_args()` function, adapted from pytest 9.0.3. Marked with a TODO to remove once we update to pytest 9.x. - Call `_deduplicate_test_args()` on `options.name` before passing to pytest. - Add `DisabledFile(pytest.File)` that skips collection with an informative message instead of returning an empty list. - Restructure `pytest_collect_file` to use walrus operator: `if repeats := ...:` / `else:` — single `remove(file_path)` at end, no early return. - Add `--keep-duplicates` argument that skips deduplication and is forwarded to pytest. - Create a shared `run_ids` cache in `pytest_collection_modifyitems` and pass it to `modify_pytest_item`, which assigns unique sequential RUN_IDs via `itertools.count`. The cache is keyed by `(build_mode, nodeid)` so each mode gets independent counters. - Remove `RUN_ID` from `_STASH_KEYS_TO_COPY` — it is no longer set on collectors. - Remove `CppFile.run_id` cached_property. `CppTestCase` now reads `RUN_ID` from its own item stash. - Fix `--repeat` option default from `"1"` to `1` and drop redundant `int()` cast. Closes SCYLLADB-1730 Closes scylladb/scylladb#29665 * github.com:scylladb/scylladb: test: add --keep-duplicates and assign RUN_ID via shared cache test/pylib/runner: fix disabled file collection test.py: deduplicate CLI test arguments before passing to pytest Revert "test.py: fix test collection bug"	2026-04-30 07:58:25 +03:00
Wojciech Mitros	ebaf536449	replica/database: fix cross-shard deadlock in lock_tables_metadata() lock_tables_metadata() acquires a write lock on tables_metadata._cf_lock on every shard. It used invoke_on_all(), which dispatches lock acquisitions to all shards in parallel via parallel_for_each + smp::submit_to. When two fibers call lock_tables_metadata() concurrently, this can deadlock. parallel_for_each starts all iterations unconditionally: even when the local shard's lock attempt blocks (because the other fiber already holds it), SMP messages are still sent to remote shards. Both fibers' lock-acquisition messages land in the per-shard SMP queues. The SMP queue itself is FIFO, but process_incoming() drains it and schedules each item as a reactor task via add_task(), which — in debug and sanitize builds with SEASTAR_SHUFFLE_TASK_QUEUE — shuffles each newly added task against all pending tasks in the same scheduling group's reactor task queue. This means fiber A's lock acquisition can be reordered past fiber B's (and past unrelated tasks) on a given shard. If fiber A wins the lock on shard X while fiber B wins on shard Y, this creates a classic cross-shard lock-ordering deadlock (circular wait). In production builds without SEASTAR_SHUFFLE_TASK_QUEUE, the reactor task queue is FIFO. Still, even in release builds, the SMP queues can reorder messages even, so the deadlock is still possible, even if it's much less likely. In debug and sanitize builds, the task-queue shuffle makes the deadlock very likely whenever both fibers' lock-acquisition tasks are pending simultaneously in the reactor task queue on any shard. This deadlock was exposed by `ce00d61917` ("db: implement large_data virtual tables with feature flag gating", merged as `88a8324e68`), which introduced legacy_drop_table_on_all_shards as a second caller of lock_tables_metadata(). When LARGE_DATA_VIRTUAL_TABLES is enabled during topology_state_load (via feature_service::enable), two fibers can race: 1. activate_large_data_virtual_tables() — calls legacy_drop_table_on_all_shards() which calls lock_tables_metadata() synchronously via .get() 2. reload_schema_in_bg() — fires as a background fiber from TABLE_DIGEST_INSENSITIVE_TO_EXPIRY, eventually reaches schema_applier::commit() which also calls lock_tables_metadata() If both reach lock_tables_metadata() while the lock is free on all shards, the parallel acquisition creates the deadlock opportunity. The deadlock blocks topology_state_load() from completing, which prevents the bootstrapping node from finishing its topology state transitions. The coordinator's topology coordinator then waits for the node to reach the expected state, but the node is stuck, so eventually the read_barrier times out after 300 seconds. Fix by acquiring the shard 0 lock first before attempting to acquire any other lock. Whichever fiber wins shard 0 is guaranteed to acquire all remaining shards before the other fiber can proceed past shard 0, eliminating the circular-wait condition. Tested manually with 2 approaches: 1. causing different shard locks to be acquired by different lock_tables_metadata() calls by adding different sleeps depending on the lock_tables_metadata() call and target shard - this reproduced the issue consistently 2. matching the time point at which both fibers reach lock_tables_metadata() adding a single sleep to one of the fibers - this heavily depends on the machine so we can't create a universal reproducer this way, but it did result in the observed failure on my machine after finding the right sleep time Also added a unit test for concurrent lock_tables_metadata() calls. Fixes: SCYLLADB-1694 Fixes: SCYLLADB-1644 Fixes: SCYLLADB-1684 Closes scylladb/scylladb#29678	2026-04-29 21:13:53 +02:00
Marcin Maliszkiewicz	45b4834ac4	Merge 'audit: fix maintenance socket startup/shutdown ordering' from Andrzej Jackowski This series addresses three problems in the audit startup/shutdown sequence: 1. [BUG] Shutdown SIGABRT. During graceful shutdown, deferred stops run in reverse order of construction. With the audit service constructed after the maintenance socket, audit was destroyed first, and in-flight queries on the maintenance socket could hit the destroyed audit service (assertion failure in sharded::local()). 2. [BUG] Startup audit bypass. The maintenance socket opened before audit storage was initialized, allowing queries (e.g. creating a superuser) to bypass auditing in that window. 3. [PROBLEM] Blocks SCYLLADB-1430. The existing order prevents audit configuration from being driven by group0 state, because audit started before group0. The series is organized as: a test-helper refactor, a test for the audited maintenance-socket flow, a startup-phase split, the construction-order fix and its shutdown-race test, and finally the storage-before-socket fix and its startup-window test. Fixes SCYLLADB-1615 No backport, bugs don't seem severe enough to justify backporting. Closes scylladb/scylladb#29539 * github.com:scylladb/scylladb: audit: assert storage ordering invariants at runtime audit: start maintenance socket after audit storage audit: move audit construction before maintenance socket audit: split startup into construction and storage phases test: audit: verify maintenance socket operations are audited test: audit: parameterize source address in audit assertions	2026-04-29 10:37:38 +02:00
Łukasz Paszkowski	7e14ea5ac8	sstables: only wipe TemporaryHashes for sstable formats that have it Commit `8d34127684` ("sstables: clean up TemporaryHashes file in wipe()") unconditionally calls filename(..., component_type::TemporaryHashes) inside filesystem_storage::wipe(). However, the TemporaryHashes component is only registered in the component map of the 'ms' sstable format. For older formats (ka, la, mc, md, me) the lookup goes through sstable_version_constants::get_component_map(version).at(...) and throws std::out_of_range. The exception is then swallowed by the outer catch(...) in wipe(), which just logs and ignores. As a side effect, the subsequent remove_file(new_toc_name) is never reached and the TemporaryTOC ('*-TOC.txt.tmp') file is left as an orphan on disk after every unlink() of a non-'ms' sstable. Guard the lookup with get_component_map(version).contains() so the cleanup is only attempted for formats that actually define the component. Add a regression test in test/boost/sstable_directory_test.cc that creates an 'me'-format sstable, unlinks it and asserts that the sstable directory is left empty. Without the fix the test fails with a leftover 'me-...-TOC.txt.tmp' file. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1697 Closes scylladb/scylladb#29620	2026-04-29 08:06:36 +03:00
Botond Dénes	809f12f988	Merge 'test/cluster/dtest: fix ScyllaNode state not persisting across nodelist() calls' from Benny Halevy `ScyllaCluster.nodelist()` creates new `ScyllaNode` objects on every call, so per-node state set via `set_smp()`, `set_log_level()`, and `_adjust_smp_and_memory()` was lost. This meant `set_smp()` had no effect when `cluster.start()` was called after it, since `start_nodes()` calls `nodelist()` internally which creates fresh nodes with default values. - Add debug logging for smp/memory in ScyllaNode - Store per-node settings (smp, memory, log levels) in a `ScyllaCluster._node_resources` dict keyed by server_id, so they survive `nodelist()` reconstruction. `ScyllaNode` restores its state from this dict on construction and saves it back whenever `set_smp()`, `set_log_level()`, or `_adjust_smp_and_memory()` modifies it. - Add a reproducer test verifying `set_smp()` takes effect on restart Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1629 -- No backport needed: this only fixes dtest infrastructure, no production code is affected. Closes scylladb/scylladb#29549 * github.com:scylladb/scylladb: test/cluster/dtest: add test for node.set_smp() persistence test/cluster/dtest: cache ScyllaNode instances in ScyllaCluster test/cluster/dtest/ccmlib/scylla_node: add debug logging	2026-04-29 06:25:36 +03:00
Evgeniy Naydanov	96d3f13245	test: add --keep-duplicates and assign RUN_ID via shared cache Add --keep-duplicates CLI argument to bypass deduplication and forward to pytest, allowing duplicate test file arguments to be collected multiple times. Move RUN_ID assignment from pytest_collect_file to modify_pytest_item. All File collectors for the same source file share a single run_ids dict (via RUN_ID_CACHE stash key), so items from duplicate collection arguments (e.g. with --keep-duplicates) automatically get unique IDs. Remove CppFile.run_id cached_property — CppTestCase now reads RUN_ID from its own item stash, which is set during modify_pytest_item. Fix --repeat option default from string "1" to int 1 — argparse only applies type= to CLI-parsed values, not defaults. Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>	2026-04-29 02:36:05 +00:00
Evgeniy Naydanov	497bd6b6c9	test/pylib/runner: fix disabled file collection Return a DisabledFile collector instead of an empty list when all modes are disabled for a file. Returning an empty list caused subsequent files to not get their stash items set because file_path was never removed from REPEATING_FILES. Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>	2026-04-29 02:36:05 +00:00
Evgeniy Naydanov	05f2c53931	Revert "test.py: fix test collection bug" This reverts commit `92c09d106d`.	2026-04-29 02:35:00 +00:00
Andrzej Jackowski	bc67dd0b82	audit: split startup into construction and storage phases The table-based audit backend needs Raft to create its keyspace, but the audit service must exist earlier so that CQL paths don't silently skip auditing. Split startup into two phases: construction and storage initialization. Queries arriving between the two phases are logged as errors. This is a refactoring commit and the split sections will be moved later in this patch series. Refs SCYLLADB-1615	2026-04-28 18:58:42 +02:00
Andrzej Jackowski	1616c71bf0	test: audit: verify maintenance socket operations are audited User creation via the maintenance socket should produce audit entries, as this is the recommended flow for creating the initial superuser when default credentials are disabled. The test is parametrized by audit backend (table and syslog). The maintenance socket source address is "::" because Seastar returns a zero-initialised in6_addr for AF_UNIX sockets. Test time in dev: 0.6s Refs SCYLLADB-1615	2026-04-28 18:42:39 +02:00
Avi Kivity	c4de2b3c9d	Merge 'test: fix flaky tablets test by using read barrier' from Aleksandra Martyniuk Some tests in test_tablets.py read system_schema.keyspaces from an arbitrary node that may not have applied the latest schema change yet. Pin the read to a specific node and issue a read barrier before querying, ensuring the node has up-to-date data. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1700 Test fix; no backport Closes scylladb/scylladb#29655 * github.com:scylladb/scylladb: test: fix flaky rack list conversion tests by using read barrier test: fix flaky test_enforce_rack_list_option by using read barrier	2026-04-28 17:15:59 +03:00
Ernest Zaslavsky	a97502920b	test: optimize compaction_strategy_cleanup_method for remote storage Parallelize SSTable creation using parallel_for_each. The file count is made a parameter with a default of 64, allowing future S3/GCS variants to use a smaller count if needed.	2026-04-28 16:59:38 +03:00
Ernest Zaslavsky	0b9a2844bd	test: optimize stcs_reshape_overlapping for remote storage Parallelize SSTable creation using parallel_for_each and reduce the SSTable count from 256 to 64 for S3/GCS variants. The local test variant retains the original 256 count.	2026-04-28 16:59:38 +03:00
Ernest Zaslavsky	ac89cffc9f	test: optimize twcs_reshape_with_disjoint_set for remote storage Parallelize SSTable creation across all sub-tests using parallel_for_each and reduce the SSTable count from 256 to 64 for S3/GCS variants. Re-enable the S3 test variant that was previously disabled due to taking 4+ minutes. With parallel creation and reduced count, the test now completes in a reasonable time.	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	01b4292f87	test: parallelize SSTable creation in cleanup_during_offstrategy_incremental Pre-extract mutation pairs and use parallel_for_each with make_sstable_containing_async to create SSTables concurrently instead of sequentially. The post-creation loop still runs serially to collect token ranges and generations.	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	923ff9abc9	test: parallelize SSTable creation in run_incremental_compaction_test Pre-extract mutation pairs and use parallel_for_each with make_sstable_containing_async to create SSTables concurrently instead of sequentially. The post-creation loop still runs serially to collect token ranges and generations that depend on SSTable order.	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	6a25f52473	test: parallelize SSTable creation in offstrategy_sstable_compaction Use parallel_for_each with make_sstable_containing_async to create SSTables concurrently instead of sequentially, reducing wall-clock time on remote storage backends (S3/GCS).	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	baca685629	test: parallelize SSTable creation in twcs_partition_estimate Use parallel_for_each with make_sstable_containing_async to create SSTables concurrently instead of sequentially, reducing wall-clock time on remote storage backends (S3/GCS).	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	716202b839	test: add trace-level logging for S3 and HTTP in compaction tests Raise log levels for s3 and gcp_storage from debug to trace, and add trace-level logging for http and default_http_retry_strategy modules. This provides better visibility into storage backend interactions when debugging slow or failing compaction tests on remote storage.	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	a4ebe16517	test: make sstable test utilities natively async The original make_memtable used seastar::thread::yield() for preemption, which required all callers to run inside a seastar::thread context. This prevented the utilities from being used directly in coroutines or parallel_for_each lambdas. Make the primary functions — make_memtable, make_sstable_containing, and verify_mutation — return future<> directly. Callers now .get() explicitly when in seastar::thread context, or co_await when in a coroutine. make_memtable now uses coroutine::maybe_yield() instead of seastar::thread::yield(). verify_mutation is converted to coroutines as well. Requested in: https://github.com/scylladb/scylladb/pull/29416#pullrequestreview-4112296282	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	4b637226a7	test: move make_memtable out of external_updater in row_cache_test test_exception_safety_of_update_from_memtable called make_memtable inside the row_cache::external_updater callback. external_updater runs as a synchronous execute() call that must not yield, but make_memtable calls seastar::thread::yield() every 10th mutation. The bug was latent because the test only inserted 5 mutations, so the yield was never reached. Move the call before the callback. Prerequisite for the next patch, which changes make_memtable to call make_memtable_async().get() -- that would yield on every mutation via coroutine::maybe_yield(), making this bug visible.	2026-04-28 16:59:37 +03:00
Ernest Zaslavsky	7c09f35ddf	test: increase S3 max connections for compaction tests Increase max_connections from the default to 32 for the S3 endpoint used in tests. This allows more concurrent HTTP connections to the S3 backend, which is needed to benefit from parallel SSTable creation that will be introduced in subsequent commits.	2026-04-28 16:59:37 +03:00
Patryk Jędrzejczak	d9dd3bfe53	Merge 'topology_coordinator: join tablet load stats refresh in stop()' from Andrzej Jackowski Commit `2b7aa32` (topology_coordinator: Refresh load stats after table is created or altered) registered topology_coordinator as a schema change listener and added on_create_column_family which fire-and-forgets _tablet_load_stats_refresh.trigger(). The triggered task runs on the gossip scheduling group via with_scheduling_group and accesses the topology_coordinator via 'this'. stop() unregisters the listener but does not wait for any in-flight refresh task. If a notification fires between _tablet_load_stats_refresh.join() in run() and unregister_listener in stop(), the scheduled task can outlive the topology_coordinator and access freed memory after run_topology_coordinator's coroutine frame is destroyed. Wait for the refresh to complete in stop() after unregistering the listener, ensuring no task can fire after destruction. Fixes SCYLLADB-1728 Backport to 2026.1 and 2026.2, because the issue was introduced in `2b7aa32` Closes scylladb/scylladb#29653 * https://github.com/scylladb/scylladb: test: tablet_stats: reproduce shutdown refresh race topology_coordinator: join tablet load stats refresh in stop()	2026-04-28 12:54:28 +02:00
Benny Halevy	5eaa979f35	test/cluster/dtest: add test for node.set_smp() persistence Add a test that reproduces SCYLLADB-1629: set_smp() had no effect because nodelist() created new ScyllaNode objects on every call, losing the _smp_set_during_test value. The test fails without the fix in the previous patch.	2026-04-28 12:34:08 +03:00
Benny Halevy	7430c1efd7	test/cluster/dtest: cache ScyllaNode instances in ScyllaCluster ScyllaCluster.nodelist() was creating new ScyllaNode objects on every call, so per-node state set via set_smp(), set_log_level(), and _adjust_smp_and_memory() was lost between calls. Fix by caching ScyllaNode instances in a list populated by _add_nodes() using the list returned by servers_add() in populate(). Nodes are assigned monotonically increasing names (node1, node2, ...). nodelist() simply returns the cached list.	2026-04-28 12:34:06 +03:00
Marcin Maliszkiewicz	b0f988afc4	Merge 'auth: fix shutdown and startup races in LDAP cache pruner' from Andrzej Jackowski The LDAP role manager's `_cache_pruner` background fiber periodically calls cache::reload_all_permissions(). Two races cause it to hit SCYLLA_ASSERT(_permission_loader): - Cross-shard race: The pruner `used _cache.container().invoke_on_all()` to reload permissions on every shard. Since both `service::start()` and `sharded<service>::stop()` execute per-shard in parallel, the pruner on one shard could call reload_all_permissions() on another shard before that shard set its loader (startup) or after it cleared its loader (shutdown). Each shard runs its own pruner instance, so reloading locally is sufficient — this also removes redundant N² reload calls. - Intra-shard race: `service::stop()` cleared the permission loader and stopped the role manager concurrently (via when_all_succeed). A mid-reload pruner could yield and then call the now-null loader. Fixed by stopping the role manager first so the pruner is fully drained before the loader is cleared. Fixes SCYLLADB-1679 Backport to 2026.2, introduced in `7eedf50c12` Closes scylladb/scylladb#29605 * github.com:scylladb/scylladb: auth: make shutdown the exact reverse of startup test: ldap: add test for pruner crash during shutdown auth: start authorizer and set permission loader before role manager auth: stop role manager before clearing permission loader auth: reload LDAP permission cache on local shard only	2026-04-28 11:16:07 +02:00
Botond Dénes	a7e9c0e6d2	Merge 'test.py: fix test collection bug' from Andrei Chekun In certain circumstances current way of collecting can be error-prone. Collection can stop when the first file is skipped in the mode leaving the rest of the files in CLI not collected. Another issue that if the file specified twice, with directory and file explicitly, it will produce incorrect CppFile in the stash causing KeyError. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1714 No backport, test framework bug fix only. Closes scylladb/scylladb#29634 * github.com:scylladb/scylladb: test.py: fix framework test test.py: fix test collection bug	2026-04-28 11:52:35 +03:00
Andrzej Jackowski	459e3970cd	test: tablet_stats: reproduce shutdown refresh race The coordinator can receive a schema-change notification after run() finishes but before stop() unregisters listeners. The test pins that window with error injections and verifies stop() waits for the refresh instead of letting it outlive the coordinator. Test time in dev: 9.51s Refs SCYLLADB-1728	2026-04-28 08:00:54 +02:00
Avi Kivity	2615d0e8d8	test/cluster/test_incremental_repair: add retry for residual leadership race There is a small race window where Raft leadership could transfer back to servers[1] between the ensure_group0_leader_on() check and the actual restart. If this happens, the new coordinator re-initiates repair and masks the compaction-merge bug. Extract the core test logic into _do_race_window_promotes_unrepaired_data() which directly checks get_topology_coordinator() after restart and raises _LeadershipTransferred if servers[1] became coordinator. The test function calls this helper in a retry loop (up to 5 attempts). Refs: SCYLLADB-1478	2026-04-27 21:11:06 +03:00
Avi Kivity	914b70c75b	test/cluster/test_incremental_repair: fix flaky coordinator-change scenario The test_incremental_repair_race_window_promotes_unrepaired_data test was flaky because it hardcodes servers[1] as the restart target but did not ensure servers[1] was NOT the topology coordinator. When servers[1] happened to be the Raft group0 leader (topology coordinator), restarting it killed the leader, forced a new election, and the new coordinator re-initiated tablet repair. This re-repair flushes memtables on all replicas via take_storage_snapshot() and marks the resulting sstables as repaired -- causing post-repair keys to appear in repaired sstables on servers[0] and servers[2]. The test then hit the wrong assertion (servers[0]/[2] contaminated). Fix: before starting the repair, check whether servers[1] is the topology coordinator. If so, move leadership to another server via ensure_group0_leader_on() so that restarting servers[1] only kills a follower -- which does not trigger an election or coordinator change. Reproducibility was confirmed by forcing leadership to servers[1] via ensure_group0_leader_on() and observing deterministic failure with all three servers showing post-repair keys in repaired sstables (confirming the re-repair scenario), then verifying the fix passes reliably. Fixes: SCYLLADB-1478	2026-04-27 21:08:12 +03:00
Aleksandra Martyniuk	6b7ce5e244	test: fix flaky rack list conversion tests by using read barrier test_numeric_rf_to_rack_list_conversion and test_numeric_rf_to_rack_list_conversion_abort were reading system_schema.keyspaces from an arbitrary node that may not have applied the latest schema change yet. Pin the read to a specific node and issue a read barrier before querying, ensuring the node has up-to-date data.	2026-04-27 15:19:09 +02:00
Aleksandra Martyniuk	9d3d424d58	test: fix flaky test_enforce_rack_list_option by using read barrier The test was reading system_schema.keyspaces from an arbitrary node that may not have applied the latest schema change yet. Pin the read to a specific node and issue a read barrier before querying, ensuring the node has up-to-date data.	2026-04-27 14:44:38 +02:00
Ferenc Szili	6b3e18c4a9	test: verify load balancer handles dropped tables gracefully Add test_load_balancing_with_dropped_table that simulates the race between DROP TABLE and the load balancer by capturing a token metadata snapshot before dropping the table, then passing the stale snapshot to balance_tablets(). Verifies it completes without aborting and produces no migrations for the dropped table.	2026-04-27 10:33:56 +02:00
Andrei Chekun	f2f4915e09	test.py: fix framework test Framework test was not skipping unit directory where C++ tests are located. With bug fixing this started to fail. Add ignoring this directory as well.	2026-04-25 18:04:55 +02:00
Piotr Szymaniak	d5efd1f676	test/cluster: wait for Alternator readiness in server startup server_add() only waits for CQL readiness before returning. The Alternator HTTP port may not be listening yet, causing ConnectionRefused with Alternator tests. Extend the ServerUpState enum and startup loop to also check Alternator port readiness when configured. Whenever Alternator port(s) is/are configured, each is verified if connectable and queryable, similar to how CQL ports are probed. Fixes SCYLLADB-1701 Closes scylladb/scylladb#29625	2026-04-25 16:35:44 +03:00
Piotr Smaron	d14d07a079	test: fix flaky test_sstable_write_large_{row,cell} by using a fixed partition key Commit `ce00d61917` ("db: implement large_data virtual tables with feature flag gating") changed these two tests to construct their mutation with a randomly generated partition key (simple_schema::make_pkey()) instead of the previously fixed pk "pv", with the comment that this avoids a "Failed to generate sharding metadata" error. simple_schema::make_pkey() delegates to tests::generate_partition_key(), which defaults to key_size{1, 128}, i.e. the partition key length is uniformly random in [1, 128] bytes. That interacts badly with the fact that both tests pick thresholds at exact byte boundaries of the MC sstable row encoding: - The large-data handler records a row's size as _data_writer->offset() - current_pos (sstables/mx/writer.cc: collect_row_stats()), i.e. the number of bytes the row took on disk. - For the first clustering row, the body includes a vint-encoded prev_row_size = pos - _prev_row_start. - _prev_row_start is captured at the start of the partition (consume_new_partition()) before the partition key is written to the data stream, so prev_row_size rolls in the partition key's serialized length (2-byte prefix + pk bytes) + deletion_time + static row size. A random-size partition key therefore perturbs the first clustering row's encoded size by 1-2 bytes across runs (the vint of prev_row_size crosses the 128 boundary), flipping the test's byte-exact threshold comparison. On seed 2104744000 this produced: critical check row_size_count == expected.size() has failed [3 != 2] Fix the two byte-exact-sensitive tests by reverting their partition key to the fixed s.new_mutation("pv") used before `ce00d61917`. Under smp=1 (which these tests run with, per -c1 in the test invocation) a fixed key is always shard-local, so no sharding-metadata issue arises here. The other tests modified by `ce00d61917` (test_sstable_log_too_many_rows, test_sstable_log_too_many_dead_rows, test_sstable_too_many_collection_elements, test_large_data_records_round_trip, etc.) assert on row/element counts or use thresholds with enough slack that the partition key size does not matter, and are left unchanged. Add an explanatory comment to each fixed site so the pitfall is not re-introduced by a future refactor. Verified stable with: ./test.py --mode=dev test/boost/sstable_3_x_test.cc::test_sstable_write_large_row --repeat 100 --max-failures 1 ./test.py --mode=dev test/boost/sstable_3_x_test.cc::test_sstable_write_large_cell --repeat 100 --max-failures 1 ./test.py --mode=release test/boost/sstable_3_x_test.cc::test_sstable_write_large_row --repeat 100 --max-failures 1 ./test.py --mode=release test/boost/sstable_3_x_test.cc::test_sstable_write_large_cell --repeat 100 --max-failures 1 All four invocations: 100/100 passed. Fixes: SCYLLADB-1685 Closes scylladb/scylladb#29621	2026-04-25 16:32:02 +03:00
Andrei Chekun	92c09d106d	test.py: fix test collection bug In certain circumstances current way of collecting can be error prone. Collection can stop when the first file is skipped in the mode leaving the rest of the files in CLI not collected. Another issue that if the file specified twice, with directory and file explicitly, it will produce incorrect CppFile in the stash causing KeyError. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1714	2026-04-24 17:57:11 +02:00
Dimitrios Symonidis	c40842f60a	db, sstables: add node_owner to sstables registry primary key Add a node_owner column (locator::host_id) to system.sstables and make it part of the partition key, so the primary key becomes PRIMARY KEY ((table_id, node_owner), generation). This is the first step toward moving the sstables registry into system_distributed: once distributed, each node's startup scan must read only the rows it owns, which requires the owning node to be part of the partition key. Partitioning by (table_id, node_owner) turns that scan into a single-partition read of exactly the local node's rows. The new column is populated via sstables_manager::get_local_host_id(). No backward compatibility is preserved; the feature is experimental and gated by keyspace-storage-options.	2026-04-24 16:41:09 +02:00
Dimitrios Symonidis	ce78c5113e	db, sstables: rename sstables registry column owner to table_id The partition-key column in system.sstables named 'owner' actually holds a table_id. Rename the CQL column and the matching C++ parameter and member names so the identifier describes what it stores. No behavior change. This prepares the schema for an upcoming node_owner partition-key column (the local host id), which needs a free name.	2026-04-24 16:24:07 +02:00
Pavel Emelyanov	aa99c1fd6e	storage_proxy: Use shared updateable_timeout_config by reference Drop storage_proxy's own updateable_timeout_config member built from db::config and take a reference to the shared sharded instance introduced by the previous patch. Both main and cql_test_env pass std::ref(timeout_cfg) into storage_proxy::start so each shard's storage_proxy references its shard-local updateable_timeout_config. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-24 15:07:21 +03:00
Pavel Emelyanov	7b7295fde0	main: Introduce sharded<updateable_timeout_config> Build a single sharded updateable_timeout_config from db::config in both main and cql_test_env, sitting next to sharded<cql_config>. Subsequent patches migrate storage_proxy, the CQL transport controller and alternator server from their per-owner updateable_timeout_config copies to references into this shared instance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-24 15:03:35 +03:00
Andrzej Jackowski	adf1e26bab	test: ldap: add test for pruner crash during shutdown Verify that service::stop() drains the LDAP pruner before clearing the permission loader. The test installs a slow permission loader and confirms the pruner is actively reloading when teardown begins. Refs SCYLLADB-1679	2026-04-24 13:34:09 +02:00
Pavel Emelyanov	ec2339e635	view: Add node_update_backlog reference to view_update_generator Pass node_update_backlog explicitly to view_update_generator via its constructor and start() call. This is plumbing only; no behavior change. A subsequent patch will use this reference to compute view update throttling delays without going through database::get_config(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-24 13:45:46 +03:00

... 2 3 4 5 6 ...

11801 Commits