Previously, when a snapshot load subsumed a committed entry before apply() was called locally, add_entry would throw commit_status_unknown -- even though the entry was known to be committed and included in the snapshot. This was overly pessimistic. Normal state machine implementations shouldn't care whether an entry was applied via apply() or via a snapshot load. Unnecessary commit_status_unknown caused flakiness of test_frequent_snapshotting and unnecessary retries in group0. Raft groups from strongly consistent tables couldn't hit unnecessary commit_status_unknown's because they use wait_type::committed and `enable_forwarding == false`. Three sites are changed: 1. wait_for_entry (truncation case): the snapshot-term match optimization that proved the entry was committed now applies to both wait_type::committed and wait_type::applied, not just committed. 2. wait_for_entry (snapshot covers entry): instead of throwing commit_status_unknown when the snapshot index >= entry index, return successfully. The entry's effects are included in the state machine's state via the snapshot. 3. drop_waiters: when called from load_snapshot, pass the snapshot term. Waiters whose term matches the snapshot term are resolved successfully (set_value) instead of failing with commit_status_unknown, since the Log Matching Property guarantees they were committed and included. This deflakes test_frequent_snapshotting: the test uses aggressive snapshot settings (snapshot_threshold=1) causing wait_for_entry to occasionally find the snapshot covering its entry. Previously this threw commit_status_unknown, failing the test. With this fix, wait_for_entry returns success. Note that apply() is never actually skipped in this test -- the leader always applies entries locally before taking a snapshot. The nemesis test is updated to handle the new behavior: call() detects when add_entry succeeded but the output channel was not written (apply() skipped locally) and returns apply_skipped instead of hanging. The linearizability checker in basic_generator_test counts skipped applies separately from failures. basic_generator_test exercises this path: skipped_applies > 0 occurs in some runs. Fixes: SCYLLADB-1264 No backport: the changes are quite risky and the test being fixed fails very rarely. Closes scylladb/scylladb#29685 * github.com:scylladb/scylladb: test/raft: fix duplicate check in connected::operator() test/raft: add tests for add_entry snapshot interactions raft: do not throw commit_status_unknown from add_entry when possible raft: change drop_waiters parameter from index to snapshot descriptor raft: server: fix a typo
Scylla in-source tests.
For details on how to run the tests, see docs/dev/testing.md
Shared C++ utils, libraries are in lib/, for Python - pylib/
alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool
If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).
To add a new folder, create a new directory, and then
copy & edit its suite.ini.