User creation via the maintenance socket should produce audit
entries, as this is the recommended flow for creating the
initial superuser when default credentials are disabled.
The test is parametrized by audit backend (table and syslog).
The maintenance socket source address is "::" because Seastar
returns a zero-initialised in6_addr for AF_UNIX sockets.
Test time in dev: 0.6s
Refs SCYLLADB-1615
Maintenance socket connections report a different source address
than regular CQL connections. Make the source field configurable
in the audit test helpers so that upcoming maintenance socket
tests can verify the correct address.
Also fix the syslog backend address parser to handle IPv6
addresses formatted as [ip]:port.
Refs SCYLLADB-1615
Commit 16b56c2451 ("Audit: avoid dynamic_cast on a hot path") moved
audit info into batch_statement via set_audit_info(), but only wired it
for the CQL-text BATCH path (raw::batch_statement::prepare()).
Native-protocol BATCH messages (opcode 0x0D), handled by
process_batch_internal in transport/server.cc, construct a
batch_statement without setting audit_info. This causes audit to
silently skip the entire batch.
Set audit_info on the batch_statement so these batches are audited.
Fixes SCYLLADB-1652
No backport - bug introduced recently.
Closesscylladb/scylladb#29570
* github.com:scylladb/scylladb:
test/audit: add reproducer for native-protocol batch not being audited
audit: set audit_info for native-protocol BATCH messages
test/audit: rename internal test methods to avoid CI misdetection
should be merged after #29235
Complete the typed skip markers migration started in the plugin PR.
Every bare `@pytest.mark.skip` decorator and `pytest.skip()` runtime call
across the test suite is replaced with a typed equivalent, making skip
reasons machine-readable in JUnit XML and Allure reports.
**62 files changed** across 8 commits, covering ~127 skip sites in total.
Bare `pytest.skip` provides only a free-text reason string. CI dashboards
(JUnit, Allure) cannot distinguish between a test skipped due to a known
bug, a missing feature, a slow test, or an environment limitation. This
makes it hard to track skip debt, prioritize fixes, or filter dashboards
by skip category.
The typed markers (`skip_bug`, `skip_not_implemented`, `skip_slow`,
`skip_env`) introduced by the `skip_reason_plugin` solve this by embedding
a `skip_type` field into every skip report entry.
| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_bug` | 24 | 16 | Skip reason references a known bug/issue |
| `skip_not_implemented` | 10 | 5 | Feature not yet implemented in Scylla |
| `skip_slow` | 4 | 3 | Test too slow for regular CI runs |
| `skip_not_implemented` (bare) | 2 | 1 | Bare `@pytest.mark.skip` with no reason (COMPACT STORAGE, #3882) |
| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_env` | ~85 | 34 | Feature/config/topology not available at runtime |
| `skip_bug` | 2 | 2 | Known bugs: Streams on tablets (#23838), coroutine task not found (#22501) |
- **Comments**: 7 comments/docstrings across 5 files updated from `pytest.skip()` to `skip()`
- **Plugin hardened**: `warnings.warn()` → `pytest.UsageError` for bare `@pytest.mark.skip` at collection time — bare skips are now a hard error, not a warning
- **Guard tests**: New `test/pylib_test/test_no_bare_skips.py` with 3 tests that prevent regression:
- AST scan for bare `@pytest.mark.skip` decorators
- AST scan for bare `pytest.skip()` runtime calls
- Real `pytest --collect-only` against all Python test directories
Runtime skip sites use the convenience wrappers from `test.pylib.skip_types`:
```python
from test.pylib.skip_types import skip_env
```
Usage:
```python
skip_env("Tablets not enabled")
```
1. **test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs** — 24 decorator sites, 16 files
2. **test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented** — 10 decorator sites, 5 files
3. **test: migrate @pytest.mark.skip to @pytest.mark.skip_slow** — 4 decorator sites, 3 files
4. **test: migrate bare @pytest.mark.skip to skip_not_implemented** — 2 bare decorators, 1 file
5. **test: migrate runtime pytest.skip() to typed skip_env()** — ~85 sites, 34 files
6. **test: migrate runtime pytest.skip() to typed skip_bug()** — 2 sites, 2 files
7. **test: update comments referencing pytest.skip() to skip()** — 7 comments, 5 files
8. **test/pylib: reject bare pytest.mark.skip and add codebase guards** — plugin hardening + 3 guard tests
- All 60 plugin + guard tests pass (`test/pylib_test/`)
- No bare `@pytest.mark.skip` or `pytest.skip()` calls remain in the codebase
- `pytest --collect-only` succeeds across all test directories with the hardened plugin
SCYLLADB-1349
Closesscylladb/scylladb#29305
* github.com:scylladb/scylladb:
test/alternator: replace bare pytest.skip() with typed skip helpers
test: migrate new bare skips introduced by upstream after rebase
test/pylib: reject bare pytest.mark.skip and add codebase guards
test: update comments referencing pytest.skip() to skip_env()
test: migrate runtime pytest.skip() to typed skip_bug()
test: migrate runtime pytest.skip() to typed skip_env()
test: migrate bare @pytest.mark.skip to skip_not_implemented
test: migrate @pytest.mark.skip to @pytest.mark.skip_slow
test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented
test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs
The existing test_batch sends a textual BEGIN BATCH ... APPLY BATCH as a
QUERY message, which goes through the CQL parser and raw::batch_statement::
prepare() — a path that correctly sets audit_info. This missed the bug
where native-protocol BATCH messages (opcode 0x0D), handled by
process_batch_internal in transport/server.cc, construct a batch_statement
without setting audit_info, causing audit to silently skip the batch.
Add _test_batch_native_protocol which uses the driver's BatchStatement
(both unprepared and prepared variants) to exercise this code path.
Refs SCYLLADB-1652
The CI heuristic picks up any function named test_* in changed files
and tries to run it as a standalone pytest test. The AuditTester class
methods (test_batch, test_dml, etc.) are not top-level pytest tests —
they are internal helpers called from the actual test functions.
Prefix them with underscore so CI does not mistake them for
standalone tests.
assert_entries_were_added asserted that new audit rows always appear at
the tail of each per-node, event_time-sorted sequence. That invariant
is not a property of the audit feature: audit writes are asynchronous
with respect to query completion, and on a multi-node cluster QUORUM
reads of audit.audit_log can reveal a row with an older event_time
after a row with a newer one has already been observed.
Replace the positional tail slice with a per-node set difference
between the rows observed before and after the audited operation.
The wait_for retry loop, noise filtering, and final by-value
comparison against expected_entries are unchanged, so the test still
verifies the real contract, that the expected audit entries appear,
without relying on a visibility-ordering invariant that the audit log
does not guarantee.
Fixes SCYLLADB-1589
Closesscylladb/scylladb#29567
In `ks_prop_defs::as_ks_metadata(...)` a default initial tablets count
is set to 0, when tablets are enabled and the replication strategy
is NetworkReplicationStrategy.
This effectively sets _uses_tablets = false in abstract_replication_strategy
for the remaining strategies when no `tablets = {...}` options are specified.
As a consequence, it is possible to create vnode-based keyspaces even
when tablets are enforced with `tablets_mode_for_new_keyspaces`.
The patch sets a default initial tablets count to zero regardless of
the chosen replication strategy. Then each of the replication strategy
validates the options and raises a configuration exception when tablets
are not supported.
All tests are altered in the following way:
+ whenever it was correct, SimpleStrategy was replaced with NetworkTopologyStrategy
+ otherwise, tablets were explicitly disabled with ` AND tablets = {'enabled': false}`
Fixes https://github.com/scylladb/scylladb/issues/25340Closesscylladb/scylladb#25342
Migrate runtime pytest.skip() calls across 34 files to use the typed
skip_env() wrapper from test.pylib.skip_types.
These sites skip at runtime because a required feature, config option,
library version, build mode, or runtime topology is not available.
Also fixes 'raise pytest.skip(...)' in test_audit.py — skip_env()
already raises internally, so the explicit raise was incorrect.
Each file gains one new import:
from test.pylib.skip_types import skip_env
Verify that upgrading from 2025.1 to master does not silently drop DDL
auditing for table-scoped audit configurations (SCYLLADB-1155).
Test time in dev: 4s
Refs: SCYLLADB-1155
Fixes: SCYLLADB-1305
The old execute_and_validate_audit_entry required every caller to
pass audit_settings so it could decide internally whether to expect
an entry. A test added later in this series needs to simply assert
an entry was produced, without specifying audit_settings at all.
Split into two methods:
- execute_and_validate_new_audit_entry: unconditionally expects an
audit entry.
- execute_and_validate_if_category_enabled: checks audit_settings
to decide whether to expect an entry or assert absence.
Local wrapper functions and **kwargs forwarding are removed in favor
of explicit arguments at each call site, and expected-error cases are
handled inline with assert_invalid + assert_entries_were_added.
AuditTester uses self.manager throughout but never declares it.
The attribute is only assigned in the CQLAuditTester subclass
__init__, so the type checker reports 'Attribute "manager" is
unknown' on every self.manager reference in the base class.
Add an __init__ to AuditTester that accepts and stores the manager
instance, and update CQLAuditTester to forward it via super().__init__
instead of assigning self.manager directly.
get_audit_partitions_for_operation() returns None when no audit log
rows are found. In _test_insert_failure_doesnt_report_success_assign_nodes,
this None is passed to set(), causing TypeError: 'NoneType' object is
not iterable.
The audit log entry may not yet be visible immediately after executing
the INSERT, so use wait_for() from test.pylib.util with exponential
backoff to poll until the entry appears. Import it as wait_for_async
to avoid shadowing the existing wait_for from test.cluster.dtest.dtest_class,
which has a different signature (timeout vs deadline).
Fixes SCYLLADB-1330
Closesscylladb/scylladb#29289
_apply_config_to_running_servers used wait_for_config (REST API poll)
to confirm live config updates. The REST API reads from shard 0 only,
so it can return before broadcast_to_all_shards() completes — other
shards may still have stale audit config, generating unexpected entries.
Additionally, server_remove_config_option for absent keys sent separate
SIGHUPs before server_update_config, and the single wait_for_config at
the end could match a completion from an earlier SIGHUP.
Wait for "completed re-reading configuration file" in the server log
after each SIGHUP-producing operation. This message is logged only
after both read_config() and broadcast_to_all_shards() finish,
guaranteeing all shards have the new config. Each operation gets its
own mark+wait so no stale completion is matched.
Fixes SCYLLADB-1277
UnixSockerListener used ThreadingUnixDatagramServer, which spawns a
new thread per datagram. The notification barrier in get_lines() relies
on all prior datagrams being handled before the notification. With
threading, the notification handler can win the lock before an audit
entry handler, so get_lines() returns before the entry is appended.
clear_audit_logs() then clears an incomplete buffer, and the late
entry leaks into the next test's before/after diff.
Switch to sequential UnixDatagramServer. The server thread now handles
datagrams in kernel FIFO order, so the notification is always processed
after all preceding audit entries.
Refs SCYLLADB-1277
pytest tries to collect tests for execution in several ways.
One is to pick all classes that start with 'Test'. Those classes
must not have custom '__init__' constructor. TestCQLAudit does.
TestCQLAudit after migration from test/cluster/dtest is not a test
class anymore, but rather a helper class. There are two ways to fix
this:
1. Add __init__ = False to the TestCQLAudit class
2. Rename it to not start with 'Test'
Option 2 feels better because the new name itself does not convey
the wrong message about its role.
Fixes SCYLLADB-1237
Remove unused pytest.mark.single_node in TestCQLAudit class.
This is a leftover from audit tests migration from
test/cluster/dtest to test/cluster.
Refs SCYLLADB-1237
assert_entries_were_added:
- takes a "before" snapshot of the audit log
- yields to execute a statement
- takes an "after" snapshot of the audit log
- computes new rows by diffing "after" minus "before"
If an audit entry generated by prepare() arrives between the snapshot
and the diff, it inflates the new row count and the test fails with
assert 2 <= 1.
Fix by:
- Adding clear_audit_logs() at the end of prepare(), after all setup
- Waiting for the "completed re-reading configuration file" log message
after server_update_config
- Draining pending syslog lines before clearing the buffer
Refs SCYLLADB-573
assert_entries_were_added computes new audit rows by slicing the "after"
list at the length of the "before" list: rows_after[len(rows_before):].
This assumes new rows always appear at the tail of the combined sorted
list. In a multinode setup, each node generates its own event_time
timestamps. A new row from node A can sort before an old row from node
B, breaking the tail assumption. The assertion "new rows are not the
last rows in the audit table" then fires.
Fix this by splitting the before/after lists per node and computing the
new rows tail independently for each node. This guarantees that per node
ordering, which is monotonic, is respected, and the combined new rows
are sorted afterwards.
Refs SCYLLADB-573
This patch reorganizes the execution flow of the test functions.
They are grouped to enable cluster reuse between specific test
functions. One of the main contributors to the test execution time
is the cluster preparation. This patch significantly reduces the
total test execution time by having way less new cluster preparation
calls and more cluster reuse.
Performance increase on the developer machine is around 38%:
- before: 4m 29s
- after: 2m 47s
Fixes SCYLLADB-573
Make audit tests from test/cluster/dtest to test/cluster.
test/cluster environment has less overhead, and audit tests
are heavy, their execution taking lots of time. This patch
is part of an effort to improve audit test suite performance.
This patch refactors the tests so that they execute correctly,
as well as enables them. A follow up patch will remove the
audit tests in test/cluster/dtest.
All the tests are confirmed to be running after the change.
No dead code present.
Test test_audit_categories_invalid is not parametrized anymore.
It never used the parametrized helper class, so it just ran
the same logic three times. This is why there are now 74,
and not 76, test executions.
Refs SCYLLADB-573
This patch just copies the audit test suite from dtest and
disables it in the test config file. Later patches will
update the code and enable the test suite.
Refs SCYLLADB-573