Files
scylladb/test
Piotr Dulikowski 762d9ef68f Merge 'cdc: Set tombstone_gc when creating log table' from Dawid Mędrek
Normally, when we create a table, MV, etc., we apply `cf_prop_defs` to the
schema builder via the function `cf_prop_defs::apply_to_builder`. Unfortunately,
that didn't happen when creating CDC log tables, and so we might have missed
some of the properties that would normally be set to some value, even if the
default one.

One particular example of that phenomenon was `tombstone_gc`. For better or
worse, it's not a "standalone property" of a table, but rather part of
`extensions`. [Somewhat related issue: scylladb/scylladb#9722]

That may have and did cause trouble. Consider this scenario:

1. A CDC log table is created.
2. The table does NOT have any value of `tombstone_gc` set.
3. The user edits the table via `ALTER TABLE`. That statement treats the log
   table just like any other one (at least as far as the relevant portion of the
   logic is concerned). Among other things, it uses
   `cf_prop_defs::apply_to_builder`, and as a result, the `tombstone_gc`
   property is set to some value:
   * the default one if the user doesn't specify it in the statement,
   * a custom one if they do.

Why is that a problem?

First of all, it's confusing. When we perform a schema backup and a table uses
CDC, we include an ALTER statement for its corresponding CDC log table (for more
context, see issue scylladb/scylladb#18467 or commit
scylladb/scylladb@f12edbdd95).

There are two consequences for the user here:
1. If the log table had NOT been altered ever since it was created, the
   statement will miss the `tombstone_gc` property as if it couldn't be set for
   it at all. That's confusing!
2. If the log table HAD in fact been altered after its creation, the statement
   will include the `tombstone_gc` property. That's even more confusing (why was
   it not present the first time, but it is now?).

The `tombstone_gc` property should always be set to avoid confusion and
problematic edge cases in tests and to simply be consistent with how other
schema entities work.

The solution we employ is that we always set the property to the default
value. That includes the case when we reattach the log table to the base;
consider the following scenario:

1. Create a table with CDC enabled.
2. Detach the log table by performing `ALTER TABLE ... WITH cdc = {'enabled': false}`.
3. Change the `tombstone_gc` property of the log table.
4. Reattach the log table to the base in the same way as in step 2.

The expected result would be that the new value of `tombstone_gc` would be
preserved after reattaching the log table. However, that's not what will
happen. We decide to stay consistent with how other properties of a log
table behave, and we reset them after every reattachment. We might change that
in the future: see issue scylladb/scylladb#25523.

Two reproducer tests of scylladb/scylladb#25187 are included in the changes.

Backport: The problem is not critical, so it may not be necessary to backport the changes.
That's to be discussed.

Closes scylladb/scylladb#25521

* github.com:scylladb/scylladb:
  cdc: Set tombstone_gc when creating log table
  tombstone_gc: Add overload of get_default_tombstone_gc_mode
  tombstone_gc: Rename get_default_tombstonesonte_gc_mode
2025-09-02 10:20:11 +02:00
..

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.