scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-04 22:13:19 +00:00

Files

Nadav Har'El 31e0315710 Merge 'alternator: fix unnecesary cdc log entries' from Radosław Cybulski

Fix cdc writing unnecesary entries to it's log, like for example when Alternator deletes an item which in reality doesn't exist.

Originally @wps0 tackled this issue. This patch is an extension of his work. His work involved adding `should_skip` function to cdc, which would process a `mutation` object and decide, wherever changes in the object should be added to cdc log or not.

The issue with his approach is that `mutation` object might contain changes for more than one row. If - for example - the `mutation` object contains two changes, delete of non-existing row and create of non-existing row, `should_skip` function will detect changes in second item and allow whole `mutation` (BOTH items) to be added. For example (using python's boto3) running this on empty table:
```
with table.batch_writer() as batch:
    batch.put_item({'p': 'p', 'c': 'c0'})
    batch.delete_item(Key={'p': 'p', 'c': 'c1'})
```
will emit two events ("put" event and "delete" event), even though the item with `c` set to `c1` does not exist (thus can't be deleted). Note, that both entries in batch write must use the same partition key, otherwise upper layer with split them into separate `mutation` objects and the issue will not happen.

The solution is to do similar processing, but consider each change separated from others. This is tricky to implement due to a way cdc works. When cdc processes `mutation` object (containing X changes), it emits cdc entries in phases. Phase 1 - emit `preimage` (old state) for each change (if requested). Phase 2 - for each change emit actual "diff" (update / delete and so on). Phase 3 - emit `postimage` (new state).

We will know if change needs to be skipped during phase 2. By that time phase 1 is completed and preimage for the change is emited. At that moment we set a flag that the change (identified by clustering key value) needs to be skipped - we add a clustering key to a `ignore-rows` set (`_alternator_clustering_keys_to_ignore` variable) and continue normally. Once all phases finish we add a `postprocess` phase (`clean_up_noop_rows` function). It will go through generated cdc mutations and skip all modifications, for which clustering key is in `ignore-rows` set. After skipping we need to do a "cleanup" operation - each generated cdc mutation contain index (incremented by one), if we skipped some parts, the index is not consecutive anymore, so we reindex final changes.

There's a special case worth mentioning - Alternator tables without clustering keys. At that point `mutation` object passed to cdc can contain exactly one change (since different partition keys are splitted by upper layers and Alternator will never emit `mutation` object containing two (or more) changes with the same primary key. Here, when we decide the change is to be skipped we add empty `bytes` object to `ignore-rows` set. When checking `ignore-rows` set, we check if it's empty or not (we don't check for presence of empty `bytes` object).

Note: there might be some confusion between this patch and #28452 patch. Both started from the same error observation and use similar tests for validation, as both are easily triggered by BatchWrite commands (both needs `mutation` object passed to cdc to contain more than one single change). This issue tho is about wrong data written in cdc log and is fixed at cdc, where #28452 is about wrong way of parsing correct cdc data and is fixed at Alternator side of things. Note, that we need #28452 to truly verify (otherwise we will emit correct cdc entries, but Alternator will incorrectly parse them).

Note: to benefit / notice this patch you need `alternator_streams_increased_compatibility` flag turned on.

Note: rework is quite "broad" and covers a lot of ground - every operation, that might result in a no-change to the database state should be tested. An additional test was added - trying to remove a column from non-existing item, as well as trying to remove non-existing column from existing item.

Fixes: #28368
Fixes: SCYLLADB-1528
Fixes: SCYLLADB-538

Closes scylladb/scylladb#28544

* github.com:scylladb/scylladb:
  alternator: remove unnecesary code
  alternator: fix Alternator writing unnecesary cdc entries
  alternator: add failing tests for Streams

2026-04-18 00:07:51 +03:00

alternator

Merge 'alternator: fix unnecesary cdc log entries' from Radosław Cybulski

2026-04-18 00:07:51 +03:00

boost

compaction: release GC'ed sstables incrementally during compaction

2026-04-17 18:20:47 +03:00

broadcast_tables

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

cluster

test: retry get_coordinator_host() after topology coordinator stop

2026-04-17 12:08:26 +02:00

cql

test.py: remove testpy_test_fixture_scope

2026-04-16 22:08:33 +02:00

cqlpy

Merge 'test.py: refactor test.py' from Andrei Chekun

2026-04-17 12:51:14 +03:00

ldap

Merge 'auth: sanitize {USER} substitution in LDAP URL template' from Piotr Smaron

2026-04-15 14:40:15 +03:00

lib

compaction: release GC'ed sstables incrementally during compaction

2026-04-17 18:20:47 +03:00

manual

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

nodetool

test.py: remove testpy_test_fixture_scope

2026-04-16 22:08:33 +02:00

perf

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

pylib

test: use uuid4 for DockerizedServer container names to avoid collisions

2026-04-17 11:56:51 +02:00

pylib_test

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

raft

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

resource

test/ldap: add LDAP filter-injection reproducers

2026-04-08 13:53:49 +02:00

rest_api

test.py: remove testpy_test_fixture_scope

2026-04-16 22:08:33 +02:00

scylla_gdb

test.py: remove testpy_test_fixture_scope

2026-04-16 22:08:33 +02:00

unit

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

vector_search

vector_search: decrease default connection timeout to 3s

2026-04-17 12:26:39 +03:00

__init__.py

test.py: delete dead code in test.py

2026-04-16 22:08:31 +02:00

CMakeLists.txt

test/cmake: add missing tests to boost test suite

2026-03-29 16:17:45 +03:00

conftest.py

test.py: remove testpy_test_fixture_scope

2026-04-16 22:08:33 +02:00

pytest.ini

Merge 'test: Lower default log level from DEBUG to INFO' from Artsiom Mishuta

2026-04-16 12:46:11 +03:00

README.md

…

README.md

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.