Files
scylladb/test
Nadav Har'El df65d09e08 Merge ' cdc: log: fill cdc$deleted_ columns in pre-images ' from Piotr Grabowski
Before this change, `cdc$deleted_` columns were all `NULL` in pre-images. Lack of such information made it hard to correctly interpret the pre-image rows, for example:

```
INSERT INTO tbl(pk, ck, v, v2) VALUES (1, 1, null, 1);
INSERT INTO tbl(pk, ck, v2) VALUES (1, 1, 1);
```

For this example, pre-image generated for the second operation would look like this (in both `true` and `full` pre-image mode):

```
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
```

`v=NULL` has two meanings:
1. If pre-image was in `true` mode, `v=NULL` describes that v was not affected (affected columns: pk, ck, v2).
2. If pre-image was in `full` mode, `v=NULL` describes that v was equal to `NULL` in the pre-image.

Therefore, to properly decode pre-images you would need to know in which mode pre-image was configured on the CDC-enabled table at the moment this CDC log row was inserted. There is no way to determine such information (you can only check a current mode of pre-image).

A solution to this problem is to fill in the `cdc$deleted_` columns for pre-images. After this PR, for the `INSERT` described above, CDC now generates the following log row:

If in pre-image 'true' mode:
```
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
```

If in pre-image 'full' mode:
```
pk=1, ck=1, v=NULL, cdc$deleted_v=true, v2=1
```

A client library now can properly decode a pre-image row. If it sees a `NULL` value, it can now check the `cdc$deleted_` column to determine if this `NULL` value was a part of pre-image or it was omitted due to not being an affected column in the delta operation.

No such change is necessary for the post-image rows, as those images are always generated in the `full` mode.

Additional example:
Additional example of trouble decoding pre-images before this change.
tbl2 - `true` pre-image mode, tbl3 - `full` pre-image mode:

```
INSERT INTO tbl2(pk, ck, v, v2) VALUES (1, 1, 5, 1);
INSERT INTO tbl3(pk, ck, v, v2) VALUES (1, 1, null, 1);
```

```
INSERT INTO tbl2(pk, ck, v2) VALUES (1, 1, 1);
```
generated pre-image:
```
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
```

```
INSERT INTO tbl3(pk, ck, v2) VALUES (1, 1, 1);
```

generated pre-image:
```
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
```

Both pre-images look the same, but:
1. `v=NULL` in tbl2 describes v being omitted from the pre-image.
2. `v=NULL` in tbl3 described v being `NULL` in the pre-image.

Closes #8568

* github.com:scylladb/scylla:
  cdc: log: assert post_image is always in full mode
  cdc: tests: check cdc$deleted_ columns in images
  cdc: log: fill cdc$deleted_ columns in pre-images
2021-05-04 14:45:27 +03:00
..