This commit moves the Features page from the section for developers to the top level in the page tree. This involves: - Moving the source files to the *features* folder from the *using-scylla* folder. - Moving images into *features/images* folder. - Updating references to the moved resources. - Adding redirections to the moved pages. Closes scylladb/scylladb#20401
184 lines
6.7 KiB
ReStructuredText
184 lines
6.7 KiB
ReStructuredText
.. _updates:
|
|
|
|
Updates
|
|
-------
|
|
|
|
Updates are the most basic statements that can be performed. The following is an example of an update with CDC enabled.
|
|
|
|
#. Start with a basic table and perform some UPDATEs:
|
|
|
|
.. code-block:: cql
|
|
|
|
CREATE TABLE ks.t (pk int, ck int, v1 int, v2 int, PRIMARY KEY (pk, ck)) WITH cdc = {'enabled':'true'};
|
|
UPDATE ks.t SET v1 = 0 WHERE pk = 0 AND ck = 0;
|
|
UPDATE ks.t SET v2 = null WHERE pk = 0 AND ck = 0;
|
|
|
|
#. Confirm that the update was performed by displaying the contents of the table:
|
|
|
|
.. code-block:: cql
|
|
|
|
SELECT * FROM ks.t;
|
|
|
|
returns:
|
|
|
|
.. code-block:: none
|
|
|
|
pk | ck | v1 | v2
|
|
----+----+----+------
|
|
0 | 0 | 0 | null
|
|
|
|
(1 rows)
|
|
|
|
#. Display the contents of the CDC log table:
|
|
|
|
.. code-block:: cql
|
|
|
|
SELECT "cdc$batch_seq_no", pk, ck, v1, "cdc$deleted_v1", v2, "cdc$deleted_v2", "cdc$operation" FROM ks.t_scylla_cdc_log;
|
|
|
|
returns:
|
|
|
|
.. code-block:: none
|
|
|
|
cdc$batch_seq_no | pk | ck | v1 | cdc$deleted_v1 | v2 | cdc$deleted_v2 | cdc$operation
|
|
------------------+----+----+------+----------------+------+----------------+---------------
|
|
0 | 0 | 0 | 0 | null | null | null | 1
|
|
0 | 0 | 0 | null | null | null | True | 1
|
|
|
|
(2 rows)
|
|
|
|
Delta rows corresponding to updates are indicated by ``cdc$operation = 1``. All columns corresponding to the primary key in the log are set to the same values as in the base table, in this case ``pk = 0`` and ``ck = 0``.
|
|
|
|
Each non-key column, such as ``v1`` above, has a corresponding column in the log with the same name, but there's an additional column with the ``cdc$deleted_`` prefix (e. g. ``cdc$deleted_v1``) used to indicate whether the column was set to ``null`` in the update.
|
|
|
|
If the update sets a column ``X`` to a non-null value, the ``X`` column in the log will have that value; however, if the update sets ``X`` to a null value, the ``X`` column in the log will be ``null``, and we'll use the ``cdc$deleted_X`` column to indicate the update by setting it to ``True``. In the example above, this is what happened with the ``v2`` column.
|
|
|
|
Column deletions are a special case of an update statement. That is, the following two statements are equivalent:
|
|
|
|
.. code-block:: cql
|
|
|
|
UPDATE <table_name> SET X = null WHERE <condition>;
|
|
DELETE X FROM <table_name> WHERE <condition>;
|
|
|
|
Thus, in the example above, instead of
|
|
|
|
.. code-block:: cql
|
|
|
|
UPDATE ks.t SET v2 = null WHERE pk = 0 AND ck = 0;
|
|
|
|
we could have used
|
|
|
|
.. code-block:: cql
|
|
|
|
DELETE v2 FROM ks.t WHERE pk = 0 AND ck = 0;
|
|
|
|
and we would've obtained the same result.
|
|
|
|
Note that column deletions, (which are equivalent to updates that set a column to ``null``) *are different than row deletions*, i.e. ``DELETE`` statements that specify a clustering row but don't specify any particular column, like the following:
|
|
|
|
.. code-block:: cql
|
|
|
|
DELETE FROM ks.t WHERE pk = 0 AND ck = 0;
|
|
|
|
You can read about row deletions in the :ref:`corresponding section <row-deletions>`.
|
|
|
|
Digression: static rows in ScyllaDB
|
|
+++++++++++++++++++++++++++++++++++
|
|
|
|
If a table in ScyllaDB has static columns, then every partition in this table contains a *static row*, which is global for the partition. This static row is different from the clustered rows: it contains values for partition key columns and static columns, while clustered rows contain values for partition key, clustering key, and regular columns. The following example illustrates how the static row can be used:
|
|
|
|
.. code-block:: cql
|
|
|
|
CREATE TABLE ks.t (pk int, ck int, s int static, c int, PRIMARY KEY (pk, ck));
|
|
UPDATE ks.t SET s = 0 WHERE pk = 0;
|
|
SELECT * from ks.t WHERE pk = 0;
|
|
|
|
returns:
|
|
|
|
.. code-block:: none
|
|
|
|
pk | ck | s | c
|
|
----+------+---+------
|
|
0 | null | 0 | null
|
|
|
|
(1 rows)
|
|
|
|
Even though no regular columns were set, the above query returned a row; the static row.
|
|
|
|
We can still update clustered rows, of course:
|
|
|
|
.. code-block:: cql
|
|
|
|
UPDATE ks.t SET c = 0 WHERE pk = 1 AND ck = 0;
|
|
SELECT * from ks.t WHERE pk = 1;
|
|
|
|
returns:
|
|
|
|
.. code-block:: none
|
|
|
|
pk | ck | s | c
|
|
----+----+------+---
|
|
1 | 0 | null | 0
|
|
|
|
(1 rows)
|
|
|
|
Somewhat confusingly, CQL mixes the static row with the clustered rows if both appear within the same partition. Example:
|
|
|
|
.. code-block:: cql
|
|
|
|
UPDATE ks.t SET c = 0 WHERE pk = 2 AND ck = 0;
|
|
UPDATE ks.t SET c = 1 WHERE pk = 2 AND ck = 1;
|
|
UPDATE ks.t SET s = 2 WHERE pk = 2;
|
|
SELECT * from ks.t WHERE pk = 2;
|
|
|
|
returns:
|
|
|
|
.. code-block:: none
|
|
|
|
pk | ck | s | c
|
|
----+----+---+---
|
|
2 | 0 | 2 | 0
|
|
2 | 1 | 2 | 1
|
|
|
|
(2 rows)
|
|
|
|
From the above query result it seems as if the static column was a part of every clustered row, and setting the static column to a value within one partition is setting it "for every clustered row". **That is not the case**, it is simply CQL's way of showing the static row when both the static row and clustered rows are present: it "mixes" the static row into each clustered row. But to understand the static row, think of it as a separate row in the partition, with the property that there can be at most one static row in a partition; the static row exists if and only if at least one of the static columns are non-null.
|
|
|
|
Static rows in CDC
|
|
++++++++++++++++++
|
|
|
|
CDC separates static row updates from clustered row updates, showing them as different entries, even if you update both within one statement. Example:
|
|
|
|
.. code-block:: cql
|
|
|
|
CREATE TABLE ks.t (pk int, ck int, s int static, c int, PRIMARY KEY (pk, ck)) WITH cdc = {'enabled': true};
|
|
UPDATE ks.t SET s = 0, c = 0 WHERE pk = 0 AND ck = 0;
|
|
SELECT "cdc$batch_seq_no", pk, ck, s, c, "cdc$operation" FROM ks.t_scylla_cdc_log;
|
|
|
|
returns:
|
|
|
|
.. code-block:: none
|
|
|
|
cdc$batch_seq_no | pk | ck | s | c | cdc$operation
|
|
------------------+----+------+------+------+---------------
|
|
0 | 0 | null | 0 | null | 1
|
|
1 | 0 | 0 | null | 0 | 1
|
|
|
|
(2 rows)
|
|
|
|
CDC recognizes that logically two updates happened: one to the static row and one to the clustered row. In other words, CDC interprets the following:
|
|
|
|
.. code-block:: cql
|
|
|
|
UPDATE ks.t SET s = 0, c = 0 WHERE pk = 0 AND ck = 0;
|
|
|
|
as follows:
|
|
|
|
.. code-block:: cql
|
|
|
|
BEGIN UNLOGGED BATCH
|
|
UPDATE ks.t SET s = 0 WHERE pk = 0;
|
|
UPDATE ks.t SET c = 0 WHERE pk = 0 AND ck = 0;
|
|
APPLY BATCH;
|
|
|
|
However, since they happened in a single statement and had a single timestamp, it grouped them using ``cdc$batch_seq_no``. The static row update can be distinguished from the clustered row update by looking at clustering key columns, in case ``ck``: they are ``null`` for static row updates and non-null for clustered row updates.
|