Commit Graph

23 Commits

Author SHA1 Message Date
Nadav Har'El
1ed21d70dc merge: CDC: do mutation augmentation from storage proxy
Merged pull request https://github.com/scylladb/scylla/pull/5567
from Calle Wilund:

Fixes #5314

Instead of tying CDC handling into cql statement objects, this patch set
moves it to storage proxy, i.e. shared code for mutating stuff. This means
we automatically handle cdc for code paths outside cql (i.e. alternator).

It also adds api handling (though initially inefficient) for batch statements.

CDC is tied into storage proxy by giving the former a ref to the latter (per
shard). Initially this is not a constructor parameter, because right now we
have chicken and egg issues here. Hopefully, Pavels refactoring of migration
manager and notifications will untie these and this relationship can become
nicer.

The actual augmentation can (as stated above) be made much more efficient.
Hopefully, the stream management refactoring will deal with expensive stream
lookup, and eventually, we can maybe coalesce pre-image selects for batches.
However, that is left as an exercise for when deemed needed.

The augmentation API has an optional return value for a "post-image handler"
to be used iff returned after mutation call is finished (and successful).
It is not yet actually invoked from storage_proxy, but it is at least in the
call chain.
2020-01-16 17:12:56 +02:00
Pavel Emelyanov
9d31bc166b cdc: Use migration_notifier to (un)register for events
If no one provided -- get it from storage_service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-01-15 14:29:19 +03:00
Calle Wilund
313ed91ab0 cdc: Listen for migration callbacks on all shards
Fixes #5582

... but only populate log on shard 0.

Migration manager callbacks are slightly assymetric. Notifications
for pre-create/update mutations are sent only on initiating shard
(neccesary, because we consider the mutations mutable).
But "created" callbacks are sent on all shards (immutable).

We must subscribe on all shards, but still do population of cdc table
only once, otherwise we can either miss table creat or populate
more than once.

v2:
- Add test case
Message-Id: <20200113140524.14890-1-calle@scylladb.com>
2020-01-14 16:35:41 +01:00
Calle Wilund
75f2b2876b cdc: Remove free function for mutation augmentation 2020-01-13 13:18:55 +00:00
Calle Wilund
9f6b22d882 cdc: Assign self to storage proxy object 2020-01-07 12:01:58 +00:00
Calle Wilund
b6c788fccf cdc: Add augmentation call to cdc service
To eventually replace the free function.
Main difference is this is build to both handle batches correctly
and to eventually allow hanging cdc object on storage proxy,
and caches on the cdc object.
2020-01-07 12:01:58 +00:00
Avi Kivity
e223154268 cdc: options: return an empty options map when cdc is disabled
This is compatible with 3.1 and below, which didn't have that schema
field at all.
2019-12-29 16:34:37 +02:00
Calle Wilund
cb0117eb44 cdc: Handle schema changes via migration manager callbacks
This allows us to create/alter/drop log and desc tables "atomically"
with the base, by including these mutations in the original mutation
set, i.e. batch create/alter tables.

Note that population does not happen until types are actually
already put into database (duh), thus there _is_ still a gap
between creating cdc and it being truly usable. This may or may
not need handling later.
2019-12-09 14:35:04 +00:00
Calle Wilund
a21e140169 cdc: Add sharded service that does nothing.
But can be used to hang functionality into eventually.
2019-12-09 12:12:09 +00:00
Calle Wilund
8c6d6254cf cdc: Remove some code from header 2019-12-02 13:00:19 +00:00
Piotr Jastrzebski
222b94c707 cdc: Return preimage only when it's requested
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-11-25 12:43:39 +01:00
Piotr Jastrzebski
595c9f9d32 cdc::append_log_mutations: fix undefined behavior
The code was iterating over a collection that was modified
at the same time. Iterators were used for that and collection
modification can invalidate all iterators.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-11-25 12:43:39 +01:00
Piotr Jastrzebski
f0f44f9c51 cdc::append_log_mutations: use do_with instead of shared_ptr
This will not only safe some allocations but also improve
code readability.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-11-25 12:43:39 +01:00
Piotr Jastrzebski
b8d9158c21 cdc: Don't take storage_proxy as transformer::pre_image_select param
transformer has access to storage_proxy through its _ctx field.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-11-25 12:43:39 +01:00
Calle Wilund
7d98f735ee cdc: Add static columns to data/preimage mutations
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-28 06:16:12 +01:00
Calle Wilund
19bba5608a cdc: Create and perform a pre-image select for mutations
As well as generate per-image rows in resulting log mutation

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-28 06:16:12 +01:00
Calle Wilund
d4ee1938c7 cdc: Add modification record for regular atomic values in mutations
Fills in the data columns for regular columns iff they are
atomic (not unfrozed collections)
2019-10-28 06:16:12 +01:00
Calle Wilund
3fdcbd9dff cdc: Set row op in log
Adds actual operation (part delete, range delete, update) to
cdc log
2019-10-28 06:16:12 +01:00
Calle Wilund
8a6b72f47e cdc: Add pre-image select generator method
Based on a mutation, creates a pre-image select operation.

Note, this uses raw proxy query to shortcut parsing etc,
instead of trying to cache by generated query. Hypothesis is that
this is essentially faster.

The routine assumes all rows in a mutation touch same static/regular
columns. If this is not always true it will need additional
calculations.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-28 06:16:12 +01:00
Calle Wilund
451bb7447d cdc: Add log / log data column operation types and make data cols tuples of these
Makes static/regular data columns tuple<op, value, ttl> as per spec.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-28 06:16:12 +01:00
Piotr Jastrzebski
997be35ef3 modification_statement: log in cdc clustering key of a change
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-17 11:28:23 +02:00
Piotr Jastrzebski
96c800ed0b modification_statement: log in cdc partition key of a change
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-17 11:28:23 +02:00
Piotr Jastrzebski
81a34168a3 create_table_statement: handle 'with cdc ='
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-10-17 11:28:14 +02:00