Commit Graph

242 Commits

Author SHA1 Message Date
Piotr Grabowski
61c8e196be cdc: improve exception message of invalid "ttl"
Improve the exception message of providing invalid "ttl" value to the
table.

Previously, if you executed a CREATE TABLE query with invalid "ttl"
value, you would get a non-descriptive error message:

CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': true, 'ttl': 'invalid'};
ServerError: stoi

This commit adds more descriptive exception messages:

CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': true, 'ttl': 'kgjhfkjd'};
ConfigurationException: Invalid value for CDC option "ttl": kgjhfkjd

CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': true, 'ttl': '75747885787487'};
ConfigurationException: Invalid CDC option: ttl too large
2021-04-14 17:40:23 +02:00
Piotr Grabowski
10390afc10 cdc: add validation of "enable" and "postimage"
Add validation of "enable" and "postimage" CDC options. Both options
are boolean options, but previously they were not validated, meaning
you could issue a query:

CREATE TABLE ks.t(pk int, PRIMARY KEY(pk)) WITH cdc = {'enabled': 'dsfdsd'};

and it would be executed without any errors, silently interpreting
"dsfdsd" as false.

This commit narrows possible values of those boolean CDC options to
false, true, 0, 1. After applying this change, issuing the query above
would result in this error message:

ConfigurationException: Invalid value for CDC option "enabled": dsfdsd
2021-04-14 17:36:38 +02:00
Piotr Sarna
d77eb39076 Merge 'cdc: log: avoid linearizations' from Michał Chojnowski
CDC log uses `bytes` to deal with cells and their values, and linearizes all
values indiscriminately.  This series makes a switch from `bytes` to
`managed_bytes` to avoid that linearization.

Fixes #7506.

Closes #8429

* github.com:scylladb/scylla:
  cdc: log: change yet another occurence of `bytes` to `managed_bytes`
  cdc: log: switch the remaining usages of `bytes` to `managed_bytes` in collection_visitor
  cdc: log: change `deleted_elements` in log_mutation_builder from bytes to managed_bytes
  cdc: log: rewrite collection merge to use managed_bytes instead of bytes
  cdc: log: don't linearize collections in get_preimage_col_value
  cdc: log: change return type of get_preimage_col_value to managed_bytes
  cdc: log: remove an unnecessary copy in process_row_visitor::live_atomic_cell
  cdc: log: switch cell_map from bytes to managed_bytes
  cdc: log: change the argument of log_mutation_builder::set_value to managed_bytes_view
  cdc: log: don't linearize the primary key in log_mutation_builder
  atomic_cell: add yet another variant of make_live for managed_bytes_view
  compound: add explode_fragmented
2021-04-12 10:56:12 +02:00
Michał Chojnowski
6b31f73987 cdc: log: change yet another occurence of bytes to managed_bytes 2021-04-08 10:16:21 +02:00
Michał Chojnowski
061f72166c cdc: log: switch the remaining usages of bytes to managed_bytes in collection_visitor 2021-04-08 10:16:21 +02:00
Michał Chojnowski
2760382a68 cdc: log: change deleted_elements in log_mutation_builder from bytes to managed_bytes 2021-04-08 10:16:21 +02:00
Michał Chojnowski
ba53c85829 cdc: log: rewrite collection merge to use managed_bytes instead of bytes 2021-04-08 10:16:21 +02:00
Michał Chojnowski
42acdc4d09 cdc: log: don't linearize collections in get_preimage_col_value 2021-04-08 10:16:21 +02:00
Michał Chojnowski
70a2bed70b cdc: log: change return type of get_preimage_col_value to managed_bytes 2021-04-08 10:16:21 +02:00
Michał Chojnowski
4214e74678 cdc: log: remove an unnecessary copy in process_row_visitor::live_atomic_cell 2021-04-08 10:16:11 +02:00
Michał Chojnowski
c2b43c8daf cdc: log: switch cell_map from bytes to managed_bytes 2021-04-08 10:05:30 +02:00
Michał Chojnowski
4e8eb07de4 cdc: log: change the argument of log_mutation_builder::set_value to managed_bytes_view 2021-04-08 10:05:00 +02:00
Michał Chojnowski
f18b74eee5 cdc: log: don't linearize the primary key in log_mutation_builder 2021-04-08 10:04:31 +02:00
Nadav Har'El
0dd6f2db8f Merge 'CDC generations: refactors and improvements' from Kamil Braun
The "most important" major changes are:

1. storage_service: simplify CDC generation management during node replace

Previously, when node A replaced node B, it would obtain B's
generation timestamp from its application state (gossiped by other
nodes) and start gossiping it immediately on bootstrap.

But that's not necessary:
  - if this is the timestamp of the last (current) generation, we would
     obtain it from other nodes anyway (every node gossips the last known
     timestamp),
  - if this is the timestamp of an earlier generation, we would forget
     it immediately and start gossiping the last timestamp (obtained from
     other nodes).

This commit simplifies the bootstrap code (in node-replace case) a bit:
the replacing node no longer attempts to retrieve the CDC generation
timestamp from the node being replaced.

2. tree-wide: introduce cdc::generation_id type

Each CDC generation has a timestamp which denotes a logical point in time
when this generation starts operating. That same timestamp is
used to identify the CDC generation. We use this identification scheme
to exchange CDC generations around the cluster.

However, the fact that a generation's timestamp is used as an ID for
this generation is an implementation detail of the currently used method
of managing CDC generations.

Places in the code that deal with the timestamp, e.g. functions which
take it as an argument (such as handle_cdc_generation) are often
interested in the ID aspect, not the "when does the generation start
operating" aspect. They don't care that the ID is a `db_clock::time_point`.
They may sometimes want to retrieve the time point given the ID (such as
do_handle_cdc_generation when it calls `cdc::metadata::insert`),
but they don't care about the fact that the time point actually IS the ID.

In the future we may actually change the specific type of the ID if we
modify the generation management algorithms.

This commit is an intermediate step that will ease the transition in the
future. It introduces a new type, `cdc::generation_id`. Inside it contains
the timestamp, so:
- if a piece of code doesn't care about the timestamp, it just passes
   the ID around
- if it does care, it can access it using the `get_ts` function.
   The fact that `get_ts` simply accesses the ID's only field is an
   implementation detail.

3. cdc: handle missing generation case in check_and_repair_cdc_streams

check_and_repair_cdc_streams assumed that there is always at least
one generation being gossiped by at least one of the nodes. Otherwise it
would enter undefined behavior.

I'm not aware of any "real" scenario where this assumption wouldn't be
satisfied at the moment where check_and_repair_cdc_streams makes it
except perhaps some theoretical races. But it's best to stay on the safe
side.

---

Additionally the PR does some simplifications, stylistic improvements,
removes some dead code, coroutinizes some functions, uncoroutinizes others
(due to miscompiles), adds additional logging, updates some stale comments.
Read commit messages for more details.

Closes #8283

* github.com:scylladb/scylla:
  cdc: log a message when creating a new CDC generation
  cdc: handle missing generation case in check_and_repair_cdc_streams
  tree-wide: introduce cdc::generation_id type
  tree-wide: rename "cdc streams timestamp" to "cdc generation id"
  cdc: remove some functions from generation.hh
  storage_service: make set_gossip_tokens a static free-function
  db: system_keyspace: group cdc functions in single place
  cdc: get rid of "get_local_streams_timestamp"
  sys_dist_ks: update comment at quorum_if_many
  storage_service: simplify CDC generation management during node replace
2021-04-07 14:49:02 +03:00
Kamil Braun
6525111d21 cdc: log a message when creating a new CDC generation 2021-04-07 13:47:16 +02:00
Kamil Braun
0978155bec cdc: handle missing generation case in check_and_repair_cdc_streams
check_and_repair_cdc_streams assumed that there is always at least
one generation being gossiped by at least one of the nodes. Otherwise it
would enter undefined behavior.

I'm not aware of any "real" scenario where this assumption wouldn't be
satisfied at the moment where check_and_repair_cdc_streams makes it
except perhaps some theoretical races. But it's best to stay on the safe
side.
2021-04-07 13:47:16 +02:00
Kamil Braun
99fd2244a3 tree-wide: introduce cdc::generation_id type
This is a follow-up to the previous commit.

Each CDC generation has a timestamp which denotes a logical point in time
when this generation starts operating. That same timestamp is
used to identify the CDC generation. We use this identification scheme
to exchange CDC generations around the cluster.

However, the fact that a generation's timestamp is used as an ID for
this generation is an implementation detail of the currently used method
of managing CDC generations.

Places in the code that deal with the timestamp, e.g. functions which
take it as an argument (such as handle_cdc_generation) are often
interested in the ID aspect, not the "when does the generation start
operating" aspect. They don't care that the ID is a `db_clock::time_point`.
They may sometimes want to retrieve the time point given the ID (such as
do_handle_cdc_generation when it calls `cdc::metadata::insert`),
but they don't care about the fact that the time point actually IS the ID.

In the future we may actually change the specific type of the ID if we
modify the generation management algorithms.

This commit is an intermediate step that will ease the transition in the
future. It introduces a new type, `cdc::generation_id`. Inside it contains
the timestamp, so:
1. if a piece of code doesn't care about the timestamp, it just passes
   the ID around
2. if it does care, it can simply access it using the `get_ts` function.
   The fact that `get_ts` simply accesses the ID's only field is an
   implementation detail.

Using the occasion, we change the `do_handle_cdc_generation_intercept...`
function to be a standard function, not a coroutine. It turns out that -
depending on the shape of the passed-in argument - the function would
sometimes miscompile (the compiled code would not copy the argument to the
coroutine frame).
2021-04-07 13:47:13 +02:00
Konstantin Osipov
c83cf1f965 uuid: switch the API to use std::chrono
A follow up for the patch for #7611. This change was requested
during review and moved out of #7611 to reduce its scope.

The patch switches UUID_gen API from using plain integers to
hold time units to units from std::chrono.

For one, we plan to switch the entire code base to std::chrono units,
to ensure type safety. Secondly, using std::chrono units allows to
increase code reuse with template metaprogramming and remove a few
of UUID_gen functions that beceme redundant as a result.

* switch  get_time_UUID(), unix_timestamp(), get_time_UUID_raw(), switch
  min_time_UUID(), max_time_UUID(), create_time_safe() to
  std::chrono
* remove unused variant of from_unix_timestamp()
* remove unused get_time_UUID_bytes(), create_time_unsafe(),
  redundant get_adjusted_timestamp()
* inline get_raw_UUID_bytes()
* collapse to similar implementations of get_time_UUID()
* switch internal constants to std::chrono
* remove unnecessary unique_ptr from UUID_gen::_instance
Message-Id: <20210406130152.3237914-2-kostja@scylladb.com>
2021-04-06 17:12:54 +03:00
Kamil Braun
e486e0f759 tree-wide: rename "cdc streams timestamp" to "cdc generation id"
Each CDC generation always has a timestamp, but the fact that the
timestamp identifies the generation is an implementation detail.
We abstract away from this detail by using a more generic naming scheme:
a generation "identifier" (whatever that is - a timestamp or something
else).

It's possible that a CDC generation will be identified by more than a
timestamp in the (near) future.

The actual string gossiped by nodes in their application state is left
as "CDC_STREAMS_TIMESTAMP" for backward compatibility.

Some stale comments have been updated.
2021-04-06 13:15:31 +02:00
Kamil Braun
0cb2f58514 cdc: remove some functions from generation.hh
They are not used outside of the generation module.
2021-04-06 13:15:31 +02:00
Kamil Braun
2e2d51cf2b cdc: get rid of "get_local_streams_timestamp"
This function retrieves the persisted timestamp of the last known CDC
generation (which this node is currently gossiping to other nodes).
It checks that the timestamp is present; if not, it throws an error.

The check is unnecessary. It's used only in a quite esoteric place
(start_gossiping, which implements an almost-never-used API call),
and it's fine if the timestamp is gone - in start_gossiping,
we can start gossiping the tokens without the CDC generation timestamp
(well, if the timestamp is not present in system tables, something
weird must have happened, but that doesn't mean we can't resume
gossiping - fixing CDC generation management in such a case is
a separate problem).
2021-04-06 13:15:31 +02:00
Wojciech Mitros
daa31be37f types: replace buffers in tuple_deserializing_iterator with fragmented ones
In preparation for removing linearization from abstract_type::compare,
add options to avoid linearization in tuple_deserializing_iterator.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2021-03-31 06:35:09 +02:00
Calle Wilund
e4d6c8904f untyped_result_set: Do not copy data from input store (retain fragmented views)
Refs #7961
Fixes #8014

Instead of doing a deep copy of input, we keep assume ownership and build
rows of the views therein, potentially retaining fragmented data as-is
avoiding premature linearization.

Note that this is not all sugar and flowers though. Any data access will
by nature be more expensive, and the view collections we create are
potentially just as expensive as copying for small cells.

Otoh, it allows writing code using this that avoids data copying,
depending on destination.

v2:
* Fixed wrong collection reserved in visitor
* Changed row index from shared ptr to ref
* Moved typedef
* Removed non-existing constructors
* Added const ref to index build
* Fixed raft usage after rebase

v3:
* Changed shared_ptr to unique
2021-03-03 10:19:46 +00:00
Kamil Braun
e2f03e4aba cdc: move (most of) CDC generation management code to the new service
Currently all management of CDC generations happens in storage_service,
which is a big ball of mud that does many unrelated things.

Previous commits have introduced a new service for managing CDC
generations. This code moves most of the relevant code to this new
service.

However, some part still remains in storage_service: the bootstrap
procedure, which happens inside storage_service, must also do some
initialization regarding CDC generations, for example: on restart it
must retrieve the latest known generation timestamp from disk; on
bootstrap it must create a new generation and announce it to other
nodes. The order of these operations w.r.t the rest of the startup
procedure is important, hence the startup procedure is the only right
place for them.

Still, what remains in storage_service is a small part of the entire
CDC generation management logic; most of it has been moved to the
new service. This includes listening for generation changes and
updating the data structures for performing CDC log writes (cdc::metadata).
Furthermore these functions now return futures (and are internally
coroutines), where previously they required a seastar::async context.
2021-02-26 12:06:12 +01:00
Kamil Braun
022d7773f4 cdc: coroutinize make_new_cdc_generation 2021-02-22 12:47:44 +01:00
Kamil Braun
26ca9d6c33 cdc: coroutinize update_streams_description 2021-02-22 12:46:53 +01:00
Kamil Braun
d4937daaea cdc: introduce cdc::generation_service
This commit introduces a new service crafted to handle CDC generation
management: listening and reacting to generation changes in the cluster.

The implementation is a stub for now, the service reacts to generation
changes by simply logging the event.

The commit plugs the service in, initializing it in main and test code,
passing a reference to storage_service and having storage_service start
the service (using the `after_join` method): the service only starts
doing its job after the node joins the token ring (either on bootstrap
or restart).
2021-02-22 12:45:43 +01:00
Avi Kivity
90a7f76fb6 Merge 'cdc: log: fix a use-after-free in process_bytes_visitor' from Michał Chojnowski
Due to small value optimization used in `bytes`, views to `bytes` stored
in `vector` can be invalidated when the vector resizes, resulting in
use-after-free and data corruption. Fix that.

Closes #8105

* github.com:scylladb/scylla:
  cdc: log: avoid an unnecessary copy
  cdc: log: fix use-after-free in process_bytes_visitor
2021-02-18 20:23:41 +02:00
Michał Chojnowski
96c22cf3f8 cdc: log: avoid an unnecessary copy
There is no need to copy `bytes_view` into `bytes` here.
2021-02-18 14:08:18 +01:00
Michał Chojnowski
8cc4f39472 cdc: log: fix use-after-free in process_bytes_visitor
Due to small value optimization used in `bytes`, views to `bytes` stored
in `vector` can be invalidated when the vector resizes, resulting in
use-after-free and data corruption. Fix that.

Fixes #8117
2021-02-18 14:08:17 +01:00
Kamil Braun
841f07e9b7 cdc: add config option to disable streams rewriting
Rewriting stream descriptions is a long, expensive, and prone-to-failure
operation. Due to #8061 it may consume a lot of memory. In general, it
may keep failing (and being retried) endlessly, straining the cluster.
As a backdoor we add this flag for potential future needs of admins or
field engineers.

I don't expect it will ever be used, but it won't hurt and may save us
some work in the worst case scenario.
2021-02-18 11:44:59 +01:00
Kamil Braun
9bdd000e97 cdc: rewrite streams to the new description table
Nodes automatically ensure that the latest CDC generation's list of
streams is present in the streams description table. When a new
generation appears, we only need to update the table for this
generation; old generations are already inserted.

However, we've changed the description table (from
`cdc_streams_descriptions` to `cdc_streams_descriptions_v2`). The
existing mechanism only ensures that the latest generation appears in
the new description table. This commit adds an additional procedure that
rewrites the older generations as well, if we find that it is necessary
to do so (i.e. when some CDC log tables may contain data in these
generations).
2021-02-18 11:44:59 +01:00
Kamil Braun
7c91894ddf cdc: introduce no_generation_data_exception exception type 2021-02-18 11:44:59 +01:00
Kamil Braun
44aab61aea cdc: coroutinize do_update_streams_description 2021-02-18 11:44:59 +01:00
Kamil Braun
67d4e5576d sys_dist_ks: split CDC streams table partitions into clustered rows
Until now, the lists of streams in the `cdc_streams_descriptions` table
for a given generation were stored in a single collection. This solution
has multiple problems when dealing with large clusters (which produce
large lists of streams):
1. large allocations
2. reactor stalls
3. mutations too large to even fit in commitlog segments

This commit changes the schema of the table as described in issue #7993.
The streams are grouped according to token ranges, each token range
being represented by a separate clustering row. Rows are inserted in
reasonably large batches for efficiency.

The table is renamed to enable easy upgrade. On upgrade, the latest CDC
generation's list of streams will be (re-)inserted into the new table.

Yet another table is added: one that contains only the generation
timestamps clustered in a single partition. This makes it easy for CDC
clients to learn about new generations. It also enables an elegant
two-phase insertion procedure of the generation description: first we
insert the streams; only after ensuring that a quorum of replicas
contains them, we insert the timestamp. Thus, if any client observes a
timestamp in the timestamps table (even using a ONE query),
it means that a quorum of replicas must contain the list of streams.
2021-02-18 11:44:59 +01:00
Kamil Braun
ba920361b3 cdc: use chunked_vector for streams in streams_version
The vector may get quite long (say... 1,6M stream IDs). We prevent a
large allocation by using utils::chunked_vector.
2021-02-18 11:44:59 +01:00
Kamil Braun
9ae4467970 cdc: remove streams_version::expired field
This field was not used anywhere.
2021-02-18 11:44:59 +01:00
Avi Kivity
c63e26e26f Merge 'cdc: Limit size of topology description' from Piotr Jastrzębski
Currently, whole topology description for CDC is stored in a single row.
This means that for a large cluster of strong machines (say 100 nodes 64
cpus each), the size of the topology description can reach 32MB.

This causes multiple problems. First of all, there's a hard limit on
mutation size that can be written to Scylla. It's related to commit log
block size which is 16MB by default. Mutations bigger than that can't be
saved. Moreover, such big partitions/rows cause reactor stalls and
negatively influence latency of other requests.

This patch limits the size of topology description to about 4MB. This is
done by reducing the number of CDC streams per vnode and can lead to CDC
data not being fully colocated with Base Table data on shards. It can
impact performance and consistency of data.

This is just a quick fix to make it easily backportable. A full solution
to the problem is under development.

For more details see #7961, #7993 and #7985.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>

Closes #8048

* github.com:scylladb/scylla:
  cdc: Limit size of topology description
  cdc: Extract create_stream_ids from topology_description_generator
2021-02-17 15:43:53 +02:00
Piotr Jastrzebski
649f254863 cdc: Limit size of topology description
Currently, whole topology description for CDC is stored in a single row.
This means that for a large cluster of strong machines (say 100 nodes 64
cpus each), the size of the topology description can reach 32MB.

This causes multiple problems. First of all, there's a hard limit on
mutation size that can be written to Scylla. It's related to commit log
block size which is 16MB by default. Mutations bigger than that can't be
saved. Moreover, such big partitions/rows cause reactor stalls and
negatively influence latency of other requests.

This patch limits the size of topology description to about 4MB. This is
done by reducing the number of CDC streams per vnode and can lead to CDC
data not being fully colocated with Base Table data on shards. It can
impact performance and consistency of data.

This is just a quick fix to make it easily backportable. A full solution
to the problem is under development.

For more details see #7961, #7993 and #7985.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2021-02-17 13:24:40 +01:00
Botond Dénes
ba7a9d2ac3 imr: switch back to open-coded description of structures
Commit aab6b0ee27 introduced the
controversial new IMR format, which relied on a very template-heavy
infrastructure to generate serialization and deserialization code via
template meta-programming. The promise was that this new format, beyond
solving the problems the previous open-coded representation had (working
on linearized buffers), will speed up migrating other components to this
IMR format, as the IMR infrastructure reduces code bloat, makes the code
more readable via declarative type descriptions as well as safer.
However, the results were almost the opposite. The template
meta-programming used by the IMR infrastructure proved very hard to
understand. Developers don't want to read or modify it. Maintainers
don't want to see it being used anywhere else. In short, nobody wants to
touch it.

This commit does a conceptual revert of
aab6b0ee27. A verbatim revert is not
possible because related code evolved a lot since the merge. Also, going
back to the previous code would mean we regress as we'd revert the move
to fragmented buffers. So this revert is only conceptual, it changes the
underlying infrastructure back to the previous open-coded one, but keeps
the fragmented buffers, as well as the interface of the related
components (to the extent possible).

Fixes: #5578
2021-02-16 23:43:07 +01:00
Piotr Jastrzebski
390cef6a96 cdc: Extract create_stream_ids from topology_description_generator
This new function will be used in the following patches in additional
places.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2021-02-10 10:24:06 +01:00
Konstantin Osipov
b4f875f08e uuid: reduce code dependency on UUID_gen.hh
Do not include UUID_gen.hh in trace_state.hh and lists.hh
to reduce header level dependency on it.

Message-Id: <20210127173114.725761-2-kostja@scylladb.com>
2021-01-27 20:08:29 +02:00
Benny Halevy
c60da2e90d cdc: remove _token_metadata from db_context
1. It's unused since cbe510d1b8
2. It's unsafe to keep a reference to token_metadata&
potentially across yield points.

The higher-level motivation is to make
storage_service::get_token_metadata() private so we
can control better how it's used.

For cdc, if the token_metadata is going to be needed
to the future, it'd be better get it from
db_context::_proxy.get_token_metadata_ptr().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201213162351.52224-2-bhalevy@scylladb.com>
2020-12-13 18:32:17 +02:00
Pavel Emelyanov
89fd524c5a schema-tables: Add database argument to make_update_table_mutations
There are 3 callers of this helper (cdc, migration manager and tests)
and all of them already have the database object at hands.

The argument will be used by next patch to remove call for global
storage proxy instance from make_update_indices_mutations.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-12-11 21:21:22 +03:00
Kamil Braun
2da723b9c8 cdc: produce postimage when inserting with no regular columns
When a row was inserted into a table with no regular columns, and no
such row existed in the first place, postimage would not be produced.
Fix this.

Fixes #7716.

Closes #7723
2020-12-01 18:01:23 +02:00
Piotr Sarna
5a9dc6a3cc Merge 'Cleanup CDC tests after CDC became GA' from Piotr Jastrzębski
Now that CDC is GA, it should be enabled in all the tests by default.
To achieve that the PR adds a special db::config::add_cdc_extension()
helper which is used in cql_test_envm to make sure CDC is usable in
all the tests that use cql_test_env.m As a result, cdc_tests can be
simplified.
Finally, some trailing whitespaces are removed from cdc_tests.

Tests: unit(dev)

Closes #7657

* github.com:scylladb/scylla:
  cdc: Remove trailing whitespaces from cdc_tests
  cdc: Remove mk_cdc_test_config from tests
  config: Add add_cdc_extension function for testing
  cdc: Add missing includes to cdc_extension.hh
2020-11-20 13:56:29 +01:00
Piotr Jastrzebski
89f4298670 cdc: Add missing includes to cdc_extension.hh
Without those additional includes, a .cc file
that includes cdc_extension.hh won't compile.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-11-19 16:11:33 +01:00
Piotr Jastrzebski
3024795507 cdc: Change for_testing to add_delay in make_new_cdc_generation
The meaning of the parameter changes from defining whether the function
is called in testing environment to deciding whether a delay should be
added to a timestamp of a newly created CDC generation.

This is a preparation for improvement in the following patch that does
not always add delay to every node but only to non-first node.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-11-19 12:19:42 +01:00
Piotr Jastrzebski
6b1167ea0d cdc: Remove std::iterator from collection_iterator
std::iterator is deprecated since C++17 so define all the required
iterator_traits directly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-11-17 16:53:20 +01:00
Piotr Jastrzebski
2091408478 cdc: Make it possible for CDC generation creation to fail
Following patch enables CDC by default and this means CDC has to work
will all the clusters now.

There is a problematic case when existing cluster with no CDC support
is stopped, all the binaries are updated to newer version with
CDC enabled by default. In such case, nodes know that they are already
members of the cluster but they can't find any CDC generation so they
will try to create one. This creation may fail due to lack of QUORUM
for the write.

Before this patch such situation would lead to node failing to start.
After the change, the node will start but CDC generation will be
missing. This will mean CDC won't be able to work on such cluster before
nodetool checkAndRepairCdcStreams is run to fix the CDC generation.

We still fail to bootstrap if the creation of CDC generation fails.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-11-12 12:29:31 +01:00