Commit Graph

151 Commits

Author SHA1 Message Date
Gleb Natapov
4402b030ae cdc: drop usage of cdc_local table and v1 generation definition 2026-03-10 10:39:59 +02:00
Gleb Natapov
6a7e850161 cdc: remove legacy code
The patch removes test/boost/cdc_generation_test.cc since it unit tests
cdc::limit_number_of_streams_if_needed function which is remove here.
2026-03-10 10:38:57 +02:00
Gleb Natapov
08268eee3f topology: disable force-gossip-topology-changes option
The patch marks force-gossip-topology-changes as deprecated and removes
tests that use it. There is one test (test_different_group0_ids) which
is marked as xfail instead since it looks like gossiper mode was used
there as a way to easily achieve a certain state, so more investigation
is needed if the tests can be fixed to use raft mode instead.

Closes scylladb/scylladb#28383
2026-02-02 09:56:32 +01:00
Petr Gusev
889d7782ed treewide: use coroutine::maybe_yield in coroutines
It's more efficient since coroutine::maybe_yield returns
a lightweight struct (awaitable), not the future.

Closes scylladb/scylladb#28101
2026-01-12 10:38:47 +01:00
Michael Litvak
e7dbccd59e cdc: use chunked_vector instead of vector for stream ids
use utils::chunked_vector instead of std::vector to store cdc stream
sets for tablets.

a cdc stream set usually represents all streams for a specific table and
timestamp, and has a stream id per each tablet of the table. each stream
id is represented by 16 bytes. thus the vector could require quite large
contiguous allocations for a table that has many tablets. change it to
chunked_vector to avoid large contiguous allocations.

Fixes scylladb/scylladb#26791

Closes scylladb/scylladb#26792
2025-10-31 13:02:34 +01:00
Michael Litvak
8743422241 cdc: improve cdc metadata loading
when loading CDC streams metadata for tablets from the tables, read only
new entries from the history table instead of reading all entries. This
improves the CDC metadata reloading, making it more efficient and
predictable.

the CDC metadata is loaded as part of group0 reload whenever the
internal CDC tables are modified. on tablet split / merge, we create a
new CDC timestamp and streams by writing them to the cdc_streams_history
table by group0 operation, and when it's applied we reload the in-memory
CDC streams map by reading from the tables and constructing the updated map.

Previously, on every update, we would read the entire
cdc_streams_history entries for the changed table, constructing all its
streams and creating a new map from scratch.

We improve this now by reading only new entries from cdc_streams_history
and append them to the existing map. we can do this because we only
append new entries to cdc_streams_history with higher timestamp than all
previous entries.

This makes this reloading more efficient and predictable, because
previously we would read a number of entries that depends on the number
of tablets splits and merges, which increases over time and is
unbounded, whereas now we read only a single stream set on each update.

Fixes scylladb/scylladb#26732
2025-10-28 08:54:09 +01:00
Michael Litvak
440caeabcb cdc: helpers for garbage collecting old streams for tablets
introduce helper functions that can be used for garbage collecting old
cdc streams for tablets-based keyspaces.

- get_new_base_for_gc: finds a new base timestamp given a TTL, such that
  all older timestamps and streams can be removed.
- get_cdc_stream_gc_mutations: given new base timestamp and streams,
  builds mutations that update the internal cdc tables and remove the
  older streams.
- garbage_collect_cdc_streams_for_table: combines the two functions
  above to find a new base and build mutations to update it for a
  specific table
- garbage_collect_cdc_streams: builds gc mutations for all cdc tables
2025-10-26 11:01:20 +01:00
Michael Litvak
98de3f0e86 topology coordinator: change streams on tablet split/merge
on tablet split/merge finalization, generate a new CDC timestamp and
stream set for the table with a new stream for each tablet in the new
tablet map, in order to maintain synchronization of the CDC streams with
the tablets.

We pick a new timestamp for the streams with a small delay into the
future so that all nodes can learn about the new streams in time, in the
same way it's done for vnodes.

the new timestamp and streams are published by adding a mutation to the
cdc_streams_history table that contains the timestamp and the sets of
closed and opened streams from the current timestamp.
2025-09-17 14:47:12 +02:00
Michael Litvak
b8442c6087 cdc: virtual tables for cdc with tablets
Define two new virtual tables in system keyspace: cdc_timestamps and
cdc_streams. They expose the internal cdc metadata for tablets-enabled
keyspace to be consumed by users consuming the CDC log.

cdc_timestamps lists all timestamps for a table where a stream change
occured.

cdc_streams list additionally the current streams sets for each
table and timestamp, as well as difference - closed and opened streams -
from the previous stream set.
2025-09-17 14:47:12 +02:00
Michael Litvak
9ec4b6ccb1 cdc: load tablet streams metadata from tables
Read the CDC stream metadata from the internal system tables, and store
it in the cdc metadata data structures.

The metadata is stored in the tables as diffs which is more storage
efficient, but when in-memory we store it as full stream sets for each
timestamp. This is more useful because we need to be able to find a
stream given timestamp and token.
2025-09-17 14:47:12 +02:00
Michael Litvak
aa61f074b5 cdc: helper functions for reading metadata from tables
Define functions in system_keyspace that read from the internal cdc
tables and construct the data into internal cdc data structures.
2025-09-17 14:47:12 +02:00
Michael Litvak
9ef0862155 cdc: remove streams when dropping CDC table
When dropping a CDC log table in a tablets-enabled keyspace, remove all
metadata about the table's CDC streams from the internal CDC tables,
since the streams can't be read anymore.

Similarly, when dropping a tablets-enabled keyspace, remove metadata of
all streams belonging to tables in the keyspace.
2025-09-17 14:47:12 +02:00
Michael Litvak
ed25e420f8 cdc: create streams when allocating tablets
When allocating tablets for a CDC table, create the initial CDC stream
set. We create one stream per each tablet, each stream covering the
corresponding token range.
2025-09-17 14:47:12 +02:00
Dawid Mędrek
508e00319b cdc/generation: Clone topology_description asynchronously
An instance of `cdc::topology_description` can be quite big. The vector
it consists of stores as many `token_range_description`s as there are
vnodes, and the size of each `token_range_description` is O(#shards).

Because of that, copying an instance of the type can lead to reactor
stalls. To prevent that, we introduce an asynchronous function copying
the contents on the object.

Reactor stalls were detected in the call to `map_reduce` in
`generation_service::legacy_do_handle_cdc_generation`, so let's start
using the new function there.

A similar scenario occurs in `generation_service::handle_cdc_generation`,
so we modify it too.

Unfortunately, it doesn't seem viable to provide a reproducer of said
problem.

Fixes scylladb/scylladb#24522
2025-08-29 13:54:00 +02:00
Ernest Zaslavsky
d2c5765a6b treewide: Move keys related files to a new keys directory
As requested in #22102, #22103 and #22105 moved the files and fixed other includes and build system.

Moved files:
- clustering_bounds_comparator.hh
- keys.cc
- keys.hh
- clustering_interval_set.hh
- clustering_key_filter.hh
- clustering_ranges_walker.hh
- compound_compat.hh
- compound.hh
- full_position.hh

Fixes: #22102
Fixes: #22103
Fixes: #22105

Closes scylladb/scylladb#25082
2025-07-25 10:45:32 +03:00
Piotr Dulikowski
66acaa1bf8 cdc: add sanity check for generating an empty generation
It doesn't make sense to create an empty CDC generation because it does
not make sense to have a cluster with no tokens. Add a sanity check to
cdc::make_new_generation_description which fails if somebody attempts to
do that (i.e. when the set of current tokens + optionally bootstrapping
node's tokens is empty).

The function does not work correctly if it is misused, as we saw in
scylladb/scylladb#23897. While the function should not be misused in the
first place, it's better to throw an exception rather than crash -
especially that this crash could happen on the topology coordinator.
2025-04-25 11:25:07 +02:00
Gleb Natapov
28fb84117d treewide: drop id parameter from gossiper::for_each_endpoint_state
We have it in endpoint_state anyway, so no need to pass both.
2025-03-31 16:50:50 +03:00
Gleb Natapov
4609bbbbb2 treewide: move gossiper to index nodes by host id
This patch changes gossiper to index nodes by host ids instead of ips.
The main data structure that changes is _endpoint_state_map, but this
results in a lot of changes since everything that uses the map directly
or indirectly has to be changed. The big victim of this outside of the
gossiper itself is topology over gossiper code. It works on IPs and
assumes the gossiper does the same and both need to be changed together.
Changes to other subsystems are much smaller since they already mostly
work on host ids anyway.
2025-03-31 16:50:50 +03:00
Gleb Natapov
499eb4d17f treewide: pass host id to endpoint state change subscribers 2025-03-11 12:09:22 +02:00
Piotr Dulikowski
56ae119b19 cdc: generation: don't capture token metadata when retrying update
In legacy topology mode, on startup, a node will attempt to insert data
of the newest CDC generation into the legacy distributed tables. In case
of any errors, the operation will be retried until success in 60s
intervals. While the node waits for the operation to be retried, it
keeps a token_metadata_ptr instance. This is a problem for two reasons:

- The tmptr instance is used in a lambda which determines the cluster
  size. This lambda is used to determine the consistency level when
  inserting the generation to the distributed tables - if there is only
  one node, CL=ONE should be used instead of CL=QUORUM. The tmptr is
  immutable so it can technically happen the the cluster is shrinked
  while the code waits for the generation to be inserted.
- Token metadata instance keeps a version tracker that which prevents
  topology operations from proceeding while the tracker exists. This is
  a very niche problem, but it might happen that a leftover instance of
  token metadata held by update_streams_description might delay a
  topology operation which happens after upgrade to raft topology
  happens. This actually slows down the test which simulates upgrade to
  raft topology getting stuck (to be introduced in later commits).

Instead of capturing a token_metadata_ptr instance, capture a reference
to shared_token_metadata and use a freshly issued token_metadata_ptr
when computing the cluster size in order to choose the consistency
level.
2025-02-17 12:28:53 +01:00
Michael Litvak
4f5550d7f2 cdc: fix handling of new generation during raft upgrade
During raft upgrade, a node may gossip about a new CDC generation that
was propagated through raft. The node that receives the generation by
gossip may have not applied the raft update yet, and it will not find
the generation in the system tables. We should consider this error
non-fatal and retry to read until it succeeds or becomes obsolete.

Another issue is when we fail with a "fatal" exception and not retrying
to read, the cdc metadata is left in an inconsistent state that causes
further attempts to insert this CDC generation to fail.

What happens is we complete preparing the new generation by calling `prepare`,
we insert an empty entry for the generation's timestamp, and then we fail. The
next time we try to insert the generation, we skip inserting it because we see
that it already has an entry in the metadata and we determine that
there's nothing to do. But this is wrong, because the entry is empty,
and we should continue to insert the generation.

To fix it, we change `prepare` to return `true` when the entry already
exists but it's empty, indicating we should continue to insert the
generation.

Fixes scylladb/scylladb#21227

Closes scylladb/scylladb#22093
2025-01-28 18:05:32 +01:00
Gleb Natapov
593308a051 node_ops, cdc: drop remaining token_metadata::get_endpoint_for_host_id() usage
Use address map to translate id to ip instead. We want to drop ips from token_metadata.
2025-01-16 16:37:07 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
f436edfa22 mutation: remove unused "#include"s
these unused includes are identified by clang-include-cleaner. after
auditing the source files, all of the reports have been confirmed.

please note, because `mutation/mutation.hh` does not include
`seastar/coroutine/maybe_yield.hh` anymore, and quite a few source
files were relying on this header to bring in the declaration of
`maybe_yield()`, we have to include this header in the places where
this symbol is used. the same applies to `seastar/core/when_all.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-29 14:01:44 +08:00
Avi Kivity
ee92784098 serialization: replace boost::type with std::type_identity
Recently, seastar rpc started accepting std::type_identity in addition
to boost::type as a type marker (while labeling the latter with an
ominous deprecation warning). Reduce our depedendency on boost
by switching to std::type_identity.
2024-11-05 00:43:27 +01:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Avi Kivity
d50ba03965 gossiper: remove initializer-list overload of add_local_application_state()
The initializer_list overload uses a too-clever technique to avoid copies.
While copies here are unlikely to pose any real problem (we're allocating
map nodes anyway), it's simple enough to provide a copy-less replacement
that doesn't require questionable tricks.

We replace the initializer_list<..., in<>> overload with a variadic
template that constructs a temporary map.
2024-07-10 14:11:27 +03:00
Tomasz Grabiec
9da3bd84c7 dht: Extract dht::static_sharder
Before the patch, dht::sharder could be instantiated and it would
behave like a static sharder. This is not safe with regards to
extensions of the API because if a derived implementation forgets to
override some method, it would incorrectly default to the
implementation from static sharder. Better to fail the compilation in
this case, so extract static sharder logic to dht::static_sharder
class and make all methods in dht::sharder pure virtual.

This also allows us to have algorithms indicate that they only work
with static sharder by accepting the type, and have compile-time
safety for this requirement.

schema::get_sharder() is changed to return the static_sharder&.
2024-05-16 00:28:47 +02:00
Patryk Jędrzejczak
628d7e709e cdc: generation: fix retrieve_generation_data_v2
`system_keyspace::read_cdc_generation_opt` queries
`system.cdc_generations_v3`, which stores ids of CDC generations
as timeuuids. This function shouldn't be called with a normal uuid
(used by `system.cdc_generations_v2` to store generation ids).
Such a call would end with a marshaling error.

Before this patch,`retrieve_generation_data_v2` could call
`system_keyspace::read_cdc_generation_opt` with a normal uuid if
the generation wasn't present in `system.cdc_generations_v2`.
This logic caused a marshaling error while handling the
`check_and_repair_cdc_streams` request in the
`cdc_test.TestCdc.test_check_and_repair_cdc_streams_liveness` dtest.

This patch fixes the code being added in 6.0, no need to backport it.

Fixes scylladb/scylladb#18473

Closes scylladb/scylladb#18483
2024-05-06 09:12:47 +02:00
Patryk Jędrzejczak
3a34bb18cd db: config: make consistent-topology-changes unused
We make the `consistent-topology-changes` experimental feature
unused and assumed to be true in 6.0. We remove code branches that
executed if `consistent-topology-changes` was disabled.
2024-04-25 14:33:21 +02:00
Kefu Chai
372a4d1b79 treewide: do not define FMT_DEPRECATED_OSTREAM
since we do not rely on FMT_DEPRECATED_OSTREAM to define the
fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`.

in this change,

* utils: drop the range formatters in to_string.hh and to_string.c, as
  we don't use them anymore. and the tests for them in
  test/boost/string_format_test.cc are removed accordingly.
* utils: use fmt to print chunk_vector and small_vector. as
  we are not able to print the elements using operator<< anymore
  after switching to {fmt} formatters.
* test/boost: specialize fmt::details::is_std_string_like<bytes>
  due to a bug in {fmt} v9, {fmt} fails to format a range whose
  element type is `basic_sstring<uint8_t>`, as it considers it
  as a string-like type, but `basic_sstring<uint8_t>`'s char type
  is signed char, not char. this issue does not exist in {fmt} v10,
  so, in this change, we add a workaround to explicitly specialize
  the type trait to assure that {fmt} format this type using its
  `fmt::formatter` specialization instead of trying to format it
  as a string. also, {fmt}'s generic ranges formatter calls the
  pair formatter's `set_brackets()` and `set_separator()` methods
  when printing the range, but operator<< based formatter does not
  provide these method, we have to include this change in the change
  switching to {fmt}, otherwise the change specializing
  `fmt::details::is_std_string_like<bytes>` won't compile.
* test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends
  for comparing values. but without the operator<< based formatters,
  Boost.Test would not be able to print them. after removing
  the homebrew formatters, we need to use the generic
  `boost_test_print_type()` helper to do this job. so we are
  including `test_utils.hh` in tests so that we can print
  the formattable types.
* treewide: add "#include "utils/to_string.hh" where
  `fmt::formatter<optional<>>` is used.
* configure.py: do not define FMT_DEPRECATED_OSTREAM
* cmake: do not define FMT_DEPRECATED_OSTREAM

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:57:36 +08:00
Benny Halevy
fceb1183d3 cdc: should_propose_first_generation: get my_host_id from caller
There is no need to map this node's inet_address to host_id.
The storage_service can easily just pass the local host_id.
While at it, get the other node's host_id directly
from their endpoint_state instead of looking it up
yet again in the gossiper, using the nodes' address.

Refs #12283

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-03-20 12:53:49 +02:00
Patryk Jędrzejczak
0470b721c2 cdc: generation: allow increasing generation_leeway through error injection
The increased `generation_leeway` is used in the next patch to
write a test. Since it's no longer a constant, we create a new
getter for it.
2024-02-12 10:14:00 +01:00
Piotr Dulikowski
d04b3338ce cdc/generation_service: in legacy mode, fall back to raft tables
When a node enters recovery after being in raft topology mode, topology
operations switch back to legacy mode. We want CDC to keep working when
that happens, so we need for the legacy code to be able to access
generations created back in raft mode - so that the node can still
properly serve writes to CDC log tables.

In order to make this possible, modify the legacy logic to also look for
a cdc generation in raft tables, if it is not found in legacy tables.
2024-02-08 19:12:28 +01:00
Piotr Dulikowski
77a8f5e3d6 cdc/generation_service: turn off gossip notifications in raft topo mode
In raft topology mode CDC information is propagated through group 0.
Prevent the generation service from reacting to gossiper notifications
after we made the switch to raft mode.
2024-02-08 19:12:28 +01:00
Kefu Chai
6c06751640 cdc: not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16725
2024-01-11 09:13:37 +02:00
Benny Halevy
ad8a9104d8 endpoint_state subscriptions: batch on_change notification
Rather than calling on_change for each particular
application_state, pass an endpoint_state::map_type
with all changed states, to be processed as a batch.

In particular, thise allows storage_service::on_change
to update_peer_info once for all changed states.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Petr Gusev
7b55ccbd8e token_metadata: drop the template
Replace token_metadata2 ->token_metadata,
make token_metadata back non-template.

No behavior changes, just compilation fixes.
2023-12-12 23:19:54 +04:00
Petr Gusev
799f747c8f shared_token_metadata: switch to the new token_metadata 2023-12-12 23:19:54 +04:00
Petr Gusev
7eb7863635 cdc: switch to token_metadata2
Change the token_metadata type to token_metadata2 in
the signatures of CDC-related methods in
storage_service and cdc/generation. Use
get_new_strong to get a pointer to the new host_id-based
token_metadata from the inet_address-based one,
living in the shared_token_metadata.

The starting point of the patch is in
storage_service::handle_global_request. We change the
tmptr type to token_metadata2 and propagate the change
down the call chains. This includes token-related methods
of the boot_strapper class.
2023-12-12 23:19:53 +04:00
Benny Halevy
25754f843b gossiper: add get_this_endpoint_state_ptr
Returns this node's endpoint_state_ptr.
With this entry point, the caller doesn't need to
get_broadcast_address.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-05 08:42:49 +02:00
Michael Huang
62a8a31be7 cdc: use chunked_vector for topology_description entries
Lists can grow very big. Let's use a chunked vector to prevent large contiguous
allocations.
Fixes: #15302.

Closes scylladb/scylladb#15428
2023-09-18 23:17:01 +03:00
Patryk Jędrzejczak
1c58c6336a system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
We change the type of IDs in CDC_GENERATIONS_V3 to timeuuid to
give them a time-based order. We also change how we initialize
them so that the new CDC generation always has the highest ID.
This is the last step to enabling the efficient clearing of
obsolete CDC generation data.

Additionally, we change the types of current_cdc_generation_uuid,
new_cdc_generation_data_uuid and the second values of the elements
in unpublished_cdc_generations to timeuuid, so that they match id
in CDC_GENERATIONS_V3.
2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak
fab066cffe cdc: generation: remove topology_description_generator
After moving the creation of uuid out of
make_new_generation_description, this function only calls the
topology_description_generator's constructor and its generate
method. We could remove this function, but we instead simplify
the code by removing the topology_description_generator class.
We can do this refactor because make_new_generation_description
is the only place using it. We inline its generate method into
make_new_generation_description and turn its private methods into
static functions.
2023-09-12 11:18:54 +02:00
Patryk Jędrzejczak
3bf4cac72e cdc: do not create uuid in make_new_generation_data
In the future commit, we change how we initialize uuid of the
new CDC generation in the Raft-based topology. It forces us to
move this initialization out of the make_new_generation_data
function shared between Raft-based and gossiper-based topologies.

We also rename make_new_generation_data to
make_new_generation_description since it only returns
cdc::topology_description now.
2023-09-12 11:18:38 +02:00
Patryk Jędrzejczak
2cd430ac80 system_kayspace: make CDC_GENERATIONS_V3 single-partition
We make CDC_GENERATIONS_V3 single-partition by adding the key
column and changing the clustering key from range_end to
(id, range_end). This is the first step to enabling the efficient
clearing of obsolete CDC generation data, which we need to prevent
Raft-topology snapshots from endlessly growing as we introduce new
generations over time. The next step is to change the type of the id
column to timeuuid. We do it in the following commits.

After making CDC_GENERATIONS_V3 single-partition, there is no easy
way of preserving the num_ranges column. As it is used only for
sanity checking, we remove it to simplify the implementation.
2023-09-12 09:51:45 +02:00
Patryk Jędrzejczak
29f54836d0 cdc: generation: introduce get_common_cdc_generation_mutations
In the following commit, we implement the
get_cdc_generation_mutations_v3 function very similar to
get_cdc_generation_mutations_v2. The only differences in creating
mutations between CDC_GENERATIONS_V2 and CDC_GENERATIONS_V3 are:
- a need to set the num_ranges cell for CDC_GENERATIONS_V2,
- different partition keys,
- different clustering keys.

To avoid code duplication, we introduce
get_common_cdc_generation_mutations, which does most of the work
shared by both functions.
2023-09-12 09:37:21 +02:00
Patryk Jędrzejczak
ed1c1369d9 cdc: generation: rename get_cdc_generation_mutations
In the following commits, we modify the CDC_GENERATIONS_V3 schema
to enable efficient clearing of obsolete CDC generation data.
These modifications make the current get_cdc_generation_mutations
work only for the CDC_GENERATIONS_V2 schema, and we need a new
function for CDC_GENERATIONS_V3, so we add the "_v2" suffix.
2023-09-11 12:30:21 +02:00
Benny Halevy
c16ec870da gms: pass endpoint_state_ptr to endpoint_state change subscribers
Now that the endpoint_state isn't change in place
we do not need to copy it to each subscriber.
We can rather just pass the lw_shared_ptr holding
a snapshot of it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-31 09:35:15 +03:00