mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 03:30:49 +00:00

Files

Tomasz Grabiec bd0b299322 Merge 'Manage CDC generations when bootstrapping nodes using Raft Group 0 topology coordinator' from Kamil Braun

Introduce a new table `CDC_GENERATIONS_V3` (`system.cdc_generations_v3`).
The table schema is a copy-paste of the `CDC_GENERATIONS_V2` schema. The
difference is that V2 lives in `system_distributed_keyspace` and writes to it
are distributed using regular `storage_proxy` replication mechanisms based on
the token ring.  The V3 table lives in `system_keyspace` and any mutations
written to it will go through group 0.

Extend the `TOPOLOGY` schema with new columns:
- `new_cdc_generation_data_uuid` will be stored as part of a bootstrapping
  node's `ring_slice`, it stores UUID of a newly introduced CDC
  generation which is used as partition key for the `CDC_GENERATIONS_V3`
  table to access this new generation's data. It's a regular column,
  meaning that every row (corresponding to a node) will have its own.
- `current_cdc_generation_uuid` and `current_cdc_generation_timestamp`
  together form the ID of the newest CDC generation in the cluster.
  (the uuid is the data key for `CDC_GENERATIONS_V3`, the timestamp is
  when the CDC generation starts operating). Those are static columns
  since there's a single newest CDC generation.

When topology coordinator handles a request for node to join, calculate a new
CDC generation using the bootstrapping node's tokens, translate it to mutation
format, and insert this mutation to the CDC_GENERATIONS_V3 table through group 0
at the same time we assign tokens to the node in Raft topology. The partition
key for this data is stored in the bootstrapping node's `ring_slice`.

After inserting new CDC generation data , we need to pick a timestamp for this
generation and commit it, telling all nodes in the cluster to start using the
generation for CDC log writes once their clocks cross that timestamp.

We introduce a separate step to the bootstrap saga, before
`write_both_read_old`, called `commit_cdc_generation`. In this step, the
coordinator takes the `new_cdc_generation_data_uuid` stored in a bootstrapping
node's `ring_slice` - which serves as the key to the table where the CDC
generation data is stored - and combines it with a timestamp which it generates
a bit into the future (as in old gossiper-based code, we use 2 * ring_delay, by
default 1 minute). This gives us a CDC generation ID which we commit into the
topology state as the `current_cdc_generation_id` while switching the saga to
the next step, `write_both_read_old`.

Once a new CDC generation is committed to the cluster by the topology
coordinator, we also need to publish it to the user-facing description tables so
CDC applications know which streams to read from.

This uses regular distributed table writes underneath (tables living in the
`system_distributed` keyspace) so it requires `token_metadata` to be nonempty.
We need a hack for the case of bootstrapping the first node in the cluster -
turning the tokens into normal tokens earlier in the procedure in
`token_metadata`, but this is fine for the single-node case since no streaming
is happening.

When a node notices that a new CDC generation was introduced in
`storage_service::topology_state_load`, it updates its internal data structures
that are used when coordinating writes to CDC log tables.

We include the current CDC generation data in topology snapshot transfers.

Some fixes and refactors included.

Closes #13385

* github.com:scylladb/scylladb:
  docs: cdc: describe generation changes using group 0 topology coordinator
  cdc: generation_service: add a FIXME
  cdc: generation_service: add legacy_ prefix for gossiper-based functions
  storage_service: include current CDC generation data in topology snapshots
  db: system_keyspace: introduce `query_mutations` with range/slice
  storage_service: hold group 0 apply mutex when reading topology snapshot
  service: raft_group0_client: introduce `hold_read_apply_mutex`
  storage_service: use CDC generations introduced by Raft topology
  raft topology: publish new CDC generation to the user description tables
  raft topology: commit a new CDC generation on node bootstrap
  raft topology: create new CDC generation data during node bootstrap
  service: topology_state_machine: make topology::find const
  db: system_keyspace: small refactor of `load_topology_state`
  cdc: generation: extract pure parts of `make_new_generation` outside
  db: system_keyspace: add storage for CDC generations managed by group 0
  service: topology_state_machine: better error checking for state name (de)serialization
  service: raft: plumbing `cdc::generation_service&`
  cdc: generation: `get_cdc_generation_mutations`: take timestamp as parameter
  cdc: generation: make `topology_description_generator::get_sharding_info` a parameter
  sys_dist_ks: make `get_cdc_generation_mutations` public
  sys_dist_ks: move find_schema outside `get_cdc_generation_mutations`
  sys_dist_ks: move mutation size threshold calculation outside `get_cdc_generation_mutations`
  service/raft: group0_state_machine: signal topology state machine in `load_snapshot`

2023-04-21 18:11:27 +02:00

_static

docs: Update custom styles

2023-03-14 12:06:20 +00:00

_utils

doc: remove Enterprise upgrade guides from OSS doc

2023-03-24 10:57:03 +02:00

alternator

docs/alternator: recommend to disable auto_snapshot

2023-03-08 10:50:59 +02:00

architecture

doc: update Raft doc for versions 5.2 and 2023.1

2023-04-04 15:15:56 +02:00

cql

doc: document tombstone_gc as not experimental

2023-04-21 14:43:25 +02:00

dev

docs: cdc: describe generation changes using group 0 topology coordinator

2023-04-20 16:36:41 +02:00

getting-started

doc: update supported os for 2022.1

2023-04-05 06:43:58 +03:00

doc: add a Knowledge Base article about consitency, v2 of https://github.com/scylladb/scylladb/pull/12929

2023-03-13 17:48:25 +02:00

operating-scylla

Remove visible :orphan:

2023-04-20 08:24:48 +03:00

reference

doc: add the Enterprise vs. OSS Matrix

2023-03-20 10:18:10 +02:00

rst_include

doc: remove the enterprise-only-note.rst file, which was replaced by the ScyllaDB Enterprise label and is not used anymore

2022-11-14 15:20:51 +01:00

troubleshooting

Update manager-monitoring-integration.rst

2023-02-20 12:46:14 +01:00

upgrade

doc: update the metrics between 5.2 and 2023.1

2023-04-14 08:23:53 +03:00

using-scylla

doc: remove in-memory tables from OSS docs

2023-04-17 16:00:09 +03:00

conf.py

docs: Separate conf.py

2023-03-27 13:42:58 +03:00

contribute.rst

doc: replace Scylla with ScyllaDB on the menu tree and major links; related: https://github.com/scylladb/scylla-docs/issues/3962

2023-01-09 08:39:50 +02:00

faq.rst

doc: remove in-memory tables from OSS docs

2023-04-17 16:00:09 +03:00

glossary.rst

doc: Add Mumur term to the glossery

2023-03-21 22:45:47 +02:00

index.rst

Merge 'docs: Add card logos' from David Garcia

2023-03-23 08:53:58 +02:00

Makefile

docs: update theme 1.4

2023-03-29 06:56:27 +03:00

pyproject.toml

docs: update theme 1.4

2023-03-29 06:56:27 +03:00

README.md

docs/README.md: expand prerequisites list

2022-08-31 17:00:59 +03:00

robots.txt

doc: add the CNAME and robots files

2022-07-11 12:16:53 +02:00

README.md

ScyllaDB Documentation

This repository contains the source files for ScyllaDB Open Source documentation.

The dev folder contains developer-oriented documentation related to the ScyllaDB code base. It is not published and is only available via GitHub.
All other folders and files contain user-oriented documentation related to ScyllaDB Open Source and are sources for docs.scylladb.com.

To report a documentation bug or suggest an improvement, open an issue in GitHub issues for this project.

To contribute to the documentation, open a GitHub pull request.

Key Guidelines for Contributors

Follow the ScyllaDB Style Guide.
The user documentation is written in reStructuredText (RST) - a plaintext markup language similar to Markdown. If you're not familiar with RST, see ScyllaDB RST Examples.
The developer documentation is written in Markdown. See Basic Markdown Syntax for reference.

Creating Knowledge Base Articles

The kb/ directory holds source files for knowledge base articles in the Knowledge Base section of the ScyllaDB documentation.

The kb/kb_common subdirectory contains a template for knowledge base articles to help you create new articles.

To create a new knowledge base article (KB):

Copy the kb-article-template.rst file from /kb/kb_common to /kb and rename it with a unique name.
Open the new file and fill in the required information.
Remove what is not needed.
Run make preview to build the docs and preview them locally.
Send a PR with "KB" in its title.

Building User Documentation

Prerequisites

Python 3. Check your version with $ python --version.
poetry 1.12 or later
make

Mac OS X

You must have a working Homebrew in order to install the needed tools.

You also need the standard utility make.

Check if you have these two items with the following commands:

brew help
make -h

Linux Distributions

Building the user docs should work out of the box on most Linux distributions.

Windows

Use "Bash on Ubuntu on Windows" for the same tools and capabilities as on Linux distributions.

Building the Docs

Run make preview to build the documentation.
Preview the built documentation locally at http://127.0.0.1:5500/.

Cleanup

You can clean up all the build products and auto-installed Python stuff with:

make pristine

Information for Contributors

If you are interested in contributing to Scylla docs, please read the Scylla open source page at http://www.scylladb.com/opensource/ and complete a Scylla contributor agreement if needed. We can only accept documentation pull requests if we have a contributor agreement on file for you.

Third-party Documentation

Do any copying as a separate commit. Always commit an unmodified version first and then do any editing in a separate commit.
We already have a copy of the Apache license in our tree, so you do not need to commit a copy of the license.
Include the copyright header from the source file in the edited version. If you are copying an Apache Cassandra document with no copyright header, use:

This document includes material from Apache Cassandra.
Apache Cassandra is Copyright 2009-2014 The Apache Software Foundation.