mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-28 12:17:02 +00:00
Introduce a new table `CDC_GENERATIONS_V3` (`system.cdc_generations_v3`). The table schema is a copy-paste of the `CDC_GENERATIONS_V2` schema. The difference is that V2 lives in `system_distributed_keyspace` and writes to it are distributed using regular `storage_proxy` replication mechanisms based on the token ring. The V3 table lives in `system_keyspace` and any mutations written to it will go through group 0. Extend the `TOPOLOGY` schema with new columns: - `new_cdc_generation_data_uuid` will be stored as part of a bootstrapping node's `ring_slice`, it stores UUID of a newly introduced CDC generation which is used as partition key for the `CDC_GENERATIONS_V3` table to access this new generation's data. It's a regular column, meaning that every row (corresponding to a node) will have its own. - `current_cdc_generation_uuid` and `current_cdc_generation_timestamp` together form the ID of the newest CDC generation in the cluster. (the uuid is the data key for `CDC_GENERATIONS_V3`, the timestamp is when the CDC generation starts operating). Those are static columns since there's a single newest CDC generation. When topology coordinator handles a request for node to join, calculate a new CDC generation using the bootstrapping node's tokens, translate it to mutation format, and insert this mutation to the CDC_GENERATIONS_V3 table through group 0 at the same time we assign tokens to the node in Raft topology. The partition key for this data is stored in the bootstrapping node's `ring_slice`. After inserting new CDC generation data , we need to pick a timestamp for this generation and commit it, telling all nodes in the cluster to start using the generation for CDC log writes once their clocks cross that timestamp. We introduce a separate step to the bootstrap saga, before `write_both_read_old`, called `commit_cdc_generation`. In this step, the coordinator takes the `new_cdc_generation_data_uuid` stored in a bootstrapping node's `ring_slice` - which serves as the key to the table where the CDC generation data is stored - and combines it with a timestamp which it generates a bit into the future (as in old gossiper-based code, we use 2 * ring_delay, by default 1 minute). This gives us a CDC generation ID which we commit into the topology state as the `current_cdc_generation_id` while switching the saga to the next step, `write_both_read_old`. Once a new CDC generation is committed to the cluster by the topology coordinator, we also need to publish it to the user-facing description tables so CDC applications know which streams to read from. This uses regular distributed table writes underneath (tables living in the `system_distributed` keyspace) so it requires `token_metadata` to be nonempty. We need a hack for the case of bootstrapping the first node in the cluster - turning the tokens into normal tokens earlier in the procedure in `token_metadata`, but this is fine for the single-node case since no streaming is happening. When a node notices that a new CDC generation was introduced in `storage_service::topology_state_load`, it updates its internal data structures that are used when coordinating writes to CDC log tables. We include the current CDC generation data in topology snapshot transfers. Some fixes and refactors included. Closes #13385 * github.com:scylladb/scylladb: docs: cdc: describe generation changes using group 0 topology coordinator cdc: generation_service: add a FIXME cdc: generation_service: add legacy_ prefix for gossiper-based functions storage_service: include current CDC generation data in topology snapshots db: system_keyspace: introduce `query_mutations` with range/slice storage_service: hold group 0 apply mutex when reading topology snapshot service: raft_group0_client: introduce `hold_read_apply_mutex` storage_service: use CDC generations introduced by Raft topology raft topology: publish new CDC generation to the user description tables raft topology: commit a new CDC generation on node bootstrap raft topology: create new CDC generation data during node bootstrap service: topology_state_machine: make topology::find const db: system_keyspace: small refactor of `load_topology_state` cdc: generation: extract pure parts of `make_new_generation` outside db: system_keyspace: add storage for CDC generations managed by group 0 service: topology_state_machine: better error checking for state name (de)serialization service: raft: plumbing `cdc::generation_service&` cdc: generation: `get_cdc_generation_mutations`: take timestamp as parameter cdc: generation: make `topology_description_generator::get_sharding_info` a parameter sys_dist_ks: make `get_cdc_generation_mutations` public sys_dist_ks: move find_schema outside `get_cdc_generation_mutations` sys_dist_ks: move mutation size threshold calculation outside `get_cdc_generation_mutations` service/raft: group0_state_machine: signal topology state machine in `load_snapshot`