scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 21:17:01 +00:00

Author	SHA1	Message	Date
Kamil Braun	3ab244e6d9	system_keyspace: make `get/set_scylla_local_param` public We'll use it outside `system_keyspace` code in later commit.	2023-09-15 13:04:04 +02:00
Kamil Braun	72cd457d53	feature_service: add `GROUP0_SCHEMA_VERSIONING` feature This feature, when enabled, will modify how schema versions are calculated and stored. - In group 0 mode, schema versions are persisted by the group 0 command that performs the schema change, then reused by each node instead of being calculated as a digest (hash) by each node independently. - In RECOVERY mode or before Raft upgrade procedure finishes, when we perform a schema change, we revert to the old digest-based way, taking into account the possibility of having performed group0-mode schema changes (that used persistent versions). As we will see in future commits, this will be done by storing additional flags and tombstones in system tables. By "schema versions" we mean both the UUIDs returned from `schema::version()` and the "global" schema version (the one we gossip as `application_state::SCHEMA`). For now, in this commit, the feature is always disabled. Once all necessary code is setup in following commits, we will enable it together with Raft.	2023-09-15 13:04:04 +02:00
Kamil Braun	dc4e20d835	schema_tables: refactor `scylla_tables(schema_features)` The `scylla_tables` function gives a different schema definition for the `system_schema.scylla_tables` table, depending on whether certain schema features are enabled or not. The way it was implemented, we had to write `θ(2^n)` amount of code and comments to handle `n` features. Refactor it so that the amount of code we have to write to handle `n` features is `θ(n)`.	2023-09-15 13:04:04 +02:00
Kamil Braun	4376854473	schema_tables: remove default value for `reload` in `merge_schema` To avoid bugs like the one fixed in the previous commit.	2023-09-15 13:04:04 +02:00
Kamil Braun	48164e1d09	schema_tables: pass `reload` flag when calling `merge_schema` cross-shard In `0c86abab4d` `merge_schema` obtained a new flag, `reload`. Unfortunately, the flag was assigned a default value, which I think is almost always a bad idea, and indeed it was in this case. When `merge_scehma` is called on shard different than 0, it recursively calls itself on shard 0. That recursive call forgot to pass the `reload` flag. Fix this.	2023-09-15 13:04:04 +02:00
Kamil Braun	9017b998ca	system_keyspace: fix outdated comment	2023-09-15 13:04:04 +02:00
Patryk Jędrzejczak	e375e769b9	raft topology: set CDC generation clean-up candidate We want to use the clean-up candidates to remove the obsolete CDC generation data, but first, we need to set suitable generations as a candidate when there is no candidate. Since CDC generations must be published before we remove them, a generation that is being published is a good candidate.	2023-09-15 09:23:59 +02:00
Dawid Medrek	fbbb9f879a	db/hints: Remove unused aliases from manager.hh	2023-09-15 04:17:08 +02:00
Dawid Medrek	d46437a87b	db/hints: Rename end_point_hints_manager This commit renames `end_point_hints_manager` to `hint_endpoint_manager` to be consistent with other names used in the module (they all start with `hint_`).	2023-09-15 03:46:15 +02:00
Dawid Medrek	6d1eee448b	db/hints: Rename sender to hint_sender We rename the structure to highlight what exactly its purpose is.	2023-09-15 03:46:15 +02:00
Dawid Medrek	4ad0f8907c	db/hints: Move the rebalancing logic to hint_storage This commit continues modularizing manager.hh.	2023-09-15 03:46:15 +02:00
Dawid Medrek	999484466d	db/hints: Move the implementation of sender This commit continues modularizing manager.hh. After moving the declaration of sender to a dedicated header file, these changes move its implementation to a separate source file.	2023-09-15 03:46:15 +02:00
Dawid Medrek	17aabf6b9a	db/hints: Move the declaration of sender to hint_sender.hh This commit is yet another step in modularizing manager.hh. We move the declaration of sender to a dedicated file. Its implementation will follow in a future commit.	2023-09-15 03:46:15 +02:00
Dawid Medrek	1a7262ed6e	db/hints: Move sender::replay_allowed() to the source file The premise of these changes is the fact that we cannot have a cycle of #includes. Because the declaration of `sender` is going to be moved to a separate header file in a future commit, and because that header file is going to be included in the file where `end_point_hints_manager` is declared, we will need to rely on `end_point_hints_manager` being an incomplete type there. A consequence of that is that we cannot access any of `end_point_hints_manager`'s methods. This commit prepares the ground for it by moving the definition of the function to the source file where `end_point_hints_manager` will be a complete type.	2023-09-15 03:46:15 +02:00
Dawid Medrek	ad2a36bd45	db/hints: Put end_point_hints_manager in internal namespace	2023-09-15 03:46:15 +02:00
Dawid Medrek	507054012d	db/hints: Move the implementation of end_point_hints_manager This commit continues moving end_point_hints_manager to its dedicated files. After moving the declaration of the class, these changes move the implementation.	2023-09-15 03:46:15 +02:00
Dawid Medrek	f72c423984	db/hints: Move the declaration of end_point_hints_manager This commit is yet another step in modularizing manager.hh. We move the declaration of the class to a dedicated header file. The implementation will follow in a future commit.	2023-09-15 03:46:15 +02:00
Dawid Medrek	854cc0c939	db/hints: Move definitions of functions using shard hint manager We move definitions of inline methods of end_point_hints_manager and sender accessing shard hint manager to the source file, effectively un-inlining them. We need to do that to prepare for moving said structures out of manager.hh. This commit is yet another step in modularizing manager.hh.	2023-09-15 03:45:57 +02:00
Dawid Medrek	db08a85f5d	db/hints: Introduce hint_storage.hh This commit moves types used by shard hint manager and related to storing hints on disk to another file. It is yet another step in modularizing manager.hh.	2023-09-15 02:28:10 +02:00
Dawid Medrek	4814b3b19a	db/hints: Extract the logger from manager.cc This commit extracts the logger used in manager.cc to prepare the ground for modularization of manager.hh into separate smaller files. We want to preserve the logging behavior (at least for the time being), which means new files should use the same logger. These changes serve that purpose.	2023-09-15 02:24:20 +02:00
Dawid Medrek	efd6d1f57a	db/hints: Extract common types from manager.hh Currently, data structures used in manager.hh use their own aliases for gms::inet_address. It is clear they all should use the same type and having different names for it only reduces readability of the code. This commit introduces a common alias -- endpoint_id -- and gets rid of the other ones. This commit is also the first step in modularizing manager.hh by extracting common types to another file.	2023-09-15 02:23:30 +02:00
Patryk Jędrzejczak	c0fd42ead4	system_keyspace: introduce decode_cdc_generation_id The decode_cdc_generations_ids function allows us to decode a vector of CDC generation IDs. After adding cleanup_candidate to CDC_GENERATIONS_V3, we need a similar function that decodes a single ID.	2023-09-14 12:09:14 +02:00
Patryk Jędrzejczak	6db325fb69	system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3 In the following commits, we implement a garbage collection for CDC_GENERATIONS_V3. The first step is introducing the clean-up candidate. It will be continually updated by the CDC generation publisher and used to remove obsolete data.	2023-09-14 12:09:10 +02:00
Petr Gusev	082cd3bc8e	system_keyspace: switch CDC_LOCAL to schema commitlog	2023-09-13 23:17:20 +04:00
Petr Gusev	a683cebb02	system_keyspace: scylla_local: use schema commitlog We remove flush from set_scylla_local_param_as since it's now redundant. We add it to save_local_enabled_features as features need to be available before schema commitlog replay. We skip the flush if save_local_enabled_features is called from topology_state_load when the features are migrated to system.topology and we don't need strict durability.	2023-09-13 23:17:20 +04:00
Petr Gusev	beb29f094b	system_keyspace: drop load phases We want to switch system.scylla_local table to the schema commitlog, but load phases hamper here - schema commitlog is initialized after phase1, so a table which is using it should be moved to phase2, but system.scylla_local contains features, and we need them before schema commitlog initialization for SCHEMA_COMMITLOG feature. In this commit we are taking a different approach to loading system tables. First, we load them all in one pass in 'readonly' mode. In this mode, the table cannot be written to and has not yet been assigned a commit log. To achieve this we've added _readonly bool field to the table class, it's initialized to true in table's constructor. In addition, we changed the table constructor to always assign nullptr to commitlog, and we trigger an internal error if table.commitlog() property is accessed while the table is in readonly mode. Then, after triggering on_system_tables_loaded notifications on feature_service and sstable_format_selector, we call system_keyspace::mark_writable and eventually table::mark_ready_for_writes which selects the proper commitlog and marks the table as writable. In sstable_compaction_test we drop several mark_ready_for_writes calls since they are redundant, the table has already been made writable in env.make_table_for_tests call. The table::commitlog function either returns the current commitlog or causes an error if the table is readonly. This didn't work for virtual tables, since they never called mark_ready_for_writes. In this commit we add this call to initialize_virtual_tables.	2023-09-13 23:17:20 +04:00
Petr Gusev	47ffc66c7f	database.hh: add_column_family: add readonly parameter Previously, creating a table or view in schema_tables.cc/merge_tables_and_views was a two-step process: first adding a column family (add_column_family function) and then marking it as ready for writes (mark_table_as_writable). There is an yield between these stages, this means someone could see a table or view for which the mark_table_as_writable method had not yet been called, and start writing to it. This problem was demonstrated by materialised view dtests. A view is created on all nodes. On some nodes it will be created earlier than on others and the view rebuild process will start writing data to that view on other nodes, where mark_table_as_writable has not yet been called. In this patch we solve this problem by adding a readonly parameter to the add_column_family method. When loading tables from disk, this flag is set to true and the mark_table_as_writable is called only after all sstables have been loaded. When creating a new table, this flag is set to false, mark_table_as_writable is called from inside add_column_family and the new table becomes visible already as writable.	2023-09-13 23:17:20 +04:00
Petr Gusev	7e52014633	schema_tables: merge_tables_and_views: delay events until tables/views are created on all shards db.get_notifier().create_view triggers view rebuild, this process writes to the table on all shards and thus can access partially created table, e.g the one where mark_table_ready_for_writes was not yet called.	2023-09-13 23:17:20 +04:00
Petr Gusev	0e5f9ae9a4	system_keyspace: switch system.peers to schema commitlog Also, we remove flushes on writes as durability is now guaranteed by the commitlog.	2023-09-13 23:17:20 +04:00
Petr Gusev	7881ce1e09	system_keyspace: switch system.local to schema commitlog Schema commitlog lives only on the zero shard, so we need to turn on use_null_sharder option. Also, we remove flushes on writes as durability is now guaranteed by the commitlog.	2023-09-13 23:17:20 +04:00
Petr Gusev	a0653590b5	sstables_format_selector: extract listener In the following commits we want to move schema commitlog replay earlier, but the current sstable format should be selected before the replay. The current sstable format is stored in system.scylla_local, so we can't read it until system tables are loaded. This problem is similar to the enabled_features. To solve this we split sstables_format_selector in two parts. The lower level part, sstables_format_selector, knows only about database and system_keyspace. It will be moved before system_keyspace initialization, and the on_system_tables_loaded method will be called on it when the system_keyspace has loaded its tables. The higher level part, sstables_format_listener, is responsible for subscribing to feature_services and gossipier and is started later, at the same place as sstables_format_selector before this commit.	2023-09-13 23:04:50 +04:00
Petr Gusev	7104fc8a7e	sstables_format_selector: wrap when_enabled with seastar::async The listener may fire immediately, we must be in a thread context for this to work. In the next commits we are going to move enable_features_on_startup above sstables_format_selector::start in scylla_main, so we need to fix this beforehand.	2023-09-13 23:00:16 +04:00
Petr Gusev	2a0b228d17	main.cc: inline and split system_keyspace.setup Our goal is to switch system.local table to schema commitlog and stop doing flushes when we write to it. This means it would be incorrect to read from this table until schema commitlog is replayed. On the other hand, we need truncation records to be loaded before we start replaying schema commitlog, since commitlog_replayer relies on them. In this commit we inline the system_keyspace::setup function and split its content into two parts. In the first part, before schema commitlog replay, we load truncation records. It's safe to load them before schema commitlog replay since we intend to let the flushes on writes to system.truncated table. In the second part, after schema commitlog replay, we do the rest of the job - build_bootstrap_info and db::schema_tables::save_system_schema. We decided to inline this function since there is very low cohesion between the actions it's performing. It's just simpler to reason about them individually.	2023-09-13 23:00:15 +04:00
Petr Gusev	f0bc9f2d93	system_keyspace: refactor save_system_schema function This is a refactoring commit without observable changes in behaviour. Previously, there were two related functions in db::schema_tables: save_system_keyspace_schema(qp) and save_system_schema(qp, ks). The first called the second passing "system_schema" as the second argument. Outside of schema_tables module we don't need two functions, we just need a way to say 'persist system schema objects in the appropriate tables/keyspaces'. In this commit we change the function save_system_schema to have this meaning. Internally it calls save_system_schema_to_keyspace twice with "system_schema" and "system", since that's what we need in the single call site of this function in system_keyspace::setup. In subsequent commits we are going to move this call out of the system_keyspace::setup.	2023-09-13 23:00:15 +04:00
Petr Gusev	e395086557	system_keyspace: move initialize_virtual_tables into virtual_tables.hh This is a readability refactoring commit without observable changes in behaviour. initialize_virtual_tables logically belongs to virtual_tables module, and it allows to make other functions in virtual_tables.cc (register_virtual_tables, install_virtual_readers) local to the module, which simplifies the matters a bit. all_virtual_tables() is not needed anymore, all the references to registered virtual tables are now local to virtual_tables module and can just use virtual_tables variable directly.	2023-09-13 23:00:15 +04:00
Petr Gusev	c4787a160b	system_keyspace: remove unused parameter	2023-09-13 23:00:15 +04:00
Petr Gusev	b90011294d	config.cc: drop db::config::host_id In this refactoring commit we remove the db::config::host_id field, as it's hacky and duplicates token_metadata::get_my_id. Some tests want specific host_id, we add it to cql_test_config and use in cql_test_env. We can't pass host_id to sstables_manager by value since it's initialized in database constructor and host_id is not loaded yet. We also prefer not to make a dependency on shared_token_metadata since in this case we would have to create artificial shared_token_metadata in many tools and tests where sstables_manager is used. So we pass a function that returns host_id to sstables_manager constructor.	2023-09-13 23:00:15 +04:00
Petr Gusev	a03fbc3781	system_keyspace: set null sharder when configuring schema commitlog The schema commitlog lives only on the null shard, it makes no sense to set use_schema_commitlog without use_null_sharder. We also extract the function enable_schema_commitlog which sets all the needed properties.	2023-09-13 23:00:15 +04:00
Petr Gusev	d32191a353	system_keyspace: rename static variables 'raft_tables' in set_use_schema_commitlog initialization was misleading. Other variables have also been renamed for consistency.	2023-09-13 23:00:15 +04:00
Petr Gusev	cda49b06dc	system_keyspace: remove redundant wait_for_sync_to_commitlog Tables with schema commitlog already sync every write, wait_for_sync_to_commitlog makes sense only for the regular commitlog. Technically there are nothing wrong with allowing both options, but it's confusing. Being strict and accurate about the meaning of the options reduces the chance of errors due to misunderstanding. This is preparation for the next commits, where we will start generating an error if the combination of options doesn't make sense.	2023-09-13 23:00:15 +04:00
Avi Kivity	0a5d9532f9	Merge 'Sanitize batchlog manager start/stop' from Pavel Emelyanov This code is now spread over main and differs in cql_test_env. The PR unifies both places and makes the manager start-stop look standard refs: #2795 Closes #15375 * github.com:scylladb/scylladb: batchlog_manager: Remove start() method batchlog_manager: Start replay loop in constructor main, cql_test_env: Start-stop batchlog manager in one "block" batchlog_manager: Move shard-0 check into batchlog_replay_loop() batchlog_manager: Fix drain() reentrability	2023-09-13 18:20:56 +03:00
Kamil Braun	a184b07cbb	Merge 'raft topology: make CDC_GENERATIONS_V3 single-partition, timeuuid-sorted' from Patryk Jędrzejczak We make the `CDC_GENERATIONS_V3` table single-partition and change the clustering key from `range_end` to `(id, range_end)`. We also change the type of `id` to `timeuuid` and ensure that a new generation always has the highest `id`. These changes allow efficient clearing of obsolete CDC generation data, which we need to prevent Raft-topology snapshots from endlessly growing as we introduce new generations over time. All this code is protected by an experimental feature flag. It includes the definition of `CDC_GENERATIONS_V3`. The table is not created unless the feature flag is enabled. Fixes #15163 Closes #15319 * github.com:scylladb/scylladb: system_keyspace: rename cdc_generation_id_v2 system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3 cdc: generation: remove topology_description_generator cdc: do not create uuid in make_new_generation_data system_kayspace: make CDC_GENERATIONS_V3 single-partition cdc: generation: introduce get_common_cdc_generation_mutations cdc: generation: rename get_cdc_generation_mutations	2023-09-13 12:54:49 +02:00
Pavel Emelyanov	d48aff5789	batchlog_manager: Remove start() method It's now a no-op, can be dropped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-12 16:37:52 +03:00
Pavel Emelyanov	3966a50ed4	batchlog_manager: Start replay loop in constructor ... and sanitize the future used on stop. The loop in question is now started in .start(), but all callers now construct the manager late enough, so the loop spawning can be moved. This also calls for renaming the future member of the class and allows to make it regular, not shared, future. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-12 16:35:53 +03:00
Pavel Emelyanov	9f45778467	batchlog_manager: Move shard-0 check into batchlog_replay_loop() Currently the only caller of it is the batchlog manager itself. It checks for the shard-id to be zero, calls the method, then the method asserts that it's run on shard-0. Moving the check into the method removes the need for assertion and makes further patching simpler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-12 16:32:12 +03:00
Pavel Emelyanov	38d0ea0916	batchlog_manager: Fix drain() reentrability Currently drain() is called twise -- first time from storage_service::drain() (on shutdown), second via batchlog_manager::stop(). The routine is unintentinally re-entrable, because: - explicit check for not aborting the abort source twise - breaking semaphore can be done multiple times - co-await-ing of the _started future works because the future is shared That's not extremely elegant, better to make the drain() bail out early if it was already called. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-12 16:30:07 +03:00
Patryk Jędrzejczak	92209996b5	system_keyspace: rename cdc_generation_id_v2 Changing the second value of cdc_generation_id_v2 from uuid_type to timeuuid_type made the name of cdc_generation_id_v2 unsuitable because it does not match cdc::generation_id_v2 anymore.	2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak	1c58c6336a	system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3 We change the type of IDs in CDC_GENERATIONS_V3 to timeuuid to give them a time-based order. We also change how we initialize them so that the new CDC generation always has the highest ID. This is the last step to enabling the efficient clearing of obsolete CDC generation data. Additionally, we change the types of current_cdc_generation_uuid, new_cdc_generation_data_uuid and the second values of the elements in unpublished_cdc_generations to timeuuid, so that they match id in CDC_GENERATIONS_V3.	2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak	2cd430ac80	system_kayspace: make CDC_GENERATIONS_V3 single-partition We make CDC_GENERATIONS_V3 single-partition by adding the key column and changing the clustering key from range_end to (id, range_end). This is the first step to enabling the efficient clearing of obsolete CDC generation data, which we need to prevent Raft-topology snapshots from endlessly growing as we introduce new generations over time. The next step is to change the type of the id column to timeuuid. We do it in the following commits. After making CDC_GENERATIONS_V3 single-partition, there is no easy way of preserving the num_ranges column. As it is used only for sanity checking, we remove it to simplify the implementation.	2023-09-12 09:51:45 +02:00
Patryk Jędrzejczak	ed1c1369d9	cdc: generation: rename get_cdc_generation_mutations In the following commits, we modify the CDC_GENERATIONS_V3 schema to enable efficient clearing of obsolete CDC generation data. These modifications make the current get_cdc_generation_mutations work only for the CDC_GENERATIONS_V2 schema, and we need a new function for CDC_GENERATIONS_V3, so we add the "_v2" suffix.	2023-09-11 12:30:21 +02:00

... 32 33 34 35 36 ...

4972 Commits