scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Kamil Braun	de3607810d	system_keyspace: fix outdated comment	2023-11-23 14:06:27 +01:00
Pavel Emelyanov	d827068d01	sstables,s3: Support state change (without generation change) Now when the system.sstables has the state field, it can be changed (UPDATEd). However, when changing the state AND generation, this still won't work, because generation is the clustering key of the table in question and cannot be just changed. This, nonetheless, is OK, as generation changes with state only when moving an sstable from upload dir into normal/staging and this is separate issue for S3 (#13018). For now changing state only is OK. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-24 19:12:37 +03:00
Pavel Emelyanov	ca5d3d217f	system_keyspace: Add state field to system.sstables The state is one of <empty>(normal)/staging/quarantine. Currently when sstable is moved to non-normal state the s3 backend state_change() call throws thus such sstables do not appear. Next patches are going to change that and the new field in the system.sstables is needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-24 19:12:37 +03:00
Kefu Chai	b36cef6f1a	sstable: remove _remote_prefix from s3_storage since we use the sstable.generation() for the remote prefix of the key of the object for storing the sstable component, there is no need to set remote_prefix beforehand. since `s3_storage::ensure_remote_prefix()` and `system_kesypace::sstables_registry_lookup_entry()` are not used anymore, they are removed. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-10-23 10:08:22 +08:00
Kefu Chai	af8bc8ba63	sstable: switch to uuid identifier for naming S3 sstable objects before this change, we create a new UUID for a new sstable managed by the s3_storage, and we use the string representation of UUID defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for naming the objects stored on s3_storage. but this representation is not what we are using for storing sstables on local filesystem when the option of "uuid_sstable_identifiers_enabled" is enabled. instead, we are using a base36-based representation which is shorter. to be consistent with the naming of the sstables created for local filesystem, and more importantly, to simplify the interaction between the local copy of sstables and those stored on object storage, we should use the same string representation of the sstable identifier. so, in this change: 1. instead of creating a new UUID, just reuse the generation of the sstable for the object's key. 2. do not store the uuid in the sstable_registry system table. As we already have the generation of the sstable for the same purpose. 3. switch the sstable identifier representation from the one defined by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the base36-based one (implemented by fmt::formatter<sstables::generation_type>) 4. enable the `uuid_sstable_identifers` cluster feature if it is enabled in the `test_env_config`, so that it the sstable manager can enable the uuid-based uuid when creating a new uuid for sstable. 5. throw if the generation of sstable is not UUID-based when accessing / manipulating an sstable with S3 storage backend. as the S3 storage backend now relies on this option. as, otherwise we'd have sstables with key like s3://bucket/number/basename, which is just unable to serve as a unique id for sstable if the bucket is shared across multiple tables. Fixes #14175 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-10-23 10:08:22 +08:00
Calle Wilund	6fbd210679	system.truncated: Remove replay_position data from truncated on start Once we've started clean, and all replaying is done, truncation logs commit log regarding replay positions are invalid. We should exorcise them as soon as possible. Note that we cannot remove truncation data completely though, since the time stamps stored are used by things like batch log to determine if it should use or discard old batch data.	2023-10-17 18:16:48 +04:00
Tomasz Grabiec	0aef0f900b	Merge 'truncation records refactorings' from Petr Gusev This PR contains several refactoring, related to truncation records handling in `system_keyspace`, `commitlog_replayer` and `table` clases: * drop map_reduce from `commitlog_replayer`, it's sufficient to load truncation records from the null shard; * add a check that `table::_truncated_at` is properly initialized before it's accessed; * move its initialization after `init_non_system_keyspaces` Closes scylladb/scylladb#15583 * github.com:scylladb/scylladb: system_keyspace: drop truncation_record system_keyspace: remove get_truncated_at method table: get_truncation_time: check _truncated_at is initialized database: add_column_family: initialize truncation_time for new tables database: add_column_family: rename readonly parameter to is_new system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace commitlog_replayer: refactor commitlog_replayer::impl::init system_keyspace: drop redundant typedef system_keyspace: drop redundant save_truncation_record overload table: rename cache_truncation_record -> set_truncation_time system_keyspace: get_truncated_position -> get_truncated_positions	2023-10-17 10:55:30 +02:00
Avi Kivity	35849fc901	Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun" This reverts commit `3d4398d1b2`, reversing changes made to `45dfce6632`. The commit causes some schema changes to be lost due to incorrect timestamps in some mutations. More information is available in [1]. Reopens: scylladb/scylladb#7620 Reopens: scylladb/scylladb#13957 Fixes scylladb/scylladb#15530. [1] https://github.com/scylladb/scylladb/pull/15687	2023-10-11 00:32:05 +03:00
Petr Gusev	a6087a10bd	system_keyspace: drop truncation_record This is a refactoring commit without observable changes in behaviour. The only usage was in get_truncation_records method which can be inlined.	2023-10-05 15:19:59 +04:00
Petr Gusev	9d350e7532	system_keyspace: remove get_truncated_at method The only usage is in batchlog_manager, and it can be replaced with cf.get_truncation_time(). std::optional<std::reference_wrapper<canonical_mutation>> is replaced with canonical_mutation* since it is semantically the same but with less type boilerplate.	2023-10-05 15:19:59 +04:00
Petr Gusev	b70bca71bc	system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace load_truncation_times() now works only for schema tables since the rest is not loaded until distributed_loader::init_non_system_keyspaces. An attempt to call cf.set_truncation_time for non-system table just throws an exception, which is caught and logged with debug level. This means that the call cf.get_truncation_time in paxos_state.cc has never worked as expected. To fix that we move load_truncation_times() closer to the point where the tables are loaded. The function distributed_loader::populate_keyspace is called for both system and non-system tables. Once the tables are loaded, we use the 'truncated' table to initialize _truncated_at field for them. The truncation_time check for schema tables is also moved into populate_keyspace since is seems like a more natural place for it.	2023-10-05 15:19:52 +04:00
Petr Gusev	c94946d566	system_keyspace: drop redundant typedef	2023-10-03 17:11:40 +04:00
Petr Gusev	f7d2300cf9	system_keyspace: drop redundant save_truncation_record overload	2023-10-03 17:11:40 +04:00
Petr Gusev	da1e6751e9	table: rename cache_truncation_record -> set_truncation_time This is a refactoring commit without observable changes in behaviour. There is a truncation_record struct, but in this method we only care about time, so rename it (and other related methods) appropriately to avoid confusion.	2023-10-03 17:11:35 +04:00
Petr Gusev	1b2e0d0cc9	system_keyspace: get_truncated_position -> get_truncated_positions This method can return many replay_positions, so the plural form is more appropriate.	2023-09-28 12:25:40 +04:00
Tomasz Grabiec	3d4398d1b2	Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620). If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957). When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary. We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`. Fixes: #7620 Fixes: #13957 Closes scylladb/scylladb#15331 * github.com:scylladb/scylladb: test: add test for group 0 schema versioning test/pylib: log_browsing: fix type hint feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0 migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations schema_tables: use schema version from group 0 if present migration_manager: store `group0_schema_version` in `scylla_local` during schema changes migration_manager: migration_request handler: assume `canonical_mutation` support system_keyspace: make `get/set_scylla_local_param` public feature_service: add `GROUP0_SCHEMA_VERSIONING` feature schema_tables: refactor `scylla_tables(schema_features)` migration_manager: add `std::move` to avoid a copy schema_tables: remove default value for `reload` in `merge_schema` schema_tables: pass `reload` flag when calling `merge_schema` cross-shard system_keyspace: fix outdated comment	2023-09-20 10:43:40 +02:00
Kamil Braun	bc6f7d1b20	Merge 'raft topology: add garbage collection for internal CDC generations table' from Patryk Jędrzejczak We add garbage collection for the `CDC_GENERATIONS_V3` table to prevent it from endlessly growing. This mechanism is especially needed because we send the entire contents of `CDC_GENERATIONS_V3` as a part of the group 0 snapshot. The solution is to keep a clean-up candidate, which is one of the already published CDC generations. The CDC generation publisher introduced in #15281 continually uses this candidate to remove all generations with timestamps not exceeding the candidate's and sets a new candidate when needed. We also add `test_cdc_generation_clearing.py` that verifies this new mechanism. Fixes #15323 Closes scylladb/scylladb#15413 * github.com:scylladb/scylladb: test: add test_cdc_generation_clearing raft topology: remove obsolete CDC generations raft topology: set CDC generation clean-up candidate topology_coordinator: refactor publish_oldest_cdc_generation system_keyspace: introduce decode_cdc_generation_id system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3	2023-09-18 11:30:10 +02:00
Kamil Braun	7ab7588d59	migration_manager: store `group0_schema_version` in `scylla_local` during schema changes We extend schema mutations with an additional mutation to the `system.scylla_local` table which: - in Raft mode, stores a UUID under the `group0_schema_version` key. - outside Raft mode, stores a tombstone under that key. As we will see in later commits, nodes will use this after applying schema mutations. If the key is absent or has a tombstone, they'll calculate the global schema digest on their own -- using the old way. If the key is present, they'll take the schema version from there. The Raft-mode schema version is equal to the group 0 state ID of this schema command. The tombstone is necessary for the case of performing a schema change in RECOVERY mode. It will force a revert to the old digest-based way. Note that extending schema mutations with a `system.scylla_local` mutation is possible thanks to earlier commits which moved `system.scylla_local` to schema commitlog, so all mutations in the schema mutations vector still go to the same commitlog domain.	2023-09-15 14:32:45 +02:00
Kamil Braun	3ab244e6d9	system_keyspace: make `get/set_scylla_local_param` public We'll use it outside `system_keyspace` code in later commit.	2023-09-15 13:04:04 +02:00
Kamil Braun	9017b998ca	system_keyspace: fix outdated comment	2023-09-15 13:04:04 +02:00
Patryk Jędrzejczak	e375e769b9	raft topology: set CDC generation clean-up candidate We want to use the clean-up candidates to remove the obsolete CDC generation data, but first, we need to set suitable generations as a candidate when there is no candidate. Since CDC generations must be published before we remove them, a generation that is being published is a good candidate.	2023-09-15 09:23:59 +02:00
Petr Gusev	a683cebb02	system_keyspace: scylla_local: use schema commitlog We remove flush from set_scylla_local_param_as since it's now redundant. We add it to save_local_enabled_features as features need to be available before schema commitlog replay. We skip the flush if save_local_enabled_features is called from topology_state_load when the features are migrated to system.topology and we don't need strict durability.	2023-09-13 23:17:20 +04:00
Petr Gusev	beb29f094b	system_keyspace: drop load phases We want to switch system.scylla_local table to the schema commitlog, but load phases hamper here - schema commitlog is initialized after phase1, so a table which is using it should be moved to phase2, but system.scylla_local contains features, and we need them before schema commitlog initialization for SCHEMA_COMMITLOG feature. In this commit we are taking a different approach to loading system tables. First, we load them all in one pass in 'readonly' mode. In this mode, the table cannot be written to and has not yet been assigned a commit log. To achieve this we've added _readonly bool field to the table class, it's initialized to true in table's constructor. In addition, we changed the table constructor to always assign nullptr to commitlog, and we trigger an internal error if table.commitlog() property is accessed while the table is in readonly mode. Then, after triggering on_system_tables_loaded notifications on feature_service and sstable_format_selector, we call system_keyspace::mark_writable and eventually table::mark_ready_for_writes which selects the proper commitlog and marks the table as writable. In sstable_compaction_test we drop several mark_ready_for_writes calls since they are redundant, the table has already been made writable in env.make_table_for_tests call. The table::commitlog function either returns the current commitlog or causes an error if the table is readonly. This didn't work for virtual tables, since they never called mark_ready_for_writes. In this commit we add this call to initialize_virtual_tables.	2023-09-13 23:17:20 +04:00
Petr Gusev	2a0b228d17	main.cc: inline and split system_keyspace.setup Our goal is to switch system.local table to schema commitlog and stop doing flushes when we write to it. This means it would be incorrect to read from this table until schema commitlog is replayed. On the other hand, we need truncation records to be loaded before we start replaying schema commitlog, since commitlog_replayer relies on them. In this commit we inline the system_keyspace::setup function and split its content into two parts. In the first part, before schema commitlog replay, we load truncation records. It's safe to load them before schema commitlog replay since we intend to let the flushes on writes to system.truncated table. In the second part, after schema commitlog replay, we do the rest of the job - build_bootstrap_info and db::schema_tables::save_system_schema. We decided to inline this function since there is very low cohesion between the actions it's performing. It's just simpler to reason about them individually.	2023-09-13 23:00:15 +04:00
Petr Gusev	e395086557	system_keyspace: move initialize_virtual_tables into virtual_tables.hh This is a readability refactoring commit without observable changes in behaviour. initialize_virtual_tables logically belongs to virtual_tables module, and it allows to make other functions in virtual_tables.cc (register_virtual_tables, install_virtual_readers) local to the module, which simplifies the matters a bit. all_virtual_tables() is not needed anymore, all the references to registered virtual tables are now local to virtual_tables module and can just use virtual_tables variable directly.	2023-09-13 23:00:15 +04:00
Petr Gusev	c4787a160b	system_keyspace: remove unused parameter	2023-09-13 23:00:15 +04:00
Pavel Emelyanov	5d52a35e05	system_keyspace: Don't require snitch argument on start Now system keyspace is finally independent from snitch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-05 12:57:09 +03:00
Pavel Emelyanov	1daa8fa3bb	system_keyspace: Don't cache local dc:rack pair Now no code needs it from system keyspace Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-05 12:56:45 +03:00
Pavel Emelyanov	9926917bf5	system_keyspace: Save local info with explicit location On boot system keyspace is kicked to insert local info into system.local table. Among other things there's dc:rack pair which sys.ks. gets from its cache which, in turn, should have been previously initialized from snitch on sys.ks. start. This patch makes the local info updating method get the dc:rack from caller via argument. Callers, in turn, call snitch directly, because these are main and cql_test_env startup routines. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-05 12:54:46 +03:00
Pavel Emelyanov	6bc30f1944	system_keyspace: De-bloat .setup() from messing with system.local On boot several manipulations with system.local are performed. 1. The host_id value is selected from it with key = local If not found, system_keyspace generates a new host_id, inserts the new value into the table and returns back 2. The cluster_name is selected from it with key = local Then it's system_keyspace that either checks that the name matches the one from db::config, or inserts the db::config value into the table 3. The row with key = local is updated with various info like versions, listen, rpc and bcast addresses, dc, rack, etc. Unconditionally All three steps are scattered over main, p.1 is called directly, p.2 and p.3 are executed via system_keyspace::setup() that happens rather late. Also there's some touch of this table from the cql_test_env startup code. The proposal is to collect this setup into one place and execute it early -- as soon as the system.local table is populated. This frees the system_keyspace code from the logic of selecting host id and cluster name leaving it to main and keeps it with only select/insert work. refs: #2795 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #15082	2023-08-20 21:24:31 +03:00
Pavel Emelyanov	7a342ed5c0	system_keyspace: Make force_blocking_flush() non-static Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:20 +03:00
Piotr Dulikowski	8f491457ae	system_keyspace: separate loading topology_features from topology Now, it is possible to load topology_features separately from the topology struct. It will be used in the code that checks enabled features on startup.	2023-08-04 12:32:04 +02:00
Avi Kivity	3de7cacdf3	Merge 'De-static system_keyspace's [gs]et_scylla_local_param(_as)?' from Pavel Emelyanov Those without `_as` suffix are just marked non-static The `..._as` ones are made class methods (now they are local to system_keyspace.cc) After that the `..._as` ones are patched to use `this->` instead of `qctx` Closes #14890 * github.com:scylladb/scylladb: system_keyspace: Stop using qctx in [gs]et_scylla_local_param_as() system_keyspace: Reuse container() and _db member for flushing system_keyspace: Make [gs]et_scylla_local_param_as() class methods system_keyspace: De-static [gs]et_scylla_local_param()	2023-07-31 21:51:04 +03:00
Pavel Emelyanov	1ac4b7d2fe	system_keyspace: Make [gs]et_scylla_local_param_as() class methods These are now two .cc-local templatized helpers, but they are only called by system_keyspace:: non-static methods, so can be such as well Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-31 16:02:18 +03:00
Pavel Emelyanov	04b12d24fd	system_keyspace: De-static [gs]et_scylla_local_param() All same-class callers are now non-static methods of system_keyspace, all external callers do it via an object at hand. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-31 16:02:18 +03:00
Botond Dénes	72043a6335	Merge 'Avoid using qctx in schema_tables' column-mapping queries' from Pavel Emelyanov There are three methods in system_keyspace namespace that run queries over `system.scylla_table_schema_history` table. For that they use qctx which's not nice. Fortunately, all the callers already have the system_keyspace& local variable or argument they can pass to those methods. Since the accessed table belongs to system keyspace, the latter declares the querying methods as "friends" to let them get private `query_processor& _qp` member Closes #14876 * github.com:scylladb/scylladb: schema_tables: Extract query_processor from system_keyspace for querying schema_tables: Add system_keyspace& argument to ..._column_mapping() calls migration_manager: Add system_keyspace argument to get_schema_mapping()	2023-07-31 15:00:59 +03:00
Pavel Emelyanov	ab6dbe654f	schema_tables: Extract query_processor from system_keyspace for querying The schema_tables() column-mapping code runs queries over system. table, but it needs LOCAL_ONE CL and cherry-pick on caching, so regular system_keyspace::execute_cql() won't work here. However, since schema_tables is somewhat part of system_keyspace, it's natural to let the former fetch private query_processor& from the latter Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-28 16:02:14 +03:00
Pavel Emelyanov	d311784721	system_keyspace: De-static set_raft_group0_id() The caller is group0 code with sys_ks local variable Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-28 13:13:59 +03:00
Pavel Emelyanov	7837bc7d5a	system_keyspace: De-static get_raft_group0_id() The callers are in group0 code that have sys_ks local variable/argument Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-28 13:13:11 +03:00
Pavel Emelyanov	26dd7985a8	system_keyspace: De-static get_last_group0_state_id() The caller is raft_group0_client with sys.ks. dependency reference and group0_state_machine with raft_group0_client exporing its sys.ks. This makes it possible to instantly drop one more qctx reference Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-28 13:12:04 +03:00
Pavel Emelyanov	3de0efd32c	system_keyspace: De-static group0_history_contains() The caller is raft_group0_client with sys.ks. dependency reference. This allows to drop one qctx reference right at once Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-28 13:11:08 +03:00
Pavel Emelyanov	c017117340	system_keyspace: Remove qctx usage from load_topology_state() Fortunately, this is pretty simple -- the only caller is storage_service that has sharded<system_keysace> dependency reference Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #14824	2023-07-27 08:56:40 +03:00
Avi Kivity	615544a09a	Merge 'Init messaging service preferred IP cache via config' from Pavel Emelyanov This is to make m.s. initialization more solid and simplify sys.ks.::setup() Closes #14832 * github.com:scylladb/scylladb: system_keyspace: Remove unused snitch arg from setup() messaging_service: Setup preferred IPs from config	2023-07-26 22:12:28 +03:00
Pavel Emelyanov	6b82071064	system_keyspace: Remove unused snitch arg from setup() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-26 16:05:26 +03:00
Pavel Emelyanov	0fba57a3e8	messaging_service: Setup preferred IPs from config Population of messageing service preferred IPs cache happens inside system keyspace setup() call and it needs m.s. per ce and additionally snitch. Moving preferred ip cache to initial configuration keeps m.s. start more self-contained and keeps system_keyspace::setup() simpler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-26 16:03:23 +03:00
Pavel Emelyanov	db1c6e2255	system_keyspace: Make save_truncation_record() non-static ... and stop using qctx Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-21 13:12:50 +03:00
Pavel Emelyanov	8a87c87824	db/system_keyspace: Move and use qctx::execute_cql_with_timeout() This template call is only used by system keyspace paxos methods. All those methods are no longer static and can use system_keyspace::_qp reference to real query processor instead of global qctx. The execute_cql_with_timeout() wrapper is moved to system_keyspace to make it work Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-19 19:32:10 +03:00
Pavel Emelyanov	b9ef16c06f	db/system_keyspace: Make paxos methods non-static The service::paxos_state methods that call those already have system keyspace reference at hand and can call method on an object Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-19 19:32:10 +03:00
Kamil Braun	028183c793	main, cql_test_env: simplify `system_keyspace` initialization Initialization of `system_keyspace` is now all done at once instead of being spread out through the entire procedure. This is doable because `query_processor` is now available early. A couple of FIXMEs have been resolved.	2023-06-18 13:39:27 +02:00
Kamil Braun	33c19baabc	db: system_keyspace: take simpler service references in `make` Take references to services which are initialized earlier. The references to `gossiper`, `storage_service` and `raft_group0_registry` are no longer needed. This will allow us to move the `make` step right after starting `system_keyspace`.	2023-06-18 13:39:27 +02:00

1 2 3 4 5 ...

291 Commits