scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Patryk Jędrzejczak	0357636f16	db/system_distributed_keyspace: fix indentation Broken in the previous commit.	2023-11-02 14:21:15 +01:00
Patryk Jędrzejczak	813c7a582c	db/system_distributed_keyspace: retry start on concurrent operation A concurrent group 0 operation in system_distributed_keyspace::start can happen during concurrent bootstrap in the Raft-based topology.	2023-11-02 14:21:15 +01:00
Kamil Braun	0846d324d7	Merge 'rollback topology operation on streaming failure' from Gleb This patch series adds error handling for streaming failure during topology operations instead of an infinite retry. If streaming fails the operation is rolled back: bootstrap/replace nodes move to left and decommissioned/remove nodes move back to normal state. * 'gleb/streaming-failure-rollback-v4' of github.com:scylladb/scylla-dev: raft: make sure that all operation forwarded to a leader are completed before destroying raft server storage_service: raft topology: remove code duplication from global_tablet_token_metadata_barrier tests: add tests for streaming failure in bootstrap/replace/remove/decomission test/pylib: do not stop node if decommission failed with an expected error storage_service: raft topology: fix typo in "decommission" everywhere storage_service: raft topology: add streaming error injection storage_service: raft topology: do not increase topology version during CDC repair storage_service: raft topology: rollback topology operation on streaming failure. storage_service: raft topology: load request parameters in left_token_ring state as well storage_service: raft topology: do not report term_changed_error during global_token_metadata_barrier as an error storage_service: raft topology: change global_token_metadata_barrier error handling to try/catch storage_service: raft topology: make global_token_metadata_barrier node independent storage_service: raft topology: split get_excluded_nodes from exec_global_command storage_service: raft topology: drop unused include_local and do_retake parameters from exec_global_command which are always true storage_service: raft topology: simplify streaming RPC failure handling	2023-11-02 10:15:45 +01:00
Kamil Braun	ae58e39743	Merge 'reduce announcements of the automatic schema changes' from Patryk Jędrzejczak There are some schema modifications performed automatically (during bootstrap, upgrade etc.) by Scylla that are announced by multiple calls to `migration_manager::announce` even though they are logically one change. Precisely, they appear in: - `system_distributed_keyspace::start`, - `redis:create_keyspace_if_not_exists_impl`, - `table_helper::setup_keyspace` (for the `system_traces` keyspace). All these places contain a FIXME telling us to `announce` only once. There are a few reasons for this: - calling `migration_manager::announce` with Raft is quite expensive -- taking a `read_barrier` is necessary, and that requires contacting a leader, which then must contact a quorum, - we must implement a retrying mechanism for every automatic `announce` if `group0_concurrent_modification` occurs to enable support for concurrent bootstrap in Raft-based topology. Doing it before the FIXMEs mentioned above would be harder, and fixing the FIXMEs later would also be harder. This PR fixes the first two FIXMEs and improves the situation with the last one by reducing the number of the `announce` calls to two. Unfortunately, reducing this number to one requires a big refactor. We can do it as a follow-up to a new, more specific issue. Also, we leave a new FIXME. Fixing the first two FIXMEs required enabling the announcement of a keyspace together with its tables. Until now, the code responsible for preparing mutations for a new table could assume the existence of the keyspace. This assumption wasn't necessary, but removing it required some refactoring. Fixes scylladb/scylladb#15437 Closes scylladb/scylladb#15897 * github.com:scylladb/scylladb: table_helper: announce twice in setup_keyspace table_helper: refactor setup_table redis: create_keyspace_if_not_exists_impl: fix indentation redis: announce once in create_keyspace_if_not_exists_impl db: system_distributed_keyspace: fix indentation db: system_distributed_keyspace: announce once in start tablet_allocator: update on_before_create_column_family migration_listener: add parameter to on_before_create_column_family alternator: executor: use new prepare_new_column_family_announcement alternator: executor: introduce create_keyspace_metadata migration_manager: add new prepare_new_column_family_announcement	2023-11-02 09:32:35 +01:00
Piotr Smaroń	8c464b2ddb	guardrails: restrict replication strategy (RS) Replacing `restrict_replication_simplestrategy` config option with 2 config options: `replication_strategy_{warn,fail}_list`, which allow us to impose soft limits (issue a warning) and hard limits (not execute CQL) on replication strategy when creating/altering a keyspace. The reason to rather replace than extend `restrict_replication_simplestrategy` config option is that it was not used and we wanted to generalize it. Only soft guardrail is enabled by default and it is set to SimpleStrategy, which means that we'll generate a CQL warning whenever replication strategy is set to SimpleStrategy. For new cloud deployments we'll move SimpleStrategy from warn to the fail list. Guardrails violations will be tracked by metrics. Resolves #5224 Refs #8892 (the replication strategy part, not the RF part) Closes scylladb/scylladb#15399	2023-10-31 18:34:41 +03:00
Avi Kivity	ef7db6df99	Merge 'schema_tables: turn view schema fixing code into a sanity check' from Kamil Braun The purpose of `maybe_fix_legacy_secondary_index_mv_schema` was to deal with legacy materialized view schemas used for secondary indexes, schemas which were created before the notion of "computed columns" was introduced. Back then, secondary index schemas would use a regular "token" column. Later it became a computed column and old schemas would be migrated during rolling upgrade. The migration code was introduced in 2019 (`db8d4a0cc6`) and then fixed in 2020 (`d473bc9b06`). The fix was present in Enterprise 2022.1 and in OSS 4.5. So, assuming that users don't try crazy things like upgrading from 2021.X to 2023.X (which we do not support), all clusters will have already executed the migration code once they upgrade to 2023.X, meaning we can get rid of it. The main motivation of this PR is to get rid of the `db::schema_tables::merge_schema` call in `parse_schema_tables`. In Raft mode this was the only call to `merge_schema` outside "group 0 code" and in fact it is unsafe -- it uses locally generated mutations with locally generated timestamp (`api::new_timestamp()`), so if we actually did it, we would permanently diverge the group 0 state machine across nodes (the schema pulling code is disabled in Raft mode). Fortunately, this should be dead code by now, as explained in the previous paragraph. The migration code is now turned into a sanity check, if the users try something crazy, they will get an error instead of silent data corruption. Closes scylladb/scylladb#15695 * github.com:scylladb/scylladb: view: remove unused `_backing_secondary_index` schema_tables: turn view schema fixing code into a sanity check schema_tables: make comment more precise feature_service: make COMPUTED_COLUMNS feature unconditionally true	2023-10-31 13:23:19 +02:00
Patryk Jędrzejczak	df199eec11	db: system_distributed_keyspace: fix indentation Broken in the previous commit.	2023-10-31 12:08:03 +01:00
Patryk Jędrzejczak	91ff8007b3	db: system_distributed_keyspace: announce once in start We refactor system_distributed_keyspace::start so that it takes at most one group 0 guard and calls migration_manager::announce at most once. We remove a catch expression together with the FIXME from get_updated_service_levels (add_new_columns_if_missing before the patch) because we cannot treat the service_levels update differently anymore.	2023-10-31 12:08:03 +01:00
Avi Kivity	d450a145ce	Revert "Merge 'reduce announcements of the automatic schema changes ' from Patryk Jędrzejczak" This reverts commit `4b80130b0b`, reversing changes made to `a5519c7c1f`. It's suspected of causing dtest failures due to a bug in coroutine::parallel_for_each.	2023-10-29 18:32:06 +02:00
Kamil Braun	1c0ae2e7ef	Merge 'raft topology: assign tokens after join node response rpc' from Piotr Dulikowski Currently, when the topology coordinator accepts a node, it moves it to bootstrap state and assigns tokens to it (either new ones during bootstrap, or the replaced node's tokens). Only then it contacts the joining node to tell it about the decision and let it perform a read barrier. However, this means that the tokens are inserted too early. After inserting the tokens the cluster is free to route write requests to it, but it might not have learned about all of the schema yet. Fix the issue by inserting the tokens later, after completing the join node response RPC which forces the receiving node to perform a read barrier. Refs: scylladb/scylladb#15686 Fixes: scylladb/scylladb#15738 Closes scylladb/scylladb#15724 * github.com:scylladb/scylladb: test: test_topology_ops: continuously write during the test raft topology: assign tokens after join node response rpc storage_service: fix indentation after previous commit raft topology: loosen assumptions about transition nodes having tokens	2023-10-29 18:30:32 +02:00
Marcin Maliszkiewicz	020a9c931b	db: view: run local materialized view mutations on a separate smp service group When base write triggers mv write and it needs to be send to another shard it used the same service group and we could end up with a deadlock. This fix affects also alternator's secondary indexes. Testing was done using (yet) not committed framework for easy alternator performance testing: https://github.com/scylladb/scylladb/pull/13121. I've changed hardcoded max_nonlocal_requests config in scylla from 5000 to 500 and then ran: ./build/release/scylla perf-alternator-workloads --workdir /tmp/scylla-workdir/ --smp 2 \ --developer-mode 1 --alternator-port 8000 --alternator-write-isolation forbid --workload write_gsi \ --duration 60 --ring-delay-ms 0 --skip-wait-for-gossip-to-settle 0 --continue-after-error true --concurrency 2000 Without the patch when scylla is overloaded (i.e. number of scheduled futures being close to max_nonlocal_requests) after couple seconds scylla hangs, cpu usage drops to zero, no progress is made. We can confirm we're hitting this issue by seeing under gdb: p seastar::get_smp_service_groups_semaphore(2,0)._count $1 = 0 With the patch I wasn't able to observe the problem, even with 2x concurrency. I was able to make the process hang with 10x concurrency but I think it's hitting different limit as there wasn't any depleted smp service group semaphore and it was happening also on non mv loads. Fixes https://github.com/scylladb/scylladb/issues/15844 Closes scylladb/scylladb#15845	2023-10-29 18:30:32 +02:00
Gleb Natapov	0a8c3e5c78	storage_service: raft topology: load request parameters in left_token_ring state as well Next patch will want to access request parameters in left_token_ring for failure recovery purposes.	2023-10-25 12:56:27 +03:00
Piotr Dulikowski	63aa9332aa	raft topology: assign tokens after join node response rpc Currently, when the topology coordinator accepts a node, it moves it to bootstrap state and assigns tokens to it (either new ones during bootstrap, or the replaced node's tokens). Only then it contacts the joining node to tell it about the decision and let it perform a read barrier. However, this means that the tokens are inserted too early. After inserting the tokens the cluster is free to route write requests to it, but it might not have learned about all of the schema yet. Fix the issue by inserting the tokens later, after completing the join node response RPC which forces the receiving node to perform a read barrier.	2023-10-25 11:50:17 +02:00
Piotr Dulikowski	2d161676c7	raft topology: loosen assumptions about transition nodes having tokens In later commits, tokens for a joining/replacing node will not be inserted when the node enters `bootstrapping`/`replacing` state but at some later step of the procedure. Loosen some of the assumptions in `storage_service::topology_state_load` and `system_keyspace::load_topology_state` appropriately.	2023-10-25 11:50:17 +02:00
Nadav Har'El	4b80130b0b	Merge 'reduce announcements of the automatic schema changes ' from Patryk Jędrzejczak There are some schema modifications performed automatically (during bootstrap, upgrade etc.) by Scylla that are announced by multiple calls to `migration_manager::announce` even though they are logically one change. Precisely, they appear in: - `system_distributed_keyspace::start`, - `redis:create_keyspace_if_not_exists_impl`, - `table_helper::setup_keyspace` (for the `system_traces` keyspace). All these places contain a FIXME telling us to `announce` only once. There are a few reasons for this: - calling `migration_manager::announce` with Raft is quite expensive -- taking a `read_barrier` is necessary, and that requires contacting a leader, which then must contact a quorum, - we must implement a retrying mechanism for every automatic `announce` if `group0_concurrent_modification` occurs to enable support for concurrent bootstrap in Raft-based topology. Doing it before the FIXMEs mentioned above would be harder, and fixing the FIXMEs later would also be harder. This PR fixes the first two FIXMEs and improves the situation with the last one by reducing the number of the `announce` calls to two. Unfortunately, reducing this number to one requires a big refactor. We can do it as a follow-up to a new, more specific issue. Also, we leave a new FIXME. Fixing the first two FIXMEs required enabling the announcement of a keyspace together with its tables. Until now, the code responsible for preparing mutations for a new table could assume the existence of the keyspace. This assumption wasn't necessary, but removing it required some refactoring. Fixes #15437 Closes scylladb/scylladb#15594 * github.com:scylladb/scylladb: table_helper: announce twice in setup_keyspace table_helper: refactor setup_table redis: create_keyspace_if_not_exists_impl: fix indentation redis: announce once in create_keyspace_if_not_exists_impl db: system_distributed_keyspace: fix indentation db: system_distributed_keyspace: announce once in start tablet_allocator: update on_before_create_column_family migration_listener: add parameter to on_before_create_column_family alternator: executor: use new prepare_new_column_family_announcement alternator: executor: introduce create_keyspace_metadata migration_manager: add new prepare_new_column_family_announcement	2023-10-24 15:42:48 +03:00
Kamil Braun	db49ccccb0	view: remove unused `_backing_secondary_index` This boolean was only used for a sanity check which was replaced with a stronger sanity check in the previous commit that doesn't require the boolean.	2023-10-24 13:33:36 +02:00
Kamil Braun	3976808b12	schema_tables: turn view schema fixing code into a sanity check The purpose of `maybe_fix_legacy_secondary_index_mv_schema` was to deal with legacy materialized view schemas used for secondary indexes, schemas which were created before the notion of "computed columns" was introduced. Back then, secondary index schemas would use a regular "token" column. Later it became a computed column and old schemas would be migrated during rolling upgrade. The migration code was introduced in 2019 (`db8d4a0cc6`) and then fixed in 2020 (`d473bc9b06`). The fix was present in Enterprise 2022.1 and in OSS 4.5. So, assuming that users don't try crazy things like upgrading from 2021.X to 2023.X (which we do not support), all clusters will have already executed the migration code once they upgrade to 2023.X, meaning we can get rid of it. The main motivation of this patch is to get rid of the `db::schema_tables::merge_schema` call in `parse_schema_tables`. In Raft mode this was the only call to `merge_schema` outside "group 0 code" and in fact it is unsafe -- it uses locally generated mutations with locally generated timestamp (`api::new_timestamp()`), so if we actually did it, we would permanently diverge the group 0 state machine across nodes (the schema pulling code is disabled in Raft mode). Fortunately, this should be dead code by now, as explained in the previous paragraph. The migration code is now turned into a sanity check, if the users try something crazy, they will get an error instead of silent data corruption.	2023-10-24 13:33:35 +02:00
Kamil Braun	f02ac9a9e7	schema_tables: make comment more precise `maybe_fix_legacy_secondary_index_mv_schema` function has this piece of code: ``` // If the first clustering key part of a view is a column with name not found in base schema, // it implies it might be backing an index created before computed columns were introduced, // and as such it must be recreated properly. if (!base_schema->columns_by_name().contains(first_view_ck.name())) { schema_builder builder{schema_ptr(v)}; builder.mark_column_computed(first_view_ck.name(), std::make_unique<legacy_token_column_computation>()); if (preserve_version) { builder.with_version(v->version()); } return view_ptr(builder.build()); } ``` The comment uses the phrase "it might be". However, the code inside the `if` assumes that it "must be": once we determined that the first column in this materialized view does not have a corresponding name in the base table, we set it to be computed using `legacy_token_column_computation`, so we assumed that the column was indeed storing the token. Doing that for a column which is not the token column would be a small disaster. Assuming that the code is correct, we can make the comment more precise. I checked the documentation and I don't see any other way how we could have such a column other than the token column which is internally created by Scylla when creating a secondary index (for example, it is forbidden to use an alias in select statement when creating materialized views, which I checked experimentally).	2023-10-24 13:30:13 +02:00
Kamil Braun	5397524875	feature_service: make COMPUTED_COLUMNS feature unconditionally true The feature is assumed to be true, it was introduced in 2019. It's still advertised in gossip, but it's assumed to always be present. The `schema_feature` enum class still contains `COMPUTED_COLUMNS`, and the `all_tables` function in schema_tables.cc still checks for the schema feature when deciding if `computed_columns()` table should be included. This is necessary because digest calculation tests contain many digests calculated with the feature disabled, if we wanted to make it unconditional in the schema_tables code we'd have to regenerate almost all digests in the tests. It is simpler to leave the possibility for the tests to disable the feature.	2023-10-24 13:30:13 +02:00
Kefu Chai	b36cef6f1a	sstable: remove _remote_prefix from s3_storage since we use the sstable.generation() for the remote prefix of the key of the object for storing the sstable component, there is no need to set remote_prefix beforehand. since `s3_storage::ensure_remote_prefix()` and `system_kesypace::sstables_registry_lookup_entry()` are not used anymore, they are removed. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-10-23 10:08:22 +08:00
Kefu Chai	af8bc8ba63	sstable: switch to uuid identifier for naming S3 sstable objects before this change, we create a new UUID for a new sstable managed by the s3_storage, and we use the string representation of UUID defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for naming the objects stored on s3_storage. but this representation is not what we are using for storing sstables on local filesystem when the option of "uuid_sstable_identifiers_enabled" is enabled. instead, we are using a base36-based representation which is shorter. to be consistent with the naming of the sstables created for local filesystem, and more importantly, to simplify the interaction between the local copy of sstables and those stored on object storage, we should use the same string representation of the sstable identifier. so, in this change: 1. instead of creating a new UUID, just reuse the generation of the sstable for the object's key. 2. do not store the uuid in the sstable_registry system table. As we already have the generation of the sstable for the same purpose. 3. switch the sstable identifier representation from the one defined by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the base36-based one (implemented by fmt::formatter<sstables::generation_type>) 4. enable the `uuid_sstable_identifers` cluster feature if it is enabled in the `test_env_config`, so that it the sstable manager can enable the uuid-based uuid when creating a new uuid for sstable. 5. throw if the generation of sstable is not UUID-based when accessing / manipulating an sstable with S3 storage backend. as the S3 storage backend now relies on this option. as, otherwise we'd have sstables with key like s3://bucket/number/basename, which is just unable to serve as a unique id for sstable if the bucket is shared across multiple tables. Fixes #14175 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-10-23 10:08:22 +08:00
Kamil Braun	c1486fee40	Merge 'commitlog: drop truncation_records after replay' from Petr Gusev This is a follow-up for #15279 and it fixes two problems. First, we restore flushes on writes for the tables that were switched to the schema commitlog if `SCHEMA_COMMITLOG` feature is not yet enabled. Otherwise durability is not guaranteed. Second, we address the problem with truncation records, which could refer to the old commitlog if any of the switched tables were truncated in the past. If the node crashes later, and we replay schema commitlog, we may skip some mutations since their `replay_position`s will be smaller than the `replay_position`s stored for the old commitlog in the `truncated` table. It turned out that this problem exists even if we don't switch commitlogs for tables. If the node was rebooted the segment ids will start from some small number - they use `steady_clock` which is usually bound to boot time. This means that if the node crashed we may skip the mutations because their RPs will be smaller than the last truncation record RP. To address this problem we delete truncation records as soon as commitlog is replayed. We also include a test which demonstrates the problem. Fixes #15354 Closes scylladb/scylladb#15532 * github.com:scylladb/scylladb: add test_commitlog system.truncated: Remove replay_position data from truncated on start main.cc: flush only local memtables when replaying schema commitlog main.cc: drop redundant supervisor::notify system_keyspace: flush if schema commitlog is not available	2023-10-18 11:14:31 +02:00
Botond Dénes	7f81957437	Merge 'Initialize datadir for system and non-system keyspaces the same way' from Pavel Emelyanov When populating system keyspace the sstable_directory forgets to create upload/ subdir in the tables' datadir because of the way it's invoked from distributed loader. For non-system keyspaces directories are created in table::init_storage() which is self-contained and just creates the whole layout regardless of what. This PR makes system keyspace's tables use table::init_storage() as well so that the datadir layout is the same for all on-disk tables. Test included. fixes: #15708 closes: scylladb/scylla-manager#3603 Closes scylladb/scylladb#15723 * github.com:scylladb/scylladb: test: Add test for datadir/ layout sstable_directory: Indentation fix after previous patch db,sstables: Move storage init for system keyspace to table creation	2023-10-18 12:12:19 +03:00
Avi Kivity	f42eb4d1ce	Merge 'Store and propagage GC timestamp markers from commitlog' from Calle Wilund Fixes #14870 (Originally suggested by @avikivity). Use commit log stored GC clock min positions to narrow compaction GC bounds. (Still requires augmented manual flush:es with extensive CL clearing to pass various dtest, but this does not affect "real" execution). Adds a lowest timestamp of GC clock whenever a CF is added to a CL segment the first time. Because GC clock is wall clock time and only connected to TTL (not cell/row timestamps), this gives a fairly accurate view of GC low bounds per segment. This is then (in a rather ugly way) propagated to tombstone_gc_state to narrow the allowed GC bounds for a CF, based on what is currently left in CL. Note: this is a rather unoptimized version - no caching or anything. But even so, should not be excessively expensive, esp. since various other code paths already cache the results. Closes scylladb/scylladb#15060 * github.com:scylladb/scylladb: main/cql_test_env: Augment compaction mgr tombstone_gc_state with CL GC info tombstone_gc_state: Add optional callback to augment GC bounds commitlog: Add keeping track of approximate lowest GC clock for CF entries database: Force new commitlog segment on user initiated flush commitlog: Add helper to force new active segment	2023-10-17 18:27:43 +03:00
Calle Wilund	6fbd210679	system.truncated: Remove replay_position data from truncated on start Once we've started clean, and all replaying is done, truncation logs commit log regarding replay positions are invalid. We should exorcise them as soon as possible. Note that we cannot remove truncation data completely though, since the time stamps stored are used by things like batch log to determine if it should use or discard old batch data.	2023-10-17 18:16:48 +04:00
Petr Gusev	c89ead55ff	system_keyspace: flush if schema commitlog is not available In PR #15279 we removed flushes when writing to a number of tables from the system keyspace. This was made possible by switching these tables to the schema commitlog. Schema commitlog is enabled only when the SCHEMA_COMMITLOG feature is supported by all nodes in the cluster. Before that these tables will use the regular commitlog, which is not durable because it uses db::commitlog::sync_mode::PERIODIC. This means that we may lose data if a node crashes during upgrade to the version with schema commitlog. In this commit we fix this problem by restoring flushes after writes to the tables if the schema commitlog is not enabled yet. The patch also contains a test that demonstrates the problem. We need flush_schema_tables_after_modification option since otherwise schema changes are not durable and node fails after restart.	2023-10-17 18:14:27 +04:00
Calle Wilund	560d3c17f0	commitlog: Add keeping track of approximate lowest GC clock for CF entries Adds a lowest timestamp of GC clock whenever a CF is added to a CL segment first. Because GC clock is wall clock time and only connected to TTL (not cell/row timestamps), this gives a fairly accurate view of GC low bounds per segment. Includes of course a function to get the all-segment lowest per CF.	2023-10-17 10:26:41 +00:00
Calle Wilund	810d06946f	commitlog: Add helper to force new active segment When called, if active segment holds data, close and replace with pristine one.	2023-10-17 10:26:40 +00:00
Tomasz Grabiec	0aef0f900b	Merge 'truncation records refactorings' from Petr Gusev This PR contains several refactoring, related to truncation records handling in `system_keyspace`, `commitlog_replayer` and `table` clases: * drop map_reduce from `commitlog_replayer`, it's sufficient to load truncation records from the null shard; * add a check that `table::_truncated_at` is properly initialized before it's accessed; * move its initialization after `init_non_system_keyspaces` Closes scylladb/scylladb#15583 * github.com:scylladb/scylladb: system_keyspace: drop truncation_record system_keyspace: remove get_truncated_at method table: get_truncation_time: check _truncated_at is initialized database: add_column_family: initialize truncation_time for new tables database: add_column_family: rename readonly parameter to is_new system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace commitlog_replayer: refactor commitlog_replayer::impl::init system_keyspace: drop redundant typedef system_keyspace: drop redundant save_truncation_record overload table: rename cache_truncation_record -> set_truncation_time system_keyspace: get_truncated_position -> get_truncated_positions	2023-10-17 10:55:30 +02:00
Pavel Emelyanov	059d7c795e	db,sstables: Move storage init for system keyspace to table creation User and system keyspaces are created and populated slightly differently. System keyspace is created via system_keyspace::make() which eventually calls calls add_column_family(). Then it's populated via init_system_keyspace() which calls sstable_directory::prepare() which, in turn, optionally creates directories in datadir/ or checks the directory permissions if it exists User keyspaces are created with the help of add_column_family_and_make_directory() call which calls the add_column_family() mentioned above _and_ calls table::init_storage() to create directories. When it's populated with init_non_system_keyspaces() it also calls sstable_directory::prepare() which notices that the directory exists and then checks the permissions. As a result, sstable_directory::prepare() initializes storage for system keyspace only and there's a BUG (#15708) that the upload/ subdir is not created. This patch makes the directories creation for _all_ keyspaces with the table::init_storage(). The change only touches system keyspace by moving the creation of directories from sstable_directory::prepare() into system_keyspace::make(). Indentation is deliberately left broken. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-16 16:19:25 +03:00
Patryk Jędrzejczak	98d067e77d	db: system_distributed_keyspace: fix indentation Broken in the previous commit.	2023-10-16 14:59:53 +02:00
Patryk Jędrzejczak	5ebc0e8617	db: system_distributed_keyspace: announce once in start We refactor system_distributed_keyspace::start so that it takes at most one group 0 guard and calls migration_manager::announce at most once. We remove a catch expression together with the FIXME from get_updated_service_levels (add_new_columns_if_missing before the patch) because we cannot treat the service_levels update differently anymore.	2023-10-16 14:59:53 +02:00
Jan Ciolek	940e44f887	db/view: change log level of failed view updates to WARN When a remote view update doesn't succeed there's a log message saying "Error applying view update...". This message had log level ERROR, but it's not really a hard error. View updates can fail for a multitude of reasons, even during normal operation. A failing view update isn't fatal, it will be saved as a view hint a retried later. Let's change the log level to WARN. It's something that shouldn't happen too much, but it's not a disaster either. ERROR log level causes trouble in tests which assume that an ERROR level message means that the test has failed. Refs: https://github.com/scylladb/scylladb/issues/15046#issuecomment-1712748784 For local view updates the log level stays at "ERROR", local view updates shouldn't fail. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes scylladb/scylladb#15640	2023-10-11 18:19:23 +03:00
Avi Kivity	35849fc901	Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun" This reverts commit `3d4398d1b2`, reversing changes made to `45dfce6632`. The commit causes some schema changes to be lost due to incorrect timestamps in some mutations. More information is available in [1]. Reopens: scylladb/scylladb#7620 Reopens: scylladb/scylladb#13957 Fixes scylladb/scylladb#15530. [1] https://github.com/scylladb/scylladb/pull/15687	2023-10-11 00:32:05 +03:00
Dawid Medrek	6fdca0d3a8	db/hints/manager: Reword comments about state The current comments should be clearer to someone not familiar with the module. This commit also makes them abide by the limit of 120 characters per line.	2023-10-06 13:25:30 +02:00
Dawid Medrek	aa38ea3642	db/hints/manager: Unfriend space_watchdog space_watchdog is a friend of shard hint manager just to be able to execute one of its functions. This commit changes that by unfriending the class and exposing the function.	2023-10-06 13:25:30 +02:00
Dawid Medrek	6cd0153954	db/hints: Remove a redundant alias	2023-10-06 13:25:30 +02:00
Dawid Medrek	ddc385bce0	db/hints: Remove an unused namespace	2023-10-06 13:25:30 +02:00
Dawid Medrek	76d414012b	db/hints: Coroutinize change_host_filter()	2023-10-06 13:25:30 +02:00
Dawid Medrek	09eb30e6f1	db/hints: Coroutinize drain_for() This commit turns the function into a coroutine and makes the code less compact and more readable.	2023-10-06 13:25:30 +02:00
Dawid Medrek	907a572e24	db/hints: Clean up can_hint_for() This commit gets rid of unnecessary additional calls to functions and makes all lines abide by the limit of 120 characters.	2023-10-06 13:25:30 +02:00
Dawid Medrek	596e1f9859	db/hints: Clean up store_hint() This commit makes the function abide by the limit of 120 characters per line.	2023-10-06 13:25:30 +02:00
Dawid Medrek	8a43f94ca6	db/hints: Clean up too_many_in_flight_hints_for() This commit makes the return statement more readable. It also makes the comment abide by the limit of 120 characters per line.	2023-10-06 13:25:30 +02:00
Dawid Medrek	96a5906621	db/hints: Refactor get_ep_manager()	2023-10-06 13:25:30 +02:00
Dawid Medrek	8b591be3c3	db/hints: Coroutinize wait_for_sync_point() This commit coroutinizes the function and adds a comment explaining a non-trivial case.	2023-10-06 13:25:27 +02:00
Dawid Medrek	fee3aafd80	db/hints: Use std::span in calculate_current_sync_point std::span is a lot more flexible than std::vector as it allows for arbitrary contiguous ranges.	2023-10-06 12:36:05 +02:00
Dawid Medrek	64fd4d6323	db/hints: Clean up manager::forbid_hints_for_eps_with_pending_hints()	2023-10-06 12:26:55 +02:00
Dawid Medrek	58cd5c4167	db/hints: Clean up manager::forbid_hints()	2023-10-06 12:26:55 +02:00
Dawid Medrek	f8ed93f5bc	db/hints: Clean up manager::allow_hints()	2023-10-06 12:26:52 +02:00
Dawid Medrek	bfe32bcf89	db/hints: Coroutinize compute_hints_dir_device_id()	2023-10-06 12:18:30 +02:00

1 2 3 4 5 ...

3435 Commits