scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Benny Halevy	66ba983fe0	compaction_manager: flush_all_tables before major compaction Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See `64ec1c6ec6` However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See `f42eb4d1ce`). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb/scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Kefu Chai	6749d963ed	config: define formatter for db::seed_provider_type before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for db::seed_provider_type. please note, we are still formatting vector<db::seed_provider_type> with the helper provided by seastar/core/sstring.hh, which uses operator<<() to print the elements in the vector being printed. so we have to keep the operator<< formatter before disabling the generic formatter for vector<T>. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16138	2023-11-23 11:04:35 +02:00
Piotr Smaroń	8c464b2ddb	guardrails: restrict replication strategy (RS) Replacing `restrict_replication_simplestrategy` config option with 2 config options: `replication_strategy_{warn,fail}_list`, which allow us to impose soft limits (issue a warning) and hard limits (not execute CQL) on replication strategy when creating/altering a keyspace. The reason to rather replace than extend `restrict_replication_simplestrategy` config option is that it was not used and we wanted to generalize it. Only soft guardrail is enabled by default and it is set to SimpleStrategy, which means that we'll generate a CQL warning whenever replication strategy is set to SimpleStrategy. For new cloud deployments we'll move SimpleStrategy from warn to the fail list. Guardrails violations will be tracked by metrics. Resolves #5224 Refs #8892 (the replication strategy part, not the RF part) Closes scylladb/scylladb#15399	2023-10-31 18:34:41 +03:00
David Garcia	1121a4df04	docs: add groups to reference docs fix: comment Closes scylladb/scylladb#15592	2023-10-04 11:42:36 +03:00
Aleksandra Martyniuk	8a65477202	tasks: db: change default task_ttl value If a test isn't going to use task manager or isn't interested in statuses of finished tasks, then keeping them in the memory for some time (currently 10s by default) after they are finished is a memory waste. Set default task_ttl value to zero. It can be changed by setting --task-ttl-in-seconds or through rest api (/task_manager/ttl). In conf/scylla.yaml set task-ttl-in-seconds to 10. Closes #15239	2023-09-07 12:42:29 +03:00
Kefu Chai	f6cca741ea	config: remove "experimental" option "experimental" option was marked "Unused" in `64bc8d2f7d`. but we chose to keep it in hope that the upgrade test does not fail. despite that the upgrade tests per-se survived the "upgrade", after the upgrade, the tests exercising the experimental features are still failing hard. they have not been updated to set the "experimental-features" option, and are still relying on "experimental" to enable all the experimental features under test. so, in this change, let's just drop the option so that scylla can fail early at seeing this "experimental" option. this should help us to identify the tests relying on it quicker. as the "experimental" features should only be used in development environment, this change should have no impact to production. Refs #15214 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15233	2023-09-05 10:09:04 +03:00
Piotr Smaroń	eb46f1bd17	guardrails: restrict replication factor (RF) Replacing `minimum_keyspace_rf` config option with 4 config options: `{minimum,maximum}_replication_factor_{warn,fail}_threshold`, which allow us to impose soft limits (issue a warning) and hard limits (not execute CQL) on RF when creating/altering a keyspace. The reason to rather replace than extend `minimum_keyspace_rf` config option is to be aligned with Cassandra, which did the same, and has the same parameters' names. Only min soft limit is enabled by default and it is set to 3, which means that we'll generate a CQL warning whenever RF is set to either 1 or 2. RF's value of 0 is always allowed and means that there will not be any replicas on a given DC. This was agreed with PM. Because we don't allow to change guardrails' values when scylla is running (per PM), there're no tests provided with this PR, and dtests will be provided separately. Exceeding guardrails' thresholds will be tracked by metrics. Resolves #8619 Refs #8892 (the RF part, not the replication-strategy part) Closes #14262	2023-09-04 19:22:17 +03:00
Avi Kivity	78fc3b5f56	config: rename stream_plan_ranges_percentage to *_fraction The value is specified as a fraction between 0 and 1, so don't mislead users into specifying a value between 0 and 100. Closes #15261	2023-09-03 23:24:29 +03:00
Michał Chojnowski	023accf246	db: config: enable index caching by default Index caching was disabled by default because it caused performance regressions for some small-partition workloads. See #11202. However, it also means that there are workloads which could benefit from the index cache, but (by default) don't. As a compromise, we can set a default limit on the memory usage of index cache, which should be small enough to avoid catastrophical regressions in small-partition workloads, but big enough to accomodate workloads where index cache is obviously beneficial. This patch sets such a limit to 0.2 of total cache memory, and re-enables index caching by default.	2023-09-01 22:34:23 +02:00
Michał Chojnowski	50b429f255	config: add index_cache_fraction Adds a configurable upper limit to memory usage by index caches. See the source code comments added in this patch for more details. This patch shouldn't change visible behaviour, because the limit is set to 1.0 by default, so it is never triggerred. We will change the default in a future patch.	2023-09-01 22:34:23 +02:00
Patryk Jędrzejczak	0beabdc6ba	utils: introduce split_comma_separated_list Three places handle comma-separated lists similarly: - ss::remove_node.set(...) in api::set_storage_service, - storage_service::parse_node_list, - storage_service::is_repair_based_node_ops_enabled. In the next commit, the fourth place that needs the same logic appears -- storage_service::raft_replace. It needs to load and parse the --ignore-dead-nodes-for-replace param from config. Moreover, the code in is_repair_based_node_ops_enabled is different and doesn't seem right. We swap '\"' and '\'' with ' ' but don't do anything with it afterward. To avoid code duplication and fix is_repair_based_node_ops_enabled, we introduce the new function utils::split_comma_separated_list. This change has a small side effect on logging. For example, ignore_nodes_strs in storage_service::parse_node_list might be printed in a slightly different form.	2023-08-22 10:30:36 +02:00
Kefu Chai	12d6ec5a18	config: respect --log-with-color 1 scylladb overrides some of seastar logging related options with its own options by applying them with `logging::apply_settings()`. but we fail to inherit `with_color` from Seastar as we are using the designated initializer, so the unspecified members are zero initialized. that's why we always have logging message in black and white even if scylla is running in a tty and `--log-with-color 1` is specified. so, make the debugging life more colorful, let's inherit the option from Seastar, and apply it when setting logging related options. see also `29e09a3292` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15076	2023-08-20 13:47:43 +03:00
Tomasz Grabiec	bd8bb5d4b1	Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. ``` INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf ``` Closes #14863 * github.com:scylladb/scylladb: Kill scylla option to configure number of compaction groups replica: Wire tablet into compaction group token_metadata: Add this_host_id to topology config replica: Switch to chunked_vector for storing compaction groups replica: Generate group_id for compaction_group on demand	2023-08-18 15:17:17 +02:00
Avi Kivity	1901475598	Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly. Closes #14948 * github.com:scylladb/scylladb: config: remove unused namespace alias config: use std::ranges when appropriate config: drop "experimental" option test: disable 'enable_user_defined_functions' if experimental_features does not include udf test: pylib: specify experimental_features explicitly	2023-08-17 20:42:02 +03:00
Raphael S. Carvalho	b578d6643f	Kill scylla option to configure number of compaction groups The option was introduced to bootstrap the project. It's still useful for testing, but that translates into maintaining an additional option and code that will not be really used outside of testing. A possible option is to later map the option in boost tests to initial_tablets, which may yield the same effect for testing. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:53 -03:00
Piotr Smaroń	34c3688017	db: config: add live_updatable_config_params_changeable_via_cql option If `live_updatable_config_params_changeable_via_cql` is set to true, configuration parameters defined with `liveness::LiveUpdate` option can be updated in the runtime with CQL, i.e. by updating `system.config` virtual table. If we don't want any configuration parameter to be changed in the runtime by updating `system.config` virtual table, this option should be set to false. This option should be set to false for e.g. cloud users, who can only perform CQL queries, and should not be able to change scylla's configuration on the fly. Current implemenatation is generic, but has a small drawback - messages returned to the user can be not fully accurate, consider: ``` cqlsh> UPDATE system.config SET value='2' WHERE name='task_ttl_in_seconds'; WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="option is not live-updateable" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} ``` where `task_ttl_in_seconds` has been defined with `liveness::LiveUpdate`, but because `live_updatable_config_params_changeable_via_cql` is set to `false` in `scylla.yaml,` `task_ttl_in_seconds` cannot be modified in the runtime by updating `system.config` virtual table. Fixes #14355 Closes #14382	2023-08-16 17:56:27 +03:00
Kefu Chai	153a808f52	config: remove unused namespace alias bpo is not used after it is defined, so drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 10:17:34 +08:00
Kefu Chai	6355270120	config: use std::ranges when appropriate use std::ranges functions for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 10:17:34 +08:00
Kefu Chai	64bc8d2f7d	config: drop "experimental" option "experimental" was marked deprecated in `8b917f7c`. this change was included since Scylla 4.6. now that 5.3 has been branched, this change will be included 5.4. this should be long enough for the user's turn around if this option is ever used. the dtests using this option has been audited and updated accordingly. and the unit testing this option is removed as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 10:17:34 +08:00
Amnon Heiman	d10a3dd19a	config: add enable_node_table_metrics flag By default, per-table-per-shard metrics reporting is turned off, and the aggregated version of the metrics (per-table-per-node) will be turned on. There could be a situation where a user with an excessive number of tables would suffer from performance issues, both from the network and the metrics collection server. This patch adds a config option, enable_node_table_metrics, which allows users to turn off per-table metrics reporting altogether. For example, when running Scylla with the command line argument '--enable-node-aggregated-table_metrics 0' per-table metrics will not be reported. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-08-02 10:20:18 +03:00
Tomasz Grabiec	4e9d95d78c	Merge 'Compact data before streaming' from Botond Dénes Currently, streaming and repair processes and sends data as-is. This is wasteful: streaming might be sending data which is expired or covered by tombstones, taking up valuable bandwidth and processing time. Repair additionally could be exposed to artificial differences, due to different nodes being in different states of compactness. This PR adds opt-in compaction to `make_streaming_reader()`, then opts in all users. The main difference being in how these choose the current compaction time to use: * Load'n'stream and streaming uses the current time on the local node. * Repair uses a centrally chosen compaction time, generated on the repair master and propagated to al repair followers. This is to ensure all repair participants work with the exact state of compactness. Importantly, this compaction does not purge tombstones (tombstone GC is disabled completely). Fixes: https://github.com/scylladb/scylladb/issues/3561 Closes #14756 * github.com:scylladb/scylladb: replica: make_[multishard_]streaming_reader(): make compaction_time mandatory repair/row_level: opt in to compacting the stream streaming: opt-in to compacting the stream sstables_loader: opt-in for compacting the stream replica/table: add optional compacting to make_multishard_streaming_reader() replica/table: add optional compacting to make_streaming_reader() db/config: add config item for enabling compaction for streaming and repair repair: log the error which caused the repair to fail readers: compacting_reader: use compact_mutation_state::abandon_current_partition() mutation/mutation_compactor: allow user to abandon current partition	2023-07-28 16:42:13 +02:00
Avi Kivity	cf81eef370	Merge 'schema_mutations, migration_manager: Ignore empty partitions in per-table digest' from Tomasz Grabiec Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. Tombstones expire 7 days after schema change which introduces them. If one of the nodes is restarted after that, it will compute a different table schema digest on boot. This may cause performance problems. When sending a request from coordinator to replica, the replica needs schema_ptr of exact schema version request by the coordinator. If it doesn't know that version, it will request it from the coordinator and perform a full schema merge. This adds latency to every such request. Schema versions which are not referenced are currently kept in cache for only 1 second, so if request flow has low-enough rate, this situation results in perpetual schema pulls. After `ae8d2a550d` (5.2.0), it is more liekly to run into this situation, because table creation generates tombstones for all schema tables relevant to the table, even the ones which will be otherwise empty for the new table (e.g. computed_columns). This change inroduces a cluster feature which when enabled will change digest calculation to be insensitive to expiry by ignoring empty partitions in digest calculation. When the feature is enabled, schema_ptrs are reloaded so that the window of discrepancy during transition is short and no rolling restart is required. A similar problem was fixed for per-node digest calculation in c2ba94dc39e4add9db213751295fb17b95e6b962. Per-table digest calculation was not fixed at that time because we didn't persist enabled features and they were not enabled early-enough on boot for us to depend on them in digest calculation. Now they are enabled before non-system tables are loaded so digest calculation can rely on cluster features. Fixes #4485. Manually tested using ccm on cluster upgrade scenarios and node restarts. Closes #14441 * github.com:scylladb/scylladb: test: schema_change_test: Verify digests also with TABLE_DIGEST_INSENSITIVE_TO_EXPIRY enabled schema_mutations, migration_manager: Ignore empty partitions in per-table digest migration_manager, schema_tables: Implement migration_manager::reload_schema() schema_tables: Avoid crashing when table selector has only one kind of tables	2023-07-28 00:01:33 +03:00
Botond Dénes	9e3987fc96	db/config: add config item for enabling compaction for streaming and repair Compacting can greatly reduce the amount of data to be processed by streaming and repair, but with certain data shapes, its effectiveness can be reduced and its CPU overhead might outweight the benefits. This should very rarely be the case, but leave an off switch in case this becomes a problem in a deployment. Not wired yet.	2023-07-27 03:22:11 -04:00
Kamil Braun	e6099c4685	Merge 'config: set schema_commitlog_segment_size_in_mb to 128 ' from Patryk Jędrzejczak Fixes #14668 In #14668, we have decided to introduce a new `scylla.yaml` variable for the schema commitlog segment size and set it to 128MB. The reason is that segment size puts a limit on the mutation size that can be written at once, and some schema mutation writes are much larger than average, as shown in #13864. This `schema_commitlog_segment_size_in_mb variable` variable is now added to `scylla.yaml` and `db/config`. Additionally, we do not derive the commitlog sync period for schema commitlog anymore because schema commitlog runs in batch mode, so it doesn't need this parameter. It has also been discussed in #14668. Closes #14704 * github.com:scylladb/scylladb: replica: do not derive the commitlog sync period for schema commitlog config: set schema_commitlog_segment_size_in_mb to 128 config: add schema_commitlog_segment_size_in_mb variable	2023-07-24 10:23:34 +02:00
Patryk Jędrzejczak	b3be9617dc	config: set schema_commitlog_segment_size_in_mb to 128 We increase the default schema commitlog segment size so that the large mutations do not fail. We have agreed that 128 MB is sufficient.	2023-07-19 14:16:49 +02:00
Patryk Jędrzejczak	5b167a4ad7	config: add schema_commitlog_segment_size_in_mb variable In #14668, we have decided to introduce a new scylla.yaml variable for the schema commitlog segment size. The segment size puts a limit on the mutation size that can be written at once, and some schema mutation writes are much larger than average, as shown in #13864. Therefore, increasing the schema commitlog segment size is sometimes necessary.	2023-07-19 14:16:41 +02:00
Raphael S. Carvalho	d6029a195e	Remove DateTieredCompactionStrategy This is the last step of deprecation dance of DTCS. In Scylla 5.1, users were warned that DTCS was deprecated. In 5.2, altering or creation of tables with DTCS was forbidden. 5.3 branch was already created, so this is targetting 5.4. Users that refused to move away from DTCS will have Scylla falling back to the default strategy, either STCS or ICS. See: WARN 2023-07-14 09:49:11,857 [shard 0] schema_tables - Falling back to size-tiered compaction strategy after the problem: Unable to find compaction strategy class 'DateTieredCompactionStrategy Then user can later switch to a supported strategy with alter table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #14559	2023-07-14 16:20:48 +03:00
Asias He	dad5caf141	streaming: Add stream_plan_ranges_percentage This option allows user to change the number of ranges to stream in batch per stream plan. Currently, each stream plan streams 10% of the total ranges. With more ranges per stream plan, it reduces the waiting time between two stream plans. For example, stream_plan1: shard0 (t0), shard1 (t1) stream_plan2: shard0 (t2), shard1 (t3) We start stream_plan2 after all shards finish streaming in stream_plan1. If shard0 and shard1 in stream_plan1 finishes at different time. One of the shards will be idle. If we stream more ranges in a single stream plan, the waiting time will be reduced. Previously, we retry the stream plan if one of the stream plans is failed. That's one of the reasons we want more stream plans. With RBNO and `1f8b529e08` (range_streamer: Disable restream logic), the restream factor is not important anymore. Also, more ranges in a single stream plan will create bigger but fewer sstables on the receiver side. The default value is the same as before: 10% percentage of total ranges. Fixes #14191 Closes #14402	2023-07-14 09:03:01 +03:00
Gleb Natapov	4f23eec44f	Rename experimental raft feature to consistent-topology-changes Make the name more descriptive Fixes #14145 Message-Id: <ZKQ2wR3qiVqJpZOW@scylladb.com>	2023-07-07 11:08:10 +02:00
Nadav Har'El	d6aba8232b	alternator: configurable override for DescribeEndpoints The AWS C++ SDK has a bug (https://github.com/aws/aws-sdk-cpp/issues/2554) where even if a user specifies a specific enpoint URL, the SDK uses DescribeEndpoints to try to "refresh" the endpoint. The problem is that DescribeEndpoints can't return a scheme (http or https) and the SDK arbitrarily picks https - making it unable to communicate with Alternator over http. As an example, the new "dynamodb shell" (written in C++) cannot communicate with Alternator running over http. This patch adds a configuration option, "alternator_describe_endpoints", which can be used to override what DescribeEndpoints does: 1. Empty string (the default) leaves the current behavior - DescribeEndpoints echos the request's "Host" header. 2. The string "disabled" disables the DescribeEndpoints (it will return an UnknownOperationException). This is how DynamoDB Local behaves, and the AWS C++ SDK and the Dynamodb Shell work well in this mode. 3. Any other string is a fixed string to be returned by DescribeEndpoints. It can be useful in setups that should return a known address. Note that this patch does not, by default, change the current behaivor of DescribeEndpoints. But it us the future to override its behavior in a user experiences problems in the field - without code changes. Fixes #14410. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14432	2023-07-07 11:08:10 +02:00
Tomasz Grabiec	f2ed9fcd7e	schema_mutations, migration_manager: Ignore empty partitions in per-table digest Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. Tombstones expire 7 days after schema change which introduces them. If one of the nodes is restarted after that, it will compute a different table schema digest on boot. This may cause performance problems. When sending a request from coordinator to replica, the replica needs schema_ptr of exact schema version request by the coordinator. If it doesn't know that version, it will request it from the coordinator and perform a full schema merge. This adds latency to every such request. Schema versions which are not referenced are currently kept in cache for only 1 second, so if request flow has low-enough rate, this situation results in perpetual schema pulls. After `ae8d2a550d`, it is more liekly to run into this situation, because table creation generates tombstones for all schema tables relevant to the table, even the ones which will be otherwise empty for the new table (e.g. computed_columns). This change inroduces a cluster feature which when enabled will change digest calculation to be insensitive to expiry by ignoring empty partitions in digest calculation. When the feature is enabled, schema_ptrs are reloaded so that the window of discrepancy during transition is short and no rolling restart is required. A similar problem was fixed for per-node digest calculation in 18f484cc753d17d1e3658bcb5c73ed8f319d32e8. Per-table digest calculation was not fixed at that time because we didn't persist enabled features and they were not enabled early-enough on boot for us to depend on them in digest calculation. Now they are enabled before non-system tables are loaded so digest calculation can rely on cluster features. Fixes #4485.	2023-07-03 23:06:55 +02:00
Calle Wilund	a3db540142	auth: Add TLS certificate authenticator Fixes #10099 Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator, will extract roles from TLS authentication certificate (not wire cert - those are server side) subject, based on configurable regex. Example: scylla.yaml: authenticator: com.scylladb.auth.CertificateAuthenticator auth_superuser_name: <name> auth_certificate_role_queries: - source: SUBJECT query: CN=([^,\s]+) client_encryption_options: enabled: True certificate: <server cert> keyfile: <server key> truststore: <shared trust> require_client_auth: True In a client, then use a certificate signed with the <shared trust> store as auth cert, with the common name <name>. I.e. for cqlsh set "usercert" and "userkey" to these certificate files. No user/password needs to be sent, but role will be picked up from auth certificate. If none is present, the transport will reject the connection. If the certificate subject does not contain a recongnized role name (from config or set in tables) the authenticator mechanism will reject it. Otherwise, connection becomes the role described.	2023-06-26 15:00:21 +00:00
Calle Wilund	69217662bd	auth: Allow specifying initial superuser name + passwd (salted) in config Instead of locking this to "cassandra:cassandra", allow setting in scylla.yaml or commandline. Note that config values become redundant as soon as auth tables are initialized.	2023-06-26 15:00:20 +00:00
Kefu Chai	f014ccf369	Revert "Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai"" This reverts commit `562087beff`. The regressions introduced by the reverted change have been fixed. So let's revert this revert to resurrect the uuid_sstable_identifier_enabled support. Fixes #10459	2023-06-21 13:02:40 +03:00
Botond Dénes	562087beff	Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai" This reverts commit `d1dc579062`, reversing changes made to `3a73048bc9`. Said commit caused regressions in dtests. We need to investigate and fix those, but in the meanwhile let's revert this to reduce the disruption to our workflows. Refs: #14283	2023-06-19 08:49:27 +03:00
Kefu Chai	4c2df04449	db: config: add uuid_sstable_identifiers_enabled option unlike Cassandra 4.1, this option is true by default, will be used for enabling cluster feature of "UUID_SSTABLE_IDENTIFIERS". not wired yet. please note, because we are still using sstableloader and sstabledump based on 3.x branch, while the Cassandra upstream introduced the uuid sstable identifier in its 4.x branch, these tool fail to work with the sstables with uuid identifier, so this option is disabled when performing these tests. we will enable it once these tools are updated to support the uuid-basd sstable identifiers. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-06-15 17:54:59 +08:00
Nadav Har'El	5984db047d	Merge 'mv: forbid IS NOT NULL on columns outside the primary key' from Jan Ciołek statement_restrictions: forbid IS NOT NULL on columns outside the primary key IS NOT NULL is currently allowed only when creating materialized views. It's used to convey that the view will not include any rows that would make the view's primary key columns NULL. Generally materialized views allow to place restrictions on the primary key columns, but restrictions on the regular columns are forbidden. The exception was IS NOT NULL - it was allowed to write regular_col IS NOT NULL. The problem is that this restriction isn't respected, it's just silently ignored (see #10365). Supporting IS NOT NULL on regular columns seems to be as hard as supporting any other restrictions on regular columns. It would be a big effort, and there are some reasons why we don't support them. For now let's forbid such restrictions, it's better to fail than be wrong silently. Throwing a hard error would be a breaking change. To avoid breaking existing code the reaction to an invalid IS NOT NULL restrictions is controlled by the `strict_is_not_null_in_views` flag. This flag can have the following values: * `true` - strict checking. Having an `IS NOT NULL` restriction on a column that doesn't belong to the view's primary key causes an error to be thrown. * `warn` - allow invalid `IS NOT NULL` restrictions, but throw a warning. The invalid restrictions are silently ignored. * `false` - allow invalid `IS NOT NULL` restricitons, without any warnings or errors. The invalid restrictions are silently ignored. The default values for this flag are `warn` in `db::config` and `true` in scylla.yaml. This way the existing clusters will have `warn` by default, so they'll get a warning if they try to create such an invalid view. New clusters with fresh scylla.yaml will have the flag set to `true`, as scylla.yaml overwrites the default value in `db::config`. New clusters will throw a hard error for invalid views, but in older existing clusters it will just be a warning. This way we can maintain backwards compatibility, but still move forward by rejecting invalid queries on new clusters. Fixes: #10365 Closes #13013 * github.com:scylladb/scylladb: boost/restriction_test: test the strict_is_not_null_in_views flag docs/cql/mv: columns outside of view's primary key can't be restricted cql-pytest: enable test_is_not_null_forbidden_in_filter statement_restrictions: forbid IS NOT NULL on columns outside the primary key schema_altering_statement: return warnings from prepare_schema_mutations() db/config: add strict_is_not_null_in_views config option statement_restrictions: add get_not_null_columns() test: remove invalid IS NOT NULL restrictions from tests	2023-06-07 12:12:19 +03:00
Jan Ciolek	c67d65987e	db/config: add strict_is_not_null_in_views config option IS NOT NULL shouldn't be allowed on columns which are outside of the materialized view's primary key. It's currently allowed to create views with such restrictions, but they're silently ignored, it's a bug. In the following commits restricting regular columns with IS NOT NULL will be forbidden. This is a breaking change. Some users might have existing code that creates views with such restrictions, we don't want to break it. To deal with this a new feature flag is introduced: strict_is_not_null_in_views. By default it's set to `warn`. If a user tries to create a view with such invalid restrictions they will get a warning saying that this is invalid, but the query will still go through, it's just a warning. The default value in scylla.yaml will be `true`. This way new clusters will have strict enforcement enabled and they'll throw errors when the user tries to create such an invalid view, Old clusters without the flag present in scylla.yaml will have the flag set to warn, so they won't break on an update. There's also the option to set the flag to `false`. It's dangerous, as it silences information about a bug, but someone might want it to silence the warnings for a moment. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-06-07 01:48:39 +02:00
Konstantin Osipov	b39ca97919	consistent_cluster_management: make the default As per our roll out plan, make consistent_cluster_management (aka Raft for schema changes) the default going forward. It means all clusters which upgrade from the previous version and don't have `consistent_cluster_management` explicitly set in scylla.yaml will begin upgrading to Raft once all nodes in the cluster have moved to the new version. Fixes #13980 Closes #13984	2023-06-02 09:05:09 +02:00
Gleb Natapov	acc035b504	storage_service: do not allow override_decommission flag if consistent cluster management is enabled If consistent cluster management is enabled it is not possible to restart decommissioned node since it will not be part of the grouup0.	2023-05-31 10:40:42 +03:00
Kefu Chai	b0c40a2a03	db: config: s/ingore/ignore/ this string is used in as the option description in the command line help message. so it is a part of user facing interface. in this change, the typo is fixed. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14013	2023-05-24 13:35:24 +03:00
Botond Dénes	0cff0ffa08	Merge 'alternator,config: make alternator_timeout_in_ms live-updateable' from Kefu Chai before this change, alternator_timeout_in_ms is not live-updatable, as after setting executor's default timeout right before creating sharded executor instances, they never get updated with this option anymore. but many users would like to set the driver timers based on server timers. we need to enable them to configure timeout even when the server is still running. in this change, * `alternator_timeout_in_ms` is marked as live-updateable * `executor::_s_default_timeout` is changed to a thread_local variable, so it can be updated by a per-shard updateable_value. and it is now a updateable_value, so its variable name is updated accordingly. this value is set in the ctor of executor, and it is disconnected from the corresponding named_value<> option in the dtor of executor. * alternator_timeout_in_ms is passed to the constructor of executor via sharded_parameter, so `executor::_timeout_in_ms` can be initialized on per-shard basis * `executor::set_default_timeout()` is dropped, as we already pass the option to executor in its ctor. Fixes #12232 Closes #13300 * github.com:scylladb/scylladb: alternator: split the param list of executor ctor into multi lines alternator,config: make alternator_timeout_in_ms live-updateable	2023-05-15 10:16:29 +03:00
Piotr Dulikowski	760651b4ad	error injection: allow enabling injections via config Currently, error injections can be enabled either through HTTP or CQL. While these mechanisms are effective for injecting errors after a node has already started, it can't be reliably used to trigger failures shortly after node start. In order to support this use case, this commit adds possibility to enable some error injections via config. A configuration option `error_injections_at_startup` is added. This option uses our existing configuration framework, so it is possible to supply it either via CLI or in the YAML configuration file. - When passed in commandline, the option is parsed as a semicolon-separated list of error injection names that should be enabled. Those error injections are enabled in non-oneshot mode. The CLI option is marked as not used in release mode and does not appear in the option list. Example: --error-injections-at-startup failure_point1;failure_point2 - When provided in YAML config, the option is parsed as a list of items. Each item is either a string or a map or parameters. This method is more flexible as it allows to provide parameters for each injection point. At this time, the only benefit is that it allows enabling points in oneshot mode, but more parameters can be added in the future if needed. Explanatory example: error_injections_at_startup: - failure_point1 # enabled in non-oneshot mode - name: failure_point2 # enabled in oneshot mode one_shot: true # due to one_shot optional parameter The primary goal of this feature is to facilitate testing of raft-based cluster features. An error injection will be used to enable an additional feature to simulate node upgrade. Tests: manual Closes #13861	2023-05-15 09:14:07 +03:00
Kefu Chai	5fa459bd1a	treewide: do not include unused header since #13452, we switched most of the caller sites from std::regex to boost::regex. in this change, all occurences of `#include <regex>` are dropped unless std::regex is used in the same source file. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13765	2023-05-07 19:01:29 +03:00
Pavel Emelyanov	2f6aa5b52e	code: Introduce conf/object_storage.yaml configuration file In order to access real S3 bucket, the client should use signed requests over https. Partially this is due to security considerations, partially this is unavoidable, because multipart-uploading is banned for unsigned requests on the S3. Also, signed requests over plain http require signing the payload as well, which is a bit troublesome, so it's better to stick to secure https and keep payload unsigned. To prepare signed requests the code needs to know three things: - aws key - aws secret - aws region name The latter could be derived from the endpoint URL, but it's simpler to configure it explicitly, all the more so there's an option to use S3 URLs without region name in them we could want to use some time. To keep the described configuration the proposed place is the object_storage.yaml file with the format endpoints: - name: a.b.c port: 443 aws_key: 12345 aws_secret: abcdefghijklmnop ... When loaded, the map gets into db::config and later will be propagated down to sstables code (see next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:15 +03:00
Tomasz Grabiec	9781d3ffc5	db: config: Introduce experimental "TABLETS" feature	2023-04-24 10:49:36 +02:00
Amnon Heiman	990545f616	Add relabel from file support. This patch adds a configuration with an optional file name for relabeling metrics. It also adds a function that accepts a file name and loads the relabel config from a file. An example for such a file: ``` $cat conf.yml relabel_configs: - source_labels: [shard] action: drop target_label: shard regex: (2) - source_labels: [shard] action: replace target_label: level replacement: $1 regex: (.*3) ``` update_relabel_config_from_file throws an exception on failure, it's up to the caller to decide what to do in such cases.	2023-04-09 09:10:02 +03:00
Petr Gusev	0152c000bb	commitlog: use separate directory for schema commitlog The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in commitlog::descriptor::descriptor, which is logged with the WARN level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new schema_commitlog_directory parameter to move the schema commitlog to another disk drive. By default, the schema commitlog directory is nested in the commitlog_directory. This can help avoid problems during an upgrade if the commitlog_directory in the custom scylla.yaml is located on a separate disk partition. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867	2023-03-30 21:55:50 +04:00
Kefu Chai	f789d8d3cd	config: mark query timeouts live update-able in this change, following query timeouts config options are marked live update-able: - range_request_timeout_in_ms - read_request_timeout_in_ms - counter_write_request_timeout_in_ms - cas_contention_timeout_in_ms - truncate_request_timeout_in_ms - write_request_timeout_in_ms - request_timeout_in_ms as per https://github.com/scylladb/scylladb/issues/10172, > Many users would like to set the driver timers based on server timers. > For example: expire a read timeout before or after the server read time > out. with this change, these options are marked live-updateable, but since they are cached by their consumers locally, so we will have another commit to update the local copies when these options get updated. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	69c21f490a	alternator,config: make alternator_timeout_in_ms live-updateable before this change, alternator_timeout_in_ms is not live-updatable, as after setting executor's default timeout right before creating sharded executor instances, they never get updated with this option anymore. in this change, * alternator_timeout_in_ms is marked as live-updateable * executor::_s_default_timeout is changed to a thread_local variable, so it can be updated by a per-shard updateable_value. and it is now a updateable_value, so its variable name is updated accordingly. this value is set in the ctor of executor, and it is disconnected from the corresponding named_value<> option in the dtor of executor. * alternator_timeout_in_ms is passed to the constructor of executor via sharded_parameter, so executor::_timeout_in_ms can be initialized on per-shard basis * executor::set_default_timeout() is dropped, as we already pass the option to executor in its ctor. please note, in the ctor of executor, we always update the cached value of `s_default_timeout` with the value of `_timeout_in_ms`, and we set the default timeout to 10s in `alternator_test_env`. this is a design decision to avoid bending the production code for testing, as in production, we always set the timeout with the value specified either by the default value of yaml conf file. Fixes #12232 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-23 20:57:08 +08:00

1 2 3 4 5 ...

270 Commits