scylladb

Author	SHA1	Message	Date
Botond Dénes	5a7af93c7c	db/config: introduce reader_concurrency_semahore_cpu_concurrency To allow increasing the semaphore's CPU concurrency, which is currently hard-limited to 1. Not wired yet. (cherry picked from commit `c7317be09a`)	2024-07-08 08:06:28 +03:00
Pavel Emelyanov	5811df4d4b	config: Mark tablets feature as unused This features used to be there for a while, but then it was removed by `83d491af02`. This patch partially takes it back, but maps to UNUSED, so that if met in config, it's warned, but other features are parsed as well. refs: #18968 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `b2520b8185`)	2024-06-12 18:35:32 +00:00
Lakshmi Narayanan Sreethar	85805f6472	db/config.cc: increment components_memory_reclaim_threshold config default Incremented the components_memory_reclaim_threshold config's default value to 0.2 as the previous value was too strict and caused unnecessary eviction in otherwise healthy clusters. Fixes #18607 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `3d7d1fa72a`) Closes scylladb/scylladb#19014	2024-06-03 12:19:16 +03:00
Pavel Emelyanov	62a23fd86a	config: Remove experimental TABLETS feature ... and replace it with boolean enable_tablets option. All the places in the code are patched to check the latter option instead of the former feature. The option is OFF by default, but the default scylla.yaml file sets this to true, so that newly installed clusters turn tablets ON. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `83d491af02`) Closes scylladb/scylladb#19012	2024-06-03 12:16:41 +03:00
Kefu Chai	617e532859	db: config: drop operator<<() for error_injection_at_startup it is not used anymore, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18701	2024-05-16 15:10:57 +03:00
Asias He	952dfc6157	repair: Introduce repair_partition_count_estimation_ratio config option In commit `642f9a1966` (repair: Improve estimated_partitions to reduce memory usage), a 10% hard coded estimation ratio is used. This patch introduces a new config option to specify the estimation ratio of partitions written by repair out of the total partitions. It is set to 0.1 by default. Fixes #18615 Closes scylladb/scylladb#18634	2024-05-13 15:16:55 +03:00
Kamil Braun	03818c4aa9	direct_failure_detector: increase ping timeout and make it tunable The direct failure detector design is simplistic. It sends pings sequentially and times out listeners that reached the threshold (i.e. didn't hear from a given endpoint for too long) in-between pings. Given the sequential nature, the previous ping must finish so the next ping can start. We timeout pings that take too long. The timeout was hardcoded and set to 300ms. This is too low for wide-area setups -- latencies across the Earth can indeed go up to 300ms. 3 subsequent timed out pings to a given node were sufficient for the Raft listener to "mark server as down" (the listener used a threshold of 1s). Increase the ping timeout to 600ms which should be enough even for pinging the opposite side of Earth, and make it tunable. Increase the Raft listener threshold from 1s to 2s. Without the increased threshold, one timed out ping would be enough to mark the server as down. Increasing it to 2s requires 3 timed out pings which makes it more robust in presence of transient network hiccups. In the future we'll most likely want to decrease the Raft listener threshold again, if we use Raft for data path -- so leader elections start quickly after leader failures. (Faster than 2s). To do that we'll have to improve the design of the direct failure detector. Ref: scylladb/scylladb#16410 Fixes: scylladb/scylladb#16607 --- I tested the change manually using `tc qdisc ... netem delay`, setting network delay on local setup to ~300ms with jitter. Without the change, the result is as observed in scylladb/scylladb#16410: interleaving ``` raft_group_registry - marking Raft server ... as dead for Raft groups raft_group_registry - marking Raft server ... as alive for Raft groups ``` happening once every few seconds. The "marking as dead" happens whenever we get 3 subsequent failed pings, which is happens with certain (high) probability depending on the latency jitter. Then as soon as we get a successful ping, we mark server back as alive. With the change, the phenomenon no longer appears. Closes scylladb/scylladb#18443	2024-05-07 23:40:23 +02:00
Kamil Braun	d8313dda43	Merge 'db: config: move consistent-topology-changes out of experimental and make it the default for new clusters' from Patryk Jędrzejczak We move consistent cluster management out of experimental and make it the default for new clusters in 6.0. In code, we make the `consistent-topology-changes` flag unused and assumed to be true. In 6.0, the topology upgrade procedure will be manual and voluntary, so some clusters will still be using the gossip-based topology even though they support the raft-based topology. Therefore, we need to continue testing the gossip-based topology. This is possible by using the `force-gossip-topology-changes` flag introduced in scylladb/scylladb#18284. Ref scylladb/scylladb#17802 Closes scylladb/scylladb#18285 * github.com:scylladb/scylladb: docs: raft.rst: update after removing consistent-topology-changes treewide: fix indentation after the previous patch db: config: make consistent-topology-changes unused test: lib: single_node_cql_env: restart a node in noninitial run_in_thread calls test: test_read_required_hosts: run with force-gossip-topology-changes storage_service: join_cluster: replace force_gossip_based_join with force-gossip-topology-changes storage_service: join_token_ring: fix finish_setup_after_join calls	2024-04-26 14:45:29 +02:00
Avi Kivity	c2b8ca7d71	Merge 'cql3: statements: change default tombstone_gc mode for tablets' from Aleksandra Martyniuk Repair may miss some tablets that migrated across nodes. So if tombstones expire after some timeout, then we can have data resurrection. Set default tombstone_gc mode to "repair" for tables which use tablets (if repair is required). Fixes: #16627. Closes scylladb/scylladb#18013 * github.com:scylladb/scylladb: test: check default value of tombstone_gc test: topology: move some functions to util.py cql3: statements: change default tombstone_gc mode for tablets	2024-04-25 19:18:37 +03:00
Amnon Heiman	dfea50a7e9	db/config.cc add metric family config from file Metric family config lets a user configure the metric family aggregate labels. This patch modifies the existing relable-config from file to accept metric family config. Similar to the existing relable_config, it adds a metric_family_configs section. For example, the following configuration demonstrates changing aggregate labels by name and regular expression. ``` metric_family_configs: - name: storage_service aggregate_labels: [shard] - regex: (storage_proxy.*) aggregate_labels: [shard, scheduling_group_name] ``` Signed-off-by: Amnon Heiman <amnon@scylladb.com> Closes scylladb/scylladb#18339	2024-04-25 16:03:39 +03:00
Patryk Jędrzejczak	3a34bb18cd	db: config: make consistent-topology-changes unused We make the `consistent-topology-changes` experimental feature unused and assumed to be true in 6.0. We remove code branches that executed if `consistent-topology-changes` was disabled.	2024-04-25 14:33:21 +02:00
Aleksandra Martyniuk	58f72f9019	cql3: statements: change default tombstone_gc mode for tablets Currently, if tombstone_gc mode isn't specified for a table, then "timeout" is used by default. With tablets, running "nodetool repair -pr" may miss a tablet if it migrated across the nodes. Then, if we expire tombstones for ranges that weren't repaired, we may get data resurrection. Set default tombstone_gc mode value for DDLs that don't specify it. It's set to "repair" for tables which use tablets unless they use local replication strategy or rf = 1. Otherwise it's set to "timeout".	2024-04-24 10:42:10 +02:00
Patryk Jędrzejczak	14911051ee	db: config: introduce force-gossip-topology-changes We are going to make the `consistent-topology-changes` experimental feature unused in 6.0. However, the topology upgrade procedure will be manual and voluntary, so some 6.0 clusters will be using the gossip-based topology. Therefore, we need to continue testing the gossip-based topology. The solution is introducing a new flag, `force-gossip-topology-changes`, that will enforce the gossip-based topology in a fresh cluster. In this patch, we only introduce the parameter without any effect. Here is the explanation. Making `consistent-topology-changes` unused and introducing `force-gossip-topology-changes` requires adjustments in scylla-dtest. We want to merge changes to scylladb and scylla-dtest in a way that ensures all tests are run correctly during the whole process. If we merged all changes to scylladb first, before merging the scylla-dtest changes, all tests would run with the raft-based topology and the ones excluded in the raft-based topology would fail. We also can't merge all changes to scylla-dtest first. However, we can follow this plan: 1. scylladb: merge this patch 2. scylla-dtest: start using `force-gossip-topology-changes` in jobs that run without the raft-based topology 3. scylladb: merge the rest of the changes 4. scylla-dtest: merge the rest of the changes Ref scylladb/scylladb#17802 Closes scylladb/scylladb#18284	2024-04-23 09:42:46 +02:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Lakshmi Narayanan Sreethar	e8026197d2	db/config: add a new variable to limit memory used by table components A new configuration variable, components_memory_reclaim_threshold, has been added to configure the maximum allowed percentage of available memory for all SSTable components in a shard. If the total memory usage exceeds this threshold, it will be reclaimed from the components to bring it back under the limit. Currently, only the memory used by the bloom filters will be restricted. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-04-02 01:37:47 +05:30
Petr Gusev	49a4220fea	error_injection: pass injection parameters at startup Injection parameters can be used in the lambda passed to inject_with_handler method to take some values from the test. However, there was no way to set values to these parameters on node startup, only through the error injection REST api. Therefore, we couldn't rely on this when inject_with_handler is used during node startup, it could trigger before we call the api from the test. In this commit with solve this problem by allowing these parameters to be assigned through scylla.yaml config. The defer.hh header was added to error_injection.hh to fix compilation after adding error_injection.hh to config.hh, defer function is used in error_injection.hh.	2024-03-19 20:17:02 +04:00
Botond Dénes	5e37c1465f	db/config: introduce query_page_size_in_bytes Regulates the page size in bytes via config, instead of the currently used hard-coded constant. Allows tests to configure lower limits so they can work with smaller data-sets when testing paging related functionality. Not wired yet.	2024-02-27 02:14:45 -05:00
Tomasz Grabiec	ef9e5e64a3	locator: token_metadata: Introduce topology barrier stall detector When topology barrier is blocked for longer than configured threshold (2s), stale versions are marked as stalled and when they get released they report backtrace to the logs. This should help to identify what was holding for token metadata pointer for too long. Example log: token_metadata - topology version 30 held for 299.159 [s] past expiry, released at: 0x2397ae1 0x23a36b6 ... Closes scylladb/scylladb#17427	2024-02-21 15:05:34 +02:00
Avi Kivity	93af3dd69b	Merge 'Maintenance socket: set filesystem permissions to 660' from Mikołaj Grzebieluch Set filesystem permissions for the maintenance socket to 660 (previously it was 755) to allow a scyllaadm's group to connect. Split the logic of creating sockets into two separate functions, one for each case: when it is a regular cql controller or used by maintenance_socket. Fixes https://github.com/scylladb/scylladb/issues/16487. Closes scylladb/scylladb#17113 * github.com:scylladb/scylladb: maintenance_socket: add option to set owning group transport/controller: get rid of magic number for socket path's maximal length transport/controller: set unix_domain_socket_permissions for maintenance_socket transport/controller: pass unix_domain_socket_permissions to generic_server::listen transport/controller: split configuring sockets into separate functions	2024-02-20 15:09:54 +02:00
Mikołaj Grzebieluch	182cfebe40	maintenance_socket: add option to set owning group Option `maintenance-socket-group` sets the owning group of the maintenance socket. If not set, the group will be the same as the user running the scylla node.	2024-02-19 10:21:00 +01:00
Kefu Chai	3dfb0f86f1	db: add formatter for error_injection_at_startup before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `error_injection_at_startup`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17211	2024-02-08 19:40:48 +02:00
Pavel Emelyanov	7c5c89ba8d	Revert "Merge 'Use utils::directories instead of db::config to get dirs' from Patryk Wróbel" This reverts commit `370fbd346c`, reversing changes made to `0912d2a2c6`. This makes scylla-manager mis-interpret the data_file_directories somehow, issue #17078	2024-01-31 15:08:14 +03:00
Avi Kivity	c8397f0287	Merge 'Implement tablet splitting' from Raphael "Raph" Carvalho The motivation for tablet resizing is that we want to keep the average tablet size reasonable, such that load rebalancing can remain efficient. Too large tablet makes migration inefficient, therefore slowing down the balancer. If the avg size grows beyond the upper bound (split threshold), then balancer decides to split. Split spans all tablets of a table, due to power-of-two constraint. Likewise, if the avg size decreases below the lower bound (merge threshold), then merge takes place in order to grow the avg size. Merge is not implemented yet, although this series lays foundation for it to be impĺemented later on. A resize decision can be revoked if the avg size changes and the decision is no longer needed. For example, let's say table is being split and avg size drops below the target size (which is 50% of split threshold and 100% of merge one). That means after split, the avg size would drop below the merge threshold, causing a merge after split, which is wasteful, so it's better to just cancel the split. Tablet metadata gains 2 new fields for managing this: resize_type: resize decision type, can be either of "merge", "split", or "none". resize_seq_number: a sequence number that works as the global identifier of the decision (monotonically increasing, increased by 1 on every new decision emitted by the coordinator). A new RPC was implemented to pull stats from each table replica, such that load balancer can calculate the avg tablet size and know the "split status", for a given table. Avg size is aggregated carefully while taking RF of each DC into account (which might differ). When a table is done splitting its storage, it loads (mirror) the resize_seq_number from tablet metadata into its local state (in another words, my split status is ready). If a table is split ready, coordinator will see that table's seq number is the same as the one in tablet metadata. Helps to distinguish stale decisions from the latest one (in case decisions are revoked and re-emited later on). Also, it's aggregated carefully, by taking the minimum among all replicas, so coordinator will only update topology when all replicas are ready. When load balancer emits split decision, replicas will listen to need to split with a "split monitor" that is awakened once a table has replication metadata updated and detects the need for split (i.e. resize_type field is "split"). The split monitor will start splitting of compaction groups (using mechanism introduced here: `081f30d149`) for the table. And once splitting work is completed, the table updates its local state as having completed split. When coordinator pulls the split status of all replicas for a table via RPC, the balancer can see whether that table is ready for "finalizing" the decision, which is about updating tablet metadata to split each tablet into two. Once table replicas have their replication metadata updated with the new tablet count, they can update appropriately their set of compaction groups (that were previously split in the preparation step). Fixes #16536. Closes scylladb/scylladb#16580 * github.com:scylladb/scylladb: test/topology_experimental_raft: Add tablet split test replica: Bypass reshape on boot with tablets temporarily replica: Fix table::compaction_group_for_sstable() for tablet streaming test/topology_experimental_raft: Disable load balancer in test fencing replica: Remap compaction groups when tablet split is finalized service: Split tablet map when split request is finalized replica: Update table split status if completed split compaction work storage_service: Implement split monitor topology_cordinator: Generate updates for resize decisions made by balancer load_balancer: Introduce metrics for resize decisions db: Make target tablet size a live-updateable config option load_balancer: Implement resize decisions service: Wire table_resize_plan into migration_plan service: Introduce table_resize_plan tablet_mutation_builder: Add set_resize_decision() topology_coordinator: Wire load stats into load balancer storage_service: Allow tablet split and migration to happen concurrently topology_coordinator: Periodically retrieve table_load_stats locator: Introduce topology::get_datacenter_nodes() storage_service: Implement table_load_stats RPC replica: Expose table_load_stats in table replica: Introduce storage_group::live_disk_space_used() locator: Introduce table_load_stats tablets: Add resize decision metadata to tablet metadata locator: Introduce resize_decision	2024-01-31 13:59:56 +02:00
Piotr Smaroń	35ba037724	config: fix a typo in --role-manager's description Closes scylladb/scylladb#17063	2024-01-30 16:13:33 +02:00
Patryk Wrobel	dc8d5ffaf6	db::config: keep dir paths unchanged This change is intended to ensure, that db::config fields related to directories are not changed. To achieve that a member function called setup_directories() is removed. The responsibility for directories paths has been moved to utils::directories, which may generate default paths if the configuration does not provide a specific value. Fixes: scylladb#5626 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-29 13:20:41 +01:00
Raphael S. Carvalho	638e6e30cb	db: Make target tablet size a live-updateable config option Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-01-25 18:36:08 -03:00
Mikołaj Grzebieluch	e6a83b9819	db/config: add maintenance mode flag	2024-01-25 15:27:53 +01:00
Nadav Har'El	df6c9828ef	Merge 'Add protobuf and Native histogram support' from Amnon Heiman Native histograms (also known as sparse histograms) are an experimental Prometheus feature. They use protobuf as the reporting layer. Native histograms hold the benefits of high resolution at a lower resource cost. This series allows sending histograms in a native histogram format over protobuf. By default, protobuf support is disabled. To use protobuf with native histograms, the command line flag prometheus_allow_protobuf should be set to true, and the Prometheus server should send the accept header with protobuf. Fixes #12931 Closes scylladb/scylladb#16737 * github.com:scylladb/scylladb: main.cc: Add prometheus_allow_protobuf command line histogram_metrics_helper: support native histogram config: Add prometheus_allow_protobuf flag	2024-01-24 21:24:50 +02:00
Kefu Chai	c978d1b3f8	config: s/re-use/reuse/ this misspelling is identified by codespell. per m-w, reuse is a word per-se, and we don't need the hyphen for addressing the ambiguity in the use cases, like, recover and re-cover. see also https://www.merriam-webster.com/dictionary/reuse Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16962	2024-01-24 15:19:03 +02:00
Botond Dénes	26d814d8be	Merge 'Configure initial tablets count scaling' from Pavel Emelyanov There are currently two options how to "request" the number of initial tables for a table 1. specify it explicitly when creating a keyspace 2. let scylla calculate it on its own Both are not very nice. The former doesn't take cluster layout into consideration. The latter does, but starts with one tablet per shard, which can be too low if the amount of data grows rapidly. Here's a (maybe temporary) proposal to facilitate at least perf tests -- the --tablets-initial-scale-factor option that enhances the option number two above by multiplying the calculated number of tablets by the configured number. This is what we currently do to run perf tests by patching scylla, with the option it going to be more convenient. Closes scylladb/scylladb#16919 * github.com:scylladb/scylladb: config: Add --tablets-initial-scale-factor tablet_allocator: Add initial tablets scale to config tablet_allocator: Add config	2024-01-23 13:25:12 +02:00
Amnon Heiman	fc9bd2de03	config: Add prometheus_allow_protobuf flag Native histograms (also known as sparse histograms) are an experimental Prometheus feature. They use protobuf as the reporting layer. The prometheus_allow_protobuf flag allows the user to enable protobuf protocol. When this flag is set to true, and the Prometheus server sends in the request that it accepts protobuf, the result will be in protobuf protocol. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2024-01-23 13:12:07 +02:00
Pavel Emelyanov	d1d4620af8	config: Add --tablets-initial-scale-factor Previous patch taught tablets allocator to multiply the initial tablets count by some value. This patch makes this factor configurable Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-22 19:18:18 +03:00
David Garcia	f3eeba8cc6	docs: parse config.cc properties as rst text This enhancement formats descriptions in config.cc using the standard markup language reStructuredText (RST). By doing so, it improves the rendering of these descriptions in the documentation, allowing you to use various directives like admonitions, code blocks, ordered lists, and more. Closes scylladb/scylladb#16311	2024-01-22 16:40:18 +02:00
Michał Jadwiszczak	f6a464ad81	configure service levels interval So far the service levels interval, responsible for updating SL configuration, was hardcoded in main. Now it's extracted to `service_levels_interval_ms` option.	2024-01-12 10:28:24 +01:00
Nadav Har'El	7c5092cb8f	test: add missing "tags" schema extension to cql_test_env One of the unfortunate anti-features of cql_test_env (the framework used in our CQL tests that are written in C++) is that it needs to repeat various bizarre initializations steps done in main.cc, otherwise various requests work incorrectly. One of these steps that main.cc is to initialize various "schema extensions" which some of the Scylla features need to work correctly. We remembered to initialize some schema extensions in cql_test_env, but forgot others. The one I will need in the following patch is the "tags" extension, which we need to mark materialized views used by local secondary indexes as "synchronous_updates" - without this patch the LSI tests in secondary_index_test.cc will crash. In addition to adding the missing extension, this patch also replaces the segmentation-fault crash when it's missing (caused by a dynamic cast failure) by a clearer on_internal_error() - so if we ever have this bug again, it will be easier to debug. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-12-21 11:44:50 +02:00
Kamil Braun	6fcaec75db	Merge 'Add maintenance socket' from Mikołaj Grzebieluch It enables interaction with the node through CQL protocol without authentication. It gives full-permission access. The maintenance socket is available by Unix domain socket with file permissions `755`, thus it is not accessible from outside of the node and from other POSIX groups on the node. It is created before the node joins the cluster. To set up the maintenance socket, use the `maintenance-socket` option when starting the node. * If set to `ignore` maintenance socket will not be created. * If set to `workdir` maintenance socket will be created in `<node's workdir>/cql.m`. * Otherwise maintenance socket will be created in the specified path. The default value is `ignore`. * With python driver ```python from cassandra.cluster import Cluster from cassandra.connection import UnixSocketEndPoint from cassandra.policies import HostFilterPolicy, RoundRobinPolicy socket = "<node's workdir>/cql.m" cluster = Cluster([UnixSocketEndPoint(socket)], # Driver tries to connect to other nodes in the cluster, so we need to filter them out. load_balancing_policy=HostFilterPolicy(RoundRobinPolicy(), lambda h: h.address == socket)) session = cluster.connect() ``` Merge note: apparently cqlsh does not support unix domain sockets; it will have to be fixed in a follow-up. Closes scylladb/scylladb#16172 * github.com:scylladb/scylladb: test.py: add maintenance socket test test.py: enable maintenance socket in tests by default docs: add maintenance socket documentation main: add maintenance socket main: refactor initialization of cql controller and auth service auth/service: don't create system_auth keyspace when used by maintenance socket cql_controller: maintenance socket: fix indentation cql_controller: add option to start maintenance socket db/config: add maintenance_socket_enabled bool class auth: add maintenance_socket_role_manager db/config: add maintenance_socket variable	2023-12-20 19:04:40 +02:00
Mikołaj Grzebieluch	e682e362a3	db/config: add maintenance_socket variable If set to "ignore", maintenance socket will be disabled. If set to "workdir", maintenance socket will be opened on <scylla's workdir>/cql.m. Otherwise it will be opened on path provided by maintenance_socket variable. It is set by default to 'ignore'.	2023-12-18 11:42:05 +01:00
Patryk Jędrzejczak	5ebfbf42bc	db: config: make consistent_cluster_management mandatory Code that executed only when consistent_cluster_management=false is removed. In particular, after this patch: - raft_group0 and raft_group_registry are always enabled, - raft_group0::status_for_monitoring::disabled becomes unused, - topology tests can only run with consistent_cluster_management.	2023-12-14 16:54:04 +01:00
Patryk Jędrzejczak	a54f9052fc	db: config: make override_decommission deprecated The override_decommission option is supported only when consistent_cluster_management is disabled. In the following commit, we make consistent_cluster_management mandatory, which makes overwrite_decommission unusable.	2023-12-14 16:54:04 +01:00
Patryk Jędrzejczak	571db3c983	db: config: make force_schema_commit_log deprecated In scylladb/scylladb#16254, we made force_schema_commit_log unused. After this change, if someone passes this option as the command line argument, the boot fails. This behavior is undesired. We only want this option to be ignored. We can achieve this effect by making it deprecated.	2023-12-14 16:53:46 +01:00
Botond Dénes	d2a88cd8de	Merge 'Typos: fix typos in code' from Yaniv Kaul Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255 Closes scylladb/scylladb#16289 * github.com:scylladb/scylladb: Update unified/build_unified.sh Update main.cc Update dist/common/scripts/scylla-housekeeping Typos: fix typos in code	2023-12-06 07:36:41 +02:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Patryk Jędrzejczak	c8ee7d4499	db: make schema commitlog feature mandatory Using consistent cluster management and not using schema commitlog ends with a bad configuration throw during bootstrap. Soon, we will make consistent cluster management mandatory. This forces us to also make schema commitlog mandatory, which we do in this patch. A booting node decides to use schema commitlog if at least one of the two statements below is true: - the node has `force_schema_commitlog=true` config, - the node knows that the cluster supports the `SCHEMA_COMMITLOG` cluster feature. The `SCHEMA_COMMITLOG` cluster feature has been added in version 5.1. This patch is supposed to be a part of version 6.0. We don't support a direct upgrade from 5.1 to 6.0 because it skips two versions - 5.2 and 5.4. So, in a supported upgrade we can assume that the version which we upgrade from has schema commitlog. This means that we don't need to check the `SCHEMA_COMMITLOG` feature during an upgrade. The reasoning above also applies to Scylla Enterprise. Version 2024.2 will be based on 6.0. Probably, we will only support an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even if we support an upgrade from 2023.x, this patch won't break anything because 2023.1 is based on 5.2, which has schema commitlog. Upgrades from 2022.x definitely won't be supported. When we populate a new cluster, we can use the `force_schema_commitlog=true` config to use schema commitlog unconditionally. Then, the cluster feature check is irrelevant. This check could fail because we initiate schema commitlog before we learn about the features. The `force_schema_commitlog=true` config is especially useful when we want to use consistent cluster management. Failing feature checks would lead to crashes during initial bootstraps. Moreover, there is no point in creating a new cluster with `consistent_cluster_management=true` and `force_schema_commitlog=false`. It would just cause some initial bootstraps to fail, and after successful restarts, the result would be the same as if we used `force_schema_commitlog=true` from the start. In conclusion, we can unconditionally use schema commitlog without any checks in 6.0 because we can always safely upgrade a cluster and start a new cluster. Apart from making schema commitlog mandatory, this patch adds two changes that are its consequences: - making the unneeded `force_schema_commitlog` config unused, - deprecating the `SCHEMA_COMMITLOG` feature, which is always assumed to be true. Closes scylladb/scylladb#16254	2023-12-04 21:02:16 +02:00
Piotr Smaroń	5fd30578d7	config: introduce value_status::Deprecated Current mechanism to deprecate config options is implemented in a hacky way in `main.cpp` and doesn't account for existing `db::config/boost::po` API controlling lifetime of config options, hence it's being replaced in this PR by adding yet another `value_status` enumerator: `Deprecated`, so that deprecation of config options is controlled in one place in `config.cc`,i.e. when specifying config options. Motivation: https://docs.google.com/document/d/18urPG7qeb7z7WPpMYI2V_lCOkM5YGKsEU78SDJmt8bM/edit?usp=sharing With this change, if a `Deprecated` config option is specified as 1. a command line parameter, scylla will run and log: ``` WARN 2023-11-25 23:37:22,623 [shard 0:main] init - background-writer-scheduling-quota option ignored (deprecated) ``` (Previously it was only a message printed to standard output, not a scylla log of warn level). 2. an option in `scylla.yaml`, scylla will run and log: ``` WARN 2023-11-27 23:55:13,534 [shard 0:main] init - Option is deprecated : background_writer_scheduling_quota ``` Fixes #15887 Incorporates dropped https://github.com/scylladb/scylladb/pull/15928 Closes scylladb/scylladb#16184	2023-11-30 08:52:57 +03:00
Avi Kivity	8e9d3af431	Merge 'Commitlog: complete prerequisites and enforce hard limit by default' from Eliran Sinvani This miniset, completes the prerequisites for enabling commitlog hard limit on by default. Namely, start flushing and evacuating segments halfway to the limit in order to never hit it under normal circumstances. It is worth mentioning that hitting the limit is an exceptional condition which it's root cause need to be resolved, however, once we do hit the limit, the performance impact that is inflicted as a result of this enforcement is irrelevant. Tests: unit tests. LWT write test (#9331) A whitebox testing has been performed by @wmitros , the test aimed at putting as much pressure as possible on the commitlog segments by using a write pattern that rewrites the partitions in the memtable keeping it at ~85% occupancy so the dirty memory manager will not kick in. The test compared 3 configurations: 1. The default configuration 2. Hard limit on (without changing the flush threshold) 3. the changes in this PR applied. The last exhibited the "best" behavior in terms of metrics, the graphs were the flattest and less jaggy from the others. Closes scylladb/scylladb#10974 * github.com:scylladb/scylladb: commitlog: enforce commitlog size hard limit by default commitlog: set flush threshold to half of the limit size commitlog: unfold flush threshold assignment	2023-11-29 20:55:53 +02:00
Benny Halevy	66ba983fe0	compaction_manager: flush_all_tables before major compaction Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See `64ec1c6ec6` However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See `f42eb4d1ce`). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb/scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Kefu Chai	6749d963ed	config: define formatter for db::seed_provider_type before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for db::seed_provider_type. please note, we are still formatting vector<db::seed_provider_type> with the helper provided by seastar/core/sstring.hh, which uses operator<<() to print the elements in the vector being printed. so we have to keep the operator<< formatter before disabling the generic formatter for vector<T>. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16138	2023-11-23 11:04:35 +02:00
Eliran Sinvani	bfa839ce92	commitlog: enforce commitlog size hard limit by default Since the commitlog size hard limit is a failsafe mechanism, we don't expect to ever hit it. If we do hit the limit, it means that we have an exceptional condition in the system. Hence, the impact of enforcing the commitlog hard limit is irrelevant. Here we enforce the limit by default. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-11-22 08:48:28 +02:00
Piotr Smaroń	8c464b2ddb	guardrails: restrict replication strategy (RS) Replacing `restrict_replication_simplestrategy` config option with 2 config options: `replication_strategy_{warn,fail}_list`, which allow us to impose soft limits (issue a warning) and hard limits (not execute CQL) on replication strategy when creating/altering a keyspace. The reason to rather replace than extend `restrict_replication_simplestrategy` config option is that it was not used and we wanted to generalize it. Only soft guardrail is enabled by default and it is set to SimpleStrategy, which means that we'll generate a CQL warning whenever replication strategy is set to SimpleStrategy. For new cloud deployments we'll move SimpleStrategy from warn to the fail list. Guardrails violations will be tracked by metrics. Resolves #5224 Refs #8892 (the replication strategy part, not the RF part) Closes scylladb/scylladb#15399	2023-10-31 18:34:41 +03:00
David Garcia	1121a4df04	docs: add groups to reference docs fix: comment Closes scylladb/scylladb#15592	2023-10-04 11:42:36 +03:00

1 2 3 4 5 ...

316 Commits