scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 15:33:15 +00:00

Author	SHA1	Message	Date
Piotr Dulikowski	cb4a4f2caf	topology_coordinator: compute cluster size correctly during upgrade During upgrade to raft topology, information about service levels is copied from the legacy tables in system_distributed to the raft-managed tables of group 0. system_distributed has RF=3, so if the cluster has only one or two nodes we should use lower consistency level than ALL - and the current procedure does exactly that, it selects QUORUM in case of two nodes and ONE in case of only one node. The cluster size is determined based on the call to _gossiper.num_endpoints(). Despite its name, gossiper::num_endpoints() does not necessarily return the number of nodes in the cluster but rather the number of endpoint states in gossiper (this behavior is documented in a comment near the declaration of this function). In some cases, e.g. after gossiper-based nodetool remove, the state might be kept for some time after removal (3 days in this case). The consequence of this is that gossiper::num_endpoints() might return more than the current number of nodes during upgrade, and that in turn might cause migration of data from one table to another to fail - causing the upgrade procedure to get stuck if there is only 1 or two nodes in the cluster. In order to fix this, use token_metadata::get_all_endpoints() as a measure of the cluster size. Fixes: scylladb/scylladb#18198	2024-04-29 13:26:29 +02:00
Kefu Chai	4433d2e10e	build: cmake: let iotune depends on config specific file before this change, in order to build `${iotune_path}`, we use the rule to build `app_iotune` but this target is built using the default build type, see https://cmake.org/cmake/help/latest/variable/CMAKE_DEFAULT_BUILD_TYPE.html#variable:CMAKE_DEFAULT_BUILD_TYPE so, if we want to build `${iotune_path}` for the configuration which is not listed as the first item in `CMAKE_CONFIGURATION_TYPES`, we would end up with copying an nonexistent file. to address this issue, we override the this behavior using the `$<OUTPUT_CONFIG:...>` generator-expression. so that we can depend on non-unique path. and the file-level dependency between ${iotune_path} and $<CONFIG>/iotune can be established. see also https://cmake.org/cmake/help/latest/generator/Ninja%20Multi-Config.html#custom-commands Refs #2717 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18395	2024-04-29 09:06:39 +03:00
Kefu Chai	f03f69ad4f	partition_version: move the base class in move ctor before this change, `partition_version` uses a hand-crafted move constructor. but it suffers from the warning from clang-tidy, which believe there is a use-after-move issue, as the inner instance of it's parent class is constructed using `anchorless_list_base_hook(std::move(pv))`, and its other member variables are initialized like `_partition(std::move(pv._partition))` `std::move(pv)` does not do anything, but indicates `pv` maybe moved from. and what is moved away is but the part belong to its parent class. so this issue is benign. but, it's still annoying. as we need to tell the genuine issues reported by clang-tidy from the false alarms. so we have at least two options: - stop using clang-tidy - ignore this warning - silence this warning using LINT direction in a comment - use another way to implement the move constructor in this change, we just cast the moved instance to its base class and move it instead, this should applease clang-tidy. Fixes #18354 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18359	2024-04-28 18:34:45 +02:00
Kamil Braun	d8313dda43	Merge 'db: config: move consistent-topology-changes out of experimental and make it the default for new clusters' from Patryk Jędrzejczak We move consistent cluster management out of experimental and make it the default for new clusters in 6.0. In code, we make the `consistent-topology-changes` flag unused and assumed to be true. In 6.0, the topology upgrade procedure will be manual and voluntary, so some clusters will still be using the gossip-based topology even though they support the raft-based topology. Therefore, we need to continue testing the gossip-based topology. This is possible by using the `force-gossip-topology-changes` flag introduced in scylladb/scylladb#18284. Ref scylladb/scylladb#17802 Closes scylladb/scylladb#18285 * github.com:scylladb/scylladb: docs: raft.rst: update after removing consistent-topology-changes treewide: fix indentation after the previous patch db: config: make consistent-topology-changes unused test: lib: single_node_cql_env: restart a node in noninitial run_in_thread calls test: test_read_required_hosts: run with force-gossip-topology-changes storage_service: join_cluster: replace force_gossip_based_join with force-gossip-topology-changes storage_service: join_token_ring: fix finish_setup_after_join calls	2024-04-26 14:45:29 +02:00
Botond Dénes	b96f28356a	Merge 'api/storage_service: convert runtime_error from repair to http error ' from Kefu Chai in `set_repair()`, despite that the repair is performed asynchronously, we check the options specified by client immediately, and throw `std::runtime_error`, if any of them is not supported. before this change, these unhandled exceptions are translated to HTTP 500 error but the underlying HTTP router. but this is misleading, as these errors are caused by client, not server. in this change, we handle the `runtime_error`, and translate them into `httpd::bad_param_exception`, so that the client can have HTTP 400 (Bad Request) instead of HTTP 500 (Internal Server Error), and with informative error message. for instance, if we apply repair with "small_table_optimization" enabled on a keyspace with tablets enabled. we should have an HTTP error 400 with "The small_table_optimization option is not supported for tablet repair" as the body of the error. this would much more helpful. Closes scylladb/scylladb#18389 * github.com:scylladb/scylladb: api/storage_service: convert runtime_error from repair to http error repair: change runtime_error to invalid_argument in do_repair_start() api/storage_service: coroutinize set_repair()	2024-04-26 13:27:51 +03:00
Patryk Jędrzejczak	3a100cd16c	test: test_raft_recovery_stuck: ensure raft upgrade procedure failed We have log browsing in test.py now, so we can fix this TODO easily. Closes scylladb/scylladb#18425	2024-04-26 10:16:49 +02:00
Asias He	62a9ecff51	repair: Cleanup repair history status entry for tablet The entry in the repair history map that is used to track repair status internally for each repair job should be removed after the repair job is done. We do the same for vnode repairs. This patch adds the missing automatic history cleanup code which is missed in the initial tablet repair support in commit `54239514af`, which does not support repair history update back then. Refs #17046 Closes scylladb/scylladb#18434	2024-04-26 10:56:45 +03:00
Botond Dénes	044fd7a3ec	Merge 'Move some view updating methods from table to view_update_generator' from Pavel Emelyanov The populate_views() and generate_and_propagate_view_updates() both naturally belong to view_update_generator -- they don't need anything special from table itself, but rather depend on some internals of the v.u.generator itself. Moving them there lets removing the view concurrency semaphore from keyspace and table, thus reducing the cross-components dependencies. Closes scylladb/scylladb#18421 * github.com:scylladb/scylladb: replica: Do not carry view concurrency semaphore pointer around view: Get concurrency semaphore via database, not table view_update_generator: Mark mutate_MV() private view: Move view_update_generator methods' code view: Move table::generate_and_propagate_view_updates into view code view: Move table::populate_views() into view_update_generator class	2024-04-26 10:55:38 +03:00
Botond Dénes	d566eec89a	Merge 'treewide: remove {dclocal_,}read_repair_chance options' from Kefu Chai dclocal_read_repair_chance and read_repair_chance have been removed in Cassandra 3.11 and 4.x, see https://issues.apache.org/jira/browse/CASSANDRA-13910. if we expose these properties via DDL, Cassandra would fail to consume the CQL statement creating the table when performing migration from Scylla to Cassandra 4.x, as the latter does not understand these properties anymore. currently the default values of `dc_local_read_repair_chance` and `read_repair_chance` are both "0". so they are practically disabled, unless user deliberately set them to a value greater than 0. also, as a side effect, Cassandra 4.x has better support of Python3. the cqlsh shipped along with Cassandra 3.11.16 only supports python2.7, see https://github.com/apache/cassandra/blob/cassandra-3.11.16/bin/cqlsh.py it errors out if the system only provides python3 with the error of ``` No appropriate python interpreter found. ``` but modern linux systems do not provide python2 anymore. so, in this change, we deprecate these two options. Fixes #3502 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18087 * github.com:scylladb/scylladb: docs: drop documents related to {,dclocal_}read_repair_chance treewide: remove {dclocal_,}read_repair_chance options	2024-04-26 10:48:47 +03:00
Michał Chojnowski	c1146314a1	docs: clarify that `DELETE` can be used with `USING TIMEOUT` The current text seems to suggest that `USING TIMEOUT` doesn't work with `DELETE` and `BATCH`. But that's wrong. Closes scylladb/scylladb#18424	2024-04-26 10:48:17 +03:00
Pavel Emelyanov	4ac30e5337	view-builder: Print correct exception in built ste exception handler Inside .handle_exception() continuation std::current_exception() doesn't work, there's std::exception ex argument to handler's lambda instead fixes #18423 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18349	2024-04-26 09:58:45 +03:00
Kefu Chai	0bbaded4ce	api/storage_service: convert runtime_error from repair to http error in `set_repair()`, despite that the repair is performed asynchronously, we check the options specified by client immediately, and throw `std::runtime_error`, if any of them is not supported. before this change, these unhandled exceptions are translated to HTTP 500 error but the underlying HTTP router. but this is misleading, as these errors are caused by client, not server. and the error message is missing in the HTTP error message when performing the translation. in this change, we handle the `runtime_error`, and translate them into `httpd::bad_param_exception`, so that the client can have HTTP 400 (Bad Request) instead of HTTP 500 (Internal Server Error), and with informative error message. for instance, if we apply repair with "small_table_optimization" enabled on a keyspace with tablets enabled. we should have an HTTP error 400 with "The small_table_optimization option is not supported for tablet repair" as the body of the error. this would much more helpful. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-26 14:25:15 +08:00
Kefu Chai	9de9f401a1	repair: change runtime_error to invalid_argument in do_repair_start() if an error is caused by the option provided by user, would be better to throw an `std::invalid_argument` instead of `std::runtime_error`, so that the caller can make a better decision when handling the thrown exceptions. so, in this change, we change the exceptions raise directly in `repair_service::do_repair_start()` from `std::runtime_error` to `std::invalid_argument`. please note, in the lambda named `host2ip`, since the hostname is not provided by user, so we are not changing the exception type in that lambda. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-26 14:24:45 +08:00
Kefu Chai	d737ba1ab2	api/storage_service: coroutinize set_repair() before this change, `set_repair()` uses a lambda for handling the client-side requests. and this works great. but the underlying `repair_start()` throws if any of the given options is not sane. and we don't handle any of these throw exceptions in `set_repair()`, from client's point of view, it would get an HTTP 500 error code, which implies an "Internal Server Error". but actually, we should blame the client for the error, not the server. so, to prepare the error handling, let's take the opportunity to coroutinize the lambda handling the request, so that we can handle the exception in a more elegant way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-26 14:24:03 +08:00
Botond Dénes	7cbe5c78b4	install.sh: use the native nodetool directly * tools/java b810e8b00e...4ee15fd9ea (1): > install.sh: don't install nodetool into /usr/bin Add a bin/nodetool and install it to bin/ in install.sh. This script simply forwards to scylla nodetool and it is the replacement for the Java nodetool, which is dropped from the java-tools's install.sh, in the submodule update also included in this patch. With this change, we now hardwire the usage of the native nodetool, as the nodetool, with the intermediary nodetool wrapper script removed from the picture. Bash completion was copied from the java tools repository and it is now installed by the scylla package, together with nodetool. The Java nodetool is still available as as a fall-back, in case the native nodetool has problems, at the path of /opt/scylladb/share/cassandra/bin/nodetool. Testing I tested upgrades on a DEB and RPM distro: Ubuntu and Fedora. First I installed scylla-5.4, then I installed the packages for this PR. On Ubuntu, I had to use dpkg -i --auto-deconfigure, otherwise, dpkg would refuse to install the new packages because they break the old ones. No extra flags were required on Fedora. In both cases, /usr/bin/nodetool was changed from a thunk calling the Java nodetool (from 5.4) to the native launcher script from this PR. /opt/scylladb/share/cassandra/bin/nodetool remained in place and still works after the upgrade. I also verified that --nonroot installs also work. Nodetool works both when called with an absolute path, or when ~/scylladb/bin is added to $PATH. Fixes: #18226 Fixes: #17412 Closes scylladb/scylladb#18255 [avi: reset submodule to actual hash we ended up with]	2024-04-25 22:52:00 +03:00
Avi Kivity	c2b8ca7d71	Merge 'cql3: statements: change default tombstone_gc mode for tablets' from Aleksandra Martyniuk Repair may miss some tablets that migrated across nodes. So if tombstones expire after some timeout, then we can have data resurrection. Set default tombstone_gc mode to "repair" for tables which use tablets (if repair is required). Fixes: #16627. Closes scylladb/scylladb#18013 * github.com:scylladb/scylladb: test: check default value of tombstone_gc test: topology: move some functions to util.py cql3: statements: change default tombstone_gc mode for tablets	2024-04-25 19:18:37 +03:00
Lakshmi Narayanan Sreethar	6af2659b57	sstables: reclaim_memory_from_components: do not update _recognised_components When reclaiming memory from bloom filters, do not remove them from _recognised_components, as that leads to the on-disk filter component being left back on disk when the SSTable is deleted. Fixes #18398 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#18400	2024-04-25 19:15:59 +03:00
Raphael S. Carvalho	4a5fdc5814	table: Remove outdated FIXME about sstable spanning multiple tablets The FIXME was added back then because we thought the interface of compaction_group_for_sstable might have to be adjusted if a sstable were allowed to temporarily span multiple tablets until it's split, but we have gone a different path. If a sstable's key range incorrectly spans more than one tablet, that will be considered a bug and an exception is thrown. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#18410	2024-04-25 17:21:11 +03:00
Marcin Maliszkiewicz	7085339f72	cql3: test: include get_mutations_internal log in test.py We have a concurrent modification conflict in tests and suspect duplicated requests but since we don't log successful requests we have no way to verify if that's the case. get_mutations_internal log will help to tell wchich nodes are trying to push auth or service levels mutations into raft. Refs scylladb/scylladb#18319 Closes scylladb/scylladb#18413	2024-04-25 17:17:53 +03:00
Botond Dénes	0234b4542a	Merge '[github] add PR template and action to verify PR tasks was completed' from Yaron Kaikov Today with the backport automation, the developer added the relevant backport label, but without any explanation of why Adding the PR template with a placeholder for the developer to add his decision about backport yes or no The placeholder is marked as a task, so once the explanation is added, the task must be checked as completed Also adding another check to the PR summary will make it clear to the maintainer/reviewer if the developer explained about backport Closes scylladb/scylladb#18275 * github.com:scylladb/scylladb: [github] add action to verify PR tasks was completed [github] add PR template	2024-04-25 17:14:50 +03:00
Pavel Emelyanov	18cc2cfa31	replica: Generalize snapshot details for single table/snapshot dir There are two places that get total:live stats for a table snapshot -- database::get_snapshot_details() and table::get_snapshot_details(). Both do pretty similar thing -- walk the table/snapshots/ directory, then each of the found sub-directory and accumulate the found files' sizes into snapshot details structure. Both try to tell total from live sizes by checking whether an sstable component found in snapshots is present in the table datadir. The database code does it in a more correct way -- not just checks the file presense, but also compares if it's a hardlink on the snapshot file, while the table code just checks if the file of the same name exists. This patch does both -- makes both database and table call the same helper method for a single snapshot details, and makes the generalized version use more elaborated collision check, thus fixing the per-table details getting behavior. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18347	2024-04-25 17:12:42 +03:00
Asias He	1ca779d287	streaming: Fix use after move in fire_stream_event The event is used in a loop. Found by clang-tidy: ``` streaming/stream_result_future.cc:80:49: warning: 'event' used after it was moved [bugprone-use-after-move] listener->handle_stream_event(std::move(event)); ^ streaming/stream_result_future.cc:80:39: note: move occurred here listener->handle_stream_event(std::move(event)); ^ streaming/stream_result_future.cc:80:49: note: the use happens in a later loop iteration than the move listener->handle_stream_event(std::move(event)); ^ ``` Fixes #18332 Closes scylladb/scylladb#18333	2024-04-25 16:48:54 +03:00
Botond Dénes	2c8bd99cd4	Merge 'Coroutinize view_builder::stop()' from Pavel Emelyanov It's pretty straightforward, but prior to that, exception handling needs some care Closes scylladb/scylladb#18378 * github.com:scylladb/scylladb: view-builder: Coroutinize stop() view_builder: Do not try to handle step join exceptions on stop	2024-04-25 16:48:25 +03:00
Kefu Chai	014a069ed2	build: cmake: require {fmt} >= 9.0.0 we are using `fmt::ostream_formatter` which was introduced in {fmt} v9.0.0, see https://github.com/fmtlib/fmt/releases/tag/9.0.0 . before this change, we depend on Seastar to find {fmt}. but the minimal version of {fmt} required by Seastar is 5.0.0, which cannot fulfill the needs to build scylladb. in this change, we find {fmt} package in scylla, and specify the minimal required version of 9.0.0, so the build can fail at the configuration time. {fmt} v8 could be still being used by users. for instance, ubuntu:jammy comes with libfmt-dev 8.1.1. and ubuntu:jammy is EOL in Apr 2027, see https://ubuntu.com/about/release-cycle . Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18386	2024-04-25 16:35:08 +03:00
Amnon Heiman	dfea50a7e9	db/config.cc add metric family config from file Metric family config lets a user configure the metric family aggregate labels. This patch modifies the existing relable-config from file to accept metric family config. Similar to the existing relable_config, it adds a metric_family_configs section. For example, the following configuration demonstrates changing aggregate labels by name and regular expression. ``` metric_family_configs: - name: storage_service aggregate_labels: [shard] - regex: (storage_proxy.*) aggregate_labels: [shard, scheduling_group_name] ``` Signed-off-by: Amnon Heiman <amnon@scylladb.com> Closes scylladb/scylladb#18339	2024-04-25 16:03:39 +03:00
Kefu Chai	e9b31cb4c1	test: locator_topology: s/get0()/get()/ this change addresses the leftover of `9e8805bb49` Refs `9e8805bb49` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18390	2024-04-25 16:03:01 +03:00
Patryk Jędrzejczak	55b011902e	docs: raft.rst: update after removing consistent-topology-changes	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	0d428a3857	treewide: fix indentation after the previous patch	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	3a34bb18cd	db: config: make consistent-topology-changes unused We make the `consistent-topology-changes` experimental feature unused and assumed to be true in 6.0. We remove code branches that executed if `consistent-topology-changes` was disabled.	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	77342ffb34	test: lib: single_node_cql_env: restart a node in noninitial run_in_thread calls In the following commit, we make the `consistent-topology-changes` experimental feature unused. Then, all unit tests in the boost suite will start using the raft-based topology by default. Unfortunately, tests with multiple `single_node_cql_env::run_in_thread` calls (usually coming from the `do_with_cql_env_thread` calls) would fail. In a noninitial `run_in_thread` call, a node is started as if it booted for the first time. On the other hand, it has its persistent state from previous boots. Hence, the node can behave strangely and unexpectedly. In particular, `SYSTEM.TOPOLOGY` is not empty and the assertion that expects it to be empty when we boot for the first time fails. We fix this issue by making noninitial `run_in_thread` calls behave as normal restarts. After this change, `test_schema_digest_does_not_change_with_disabled_features` starts failing. This test copies the data directory before booting for the first time, so the new `_sys_ks.local().build_bootstrap_info().get();` makes the node incorrectly think it restarts. Then, after noticing it is not a part of group 0, the node would start the raft upgrade procedure if we didn't run it in the raft RECOVERY mode. This procedure would get stuck because it depends on messaging being enabled even if the node communicates only with itself and messaging is disabled in boost tests.	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	88038d958a	test: test_read_required_hosts: run with force-gossip-topology-changes In one of the following commits, we make the `consistent-topology-changes` experimental feature unused. Then, all unit tests in the boost suite will start using the raft-based topology by default. Unfortunately, some tests would start failing and `test_read_required_hosts` is one of them. `tablet_cql_test_config` in `tablets_test.cc` doesn't use `consistent-topology-changes`, so all test cases in this file run incorrectly wit the gossip-based topology changes. With `consistent-topology-changes`, only `test_read_required_hosts` fails. The failure happens on `auto table2 = add_table(e).get();`: ``` ERROR 2024-04-17 11:14:16,083 [shard 0:main] load_balancer - Replica 9b94d710-fbfb-11ee-9c4f-448617b47e11:0 of tablet 9b94d713-fbfb-11ee-9c4f-448617b47e11:0 not found in topology ``` This test case needs to be investigated and rewritten so that it passes with the raft-based topology. However, we don't want this issue to block the process of making the `consistent-topology-changes` experimental feature unused. We leave a FIXME and we will open a new issue to track it.	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	213f2f6882	storage_service: join_cluster: replace force_gossip_based_join with force-gossip-topology-changes The `force_gossip_based_join` error injection does exactly what we expect from `force-gossip-topology-changes` so we can do a simple replacement. We prefer a flag over an error injection because we will use it a lot in CI jobs' configurations, some tests, manual testing etc. It's much more convenient. Moreover, the flag can be used in the release mode, so we re-enable all tests that were disabled in release mode only because of using the `force_gossip_based_join` error injection. The name of the `force-gossip-topology-changes` flag suggests that using it should always succesfully force the gossip-based topology or, if forcing is not possible, the booting should fail. We don't want a node with `force-gossip-topology-changes=true` that silently boots in the raft-topology mode. We achieve it by throwing a runtime error from `join_cluster` in two cases: - the node is restarting in the cluster that is using raft topology - the node is joining the cluster that is using raft topology	2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak	d6ee540efc	storage_service: join_token_ring: fix finish_setup_after_join calls The `topology_change_enabled` parameter of `finish_setup_after_join` is used underneath to enable pulling raft topology snapshots in two cases: - when the node joins the cluster that uses the raft-based topology, - when the SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES feature is enabled. The first case happens in the first changed call. `_raft_experimental_topology` always equals true there. The second call was incorrect as it could enable pulling snapshots before SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES was enabled. It could cause problems during rolling upgrade to 6.0. For more information see `07aba3abc4`.	2024-04-25 14:33:21 +02:00
Yaron Kaikov	5e63f74984	[github] add action to verify PR tasks was completed Adding another check to the PR summary will make it clear to the maintainer/reviewer if the developer explained about backport	2024-04-25 15:24:22 +03:00
Botond Dénes	aaa76d4c0e	Merge 'Getting per-table snapshot size is racy wrt creating new snapshots' from Pavel Emelyanov The API endpoint in question calls table::get_snapshot_detail() which just walks table/snapshots/ directory. This can clash with creating a new snapshot. Database-wide walk is guarded with snapshot-ctl's locking, so should the per-table API do Closes scylladb/scylladb#18414 * github.com:scylladb/scylladb: snapshot: Get per-table snapshot size under snapshot lock snapshot: Move per-table snap API to other snapshot endpoints	2024-04-25 14:57:52 +03:00
Kefu Chai	e5b30ae2ad	partition_version: do not rereference moved variable in `partition_entry::apply_to_incomplete()`, we pass `dst_snp` and `std::move(dst_snp)` to build the capture variable list of a lambda, but the order of evaluation of these variables are unspecified. fortunately, we haven't run into any issues at this moment. but this is not future-proof. so, let's avoid this by storing a reference of the dereferenced smart pointer, and use it later on. this issue is identified by clang-tidy: ``` /home/kefu/dev/scylladb/mutation/partition_version.cc:500:53: warning: 'dst_snp' used after it was moved [bugprone-use-after-move] 500 \| cur = partition_snapshot_row_cursor(s, dst_snp), \| ^ /home/kefu/dev/scylladb/mutation/partition_version.cc:502:23: note: move occurred here 502 \| dst_snp = std::move(dst_snp), \| ^ /home/kefu/dev/scylladb/mutation/partition_version.cc:500:53: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated 500 \| cur = partition_snapshot_row_cursor(s, dst_snp), \| ^ /home/kefu/dev/scylladb/mutation/partition_version.cc:501:57: warning: 'src_snp' used after it was moved [bugprone-use-after-move] 501 \| src_cur = partition_snapshot_row_cursor(s, src_snp, can_move), \| ^ /home/kefu/dev/scylladb/mutation/partition_version.cc:504:23: note: move occurred here 504 \| src_snp = std::move(src_snp), \| ^ /home/kefu/dev/scylladb/mutation/partition_version.cc:501:57: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated 501 \| src_cur = partition_snapshot_row_cursor(s, *src_snp, can_move), \| ^ ``` Fixes #18360 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18361	2024-04-25 14:57:52 +03:00
Pavel Emelyanov	8aaa09ee97	replica: Do not carry view concurrency semaphore pointer around Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 14:27:43 +03:00
Pavel Emelyanov	2ee7c41139	view: Get concurrency semaphore via database, not table The _view_update_concurrency_sem field on database propagates itself via keyspace config down to table config and view_update_generator then grabs one via table:: helper. That's an overkil, view_update_generator has direct reference on the database and can get this semaphore from there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 14:25:57 +03:00
Pavel Emelyanov	3d8b572d96	view_update_generator: Mark mutate_MV() private Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 14:25:40 +03:00
Pavel Emelyanov	bc4552740f	view: Move view_update_generator methods' code Now when the two methods belong to another class, move the code itself to db/view , where the class itself resides. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 14:24:20 +03:00
Pavel Emelyanov	c2bf6b43b2	view: Move table::generate_and_propagate_view_updates into view code Similarly to populate_views() method, this one also naturally belongs to view_update_generator class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 14:20:06 +03:00
Pavel Emelyanov	670c7c925c	view: Move table::populate_views() into view_update_generator class The method in question has little to do with table, effectively it only needs stats and consurrency semaphore. And the semaphore in question is obtained from table indirectly, it really resides on database. On the other hand, the method carries lots of bits from db::view, e.g. the view_update_builder class, memory_usage_of() helper and a bit more. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 14:17:20 +03:00
Kefu Chai	e5bcea6718	docs: drop documents related to {,dclocal_}read_repair_chance since "read_repair_chance" and "dclocal_read_repair_chance" are removed, and not supported anymore. let's stop documenting them. Refs #3502 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-25 17:15:27 +08:00
Kefu Chai	c323c93fa4	treewide: remove {dclocal_,}read_repair_chance options dclocal_read_repair_chance and read_repair_chance have been removed in Cassandra 3.11 and 4.x, see https://issues.apache.org/jira/browse/CASSANDRA-13910. if we expose the properties via DDL, Cassandra would fails to consume the CQL statement to creating the table when performing migration from Scylla to Cassandra 4.x, as the latter does not understand these properties anymore. currently the default values of `dc_local_read_repair_chance` and `read_repair_chance` are both "0". so this is practically disabled, unless user deliberately set them to a value greater than 0. also, as a side effect, Cassandra 4.x has better support of Python3. the cqlsh shipped along with Cassandra 3.11.16 only supports python2.7, see https://github.com/apache/cassandra/blob/cassandra-3.11.16/bin/cqlsh.py it errors out if the system only provides python3 with the error of ``` No appropriate python interpreter found. ``` but modern linux systems do not provide python2 anymore. so, in this change, we deprecate these two options. Fixes #3502 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-25 17:15:27 +08:00
Botond Dénes	ca26899c36	Merge 'sstable: large data handler needs to count range tombstones as rows' from Ferenc Szili When issuing warnings about partitions with the number of rows above a configured threshold, the large partitions handler does not take into consideration the number of range tombstone markers in the total rows count. This fix adds the number of range tombstone markers to the total number of rows and saves this total in system.large_partitions.rows (if it is above the threshold). It also adds a new column range_tombstones to the system.large_partitions table which only contains the number of range tombstone markers for the given partition. This PR fixes the first part of issue #13968 It does not cover distinguishing between live and dead rows. A subsequent PR will handle that. Closes scylladb/scylladb#18346 * github.com:scylladb/scylladb: sstables: add docs changes for system.large_partitions sstable: large data handler needs to count range tombstones as rows	2024-04-25 11:38:30 +03:00
Pavel Emelyanov	e97abfc473	tablets: Fix indentation after flat-hash-map patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18364	2024-04-25 11:36:37 +03:00
Kefu Chai	0b5a861961	build: cmake: reference build_mode with ${scylla_build_mode_${CMAKE_BUILD_TYPE}} before this change, if we generate the building system with plain `Ninja`, instead of `Ninja Multi-Config` using cmake, the build fails, because `${scylla_build_mode_${CMAKE_BUILD_TYPE}}` is not defined. so the profile used for building the rust library would be "rust-", which does not match any of the profiles defined by `Cargo.toml`. in this change, we use `$CMAKE_BUILD_TYPE` instead of "$config". as the former is defined for non-multi generator. while the latter is. see https://cmake.org/cmake/help/latest/generator/Ninja%20Multi-Config.html with this change, we are able to generate the building system properly with the "Ninja" generator. if we just want to run some static analyzer against the source tree or just want to build scylladb with a single configuration, the "Ninja" generator is a good fit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18353	2024-04-25 10:51:54 +03:00
Pavel Emelyanov	ae4c1c44ec	snapshot: Get per-table snapshot size under snapshot lock Walking per-table snapshot directory without lock is racy. There's snapshot-ctl locking that's used to get db-wide snapshot details, it should be used to get per-table snapshot details too Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 10:05:51 +03:00
Pavel Emelyanov	186b36165e	snapshot: Move per-table snap API to other snapshot endpoints So that they are collected in one place and to facilitate next patch that's going to use snapshot-ctl for per-table API too Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 10:05:01 +03:00
Anna Stuchlik	b5d256a991	doc: add Scylla Doctor to the docs This commit adds the description and usage instructions of Scylla Doctor to the "How to Report a ScyllaDB Problem" page. Scylla Doctor replaces Health Check Report, so the description of and references to the latter are removed with this commit. Fixes https://github.com/scylladb/scylladb/issues/16276 Closes scylladb/scylladb#17617	2024-04-25 09:50:38 +03:00

1 2 3 4 5 ...

42363 Commits