scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 12:17:02 +00:00

Author	SHA1	Message	Date
Kefu Chai	6fdb124914	dist: drop %pretrans section before this change, if user does not have `/bin/sh` around, when installing scylla packages, the script in `%pretrans" is executed, and fails due to missing `/bin/sh`. per https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/#pretrans > Note that the %pretrans scriptlet will, in the particular case of > system installation, run before anything at all has been installed. > This implies that it cannot have any dependencies at all. For this > reason, %pretrans is best avoided, but if used it MUST (by necessity) > be written in Lua. See > https://rpm-software-management.github.io/rpm/manual/lua.html for more > information. but we were trying to warn users upgrading from scylla < 1.7.3, which was released 7 years ago at the time of writing. in this change, we drop the `%pretrans` section. hopefuly they will find their way out if they still exist. Fixes scylladb/scylladb#20321 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `6970c502c9`) Closes scylladb/scylladb#20386	2024-09-10 11:46:30 +03:00
Avi Kivity	093bff385b	docs: cql: document ZstdCompressor for CREATE TABLE Adjust the wording slightly to be less awkward. (cherry picked from commit `60acfd8c08`) Closes scylladb/scylladb#20381	2024-09-10 11:45:55 +03:00
Raphael S. Carvalho	a4f6811d5f	storage_service: avoid processing same table unnecessarily in split monitor If there's a token metadata for a given table, and it is in split mode, it will be registered such that split monitor can look at it, for example, to start split work, or do nothing if table completed it. during topology change, e.g. drain, split is stalled since it cannot take over the state machine. It was noticed that the log is being spammed with a message saying the table completed split work, since every tablet metadata update, means waking up the monitor on behalf of a table. So it makes sense to demote the logging level to debug. That persists until drain completes and split can finally complete. Another thing that was noticed is that during drain, a table can be submitted for processing faster than the monitor can handle, so the candidate queue may end up with multiple duplicated entries for same table, which means unnecessary work. That is fixed by using a sequenced set, which keeps the current FIFO behavior. Fixes #20339. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `26facd807e`) Closes scylladb/scylladb#20344	2024-09-10 11:45:06 +03:00
Botond Dénes	8d5f2d8943	Merge '[Backport 6.0] repair: throw if batchlog manager isn't initialized' from Aleksandra Martyniuk repair_service::repair_flush_hints_batchlog_handler may access batchlog manager while it is uninitialized. Throw if batchlog manager isn't initialized. Fixes: https://github.com/scylladb/scylladb/issues/20236. Needs backport to 6.0 and 6.1 as they suffer from the uninitialized bm access. (cherry picked from commit `d8e4393418`) (cherry picked from commit `f38bb6483a`) Refs https://github.com/scylladb/scylladb/pull/20251 Closes scylladb/scylladb#20392 * github.com:scylladb/scylladb: test: add test to ensure repair won't fail with uninitialized bm repair: throw if batchlog manager isn't initialized	2024-09-09 15:13:38 +03:00
Jenkins Promoter	0e5108ed7f	Update ScyllaDB version to: 6.0.4	2024-09-04 15:34:01 +03:00
Kamil Braun	96064e9647	Merge '[Backport 6.0] Fix node replace with inter-dc encryption enabled.' from Gleb Natapov Currently if a coordinator and a node being replaced are in the same DC while inter-dc encryption is enabled (connections between nodes in the same DC should not be encrypted) the replace operation will fail. It fails because a coordinator uses non encrypted connection to push raft data to the new node, but the new node will not accept such connection until it knows which DC the coordinator belongs to and for that the raft data needs to be transferred. The series adds the test for this scenario and the fix for the chicken&egg problem above. The series (or at least the fix itself) needs to be backported because this is a serious regression. Fixes: https://github.com/scylladb/scylladb/issues/19025 (cherry picked from commit `84757a4ed3`) (cherry picked from commit `b98282a976`) (cherry picked from commit `2f1b1fd45e`) (cherry picked from commit `17f4a151ce`) (cherry picked from commit `32a59ba98f`) Refs https://github.com/scylladb/scylladb/pull/20290 Closes scylladb/scylladb#20399 * github.com:scylladb/scylladb: topology coordinator: fix indentation after the last patch topology coordinator: do not add replacing node without a ring to topology test: add test for replace in clusters with encryption enabled test.py: add server encryption support to cluster manager .gitignore: fix pattern for resources to match only one specific directory	2024-09-03 12:24:35 +02:00
Gleb Natapov	8400e6947b	topology coordinator: fix indentation after the last patch (cherry picked from commit `32a59ba98f`)	2024-09-02 17:04:42 +03:00
Gleb Natapov	8510568eda	topology coordinator: do not add replacing node without a ring to topology When only inter dc encryption is enabled a non encrypted connection between two nodes is allowed only if both nodes are in the same dc. If a nodes that initiates the connection knows that dst is in the same dc and hence use non encrypted connection, but the dst not yet knows the topology of the src such connection will not be allowed since dst cannot guaranty that dst is in the same dc. Currently, when topology coordinator is used, a replacing node will appear in the coordinator's topology immediately after it is added to the group0. The coordinator will try to send raft message to the new node and (assuming only inter dc encryption is enabled and replacing node and the coordinator are in the same dc) it will try to open regular, non encrypted, connection to it. But the replacing node will not have the coordinator in it's topology yet (it needs to sync the raft state for that). so it will reject such connection. To solve the problem the patch does not add a replacing node that was just added to group0 to the topology. It will be added later, when tokens will be assigned to it. At this point a replacing node will already make sure that its topology state is up-to-date (since it will execute a raft barrier in join_node_response_params handler) and it knows coordinator's topology. This aligns replace behaviour with bootstrap since bootstrap also does not add a node without a ring to the topology. The patch effectively reverts `b8ee8911ca` Fixes: scylladb/scylladb#19025 (cherry picked from commit `17f4a151ce`)	2024-09-02 17:04:42 +03:00
Gleb Natapov	cd324b8513	test: add test for replace in clusters with encryption enabled (cherry picked from commit `2f1b1fd45e`)	2024-09-02 17:04:42 +03:00
Gleb Natapov	d441d93e63	test.py: add server encryption support to cluster manager (cherry picked from commit `b98282a976`)	2024-09-02 17:04:42 +03:00
Gleb Natapov	84c47df5e3	.gitignore: fix pattern for resources to match only one specific directory (cherry picked from commit `84757a4ed3`)	2024-09-02 15:21:11 +03:00
Aleksandra Martyniuk	ca3cbae70b	test: add test to ensure repair won't fail with uninitialized bm (cherry picked from commit `f38bb6483a`)	2024-09-02 10:37:18 +02:00
Aleksandra Martyniuk	3e25eadf12	repair: throw if batchlog manager isn't initialized repair_service::repair_flush_hints_batchlog_handler may access batchlog manager while it is uninitialized. Batchlog manager cannot be initialized before repair as we have the dependencies chain: repair_service -> storage_service::join_cluster -> batchlog_manager. Throw if batchlog manager isn't initialized. That won't cause repair to fail. (cherry picked from commit `d8e4393418`)	2024-08-30 13:55:48 +00:00
Botond Dénes	e33fcfe27b	Merge '[Backport 6.0] Make Summary support histogram with infinite bucket vlaues' from ScyllaDB This series fixes an issue where histogram Summaries return an infinite value. It updated the quantile calculation logic to address cases where values fall into the infinite bucket of a histogram. Now, instead of returning infinite (max int), the calculation will return the last bucket limit, ensuring finite outputs in all cases. The series adds a test for summaries with a specific test case for this scenario. Fixes #20255 Need backport to 6.0, 6.1 and 2023.1 and above (cherry picked from commit `011aa91a8c`) (cherry picked from commit `644e6f0121`) Refs #20257 Closes scylladb/scylladb#20304 * github.com:scylladb/scylladb: test/estimated_histogram_test Add summary tests utils/histogram.hh: Make summary support inifinite bucket.	2024-08-29 07:52:36 +03:00
Botond Dénes	0020d37a20	Merge '[Backport 6.0] repair: do_rebuild_replace_with_repair: use source_dc only when safe' from ScyllaDB It is unsafe to restrict the sync nodes for repair to the source data center if it has too low replication factor in network_topology_replication_strategy, or if other nodes in that DC are ignored. Also, this change restricts the usage of source_dc to `network_topology` and `everywhere_topology` strategies, as with simple replication strategy there is no guarantee that there would be any more replicas in that data center. Fixes #16826 Reproducer submitted as https://github.com/scylladb/scylla-dtest/pull/3865 It fails without this fix and passes with it. * Requires backport to live versions. Issue hit in the filed with 2022.2.14 (cherry picked from commit `8b1877f3ca`) (cherry picked from commit `0419b1d522`) (cherry picked from commit `b5d0ab092c`) (cherry picked from commit `9729dd21c3`) (cherry picked from commit `8665eef98c`) (cherry picked from commit `5f655e41e3`) Refs #16827 Closes scylladb/scylladb#20229 * github.com:scylladb/scylladb: raft_rebuild: propagate source_dc force option to rebuild_option repair: do_rebuild_replace_with_repair: use source_dc only when safe repair: replace_with_repair: pass the replace_node downstream repair: replace_with_repair: pass ignore_nodes as a set of host_id:s repair: replace_rebuild_with_repair: pass ks_erms from caller nodetool: rebuild: add force option Add and use utils::optional_param to pass source_dc	2024-08-29 07:36:39 +03:00
Botond Dénes	97fe5213f7	Merge '[Backport 6.0] schema_tables: calculate_schema_digest: prevent stalls due to large m…' from ScyllaDB …utations vector With a large number of table the schema mutations vector might get big enoug to cause reactor stalls when freed. For example, the following stall was hit on 2023.1.0~rc1-20230208.fe3cc281ec73 with 5000 tables: ``` (inlined by) ~vector at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_vector.h:730 (inlined by) db::schema_tables::calculate_schema_digest(seastar::sharded<service::storage_proxy>&, enum_set<super_enum<db::schema_feature, (db::schema_feature)0, (db::schema_feature)1, (db::schema_feature)2, (db::schema_feature)3, (db::schema_feature)4, (db::schema_feature)5, (db::schema_feature)6, (db::schema_feature)7> >, seastar::noncopyable_function<bool (std::basic_string_view<char, std::char_traits<char> >)>) at ./db/schema_tables.cc:799 ``` This change returns a mutations generator from the `map` lambda coroutine so we can process them one at a time, destroy the mutations one at a time, and by that, reducing memory footprint and preventing reactor stalls. Fixes #18173 (cherry picked from commit `95a5fba0ea`) (cherry picked from commit `52234214e5`) Refs #18174 Closes scylladb/scylladb#20247 * github.com:scylladb/scylladb: schema_tables: calculate_schema_digest: filter the key earlier schema_tables: calculate_schema_digest: prevent stalls due to large mutations vector	2024-08-29 07:33:28 +03:00
Benny Halevy	81f4036143	raft_rebuild: propagate source_dc force option to rebuild_option Currently, the `force` property of the `source_dc` rebuild option is lost and `raft_topology_cmd_handler` has no way to know if it was given or not. This in turn can cause rebuild to fail, even when `--force` is set by the user, where it would succeed with gossip topology changes, based on the source_dc --force semantics. \Fixes scylladb/scylladb#20242 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> \Closes scylladb/scylladb#20249 (cherry picked from commit `18c45f7502`) Closes scylladb/scylladb#20312	2024-08-28 12:12:15 +03:00
Benny Halevy	18644625a5	repair: do_rebuild_replace_with_repair: use source_dc only when safe It is unsafe to restrict the sync nodes for repair to the source data center if we cannot guarantee a quorum in the data center with network-topology replication strategy. This change restricts the usage of source_dc in the following cases: 1. For SimpleStrategy - source_dc is ignored since there is no guarantee that it contains remaining replicas for all tokens. 2. For EverywhereStrategy - use source_dc if there are remaining live nodes in the datacenter. 3. For NetworkTopologyStrategy: a. It is considered unsafe to use source_dc if number of nodes lost in that DC (replaced/rebuilt node + additional ignored nodes) is greater than 1, or it has 1 lost node and rf <= 1 in the DC. b. If the source_dc arg is forced, as with the new `nodetool rebuild --force <source_dc>` option, we use it anyway, even if it's considered to be unsafe. A warning is printed in this case. c. If the source_dc arg is user-provided, (using nodetool rebuild), an error exception is thrown, advising to use an alternative dc, if available, omit source_dc to sync with all nodes, or use the --force option to use the given source_dc anyhow. d. Otherwise, we look for an alternative source datacenter, that has not lost any node. If such datacenter is found we use it as source_dc for the keyspace, and log a warning. e. If no alternative dc is found (and source_dc is implicit), then: log a warning and fall back to using replicas from all nodes in the cluster. Fixes #16826 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `5f655e41e3`) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-08-28 12:12:15 +03:00
Benny Halevy	453f4b0277	repair: replace_with_repair: pass the replace_node downstream To be used by the next path to count how many nodes are lost in each datacenter. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `8665eef98c`)	2024-08-28 12:12:15 +03:00
Benny Halevy	ee9202ac1b	repair: replace_with_repair: pass ignore_nodes as a set of host_id:s The callers already pass ignore_nodes as host_id:s and we translate them into inet_address only for repair so delay the translation as much as posible, Refs scylladb/scylladb#6403 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `9729dd21c3`)	2024-08-28 12:12:15 +03:00
Benny Halevy	53abb7fdcc	repair: replace_rebuild_with_repair: pass ks_erms from caller The keyspaces replication maps must be in sync with the token_metadata_ptr passed already to the functions, so instead of getting it in the callee, let the caller get the ks_erms along with retrieving the tmptr. Note that it's already done on the rebuild path for streaming based rebuild. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `b5d0ab092c`)	2024-08-28 12:12:15 +03:00
Benny Halevy	7d63c9c62b	nodetool: rebuild: add force option To be used to force usage of source_dc, even when it is unsafe for rebuild. Update docs and add test/nodetool/test_rebuild.py Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `0419b1d522`)	2024-08-28 12:12:11 +03:00
Benny Halevy	4443a21c2a	Add and use utils::optional_param to pass source_dc Clearly indicate if a source_dc is provided, and if so, was it explicitly given by the user, or was implicitly selected by scylla. This will become useful in the next patches that will use that to either reject the operation if it's unsafe to use the source_dc and the dc was explicitly given by the user, or whether to fallback to using all nodes otherwise. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `8b1877f3ca`)	2024-08-28 12:02:20 +03:00
Botond Dénes	c945adcb67	Merge '[Backport 6.0] select from mutation_fragments() + tablets: handle reads for non-owned partitions' from ScyllaDB Attempting to read a partition via `SELECT * FROM MUTATION_FRAGMENTS()`, which the node doesn't own, from a table using tablets causes a crash. This is because when using tablets, the replica side simply doesn't handle requests for un-owned tokens and this triggers a crash. We should probably improve how this is handled (an exception is better than a crash), but this is outside the scope of this PR. This PR fixes this and also adds a reproducer test. Fixes: https://github.com/scylladb/scylladb/issues/18786 Fixes a regression introduced in 6.0, so needs backport to 6.0 and 6.1 (cherry picked from commit `de5329157c`) (cherry picked from commit `46563d719f`) (cherry picked from commit `4e2d7aa2a2`) Refs #20109 Closes scylladb/scylladb#20314 * github.com:scylladb/scylladb: test/tablets: Test that reading tablets' mutations from MUTATION_FRAGMENTS works replica/mutation_dump: enfore pinning of effective replication map replica/mutation_dump: handle un-owned tokens (with tablets)	2024-08-28 06:40:04 +03:00
Lakshmi Narayanan Sreethar	d3b41635de	test/pylib: fix keyspace_compaction method The `keyspace_compaction` method incorrectly appends the column family parameter to the URL using a regular string, `"?cf={table}"`, instead of an f-string, `f"?cf={table}"`. As a result, the column family name is sent as `{table}` to the server, causing the compaction request to fail. Fix this issue by passing the parameter to the POST request using a dictionary instead of appending it to the URL. Fixes #20264 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `4823a1e203`) Closes scylladb/scylladb#20275	2024-08-28 06:38:44 +03:00
Michał Chojnowski	0cee97f5c7	cql_test_env: ensure shutdown() before stop() for system_keyspace If system_keyspace::stop() is called before system_keyspace::shutdown(), it will never finish, because the uncleared shared pointers will keep it alive indefinitely. Currently this can happen if an exception is thrown before the construction of the shutdown() defer. This patch moves the shutdown() call to immediately before stop(). I see no reason why it should be elsewhere. Fixes scylladb/scylla-enterprise#4380 (cherry picked from commit `4d77faa61e`) Closes scylladb/scylladb#20147	2024-08-28 06:34:46 +03:00
Lakshmi Narayanan Sreethar	3f23780650	boost/sstable_datafile_test: wait for total memory reclaimed update The testcase `test_bloom_filter_reclaim_during_reload` checks the SSTable manager's `_total_memory_reclaimed` against an expected value to verify that a Bloom filter was reloaded. However, it does not wait for the manager to update the variable, causing the check to fail if the update has not occurred yet. Fix it by making the testcase wait until the variable is updated to the expected value. Fixes #19879 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#19883 (cherry picked from commit `27b305b9d1`) Closes scylladb/scylladb#19963	2024-08-28 06:33:29 +03:00
Benny Halevy	e06be56c28	sstable_directory: delete_atomically: allow sstables from multiple prefixes Currently, delete_atomically can be called with a list of sstables from mixed prefixes in two cases: 1. truncate: where we delete all the sstables in the table directory 2. tablet cleanup: similar to truncate but restricted to sstables in a single tablet replica In both cases, it is possible that sstables in staging (or quarantine) are mixed with sstables in the base directory. Until a more comprehensive fix is in place, (see https://github.com/scylladb/scylladb/pull/19555) this change just lifts the ban on atomic deletion of sstables from different prefixes, and acknowledging that the implementation is not atomic across prefixes. This is better than crashing for now, and can be backported more easily to branches that support tablets so tablet migration can be done safely in the presence of repair of tables with views. Refs scylladb/scylladb#18862 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `26abad23d9`) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#19920	2024-08-28 06:32:12 +03:00
Aleksandra Martyniuk	5276701881	test: tasks: adjust tests to new wait_task behavior After `c1b2b8cb2c` /task_manager/wait_task/ does not unregister tasks anymore. Delete the check if the task was unregistered from test_task_manager_wait. Check task status in drain_module_tasks to ensure that the task is removed from task manager. Fixes: #19351. (cherry picked from commit `dfe3af40ed`) Closes scylladb/scylladb#19840	2024-08-28 06:29:38 +03:00
Łukasz Paszkowski	9698ebe975	api/system: add highest_supported_sstable_format path Current upgrade dtest rely on a ccm node function to get_highest_supported_sstable_version() that looks for r'Feature (.*)_SSTABLE_FORMAT is enabled' in the log files. Starting from scylla-6.0 ME_SSTABLE_FORMAT is enabled by default and there is no cluster feature for it. Thus get_highest_supported_sstable_version() returns an empty list resulting in the upgrade tests failures. This change introduces a seperate API path that returns the highest supported sstable format (one of la, mc, md, me) by a scylla node. Fixes scylladb/scylladb#19772 Backports to 6.0 and 6.1 required. The current upgrade test in dtest checks scylla upgrades up to version 5.4 only. This patch is a prerequisite to backport the upgrade tests fix in dtest. (cherry picked from commit `781eb7517c`) Closes scylladb/scylladb#19815	2024-08-28 06:28:55 +03:00
Tomas Nozicka	37a0e149ae	Allow configuring default loglevel with args for container images (cherry picked from commit `26466a3043`) Closes scylladb/scylladb#19702	2024-08-28 06:28:10 +03:00
Avi Kivity	9d3ee6e920	config, enum_option: allow round-trip string conversion The default configuration for replication_strategy_warn_list is ["SimpleStrategy"], but one cannot set this via CQL: cqlsh> select * from system.config where name = 'replication_strategy_warn_list'; name \| source \| type \| value --------------------------------+---------+---------------------------+-------------------- replication_strategy_warn_list \| default \| replication strategy list \| ["SimpleStrategy"] (1 rows) cqlsh> update system.config set value = '[NetworkTopologyStrategy]' where name = 'replication_strategy_warn_list'; cqlsh> select * from system.config where name = 'replication_strategy_warn_list'; name \| source \| type \| value --------------------------------+--------+---------------------------+----------------------------- replication_strategy_warn_list \| cql \| replication strategy list \| ["NetworkTopologyStrategy"] (1 rows) cqlsh> update system.config set value = '["NetworkTopologyStrategy"]' where name = 'replication_strategy_warn_list'; WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="Operation failed for system.config - received 0 responses and 1 failures from 1 CL=ONE." info={'consistency': 'ONE', 'required_responses': 1, 'received_responses': 0, 'failures': 1} Fix by allowing quotes in enum_set parsing. Bug present since `8c464b2ddb` ("guardrails: restrict replication strategy (RS)", 6.0). Fixes #19604. (cherry picked from commit `45e27c0da2`) Closes scylladb/scylladb#19691	2024-08-28 06:27:08 +03:00
Yaron Kaikov	c20f5c8408	.github/script/label_promoted_commit.py: add label only if ref is PR we got a failure during check-commit action: ``` Run python .github/scripts/label_promoted_commits.py --commit_before_merge `30e82a81e8` --commit_after_merge `f31d5e3204` --repository scylladb/scylladb --ref refs/heads/master Commit sha is: `d5a149fc01` Commit sha is: `415457be2b` Commit sha is: `d3b1ccd03a` Commit sha is: `1fca341514` Commit sha is: `f784be6a7e` Commit sha is: `80986c17c3` Commit sha is: `492d0a5c86` Commit sha is: `7b3f55a65f` Commit sha is: `78d6471ce4` Commit sha is: `7a69d9070f` Commit sha is: `a9e985fcc9` master branch, pr number is: 19213 Traceback (most recent call last): File "/home/runner/work/scylladb/scylladb/.github/scripts/label_promoted_commits.py", line 87, in <module> main() File "/home/runner/work/scylladb/scylladb/.github/scripts/label_promoted_commits.py", line 81, in main pr = repo.get_pull(pr_number) File "/usr/lib/python3/dist-packages/github/Repository.py", line 2746, in get_pull headers, data = self._requester.requestJsonAndCheck( File "/usr/lib/python3/dist-packages/github/Requester.py", line 353, in requestJsonAndCheck return self.__check( File "/usr/lib/python3/dist-packages/github/Requester.py", line 378, in __check raise self.__createException(status, responseHeaders, output) github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/pulls/pulls#get-a-pull-request", "status": "404"} Error: Process completed with exit code 1. ``` The reason for this failure is since in one of the promoted commits (`a9e985fcc9`) had a reference of `Closes` to an issue. Fixes: https://github.com/scylladb/scylladb/issues/19677 (cherry picked from commit `e33126fc3e`) Closes scylladb/scylladb#19690	2024-08-28 06:26:24 +03:00
Botond Dénes	c33556c0f1	Merge '[Backport 6.0] conf: scylla.yaml: update documentation for tablets' from ScyllaDB Tablets are no longer in experimental_features since `83d491a`, so remove them from the experimental_features section documentation. Also, expand the documentation for the `enable_tablets` option. Fixes #19456 Needs backport to 6.0 (cherry picked from commit `92f8d219b3`) (cherry picked from commit `7f05f95ec4`) Refs #19516 Closes scylladb/scylladb#19688 * github.com:scylladb/scylladb: conf: scylla.yaml: enable_tablets: expand documentation conf: scylla.yaml: remove tablets from experimental_features doc comment	2024-08-28 06:25:26 +03:00
Pavel Emelyanov	e50a8ad209	test/tablets: Test that reading tablets' mutations from MUTATION_FRAGMENTS works Currently it doesn't, one of the node crashes with std::out_of_range exception and meaningless calltrace [Botond]: this test checks the case of reading a partition via MUTATION_FRAGMENTS from a node which doesn't own said partition. refs: #18786 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `4e2d7aa2a2`)	2024-08-27 23:43:14 +00:00
Botond Dénes	12bd371152	replica/mutation_dump: enfore pinning of effective replication map By making it a required argument, making sure the topology version is pinned for the duration of the query. This is needed because mutation dump queries bypass the storage proxy, where this pinning usually takes place. So it has to be enforced here. (cherry picked from commit `46563d719f`)	2024-08-27 23:43:14 +00:00
Botond Dénes	431d4740e3	replica/mutation_dump: handle un-owned tokens (with tablets) When using tablets, the replica-side doesn't handle un-owned tokens. table::shard_for_reads() will just return 0 for un-owned tokens, and a later attempt at calling table::storage_group_for_token() with said un-owned token will cause a crash (std::terminate due to std::out_of_range thrown in noexcept context). The replicas rely on the coordinator to not send stray requests, but for select from mutation_fragments(table) queries, there is no coordinator side who could do the correct dispatching. So do this in mutation_dump(), just creating empty readers for un-owned tokens. (cherry picked from commit `de5329157c`)	2024-08-27 23:43:14 +00:00
Aleksandra Martyniuk	9b5d33ac4f	replica: add/remove table atomically Currently, database::tables_metadata::add_table needs to hold a write lock before adding a table. So, if we update other classes keeping track of tables before calling add_table, and the method yields, table's metadata will be inconsistent. Set all table-related info in tables_metadata::add_table_helper (called by add_table) so that the operation is atomic. Analogically for remove_table. Fixes: #19833. (cherry picked from commit `483d89ed6d`) Closes scylladb/scylladb#20245	2024-08-27 20:47:55 +03:00
Amnon Heiman	043855c574	test/estimated_histogram_test Add summary tests This patch adds tests for summary calculation. It adds two tests, the first is a basic calculation for P50, P95, P99 by adding 100 elements into 20 buckets. The second test look that if elements are found in the infinite bucket, the result would be the lower limit (33s) and not infinite. Relates to #20255 Signed-off-by: Amnon Heiman <amnon@scylladb.com> (cherry picked from commit `644e6f0121`)	2024-08-27 12:12:39 +00:00
Amnon Heiman	615101bd90	utils/histogram.hh: Make summary support inifinite bucket. This patch handles an edge cases related to The infinite bucket limit. Summaries are the P50, P95, and P99 quantiles. The quantiles are calculated from a histogram; we find the bucket and return its upper limit. In classic histograms, there is a notion of the infinite bucket; anything that does not fall into the last bucket is considered to be infinite; with quantile, it does not make sense. So instead of reporting infinite we'll report the bucket lower limit. Fixes #20255 Signed-off-by: Amnon Heiman <amnon@scylladb.com> (cherry picked from commit `011aa91a8c`)	2024-08-27 12:12:39 +00:00
Botond Dénes	0509548dbc	Update tools/java submodule * tools/java 6dfc187a...c0735a9d (1): > cassandra-stress: Make default repl. strategy NetworkTopologyStrategy Fixes: scylladb/scylla-tools-java#400 Closes scylladb/scylladb#20200	2024-08-27 07:39:39 +03:00
Benny Halevy	b7de15dc60	schema_tables: calculate_schema_digest: filter the key earlier Currently, each frozen mutation we get from system_keyspace::query_mutations is unfrozen in whole to a mutation and only then we check its key with the provided `accept_keyspace` function. This is wasteful, since they key can be processed directly form the frozen mutation, before taking the toll of unfreezing it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `52234214e5`)	2024-08-22 09:06:28 +00:00
Benny Halevy	7e7a8e44b0	schema_tables: calculate_schema_digest: prevent stalls due to large mutations vector With a large number of table the schema mutations vector might get big enoug to cause reactor stalls when freed. For example, the following stall was hit on 2023.1.0~rc1-20230208.fe3cc281ec73 with 5000 tables: ``` (inlined by) ~vector at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_vector.h:730 (inlined by) db::schema_tables::calculate_schema_digest(seastar::sharded<service::storage_proxy>&, enum_set<super_enum<db::schema_feature, (db::schema_feature)0, (db::schema_feature)1, (db::schema_feature)2, (db::schema_feature)3, (db::schema_feature)4, (db::schema_feature)5, (db::schema_feature)6, (db::schema_feature)7> >, seastar::noncopyable_function<bool (std::basic_string_view<char, std::char_traits<char> >)>) at ./db/schema_tables.cc:799 ``` This change returns a mutations generator from the `map` lambda coroutine so we can process them one at a time, destroy the mutations one at a time, and by that, reducing memory footprint and preventing reactor stalls. Fixes #18173 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `95a5fba0ea`)	2024-08-22 09:06:28 +00:00
Anna Stuchlik	c8bbfbecad	doc: extract the info about tablets defaut to a separate file This commit extracts the information about the default for tables in keyspace creation to a separate file in the _common folder. The file is then included using the scylladb_include_flag directive. The purpose of this commit is to make it possible to include a different file in the scylla-enterprise repo - with a different default. Refs https://github.com/scylladb/scylla-enterprise/issues/4585 (cherry picked from commit `107708434c`) Closes scylladb/scylladb#20222	2024-08-21 11:07:57 +03:00
Avi Kivity	3a67423ac9	Merge '[Backport 6.0] replica: fix copy constructor of tablet_sstable_set' from ScyllaDB Commit `9f93dd9fa3` changed `tablet_sstable_set::_sstable_sets` to be a `absl::flat_hash_map` and in addition, `std::set<size_t> _sstable_set_ids` was added. `_sstable_set_ids` is set up in the `tablet_sstable_set(schema_ptr s, const storage_group_manager& sgm, const locator::tablet_map& tmap)` constructor, but it is not copied in `tablet_sstable_set(const tablet_sstable_set& o)`. This affects the `tablet_sstable_set::tablet_sstable_set` method as it depends on the copy constructor. Since sstable set can be cloned when a new sstable set is added, the issue will cause ids not being copied into the new sstable set. It's healed only after compaction, since the sstable set is rebuilt from scratch there. This PR fixes this issue by removing the existing copy constructor of `tablet_sstable_set` to enable the implicit default copy constructor. Fixes #19519 (cherry picked from commit `44583eed9e`) (cherry picked from commit `ec47b50859`) Refs #20115 Closes scylladb/scylladb#20204 * github.com:scylladb/scylladb: boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor replica: fix copy constructor of tablet_sstable_set	2024-08-19 21:31:33 +03:00
Michał Jadwiszczak	4e236f3392	cql3/statements/create_service_level: forbid creating SL starting with `$` Tenant names starting with `$` are reserved for internal ones. Forbid creating new service level which name starts with `$` and log a warning for existing service levels with `$` prefix. (cherry picked from commit `d729d1b272`) Closes scylladb/scylladb#20198	2024-08-19 16:20:48 +03:00
Anna Stuchlik	b127b492cf	doc: fix a link on the RBAC page This commit fixes an external link on the Role Based Access Control page. Fixes https://github.com/scylladb/scylladb/issues/20166 (cherry picked from commit `c56c3ce469`) Closes scylladb/scylladb#20203	2024-08-19 15:30:34 +03:00
Lakshmi Narayanan Sreethar	ab6b8be69a	boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `ec47b50859`)	2024-08-19 12:13:11 +00:00
Lakshmi Narayanan Sreethar	5d8543221b	replica: fix copy constructor of tablet_sstable_set Remove the existing copy constructor to enable the use of the implicit copy constructor. This fixes the issue of `_sstable_set_ids` not being copied in the current copy constructor. Fixes #19519 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `44583eed9e`)	2024-08-19 12:13:10 +00:00
Avi Kivity	cdae15ced9	Merge '[Backport 6.0] db/view: drop view updates to replaced node marked as left' from ScyllaDB When a node that is permanently down is replaced, it is marked as "left" but it still can be a replica of some tablets. We also don't keep IPs of nodes that have left and the `node` structure for such node returns an empty IP (all zeros) as the address. This interacts badly with the view update logic. The base replica paired with the left node might decide to generate a view update. Because storage proxy still uses IPs and not host IDs, it needs to obtain the view replica's IP and tell the storage proxy to write a view update to that node - so, it chooses 0.0.0.0. Apparently, storage proxy decides to write a hint towards this address - hinted handoff on the other hand operates on host IDs and not IPs, so it attempts to translate the IP back, which triggers an assertion as there is no replica with IP 0.0.0.0. As a quick workaround for this issue just drop view updates towards nodes which seem to have IPs that are all zeros. It would be more proper to keep the view updates as hints and replay them later to the new paired replica, but achieving this right now would require much more significant changes. For now, fixing a crash is more important than keeping views consistent with base replicas. In addition to the fix, this PR also includes a regression test heavily based on the test that @kbr-scylla prepared during his investigation of the issue. Fixes: scylladb/scylladb#19439 This issue can cause multiple nodes to crash at once and the fix is quite small, so I think this justifies backporting it to all affected versions. 6.0 and 6.1 are affected. No need to backport to 5.4 as this issue only happens with tablets, and tablets are experimental there. (cherry picked from commit `6af7882c59`) (cherry picked from commit `5ec8c06561`) Refs #19765 Closes scylladb/scylladb#19896 * github.com:scylladb/scylladb: test: regression test for MV crash with tablets during decommission db/view: drop view updates to replaced node marked as left	2024-08-14 22:32:07 +03:00

1 2 3 4 5 ...

43123 Commits