scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 18:10:39 +00:00

Author	SHA1	Message	Date
Lakshmi Narayanan Sreethar	58da8fdbbc	[Backport 6.1]: database::get_all_tables_flushed_at: fix return value The `database::get_all_tables_flushed_at` method returns a variable without setting the computed all_tables_flushed_at value. This causes its caller, `maybe_flush_all_tables` to flush all the tables everytime regardless of when they were last flushed. Fix this by returning the computed value from `database::get_all_tables_flushed_at`. Fixes #20301 Closes scylladb/scylladb#20471 * github.com:scylladb/scylladb: cql-pytest: add test to verify compaction_flush_all_tables_before_major_seconds config database::get_all_tables_flushed_at: fix return value (cherry picked from commit `0e5b444777`) Backported from #20471 to 6.1. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#20581	2024-09-26 10:40:48 +03:00
Kamil Braun	92156e7930	test: fix `topology_custom/test_raft_recovery_stuck` flakiness The test performs consecutive schema changes in RECOVERY mode. The second change relies on the first. However the driver might route the changes to different servers and we don't have group 0 to guarantee linearizability. We must rely on the first change coordinator to push the schema mutations to other servers before returning, but that only happens when it sees other servers as alive when doing the schema change. It wasn't guaranteed in the test. Fix this. Fixes scylladb/scylladb#20791 Should be backported to all branches containing this test to reduce flakiness. (cherry picked from commit `f390d4020a`) Closes scylladb/scylladb#20809	2024-09-25 15:11:50 +02:00
Abhinav	33b50a9d3a	raft topology: add error for removal of non-normal nodes In the current scenario, We check if a node being removed is normal on the node initiating the removenode request. However, we don't have a similar check on the topology coordinator. The node being removed could be normal when we initiate the request, but it doesn't have to be normal when the topology coordinator starts handling the request. For example, the topology coordinator could have removed this node while handling another removenode request that was added to the request queue earlier. This commit intends to fix this issue by adding more checks in the enqueuing phase and return errors for duplicate requests for node removal. This PR fixes a bug. Hence we need to backport it. Fixes: scylladb/scylladb#20271 (cherry picked from commit `b25b8dccbd`) Closes scylladb/scylladb#20800	2024-09-25 11:35:27 +02:00
Gleb Natapov	43f9b3b997	test: skip test_lwt_semaphore::test_cas_semaphore in aarch64 debug mode The test configures write timeout to much smaller value to make the test run faster since for some writes sleep is inserted to hit the timeout, but it makes aarch64 debug flaky since timeout happens when it should not because of a natural slowness. (cherry picked from commit `71a5b1c6dd`) Closes scylladb/scylladb#20777	2024-09-24 15:20:09 +02:00
Botond Dénes	7ed2f87414	Merge '[Backport 6.1] cql3: add option to not unify bind variables with the same' from Avi Kivity Bind variables in CQL have two formats: positional (?) where a variable is referred to by its relative position in the statement, and named (:var), where the user is expected to supply a name->value mapping. In `19a6e69001` we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to. However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection. Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the dialect and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables. A unit test is added. Fixes https://github.com/scylladb/scylladb/issues/15559 This may be useful to users transitioning from Cassandra, so merits a backport. (cherry picked from commit `f9322799af`) (cherry picked from commit `d69bf4f010`) (cherry picked from commit `ea8441dfa3`) Refs https://github.com/scylladb/scylladb/pull/19493 Closes scylladb/scylladb#20590 * github.com:scylladb/scylladb: cql3: add option to not unify bind variables with the same name cql3: introduce dialect infrastructure cql3: prepared_statement_cache: drop cache key default constructor Merge 'config: round-trip boolean configuration variables' from Avi Kivity	2024-09-24 15:15:05 +03:00
Piotr Dulikowski	bf6dd16071	Merge '[Backport 6.1] message/messaging_service: guard adding maintenance tenant under cluster feature' from Michał Jadwiszczak In https://github.com/scylladb/scylladb/pull/18729, we introduced a new statement tenant $maintenance, but the change wasn't protected by any cluster feature. This wasn't a problem for OSS, since unknown isolation cookie just uses default scheduling group. However, in enterprise that leads to creating a service level on not-upgraded nodes, which may end up in an error if user create maximum number of service levels. This patch adds a cluster feature to guard adding the new tenant. It's done in the way to handle two upgrade scenarios: version without $maintenance tenant -> version with $maintenance tenant guarded by a feature version with $maintenance tenant but not guarded by a feature -> version with $maintenance tenant guarded by a feature The PR adds enabled flag to statement tenants. This way, when the tenant is disabled, it cannot be used to create a connection, but it can be used to accept an incoming connection. The $maintenance tenant is added to the config as disabled and it gets enabled once the corresponding feature is enabled. Fixes https://github.com/scylladb/scylladb/issues/20070 Refs https://github.com/scylladb/scylla-enterprise/issues/4403 (cherry picked from commit `d44844241d`) (cherry picked from commit `71a03ef6b0`) (cherry picked from commit `b4b91ca364`) Refs https://github.com/scylladb/scylladb/pull/19802 Closes scylladb/scylladb#20674 * github.com:scylladb/scylladb: message/messaging_service: guard adding maintenance tenant under cluster feature message/messaging_service: add feature_service dependency message/messaging_service: add `enabled` flag to statement tenants	2024-09-23 13:18:45 +02:00
Botond Dénes	f987afb2e1	Merge '[Manual Backport 6.1] generic_server: convert connection tracking to seastar::gate' from Laszlo Ersek This is a manual backport of #20212 to 6.1, superseding #20345 (which had run into conflicts). Please see the individual commit messages for backport notes. Fixes #10305 Closes scylladb/scylladb#20355 * github.com:scylladb/scylladb: generic_server: make server::stop() idempotent generic_server: coroutinize server::shutdown() generic_server: make server::shutdown() idempotent test/generic_server: add test case configure, cmake: sort the lists of boost unit tests generic_server: convert connection tracking to seastar::gate	2024-09-18 15:52:32 +03:00
Michał Jadwiszczak	d11df0fcbc	message/messaging_service: add feature_service dependency (cherry-picked from `71a03ef6b0`)	2024-09-18 11:26:56 +02:00
Tomasz Grabiec	edea822bd7	Merge '[Backport 6.1] tablets: Fix race between repair and split' from Raphael "Raph" Carvalho Consider the following: ``` T 0 split prepare starts 1 repair starts 2 split prepare finishes 3 repair adds unsplit sstables 4 repair ends 5 split executes ``` If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path. The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set. Fixes https://github.com/scylladb/scylladb/issues/19378. Fixes https://github.com/scylladb/scylladb/issues/19416. Please replace this line with justification for the backport/* labels added to this PR (cherry picked from commit `239344ab55`) (cherry picked from commit `74612ad358`) Refs https://github.com/scylladb/scylladb/pull/19427 Closes scylladb/scylladb#20595 * github.com:scylladb/scylladb: tablets: Fix race between repair and split compaction: Allow "offline" sstable to be split	2024-09-17 13:25:03 +02:00
Aleksandra Martyniuk	032c9146d5	test: check if cleanup of deallocated sg is ignored (cherry picked from commit `2c4b1d6b45`)	2024-09-16 16:22:29 +02:00
Raphael S. Carvalho	fe56fa39c0	tablets: Fix race between repair and split Consider the following: T 0 split prepare starts 1 repair starts 2 split prepare finishes 3 repair adds unsplit sstables 4 repair ends 5 split executes If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path. The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set. Fixes #19378. Fixes #19416. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `74612ad358`)	2024-09-13 21:32:01 -03:00
Avi Kivity	8ddfd0d70d	cql3: add option to not unify bind variables with the same name Bind variables in CQL have two formats: positional (`?`) where a variable is referred to by its relative position in the statement, and named (`:var`), where the user is expected to supply a name->value mapping. In `19a6e69001` we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to. However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection. Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the `dialect` and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables. A unit test is added. Fixes #15559 (cherry picked from commit `ea8441dfa3`) (cherry picked from commit `edb3068ecf`)	2024-09-13 18:17:15 +03:00
Avi Kivity	92dd47c6d6	cql3: introduce dialect infrastructure A dialect is a different way to interpret the same CQL statement. Examples: - how duplicate bind variable names are handled (later in this series) - whether `column = NULL` in LWT can return true (as is now) or whether it always returns NULL (as in SQL) Currently, dialect is an empty structure and will be filled in later. It is passed to query_processor methods that also accept a CQL string, and from there to the parser. It is part of the prepared statement cache key, so that if the dialect is changed online, previous parses of the statement are ignored and the statement is prepared again. The patch is careful to pick up the dialect at the entry point (e.g. CQL protocol server) so that the dialect doesn't change while a statement is parsed, prepared, and cached. (cherry picked from commit `d69bf4f010`)	2024-09-13 18:11:11 +03:00
Nadav Har'El	d9ba5423bb	Merge 'config: round-trip boolean configuration variables' from Avi Kivity When you SELECT a boolean from system.config, it reads as true/false, but this isn't accepted on UPDATE (instead, we accept 1/0). This is surprising and annoying, so accept true/false in both directions. Not a regression, so a backport isn't strictly necessary. Closes scylladb/scylladb#19792 * github.com:scylladb/scylladb: config: specialize from-string conversion for bool config: wrap boost::lexical_cast<> when converting from strings (cherry picked from commit `9eb47b3ef0`)	2024-09-13 17:54:37 +03:00
Piotr Smaron	b60f9ef4c2	cql: fix exception when validating KS in CREATE TABLE `c70f321c6f` added an extra check if KS exists. This check can throw `data_dictionary::no_such_keyspace` exception, which is supposed to be caught and a more user-friendly exception should be thrown instead. This commit fixes the above problem and adds a testcase to validate it doesn't appear ever again. Also, I moved the check for the keyspace outside of the `for` loop, as it doesn't need to be checked repeatedly. Additionally, I added an extra comment to both `no_such_keyspace` and `no_such_column_family` exceptions explaining they should not be returned directly to the caller, as they lack error code, which may not trigger correct exceptions handling mechanisms on the driver side. Fixes: #20097 (cherry picked from commit `f1e8976fbe`) Closes scylladb/scylladb#20553	2024-09-13 11:36:51 +03:00
Kamil Braun	e4a18b0858	test: test_raft_no_quorum: increase raft timeout in debug mode The test cases in this file use an error injection to reduce raft group 0 timeouts (from the default 1 minute), in order to speed up the tests; the scenarios expect these timeouts to happen, so we want them to happen as quick as possible, but we don't want to reduce timeouts so much that it will make other operations fail when we don't expect them to (e.g. when the test wants to add a node to the cluster). Unfortunately the selected 5 seconds in debug mode was not enough and made the tests flaky: scylladb/scylladb#20111. Increase it to 10 seconds. This unfortunately will slow down these tests as they have to sometimes wait for 10 seconds for the timeout to happen. But better to have this than a flaky test. Fixes: scylladb/scylladb#20111 (cherry picked from commit `52fdf5b4c9`) Closes scylladb/scylladb#20477	2024-09-10 08:48:06 +03:00
Botond Dénes	c64ae3f839	Merge '[Backport 6.1] repair: throw if batchlog manager isn't initialized' from ScyllaDB repair_service::repair_flush_hints_batchlog_handler may access batchlog manager while it is uninitialized. Throw if batchlog manager isn't initialized. Fixes: #20236. Needs backport to 6.0 and 6.1 as they suffer from the uninitialized bm access. (cherry picked from commit `d8e4393418`) (cherry picked from commit `f38bb6483a`) Refs #20251 Closes scylladb/scylladb#20351 * github.com:scylladb/scylladb: test: add test to ensure repair won't fail with uninitialized bm repair: throw if batchlog manager isn't initialized	2024-09-04 07:02:18 +03:00
Gleb Natapov	9db819763b	topology coordinator: do not add replacing node without a ring to topology When only inter dc encryption is enabled a non encrypted connection between two nodes is allowed only if both nodes are in the same dc. If a nodes that initiates the connection knows that dst is in the same dc and hence use non encrypted connection, but the dst not yet knows the topology of the src such connection will not be allowed since dst cannot guaranty that dst is in the same dc. Currently, when topology coordinator is used, a replacing node will appear in the coordinator's topology immediately after it is added to the group0. The coordinator will try to send raft message to the new node and (assuming only inter dc encryption is enabled and replacing node and the coordinator are in the same dc) it will try to open regular, non encrypted, connection to it. But the replacing node will not have the coordinator in it's topology yet (it needs to sync the raft state for that). so it will reject such connection. To solve the problem the patch does not add a replacing node that was just added to group0 to the topology. It will be added later, when tokens will be assigned to it. At this point a replacing node will already make sure that its topology state is up-to-date (since it will execute a raft barrier in join_node_response_params handler) and it knows coordinator's topology. This aligns replace behaviour with bootstrap since bootstrap also does not add a node without a ring to the topology. The patch effectively reverts `b8ee8911ca` Fixes: scylladb/scylladb#19025 (cherry picked from commit `17f4a151ce`)	2024-09-01 11:57:25 +03:00
Gleb Natapov	4769e694d1	test: add test for replace in clusters with encryption enabled (cherry picked from commit `2f1b1fd45e`)	2024-09-01 11:56:37 +03:00
Gleb Natapov	74012c562a	test.py: add server encryption support to cluster manager (cherry picked from commit `b98282a976`)	2024-09-01 11:56:25 +03:00
Laszlo Ersek	16321fc243	test/generic_server: add test case Check whether we can stop a generic server without first asking it to listen. The test fails currently; the failure mode is a hang, which triggers the 5 minute timeout set in the test: > unknown location(0): fatal error: in "stop_without_listening": > seastar::timed_out_error: timedout > seastar/src/testing/seastar_test.cc(43): last checkpoint > test/boost/generic_server_test.cc(34): Leaving test case > "stop_without_listening"; testing time: 300097447us Backport notes for 6.1: - Replace #include "utils/assert.hh" SCYLLA_ASSERT(false); with #include <cassert> assert(false); due to 6.1 lacking commit `aa1270a00c` ("treewide: change assert() to SCYLLA_ASSERT()", 2024-08-05). The header file "utils/assert.hh" wouldn't be difficult to backport, but separating it from the treewide changes in commit `aa1270a00c` might not be the best idea. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> (cherry picked from commit `dbc0ca6354`)	2024-08-30 16:17:44 +02:00
Laszlo Ersek	8f0f362a30	configure, cmake: sort the lists of boost unit tests Both lists were obviously meant to be sorted originally, but by today we've introduced many instances of disorder -- thus, inserting a new test in the proper place leaves the developer scratching their head. Sort both lists. Backport notes for 6.1: - Conflicts in "configure.py", unsurprisingly. For the backport, I sorted the boost unit test list manually, from scratch. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> (cherry picked from commit `931f2f8d73`)	2024-08-30 16:16:53 +02:00
Aleksandra Martyniuk	93fbe3af12	test: add test to ensure repair won't fail with uninitialized bm (cherry picked from commit `f38bb6483a`)	2024-08-30 13:55:48 +00:00
Botond Dénes	e6d2d29dd1	Merge '[Backport 6.1] repair: do_rebuild_replace_with_repair: use source_dc only when safe' from ScyllaDB It is unsafe to restrict the sync nodes for repair to the source data center if it has too low replication factor in network_topology_replication_strategy, or if other nodes in that DC are ignored. Also, this change restricts the usage of source_dc to `network_topology` and `everywhere_topology` strategies, as with simple replication strategy there is no guarantee that there would be any more replicas in that data center. Fixes #16826 Reproducer submitted as https://github.com/scylladb/scylla-dtest/pull/3865 It fails without this fix and passes with it. * Requires backport to live versions. Issue hit in the filed with 2022.2.14 (cherry picked from commit `8b1877f3ca`) (cherry picked from commit `0419b1d522`) (cherry picked from commit `b5d0ab092c`) (cherry picked from commit `9729dd21c3`) (cherry picked from commit `8665eef98c`) (cherry picked from commit `5f655e41e3`) Refs #16827 Closes scylladb/scylladb#20228 * github.com:scylladb/scylladb: raft_rebuild: propagate source_dc force option to rebuild_option repair: do_rebuild_replace_with_repair: use source_dc only when safe repair: replace_with_repair: pass the replace_node downstream repair: replace_with_repair: pass ignore_nodes as a set of host_id:s repair: replace_rebuild_with_repair: pass ks_erms from caller nodetool: rebuild: add force option Add and use utils::optional_param to pass source_dc	2024-08-29 07:35:05 +03:00
Lakshmi Narayanan Sreethar	01661e1eaa	test/pylib: fix keyspace_compaction method The `keyspace_compaction` method incorrectly appends the column family parameter to the URL using a regular string, `"?cf={table}"`, instead of an f-string, `f"?cf={table}"`. As a result, the column family name is sent as `{table}` to the server, causing the compaction request to fail. Fix this issue by passing the parameter to the POST request using a dictionary instead of appending it to the URL. Fixes #20264 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `dc5c45e803`) Closes scylladb/scylladb#20273	2024-08-28 20:08:58 +03:00
Botond Dénes	6232982772	Merge '[Backport 6.1] select from mutation_fragments() + tablets: handle reads for non-owned partitions' from ScyllaDB Attempting to read a partition via `SELECT * FROM MUTATION_FRAGMENTS()`, which the node doesn't own, from a table using tablets causes a crash. This is because when using tablets, the replica side simply doesn't handle requests for un-owned tokens and this triggers a crash. We should probably improve how this is handled (an exception is better than a crash), but this is outside the scope of this PR. This PR fixes this and also adds a reproducer test. Fixes: https://github.com/scylladb/scylladb/issues/18786 Fixes a regression introduced in 6.0, so needs backport to 6.0 and 6.1 (cherry picked from commit `de5329157c`) (cherry picked from commit `46563d719f`) (cherry picked from commit `4e2d7aa2a2`) Refs #20109 Closes scylladb/scylladb#20313 * github.com:scylladb/scylladb: test/tablets: Test that reading tablets' mutations from MUTATION_FRAGMENTS works replica/mutation_dump: enfore pinning of effective replication map replica/mutation_dump: handle un-owned tokens (with tablets)	2024-08-28 06:23:45 +03:00
Botond Dénes	6418787ee0	Merge '[Backport 6.1] Make Summary support histogram with infinite bucket vlaues' from ScyllaDB This series fixes an issue where histogram Summaries return an infinite value. It updated the quantile calculation logic to address cases where values fall into the infinite bucket of a histogram. Now, instead of returning infinite (max int), the calculation will return the last bucket limit, ensuring finite outputs in all cases. The series adds a test for summaries with a specific test case for this scenario. Fixes #20255 Need backport to 6.0, 6.1 and 2023.1 and above (cherry picked from commit `011aa91a8c`) (cherry picked from commit `644e6f0121`) Refs #20257 Closes scylladb/scylladb#20303 * github.com:scylladb/scylladb: test/estimated_histogram_test Add summary tests utils/histogram.hh: Make summary support inifinite bucket.	2024-08-28 06:23:03 +03:00
Botond Dénes	06d6cf5608	Merge '[Backport 6.1] abstract_replication_strategy: make get_ranges async' from ScyllaDB To prevent stalls due to large number of tokens. For example, large cluster with say 70 nodes can have more than 16K tokens. Fixes #19757 (cherry picked from commit `d385219a12`) (cherry picked from commit `333c0d7c88`) (cherry picked from commit `b2abbae24b`) (cherry picked from commit `824bdf99d2`) (cherry picked from commit `ea5a0cca10`) (cherry picked from commit `2bbbe2a8bc`) (cherry picked from commit `686a8f2939`) Refs #19758 Closes scylladb/scylladb#20297 * github.com:scylladb/scylladb: abstract_replication_strategy: make get_ranges async database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param compaction: task_manager_module: open code maybe_get_keyspace_local_ranges alternator: ttl: token_ranges_owned_by_this_shard: let caller make the ranges_holder alternator: ttl: can pass const gms::gossiper& to ranges_holder alternator: ttl: ranges_holder_primary: unconstify _token_ranges member alternator: ttl: refactor token_ranges_owned_by_this_shard	2024-08-28 06:22:33 +03:00
Botond Dénes	1f8d8fd3db	Merge '[Backport 6.1] replica: fix copy constructor of tablet_sstable_set' from ScyllaDB Commit `9f93dd9fa3` changed `tablet_sstable_set::_sstable_sets` to be a `absl::flat_hash_map` and in addition, `std::set<size_t> _sstable_set_ids` was added. `_sstable_set_ids` is set up in the `tablet_sstable_set(schema_ptr s, const storage_group_manager& sgm, const locator::tablet_map& tmap)` constructor, but it is not copied in `tablet_sstable_set(const tablet_sstable_set& o)`. This affects the `tablet_sstable_set::tablet_sstable_set` method as it depends on the copy constructor. Since sstable set can be cloned when a new sstable set is added, the issue will cause ids not being copied into the new sstable set. It's healed only after compaction, since the sstable set is rebuilt from scratch there. This PR fixes this issue by removing the existing copy constructor of `tablet_sstable_set` to enable the implicit default copy constructor. Fixes #19519 (cherry picked from commit `44583eed9e`) (cherry picked from commit `ec47b50859`) Refs #20115 Closes scylladb/scylladb#20201 * github.com:scylladb/scylladb: boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor replica: fix copy constructor of tablet_sstable_set	2024-08-28 06:20:12 +03:00
Pavel Emelyanov	bc03d13c76	test/tablets: Test that reading tablets' mutations from MUTATION_FRAGMENTS works Currently it doesn't, one of the node crashes with std::out_of_range exception and meaningless calltrace [Botond]: this test checks the case of reading a partition via MUTATION_FRAGMENTS from a node which doesn't own said partition. refs: #18786 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `4e2d7aa2a2`)	2024-08-27 23:43:14 +00:00
Amnon Heiman	64befbca61	test/estimated_histogram_test Add summary tests This patch adds tests for summary calculation. It adds two tests, the first is a basic calculation for P50, P95, P99 by adding 100 elements into 20 buckets. The second test look that if elements are found in the infinite bucket, the result would be the lower limit (33s) and not infinite. Relates to #20255 Signed-off-by: Amnon Heiman <amnon@scylladb.com> (cherry picked from commit `644e6f0121`)	2024-08-27 12:12:39 +00:00
Botond Dénes	e84d8b1205	Merge '[Backport 6.1] cql: process LIMIT for GROUP BY select queries' from ScyllaDB This change fixes #17237, fixes #5361 and fixes #5362 by passing the limit value down the call chain in cql3. A test is also added. fixes: #17237 fixes: #5361 fixes: #5362 The regression happened in 5.4 as we changed the way GROUP BY is processed in `432cb02` - to force aggregation when it is used. The LIMIT value was not passed to aggregations and thus we failed to adhere to it. W want to backport this fix to 5.4 and 6.0 to have continuous correct results for the test case from #17237 This patch consists of 4 commits: - fa4225ea0fac2057b7a9976f57dc06bcbd900cd4 - cql3: respect the user-defined page size in aggregate queries - a precondition for this patch to be implementable - 8fbe69e74dca16ed8832d9a90489ca47ba271d0b - cql3/select_statement: simplify the get_limit function - the `do_get_limit()` function did a lot of legwork that should not be associated with it. This change makes it trivial and makes its callers do additional checks (for unset guards, or for an aggregate query) - 162828194a2b88c22fbee335894ff045dcc943c9 - cql3: process LIMIT for GROUP BY queries - pass the limit value down the chain and make use of it. This is the actual fix to #17237 - b3dc6de6d6cda8f5c09b01463bb52f827a6a00b4 - test/cql-pytest: Add test for GROUP BY queries with LIMIT - tests (cherry picked from commit `08f3219cb8`) (cherry picked from commit `3838ad64b3`) (cherry picked from commit `e7ae7f3662`) (cherry picked from commit `9db272c949`) Refs: #18842 Closes scylladb/scylladb#20154 * github.com:scylladb/scylladb: test/cql-pytest: Add test for GROUP BY queries with LIMIT cql3: process LIMIT for GROUP BY queries cql3/select_statement: simplify the get_limit function cql3: respect the user-defined page size in aggregate queries	2024-08-27 14:52:18 +03:00
Benny Halevy	6692c1702d	abstract_replication_strategy: make get_ranges async To prevent stalls due to large number of tokens. For example, large cluster with say 70 nodes can have more than 16K tokens. Fixes #19757 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `686a8f2939`)	2024-08-26 21:50:39 +00:00
Benny Halevy	415bdf3160	database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param Prepare for making the function async. Then, it will need to hold on to the erm while getting the token_ranges asynchronously. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `2bbbe2a8bc`)	2024-08-26 21:50:39 +00:00
Benny Halevy	e13d5ee834	nodetool: rebuild: add force option To be used to force usage of source_dc, even when it is unsafe for rebuild. Update docs and add test/nodetool/test_rebuild.py Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `0419b1d522`)	2024-08-21 09:37:14 +03:00
Raphael S. Carvalho	d65961d8cf	compaction: Allow "offline" sstable to be split In order to fix the race between split and repair, we must introduce the ability to split an "offline" sstable, one that wasn't added to any of the table's sstable set yet. It's not safe to split a sstable after adding it to the set, because a failure to split can result in unsplit data left in the set, causing split to fail down the road, since the coordinator thinks this replica has only split data in the set. Refs #19378. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `239344ab55`)	2024-08-20 10:38:36 +00:00
Lakshmi Narayanan Sreethar	13aa97a00f	boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `ec47b50859`)	2024-08-19 12:11:50 +00:00
Avi Kivity	c382e19e5e	Merge '[Backport 6.1] Prevent ALTERing non-existing KS with tablets' from ScyllaDB ALTER tablets KS executes in 2 steps: 1. ALTER KS's cql handler forms a global topo req, and saves data required to execute this req, 2. global topo req is executed by topo coordinator, which reads data attached to the req. The KS name is among the data attached to the req. There's a time window between these steps where a to-be-altered KS could have been DROPped, which results in topo coordinator forever trying to ALTER a non-existing KS. In order to avoid it, the code has been changed to first check if a to-be-altered KS exists, and if it's not the case, it doesn't perform any schema/tablets mutations, but just removes the global topo req from the coordinator's queue. BTW. just adding this extra check resulted in broader than expected changes, which is due to the fact that the code is written badly and needs to be refactored - an effort that's already planned under #19126 (I suggest to disable displaying whitespace differences when reviewing this PR). Fixes: #19576 Requires 6.0 backport (cherry picked from commit `5b089d8e10`) (cherry picked from commit `0ea2128140`) (cherry picked from commit `ddb5204929`) Refs #19666 Closes scylladb/scylladb#20143 * github.com:scylladb/scylladb: tests: ensure ALTER tablets KS doesn't crash if KS doesn't exist cql: refactor rf_change indentation Prevent ALTERing non-existing KS with tablets	2024-08-14 20:16:55 +03:00
Michał Chojnowski	b786e6a39a	cql_test_env: ensure shutdown() before stop() for system_keyspace If system_keyspace::stop() is called before system_keyspace::shutdown(), it will never finish, because the uncleared shared pointers will keep it alive indefinitely. Currently this can happen if an exception is thrown before the construction of the shutdown() defer. This patch moves the shutdown() call to immediately before stop(). I see no reason why it should be elsewhere. Fixes scylladb/scylla-enterprise#4380 (cherry picked from commit `eeaf4c3443`) Closes scylladb/scylladb#20145	2024-08-14 20:16:29 +03:00
Paweł Zakrzewski	3286c14d76	test/cql-pytest: Add test for GROUP BY queries with LIMIT Remove xfail from all tests for #5361, as the issue is fixed. Remove xfail from test_group_by_clustering_prefix_with_limit It references #5362, but is fixed by #17237. Refs #17237 (cherry picked from commit `9db272c949`)	2024-08-14 16:56:20 +00:00
Piotr Smaron	706761d8ec	tests: ensure ALTER tablets KS doesn't crash if KS doesn't exist Using the error injection framework, we inject a sleep into the processing path of ALTER tablets KS, so that the topology coordinator of the leader node sleeps after the rf_change event has been scheduled, but before it is started to be executed. During that time the second node executes a DROP KS statement, which is propagated to the leader node. Once leader node wakes up and resumes processing of ALTER tablets KS, the KS won't exist and the node cannot crash, which was the case before. (cherry picked from commit `ddb5204929`)	2024-08-14 10:37:25 +00:00
Tomasz Grabiec	0144549cd6	tablets: Do not allocate tablets on nodes being decommissioned If tablet-based table is created concurrently with node being decommissioned after tablets are already drained, the new table may be permanently left with replicas on the node which is no longer in the topology. That creates an immidiate availability risk because we are running with one replica down. This also violates invariants about replica placement and this state cannot be fixed by topology operations. One effect is that this will lead to load balancer failure which will inhibit progress of any topology operations: load_balancer - Replica 154b0380-1dd2-11b2-9fdd-7156aa720e1a:0 of tablet 7e03dd40-537b-11ef-9fdd-7156aa720e1a:1 not found in topology, at: ... Fixes #20032 (cherry picked from commit `f5c74a5df2`) Closes scylladb/scylladb#20066	2024-08-08 11:56:13 +03:00
Kamil Braun	0f246bfbc9	raft topology: improve logging Add more logging for raft-based topology operations in INFO and DEBUG levels. Improve the existing logging, adding more details. Fix a FIXME in test_coordinator_queue_management (by readding a log message that was removed in the past -- probably by accident -- and properly awaiting for it to appear in test). Enable group0_state_machine logging at TRACE level in tests. These logs are relatively rare (group 0 commands are used for metadata operations) and relatively small, mostly consist of printing `system.group0_history` mutation in the applied command, for example: ``` TRACE 2024-08-02 18:47:12,238 [shard 0: gms] group0_raft_sm - apply() is called with 1 commands TRACE 2024-08-02 18:47:12,238 [shard 0: gms] group0_raft_sm - cmd: prev_state_id: optional(dd9d47c6-50ee-11ef-d77f-500b8e1edde3), new_state_id: dd9ea5c6-50ee-11ef-ae64-dfbcd08d72c3, creator_addr: 127.219.233.1, creator_id: 02679305-b9d1-41ef-866d-d69be156c981 TRACE 2024-08-02 18:47:12,238 [shard 0: gms] group0_raft_sm - cmd.history_append: {canonical_mutation: table_id 027e42f5-683a-3ed7-b404-a0100762063c schema_version c9c345e1-428f-36e0-b7d5-9af5f985021e partition_key pk{0007686973746f7279} partition_tombstone {tombstone: none}, row tombstone {range_tombstone: start={position: clustered, ckp{0010b4ba65c64b6e11ef8080808080808080}, 1}, end={position: clustered, ckp{}, 1}, {tombstone: timestamp=1722617232237511, deletion_time=1722617232}}{row {position: clustered, ckp{0010dd9ea5c650ee11efae64dfbcd08d72c3}, 0} tombstone {row_tombstone: none} marker {row_marker: 1722617232237511 0 0}, column description atomic_cell{ create system_distributed keyspace; create system_distributed_everywhere keyspace; create and update system_distributed(_everywhere) tables,ts=1722617232237511,expiry=-1,ttl=0}}} ``` note that the mutation contains a human-readable description of the command -- like "create system_distributed keyspace" above. These logs might help debugging various issues (e.g. when `apply` hangs waiting for read_apply mutex, or takes too long to apply a command). Ref: scylladb/scylladb#19105 Ref: scylladb/scylladb#19945 (cherry picked from commit `e8d5974961`) Closes scylladb/scylladb#20048	2024-08-07 13:39:30 +02:00
Botond Dénes	f78b88b59b	Merge '[Backport 6.1] db/view: drop view updates to replaced node marked as left' from ScyllaDB When a node that is permanently down is replaced, it is marked as "left" but it still can be a replica of some tablets. We also don't keep IPs of nodes that have left and the `node` structure for such node returns an empty IP (all zeros) as the address. This interacts badly with the view update logic. The base replica paired with the left node might decide to generate a view update. Because storage proxy still uses IPs and not host IDs, it needs to obtain the view replica's IP and tell the storage proxy to write a view update to that node - so, it chooses 0.0.0.0. Apparently, storage proxy decides to write a hint towards this address - hinted handoff on the other hand operates on host IDs and not IPs, so it attempts to translate the IP back, which triggers an assertion as there is no replica with IP 0.0.0.0. As a quick workaround for this issue just drop view updates towards nodes which seem to have IPs that are all zeros. It would be more proper to keep the view updates as hints and replay them later to the new paired replica, but achieving this right now would require much more significant changes. For now, fixing a crash is more important than keeping views consistent with base replicas. In addition to the fix, this PR also includes a regression test heavily based on the test that @kbr-scylla prepared during his investigation of the issue. Fixes: scylladb/scylladb#19439 This issue can cause multiple nodes to crash at once and the fix is quite small, so I think this justifies backporting it to all affected versions. 6.0 and 6.1 are affected. No need to backport to 5.4 as this issue only happens with tablets, and tablets are experimental there. (cherry picked from commit `6af7882c59`) (cherry picked from commit `5ec8c06561`) Refs #19765 Closes scylladb/scylladb#19895 * github.com:scylladb/scylladb: test: regression test for MV crash with tablets during decommission db/view: drop view updates to replaced node marked as left	2024-08-07 09:18:26 +03:00
Nadav Har'El	78d7c953b0	test: increase timeouts for /localnodes test In commit `bac7c33313` we introduced a new test for the Alternator "/localnodes" request, checking that a node that is still joining does not get returned. The tests used what I thought were "very high" timeouts - we had a timeout of 10 seconds for starting a single node, and injected a 20 second sleep to leave us 10 seconds after the first sleep. But the test failed in one extremely slow run (a debug build on aarch64), where starting just a single node took more than 15 seconds! So in this patch I increase the timeouts significantly: We increase the wait for the node to 60 seconds, and the sleeping injection to 120 seconds. These should definitely be enough for anyone (famous last words...). The test doesn't actually wait for these timeouts, so the ridiculously high timeouts shouldn't affect the normal runtime of this test. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `ca8b91f641`) Closes scylladb/scylladb#19940	2024-08-07 08:55:23 +03:00
Nadav Har'El	753fc87efa	alternator: exclude CDC log table from ListTables The Alternator command ListTables is supposed to list actual tables created with CreateTable, and should list things like materialized views (created for GSI or LSI) or CDC log tables. We already properly excluded materialized views from the list - and had the tests to prove it - but forgot both the exclusion and the testing for CDC log tables - so creating a table xyz with streams enable would cause ListTables to also list "xyz_scylla_cdc_log". This patch fixes both oversights: It adds the code to exclude CDC logs from the output of ListTables, add adds a test which reproduces the bug before this fix, and verifies the fix works. Fixes #19911. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `d293a5787f`) Closes scylladb/scylladb#19938	2024-08-07 08:54:08 +03:00
Benny Halevy	c75dbc1f9c	sstable_directory: delete_atomically: allow sstables from multiple prefixes Currently, delete_atomically can be called with a list of sstables from mixed prefixes in two cases: 1. truncate: where we delete all the sstables in the table directory 2. tablet cleanup: similar to truncate but restricted to sstables in a single tablet replica In both cases, it is possible that sstables in staging (or quarantine) are mixed with sstables in the base directory. Until a more comprehensive fix is in place, (see https://github.com/scylladb/scylladb/pull/19555) this change just lifts the ban on atomic deletion of sstables from different prefixes, and acknowledging that the implementation is not atomic across prefixes. This is better than crashing for now, and can be backported more easily to branches that support tablets so tablet migration can be done safely in the presence of repair of tables with views. Refs scylladb/scylladb#18862 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `26abad23d9`) Closes scylladb/scylladb#19919	2024-08-06 16:27:57 +03:00
Lakshmi Narayanan Sreethar	96e5ebe28c	boost/bloom_filter_test: wait for total memory reclaimed update The testcase `test_bloom_filter_reclaim_during_reload` checks the SSTable manager's `_total_memory_reclaimed` against an expected value to verify that a Bloom filter was reloaded. However, it does not wait for the manager to update the variable, causing the check to fail if the update has not occurred yet. Fix it by making the testcase wait until the variable is updated to the expected value. Fixes #19879 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit `27b305b9d1`) Closes scylladb/scylladb#19897	2024-08-06 16:26:36 +03:00
Aleksandra Martyniuk	d69f0e529a	test: tasks: adjust tests to new wait_task behavior After `c1b2b8cb2c` /task_manager/wait_task/ does not unregister tasks anymore. Delete the check if the task was unregistered from test_task_manager_wait. Check task status in drain_module_tasks to ensure that the task is removed from task manager. Fixes: #19351. (cherry picked from commit `dfe3af40ed`) Closes scylladb/scylladb#19839	2024-08-06 16:23:02 +03:00
Łukasz Paszkowski	86ff3c2aa3	api/system: add highest_supported_sstable_format path Current upgrade dtest rely on a ccm node function to get_highest_supported_sstable_version() that looks for r'Feature (.*)_SSTABLE_FORMAT is enabled' in the log files. Starting from scylla-6.0 ME_SSTABLE_FORMAT is enabled by default and there is no cluster feature for it. Thus get_highest_supported_sstable_version() returns an empty list resulting in the upgrade tests failures. This change introduces a seperate API path that returns the highest supported sstable format (one of la, mc, md, me) by a scylla node. Fixes scylladb/scylladb#19772 Backports to 6.0 and 6.1 required. The current upgrade test in dtest checks scylla upgrades up to version 5.4 only. This patch is a prerequisite to backport the upgrade tests fix in dtest. (cherry picked from commit `781eb7517c`) Closes scylladb/scylladb#19814	2024-08-06 16:21:48 +03:00

1 2 3 4 5 ...

7165 Commits