scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 10:30:38 +00:00

Author	SHA1	Message	Date
Michał Chojnowski	1446b4e0ef	test.py: add --run-internet-dependent-tests Later, we will add upgrade tests, which need to download the previous release of Scylla from the internet. Internet access is a major dependency, so we want to make those tests opt-in for now. (cherry picked from commit `d3cb873532`)	2025-07-23 19:28:35 +02:00
Michał Chojnowski	06d6718f3b	pylib/manager_client: add server_switch_executable Add an util for switching the Scylla executable during the test. Will be used for upgrade tests. (cherry picked from commit `5da19ff6a6`)	2025-06-18 13:50:38 +00:00
Michał Chojnowski	b5591422c6	test/pylib: in add_server, give a way to specify the executable and version-specific config This will be used for upgrade tests. The cluster will be started with an older executable and without configs specific to newer versions. (cherry picked from commit `1ff7e09edc`)	2025-06-18 13:50:38 +00:00
Michał Chojnowski	043eaf099a	pylib: pass scylla_env environment variables to the topology suite I want to add an upgrade test under the topology suite. To work, it will have to know the path to the tested Scylla executable, so that it can switch the nodes to it. The path could be passed by various means and I'm not sure which what method is appropriate. In some other places (e.g. the cql suite) we pass the path via the `SCYLLA` environment variable and this patch follows that example. `PythonTestSuite` (parent class of `TopologySuite`) already has that variable set in `self.scylla_env`, and passes it around. However, `TopologySuite` uses its own `run()`, and so it implicitly overrides the decision to pass `self.scylla_env` down. This patch changes that, and after the patch we apply the `self.scylla_env` to the environment for topology tests. This might has some unforeseen side effects for coverage measurement, because AFAICS the (only) other variable in `self.scylla_env` is `LLVM_PROFILE_FILE`. But topology tests don't run Scylla executables themselves (they only send command to the cluster manager started externally), so I figure there should be no change. (cherry picked from commit `2ef0db0a6b`)	2025-06-18 13:50:38 +00:00
Michał Chojnowski	5e2b3be754	test/pylib: add get_scylla_2025_1_executable() Adds a function which downloads and installs (in `~/.cache`) the Scylla 2025.1, for upgrade tests. Note: this introduces an internet dependency into pylib, AFAIK the first one. We already have some other code for downloading existing Scylla releases, written for different purposes, in `cqlpy/fetch_scylla.py`. I made zero effort to reuse that in any way. Note: hardcoding the package version might be uncool, but if we want "better" version selection (e.g. the newest patch version in the given branch), we should have a separate library (or web service) for that, and share it with CCM/SCT. If we add a separate automatic version selection mechanism here, we are going to end up with yet another half-broken Scylla version selector, with yet different syntax and semantics than the other ones. We never clear the downloaded and unpacked files. This could become a problem in the future. (At which point we can add some mechanism that deletes cached archives downloaded more than a week ago.) (cherry picked from commit `34098fbd1f`)	2025-06-18 13:50:38 +00:00
Michał Chojnowski	d141b730fc	pylib/scylla_cluster: give a way to pass executable-specific options to nodes I'm trying to adapt pylib to multi-version tests. (Where the Scylla cluster is upgraded to a newer Scylla version during the test). Before this patch, the initial config (where "config" == yaml file + CLI args) of the nodes is hardcoded in scylla_cluster.py. The problem is that this config might not apply to past versions, so we need some way to give them a different config. (For example, with the config as it is before the patch, a Scylla 2025.1 executable would not boot up because it does not know the `group0_voter_handler` logger). In this patch, we create a way to attach version-specific config to the executable passed to ScyllaServer. (cherry picked from commit `cc7432888e`)	2025-06-18 13:50:37 +00:00
Michał Chojnowski	76d989cbfe	dbuild: mount "$XDG_CACHE_HOME/scylladb" We will use it to keep a cache of artifact downloads for upgrade tests, across dbuild invocations. (cherry picked from commit `63218bb094`)	2025-06-18 13:50:37 +00:00
Piotr Dulikowski	9536949911	Merge '[Backport 2025.2] tablets: deallocate storage state on end_migration' from Scylladb[bot] When a tablet is migrated and cleaned up, deallocate the tablet storage group state on `end_migration` stage, instead of `cleanup` stage: * When the stage is updated from `cleanup` to `end_migration`, the storage group is removed on the leaving replica. * When the table is initialized, if the tablet stage is `end_migration` then we don't allocate a storage group for it. This happens for example if the leaving replica is restarted during tablet migration. If it's initialized in `cleanup` stage then we allocate a storage group, and it will be deallocated when transitioning to `end_migration`. This guarantees that the storage group is always deallocated on the leaving replica by `end_migration`, and that it is always allocated if the tablet wasn't cleaned up fully yet. It is a similar case also for the pending replica when the migration is aborted. We deallocate the state on `revert_migration` which is the stage following `cleanup_target`. Previously the storage group would be allocated when the tablet is initialized on any of the tablet replicas - also on the leaving replica, and when the tablet stage is `cleanup` or `end_migration`, and deallocated during `cleanup`. This fixes the following issue: 1. A migrating tablet enters cleanup stage 2. the tablet is cleaned up successfuly 3. The leaving replica is restarted, and allocates storage group 4. tablet cleanup is not called because it's already cleaned up 5. the storage group remains allocated on the leaving replica after the migration is completed - it's not cleaned up properly. Fixes https://github.com/scylladb/scylladb/issues/23481 backport to all relevant releases since it's a bug that results in a crash - (cherry picked from commit `34f15ca871`) - (cherry picked from commit `fb18fc0505`) - (cherry picked from commit `bd88ca92c8`) Parent PR: #24393 Closes scylladb/scylladb#24488 * github.com:scylladb/scylladb: test/cluster/test_tablets: test restart during tablet cleanup test: tablets: add get_tablet_info helper tablets: deallocate storage state on end_migration scylla-2025.2.0-rc5-candidate-20250618080131 scylla-2025.2.0-rc5	2025-06-18 10:25:32 +02:00
Anna Stuchlik	01d3b504d1	doc: add support for z3 GCP This commit adds support for z3-highmem-highlssd instance types to Cloud Instance Recommendations for GCP. Fixes https://github.com/scylladb/scylladb/issues/24511 Closes scylladb/scylladb#24533 (cherry picked from commit `648d8caf27`) Closes scylladb/scylladb#24545	2025-06-17 23:40:47 +03:00
Michael Litvak	305f827888	test/cluster/test_tablets: test restart during tablet cleanup Add a test that reproduces issue scylladb/scylladb#23481. The test migrates a tablet from one node to another, and while the tablet is in some stage of cleanup - either before or right after, depending on the parameter - the leaving replica, on which the tablet is cleaned, is restarted. This is interesting because when the leaving replica starts and loads its state, the tablet could be in different stages of cleanup - the SSTables may still exist or they may have been cleaned up already, and we want to make sure the state is loaded correctly. (cherry picked from commit `bd88ca92c8`)	2025-06-17 13:59:10 +00:00
Michael Litvak	d094bc6fc9	test: tablets: add get_tablet_info helper Add a helper for tests to get the tablet info from system.tablets for a tablet owning a given token. (cherry picked from commit `fb18fc0505`)	2025-06-17 13:59:10 +00:00
Michael Litvak	c11a2e2aaf	tablets: deallocate storage state on end_migration When a tablet is migrated and cleaned up, deallocate the tablet storage group state on `end_migration` stage, instead of `cleanup` stage: * When the stage is updated from `cleanup` to `end_migration`, the storage group is removed on the leaving replica. * When the table is initialized, if the tablet stage is `end_migration` then we don't allocate a storage group for it. This happens for example if the leaving replica is restarted during tablet migration. If it's initialized in `cleanup` stage then we allocate a storage group, and it will be deallocated when transitioning to `end_migration`. This guarantees that the storage group is always deallocated on the leaving replica by `end_migration`, and that it is always allocated if the tablet wasn't cleaned up fully yet. It is a similar case also for the pending replica when the migration is aborted. We deallocate the state on `revert_migration` which is the stage following `cleanup_target`. Previously the storage group would be allocated when the tablet is initialized on any of the tablet replicas - also on the leaving replica, and when the tablet stage is `cleanup` or `end_migration`, and deallocated during `cleanup`. This fixes the following issue: 1. A migrating tablet enters cleanup stage 2. the tablet is cleaned up successfuly 3. The leaving replica is restarted, and allocates storage group 4. tablet cleanup is not called because it was already cleaned up 4. the storage group remains allocated on the leaving replica after the migration is completed - it's not cleaned up properly. Fixes scylladb/scylladb#23481 (cherry picked from commit `34f15ca871`)	2025-06-17 13:59:10 +00:00
Botond Dénes	a63b22eec6	Merge '[Backport 2025.2] tablets: fix missing data after tablet merge ' from Scylladb[bot] Consider the following scenario: 1) let's assume tablet 0 has range [1, 5] (pre merge) 2) tablet merge happens, tablet 0 has now range [1, 10] 3) tablet_sstable_set isn't refreshed, so holds a stale state, thinks tablet 0 still has range [1, 5] 4) during a full scan, forward service will intersect the full range with tablet ranges and consume one tablet at a time 5) replica service is asked to consume range [1, 10] of tablet 0 (post merge) We have two possible outcomes: With cache bypass: 1) cache reader is bypassed 2) sstable reader is created on range [1, 10] 3) unrefreshed tablet_sstable_set holds stale state, but select correctly all sstables intersecting with range [1, 10] With cache: 1) cache reader is created 2) finds partition with token 5 is cached 3) sstable reader is created on range [1, 4] (later would fast forward to range [6, 10]; also belongs to tablet 0) 4) incremental selector consumes the pre-merge sstable spanning range [1, 5] 4.1) since the partitioned_sstable_set pre-merge contains only that sstable, EOS is reached 4.2) since EOS is reached, the fast forward to range [6, 10] is not allowed. So with the set refreshed, sstable set is aligned with tablet ranges, and no premature EOS is signalled, otherwise preventing fast forward to from happening and all data from being properly captured in the read. This change fixes the bug and triggers a mutation source refresh whenever the number of tablets for the table has changed, not only when we have incoming tablets. Additionally, includes a fix for range reads that span more than one tablet, which can happen during split execution. Fixes: https://github.com/scylladb/scylladb/issues/23313 This change needs to be backported to all supported versions which implement tablet merge. - (cherry picked from commit `d0329ca370`) - (cherry picked from commit `1f9f724441`) - (cherry picked from commit `53df911145`) Parent PR: #24287 Closes scylladb/scylladb#24339 * github.com:scylladb/scylladb: replica: Fix range reads spanning sibling tablets test: add reproducer and test for mutation source refresh after merge tablets: trigger mutation source refresh on tablet count change	2025-06-17 08:35:14 +03:00
Jenkins Promoter	0adf905112	Update ScyllaDB version to: 2025.2.0-rc5	2025-06-16 16:21:22 +03:00
Pavel Emelyanov	c2a9f2d9c6	Update seastar submodule * seastar d7ff58f2...9f0034a0 (1): > http_client: Add ECONNRESET to retryable errors And switch to 2025.2 branch from scylla-seastar for backports Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#24446	2025-06-15 17:33:16 +03:00
Raphael S. Carvalho	79958472bc	replica: Fix range reads spanning sibling tablets We don't guarantee that coordinators will only emit range reads that span only one tablet. Consider this scenario: 1) split is about to be finalized, barrier is executed, completes. 2) coordinator starts a read, uses pre-split erm (split not committed to group0 yet) 3) split is committed to group0, all replicas switch storage. 4) replica-side read is executed, uses a range which spans tablets. We could fix it with two-phase split execution. Rather than pushing the complexity to higher levels, let's fix incremental selector which should be able to serve all the tokens owned by a given shard. During split execution, either of sibling tablets aren't going anywhere since it runs with state machine locked, so a single read spanning both sibling tablets works as long as the selector works across tablet boundaries. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `53df911145`)	2025-06-15 09:14:38 -03:00
Ferenc Szili	ba192c1a29	test: add reproducer and test for mutation source refresh after merge This change adds a reproducer and test for the fix where the local mutation source is not always refreshed after a tablet merge. (cherry picked from commit `1f9f724441`)	2025-06-15 09:14:37 -03:00
Jenkins Promoter	89f5374435	Update pgo profiles - aarch64	2025-06-15 04:46:00 +03:00
Jenkins Promoter	184e0716b3	Update pgo profiles - x86_64	2025-06-15 04:08:36 +03:00
Anna Stuchlik	baa2592299	doc: remove the limitation for disabling CDC This commit removes the instruction to stop all writes before disabling CDC with ALTER. Fixes https://github.com/scylladb/scylla-docs/issues/4020 Closes scylladb/scylladb#24406 (cherry picked from commit `b0ced64c88`) Closes scylladb/scylladb#24476 scylla-2025.2.0-rc4-candidate-20250613105409 scylla-2025.2.0-rc4	2025-06-13 14:07:38 +03:00
Robert Bindar	a926cba476	Add support for nodetool refresh --skip-reshape This patch adds the new option in nodetool, patches the load_new_ss_tables REST request with a new parameter and skips the reshape step in refresh if this flag is passed. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#24409 Fixes: #24365 (cherry picked from commit `ca1a9c8d01`) Closes scylladb/scylladb#24472	2025-06-13 14:06:19 +03:00
Michał Chojnowski	9c28b812ca	db/config: add an option that disables dict-aware sstable compressors in DDL statements For reasons, we want to be able to disallow dictionary-aware compressors in chosen deployments. This patch adds a knob for that. When the knob is disabled, dictionary-aware compressors will be rejected in the validation stage of CREATE and ALTER statements. Closes scylladb/scylladb#24355 (cherry picked from commit `7d26d3c7cb`) Closes scylladb/scylladb#24454	2025-06-13 14:03:32 +03:00
Michael Litvak	d792916e8e	test_cdc_generation_clearing: wait for generations to propagate In test_cdc_generation_clearing we trigger events that update CDC generations, verify the generations are updated as expected, and verify the system topology and CDC generations are consistent on all nodes. Before checking that all nodes are consistent and have the same CDC generations, we need to consider that the changes are propagated through raft and take some time to propagate to all nodes. Currently, we wait for the change to be applied only on the first server which runs the CDC generation publisher fiber and read the CDC generations from this single node. The consistency check that follows could fail if the change was not propagated to some other node yet. To fix that, before checking consistency with all nodes, we execute a read barrier on all nodes so they all see the same state as the leader. Fixes scylladb/scylladb#24407 Closes scylladb/scylladb#24433 (cherry picked from commit `8aeb404893`) Closes scylladb/scylladb#24450	2025-06-10 15:50:40 +03:00
Michał Chojnowski	a539ff6419	utils/lsa/chunked_managed_vector: fix the calculation of max_chunk_capacity() `chunked_managed_vector` is a vector-like container which splits its contents into multiple contiguous allocations if necessary, in order to fit within LSA's max preferred contiguous allocation limits. Each limited-size chunk is stored in a `managed_vector`. `managed_vector` is unaware of LSA's size limits. It's up to the user of `managed_vector` to pick a size which is small enough. This happens in `chunked_managed_vector::max_chunk_capacity()`. But the calculation is wrong, because it doesn't account for the fact that `managed_vector` has to place some metadata (the backreference pointer) inside the allocation. In effect, the chunks allocated by `chunked_managed_vector` are just a tiny bit larger than the limit, and the limit is violated. Fix this by accounting for the metadata. Also, before the patch `chunked_managed_vector::max_contiguous_allocation`, repeats the definition of logalloc::max_managed_object_size. This is begging for a bug if `logalloc::max_managed_object_size` changes one day. Adjust it so that `chunked_managed_vector` looks directly at `logalloc::max_managed_object_size`, as it means to. Fixes scylladb/scylladb#23854 (cherry picked from commit `7f9152babc`) Closes scylladb/scylladb#24371	2025-06-10 11:25:52 +03:00
Jenkins Promoter	b295ce38ae	Update ScyllaDB version to: 2025.2.0-rc4	2025-06-06 17:03:11 +03:00
Nikos Dragazis	2e50d1a357	sstables: Fix race when loading checksum component `read_checksum()` loads the checksum component from disk and stores a non-owning reference in the shareable components. To avoid loading the same component twice, the function has an early return statement. However, this does not guarantee atomicity - two fibers or threads may load the component and update the shareable components concurrently. This can lead to use-after-free situations when accessing the component through the shareable components, since the reference stored there is non-owning. This can happen when multiple compaction tasks run on the same SSTable (e.g., regular compaction and scrub-validate). Fix this by not updating the reference in shareable components, if a reference is already in place. Instead, create an owning reference to the existing component for the current fiber. This is less efficient than using a mutex, since the component may be loaded multiple times from disk before noticing the race, but no locks are used for any other SSTable component either. Also, this affects uncompressed SSTables, which are not that common. Fixes #23728. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb/scylladb#23872 (cherry picked from commit `eaa2ce1bb5`) Closes scylladb/scylladb#24358	2025-06-06 08:49:56 +03:00
Szymon Malewski	d65b390780	mapreduce_service: Prevent race condition In parallelized aggregation functions super-coordinator (node performing final merging step) receives and merges each partial result in parallel coroutines (`parallel_for_each`). Usually responses are spread over time and actual merging is atomic. However sometimes partial results are received at the similar time and if an aggregate function (e.g. lua script) yields, two coroutines can try to overwrite the same accumulator one after another, which leads to losing some of the results. To prevent this, in this patch each coroutine stores merging results in its own context and overwrites accumulator atomically, only after it was fully merged. Comparing to the previous implementation order of operands in merging function is swapped, but the order of aggregation is not guaranteed anyway. Fixes #20662 Closes scylladb/scylladb#24106 (cherry picked from commit `5969809607`) Closes scylladb/scylladb#24389	2025-06-06 08:49:15 +03:00
Anna Stuchlik	4ebae7ae62	doc: add the upgrade guide from 2025.1 to 2025.2 This commit adds the upgrade guide from version 2025.1 to 2025.2. Also, it removes the upgrade guides existing for the previous version that are irrelevant in 2025.2 (upgrade from OSS 6.2 and Enterprise 2024.x). Note that the new guide does not include the "Enable Consistent Topology Updates" page, as users upgrading to 2025.2 have consistent topology updates already enabled. Fixes https://github.com/scylladb/scylladb/issues/24133 Fixes https://github.com/scylladb/scylladb/issues/24265 Closes scylladb/scylladb#24266 (cherry picked from commit `8b989d7fb1`) Closes scylladb/scylladb#24391	2025-06-06 08:48:31 +03:00
Ernest Zaslavsky	4fed3a5a5a	encryption_test: Catch exact exception Apparently `test_kms_network_error` will succeed at any circumstances since most of our exceptions derive from `std::exception`, so whatever happens to the test, for whatever reason it will throw, the test will be marked as passed. Start catching the exact exception that we expect to be thrown. Maybe somewhat related to https://github.com/scylladb/scylladb/issues/22628 Fixes: https://github.com/scylladb/scylladb/issues/24145 reapplies reverted: https://github.com/scylladb/scylladb/pull/24065 Should be backported to 2025.2. Closes scylladb/scylladb#24242 (cherry picked from commit `a39b773d36`) Closes scylladb/scylladb#24402	2025-06-06 08:48:02 +03:00
Pavel Emelyanov	5b86b6393a	Merge '[Backport 2025.2] Add ability to skip SSTables cleanup when loading them' from Scylladb[bot] The non-streaming loading of sstables performs cleanup since recently [1]. For vnodes, unfortunately, cleanup is almost unavoidable, because of the nature of vnodes sharding, even if sstable is already clean. This leads to waste of IO and CPU for nothing. Skipping the cleanup in a smart way is possible, but requires too many changes in the code and in the on-disk data. However, the effort will not help existing SSTables and it's going to be obsoleted by tablets some time soon. Said that, the easiest way to skip cleanup is the explicit --skip-cleanup option for nodetool and respective skip_cleanup parameter for API handler. New feature, no backport fixes #24136 refs #12422 [1] - (cherry picked from commit `4ab049ac8d`) - (cherry picked from commit `ed3ce0f6af`) - (cherry picked from commit `1b1f653699`) - (cherry picked from commit `c0796244bb`) Parent PR: #24139 Closes scylladb/scylladb#24398 * github.com:scylladb/scylladb: nodetool: Add refresh --skip-cleanup option api: Introduce skip_cleanup query parameter distributed_loader: Don't create owned ranges if skip-cleanup is true code: Push bool skip_cleanup flag around	2025-06-06 08:47:22 +03:00
Pavel Emelyanov	024af57bd5	nodetool: Add refresh --skip-cleanup option The option "conflicts" with load-and-stream. Tests and doc included. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `c0796244bb`)	2025-06-05 17:52:13 +03:00
Pavel Emelyanov	c59327950b	api: Introduce skip_cleanup query parameter Just copy the load_and_stream and primary_replica_only logic, this new option is the same in this sense. Throw if it's specified with the load_and_stream one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `1b1f653699`)	2025-06-05 17:48:35 +03:00
Pavel Emelyanov	a2b2e46482	distributed_loader: Don't create owned ranges if skip-cleanup is true In order to make reshard compaction task run cleanup, the owner-ranges pointer is passed to it. If it's nullptr, the cleanup is not performed. So to do the skip-cleanup, the easiest (but not the most apparent) way is not to initialize the pointer and keep it nullptr. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `ed3ce0f6af`)	2025-06-05 17:44:45 +03:00
Pavel Emelyanov	4a7ddbfe07	code: Push bool skip_cleanup flag around Just put the boolean into the callstack between API and distributed loader to reduce the churn in the next patches. No functional changes, flag is false and unused. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `4ab049ac8d`)	2025-06-05 17:44:40 +03:00
Michał Chojnowski	484fc374c1	compress: fix a use-after-free in `dictionary_holder::get_recommended_dict()` The function calls copy() on a foreign_ptr (stored in a map) which can be destroyed (erased from the map) before the copy() completes. This is illegal. One way to fix this would be to apply an rwlock to the map. Another way is to wrap the `foreign_ptr` in a `lw_shared_ptr` and extend its lifetime over the `copy()` call. This patch does the latter. Fixes scylladb/scylladb#24165 Fixes scylladb/scylladb#24174 Closes scylladb/scylladb#24175 (cherry picked from commit `ea4d251ad2`) Closes scylladb/scylladb#24374	2025-06-05 12:11:22 +03:00
Botond Dénes	a5251b4d44	Merge '[Backport 2025.2] Add --scope arg to `notedool refresh`' from Scylladb[bot] This PR adds the `--scope` option to `nodetool refresh`. Like in the case of `nodetool restore`, you can pass either of: * `node` - On the local node. * `rack` - On the local rack. * `dc` - In the datacenter (DC) where the local node lives. * `all` (default) - Everywhere across the cluster. as scope. The feature is based on the existing load_and_stream paths, so it requires passing `--load-and-stream` to the `refresh` command, although this might change in the near future. Fixes https://github.com/scylladb/scylladb/issues/23564 - (cherry picked from commit `c570941692`) Parent PR: #23861 Closes scylladb/scylladb#24379 * github.com:scylladb/scylladb: Add nodetool refresh --scope option Refactor out code from test_restore_with_streaming_scopes Refactor out code from test_restore_with_streaming_scopes Refactor out code from test_restore_with_streaming_scopes Refactor out code from test_restore_with_streaming_scopes Refactor out code from test_restore_with_streaming_scopes	2025-06-05 11:54:17 +03:00
Avi Kivity	2afe0695cf	Revert "config: decrease default large allocation warning threshold to 128k" This reverts commit `04fb2c026d`. 2025.2 got the reduced threshold, but won't get most of the fixes the warning will generate, leaving it very noisy. Better to avoid the noise for this release. Fixes #24384.	2025-06-04 14:18:35 +03:00
Robert Bindar	b62264e1d9	Add nodetool refresh --scope option This change adds the --scope option to nodetool refresh. Like in the case of nodetool restore, you can pass either of: * node - On the local node. * rack - On the local rack. * dc - In the datacenter (DC) where the local node lives. * all (default) - Everywhere across the cluster. as scope. The feature is based on the existing load_and_stream paths, so it requires passing --load-and-stream to the refresh command. Also, it is not compatible with the --primary-replica-only option. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#23861 (cherry picked from commit `c570941692`)	2025-06-04 11:59:17 +03:00
Robert Bindar	36cc0f8e7e	Refactor out code from test_restore_with_streaming_scopes part 5: check_data_is_back Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> (cherry picked from commit `548a1ec20a`)	2025-06-04 11:54:07 +03:00
Robert Bindar	a885c87547	Refactor out code from test_restore_with_streaming_scopes part 4: compute_scope Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> (cherry picked from commit `29309ae533`)	2025-06-04 11:54:01 +03:00
Robert Bindar	371fc05943	Refactor out code from test_restore_with_streaming_scopes part 3: create_dataset Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> (cherry picked from commit `a0f0580a9c`)	2025-06-04 11:53:51 +03:00
Robert Bindar	4366cd5a81	Refactor out code from test_restore_with_streaming_scopes part 2: take_snapshot Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> (cherry picked from commit `5171ca385a`)	2025-06-04 11:53:43 +03:00
Robert Bindar	38ee119112	Refactor out code from test_restore_with_streaming_scopes part 1: create_cluster Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> (cherry picked from commit `f09bb20ac4`)	2025-06-04 11:53:32 +03:00
Piotr Dulikowski	6edf92a9e3	Merge '[Backport 2025.2] test/boost: Adjust tests to RF-rack-valid keyspaces' from Scylladb[bot] This PR adjusts existing Boost tests so they respect the invariant introduced by enabling `rf_rack_valid_keyspaces` configuration option. We disable it explicitly in more problematic tests. After that, we enable the option by default in the whole test suite. Fixes scylladb/scylladb#23958 Backport: backporting to 2025.1 to be able to test the implementation there too. - (cherry picked from commit `6e2fb79152`) - (cherry picked from commit `e4e3b9c3a1`) - (cherry picked from commit `1199c68bac`) - (cherry picked from commit `cd615c3ef7`) - (cherry picked from commit `fa62f68a57`) - (cherry picked from commit `22d6c7e702`) - (cherry picked from commit `237638f4d3`) - (cherry picked from commit `c60035cbf6`) Parent PR: scylladb/scylladb#23802 Closes scylladb/scylladb#24368 * github.com:scylladb/scylladb: test/lib/cql_test_env.cc: Enable rf_rack_valid_keyspaces by default test/boost/tablets_test.cc: Explicitly disable rf_rack_valid_keyspaces in problematic tests test/boost/tablets_test.cc: Fix indentation in test_load_balancing_with_random_load test/boost/tablets_test.cc: Adjust test_load_balancing_with_random_load to RF-rack-validity test/boost/tablets_test.cc: Adjust test_load_balancing_works_with_in_progress_transitions to RF-rack-validity test/boost/tablets_test.cc: Adjust test_load_balancing_resize_requests to RF-rack-validity test/boost/tablets_test.cc: Adjust test_load_balancing_with_two_empty_nodes to RF-rack-validity test/boost/tablets_test.cc: Adjust test_load_balancer_shuffle_mode to RF-rack-validity	2025-06-04 10:24:35 +02:00
Nadav Har'El	609ad01bbc	alternator: hide internal tags from users The "tags" mechanism in Alternator is a convenient way to attach metadata to Alternator tables. Recently we have started using it more and more for internal metadata storage: * UpdateTimeToLive stores the attribute in a tag system:ttl_attribute * CreateTable stores provisioned throughput in tags system:provisioned_rcu and system:provisioned_wcu * CreateTable stores the table's creation time in a tag called system:table_creation_time. We do not want any of these internal tags to be visible to a ListTagsOfResource request, because if they are visible (as before this patch), systems such as Terraform can get confused when they suddenly see a tag which they didn't set - and may even attempt to delete it (as reported in issue #24098). Moreover, we don't want any of these internal tags to be writable with TagResource or UntagResource: If a user wants to change the TTL setting they should do it via UpdateTimeToLive - not by writing directly to tags. So in this patch we forbid read or write to any tag that begins with the "system:" prefix, except one: "system:write_isolation". That tag is deliberately intended to be writable by the user, as a configuration mechanism, and is never created internally by Scylla. We should have perhaps chosen a different prefix for configurable vs. internal tags, or chosen more unique prefixes - but let's not change these historic names now. This patch also adds regression tests for the internal tags features, failing before this patch and passing after: 1. internal tags, specifically system:ttl_attribute, are not visible in ListTagsOfResource, and cannot be modified by TagResource or UntagResource. 2. system:write_isolation is not internal, and be written by either TagResource or UntagResource, and read with ListTagsOfResource. This patch also fixes a bug in the test where we added more checks for system:write_isolation - test_tag_resource_write_isolation_values. This test forgot to remove the system:write_isolation tags from test_table when it ended, which would lead to other tests that run later to run with a non-default write isolation - something which we never intended. Fixes #24098. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#24299 (cherry picked from commit `6cbcabd100`) Closes scylladb/scylladb#24377	2025-06-04 09:56:33 +03:00
Avi Kivity	10b7f2d924	pgo: drop Java configuration Since `5e1cf90a51` ("build: replace tools/java submodule with packaged cassandra-stress") we run pre-packaged cassandra-stress. As such, we don't need to look for a Java runtime (which is missing on the frozen toolchain) and can rely on the cassandra-stress package finding its own Java runtime. Fix by just dropping all the Java-finding stuff. Note: Java 11 is in fact present on the frozen toolchain, just not in a way that pgo.py can find it. Fixes #24176. Closes scylladb/scylladb#24178 (cherry picked from commit `29932a5af1`) Closes scylladb/scylladb#24254 scylla-2025.2.0-rc3-candidate-20250604030830 scylla-2025.2.0-rc3	2025-06-03 17:54:28 +03:00
Dawid Mędrek	5130ec84de	test/lib/cql_test_env.cc: Enable rf_rack_valid_keyspaces by default We've adjusted all of the Boost tests so they respect the invariant enforced by the `rf_rack_valid_keyspaces` configuration option, or explicitly disabled the option in those that turned out to be more problematic and will require more attention. Thanks to that, we can now enable it by default in the test suite. (cherry picked from commit `c60035cbf6`)	2025-06-03 11:10:16 +00:00
Dawid Mędrek	9938183ace	test/boost/tablets_test.cc: Explicitly disable rf_rack_valid_keyspaces in problematic tests Some of the tests in the file verify more subtle parts of the behavior of tablets and rely on topology layouts or using keyspaces that violate the invariant the `rf_rack_valid_keyspaces` configuration option is trying to enforce. Because of that, we explicitly disable the option to be able to enable it by default in the rest of the test suite in the following commit. (cherry picked from commit `237638f4d3`)	2025-06-03 11:10:16 +00:00
Dawid Mędrek	1271b42848	test/boost/tablets_test.cc: Fix indentation in test_load_balancing_with_random_load (cherry picked from commit `22d6c7e702`)	2025-06-03 11:10:16 +00:00
Dawid Mędrek	012e248792	test/boost/tablets_test.cc: Adjust test_load_balancing_with_random_load to RF-rack-validity We make sure that the keyspaces created in the test are always RF-rack-valid. To achieve that, we change how the test is performed. Before this commit, we first created a cluster and then ran the actual test logic multiple times. Each of those test cases created a keyspace with a random replication factor. That cannot work with `rf_rack_valid_keyspaces` set to true. We cannot modify the property file of a node (see commit: `eb5b52f598`), so once we set up the cluster, we cannot adjust its layout to work with another replication factor. To solve that issue, we also recreate the cluster in each test case. Now we choose the replication factor at random, create a cluster distributing nodes across as many racks as RF, and perform the rest of the logic. We perform it multiple times in a loop so that the test behaves as before these changes. (cherry picked from commit `fa62f68a57`)	2025-06-03 11:10:16 +00:00

1 2 3 4 5 ...

47808 Commits