scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 04:26:48 +00:00

Author	SHA1	Message	Date
Benny Halevy	4dcb8c19bd	scylla-sstable: correctly dump sharding_metadata This patch fixes 2 issues at one go: First, Currently sstables::load clears the sharding metadata (via open_data()), and so scylla-sstable always prints an empty array for it. Second, printing token values would generate invalid json as they are currently printed as binary bytes, and they should be printed simply as numbers, as we do elsewhere, for example, for the first and last keys. Fixes #26982 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#26991 (cherry picked from commit `f9ce98384a`) Closes scylladb/scylladb#27030	2025-11-16 16:05:14 +02:00
Jenkins Promoter	3818e15d91	Update pgo profiles - aarch64	2025-11-15 05:02:53 +02:00
Jenkins Promoter	a945742c2a	Update pgo profiles - x86_64	2025-11-15 04:02:16 +02:00
Ernest Zaslavsky	e185740c54	minio: update CLI usage, remove deprecated `mc` options Replace phased-out `mc` command options with supported alternatives. Ensures compatibility with the latest MinIO version. Closes scylladb/scylladb#24363 (cherry picked from commit `1446f57635`) Closes scylladb/scylladb#27004	2025-11-14 10:48:00 +02:00
Andrei Chekun	88556e6c77	test.py: rewrite the wait_for_first_completed Rewrite wait_for first_completed to return only first completed task guarantee of awaiting(disappearing) all cancelled and finished tasks Use wait_for_first_completed to avoid false pass tests in the future and issues like #26148 Use gather_safely to await tasks and removing warning that coroutine was not awaited Closes scylladb/scylladb#26435 (cherry picked from commit `24d17c3ce5`) Closes scylladb/scylladb#26661	2025-11-12 11:50:51 +01:00
Botond Dénes	bec413a671	service/storage_proxy: send batches with CL=EACH_QUORUM Batches that fail on the initial send are retired later, until they succeed. These retires happen with CL=ALL, regardless of what the original CL of the batch was. This is unnecessarily strict. We tried to follow Cassandra here, but Cassandra has a big caveat in their use of CL=ALL for batches. They accept saving just a hint for any/all of the endpoints, so a batch which was just logged in hints is good enough for them. We do not plan on replicating this usage of hints at this time, so as a middle ground, the CL is changed to EACH_QUORUM. Fixes: scylladb/scylladb#25432 Closes scylladb/scylladb#26304 (cherry picked from commit `d9c3772e20`) Closes scylladb/scylladb#26927	2025-11-11 10:23:59 +03:00
Jenkins Promoter	01e929805a	Update pgo profiles - aarch64	2025-11-01 05:04:05 +02:00
Jenkins Promoter	9f961d67d7	Update pgo profiles - x86_64	2025-11-01 04:31:27 +02:00
Anna Stuchlik	ee35b5aa90	doc: add --list-active-releases to Web Installer Fixes https://github.com/scylladb/scylladb/issues/26688 V2 of https://github.com/scylladb/scylladb/pull/26687 Closes scylladb/scylladb#26689 (cherry picked from commit `bd5b966208`) Closes scylladb/scylladb#26751	2025-10-29 11:42:03 +02:00
Patryk Jędrzejczak	0cd118f5d6	test: test_raft_recovery_stuck: reconnect driver after rolling restarts It turns out that #21477 wasn't sufficient to fix the issue. The driver may still decide to reconnect the connection after `rolling_restart` returns. One possible explanation is that the driver sometimes handles the DOWN notification after all nodes consider each other UP. Reconnecting the driver after restarting nodes seems to be a reliable workaround that many tests use. We also use it here. Fixes #19959 Closes scylladb/scylladb#26638 (cherry picked from commit `5321720853`) Closes scylladb/scylladb#26749	2025-10-29 11:41:23 +02:00
Anna Stuchlik	2019ef899f	doc: add support for Debian 12 Fixes https://github.com/scylladb/scylladb/issues/26640 Closes scylladb/scylladb#26668 (cherry picked from commit `9c0ff7c46b`) Closes scylladb/scylladb#26676	2025-10-29 11:40:50 +02:00
Botond Dénes	2e375ade3a	Merge '[Backport 2025.1] service/qos: set long timeout for auth queries on SL cache update' from Scylladb[bot] pass an appropriate query state for auth queries called from service level cache reload. we use the function qos_query_state to select a query_state based on caller context - for internal queries, we set a very long timeout. the service level cache reload is called from group0 reload. we want it to have a long timeout instead of the default 5 seconds for auth queries, because we don't have strict latency requirement on the one hand, and on the other hand a timeout exception is undesired in the group0 reload logic and can break group0 on the node. Fixes https://github.com/scylladb/scylladb/issues/25290 backport possible to improve stability - (cherry picked from commit `a1161c156f`) - (cherry picked from commit `3c3dd4cf9d`) - (cherry picked from commit `ad1a5b7e42`) Parent PR: #26180 Closes scylladb/scylladb#26475 * github.com:scylladb/scylladb: service/qos: set long timeout for auth queries on SL cache update auth: add query_state parameter to query functions auth: refactor query_all_directly_granted	2025-10-29 11:40:17 +02:00
Asias He	7e82e3b56c	repair: Always reset node ops progress to 100% upon completion Always set the node ops progress to 100% when the operation finishes, regardless of success or failure. This ensures the progress never remains below 100%, which would otherwise indicates a pending node operation in case of an error. Fixes #26193 Closes scylladb/scylladb#26194 (cherry picked from commit `b31e651657`) Closes scylladb/scylladb#26262	2025-10-29 11:39:39 +02:00
Andrei Chekun	c85a0615e0	test.py: fix flaky LDAP tests The issue with current approach is that LDAP server starting on localhost where ports can be busy. This PR migrate using HostRegistry() instead of localhost where no busy ports. This fix has the same idea that was on master #23235. Simple backport is not possible due to huge differences between the branches. Additionally, Minio's host fixed as well, to avoid flakiness. Fixes: #26295 Closes scylladb/scylladb#26518	2025-10-28 13:57:54 +03:00
Piotr Dulikowski	5cb0dc3f2b	Merge '[Backport 2025.1] transport: call update_scheduling_group for non-auth connections' from Andrzej Jackowski This is backport of fix for #26040 and related test (#26589) to 2025.1. Before this change, unauthorized connections stayed in `main` scheduling group. It is not ideal, in such case, rather `sl:default` should be used, to have a consistent behavior with a scenario where users is authenticated but there is no service level assigned to the user. This commit adds a call to `update_scheduling_group` at the end of connection creation for an unauthenticated user, to make sure the service level is switched to `sl:default`. Fixes: scylladb/scylladb#26040 Fixes: scylladb/scylladb#26581 (cherry picked from commit `278019c328`) (cherry picked from commit `8642629e8e`) No backport, as it's already a backport (but similar PRs will be created for 2025.2, 2025.3, and 2025.4) Closes scylladb/scylladb#26718 * github.com:scylladb/scylladb: test: add test_anonymous_user to test_raft_service_levels transport: call update_scheduling_group for non-auth connections	2025-10-27 22:21:24 +01:00
Andrzej Jackowski	014ecbb2e0	test: add test_anonymous_user to test_raft_service_levels The primary goal of this test is to reproduce scylladb/scylladb#26040 so the fix (`278019c328`) can be backported to older branches. Scenario: connect via CQL as an anonymous user and verify that the `sl:default` scheduling group is used. Before the fix for #26040 `main` scheduling group was incorrectly used instead of `sl:default`. Control connections may legitimately use `sl:driver`, so the test accepts those occurrences while still asserting that regular anonymous queries use `sl:default`. This adds explicit coverage on master. After scylladb#24411 was implemented, some other tests started to fail when scylladb#26040 was unfixed. However, none of the tests asserted this exact behavior. Refs: scylladb/scylladb#26040 Refs: scylladb/scylladb#26581 Closes scylladb/scylladb#26589 (cherry picked from commit `8642629e8e`)	2025-10-27 10:10:51 +01:00
Andrzej Jackowski	9ddf7feabd	transport: call update_scheduling_group for non-auth connections Before this change, unauthorized connections stayed in `main` scheduling group. It is not ideal, in such case, rather `sl:default` should be used, to have a consistent behavior with a scenario where users is authenticated but there is no service level assigned to the user. This commit adds a call to `update_scheduling_group` at the end of connection creation for an unauthenticated user, to make sure the service level is switched to `sl:default`. Fixes: scylladb/scylladb#26040 Fixes: scylladb/scylladb#26581 (cherry picked from commit `278019c328`)	2025-10-27 10:10:12 +01:00
Asias He	691f2740b2	repair: Fix uuid and nodes_down order in the log Fixes #26536 Closes scylladb/scylladb#26547 (cherry picked from commit `33bc1669c4`) Closes scylladb/scylladb#26627	2025-10-22 11:28:10 +03:00
Jenkins Promoter	c4a775be6f	Update ScyllaDB version to: 2025.1.10	2025-10-15 11:29:08 +03:00
Jenkins Promoter	17cf9270b3	Update pgo profiles - aarch64	2025-10-15 05:03:59 +03:00
Jenkins Promoter	779a5b0919	Update pgo profiles - x86_64	2025-10-15 04:32:12 +03:00
Michael Litvak	0a881e8c24	service/qos: set long timeout for auth queries on SL cache update pass an appropriate query state for auth queries called from service level cache reload. we use the function qos_query_state to select a query_state based on caller context - for internal queries, we set a very long timeout. the service level cache reload is called from group0 reload. we want it to have a long timeout instead of the default 5 seconds for auth queries, because we don't have strict latency requirement on the one hand, and on the other hand a timeout exception is undesired in the group0 reload logic and can break group0 on the node. Fixes scylladb/scylladb#25290 (cherry picked from commit `ad1a5b7e42`)	2025-10-09 12:46:53 +00:00
Michael Litvak	28d45bf612	auth: add query_state parameter to query functions add a query_state parameter to several auth functions that execute internal queries. currently the queries use the internal_distributed_query_state() query state, and we maintain this as default, but we want also to be able to pass a query state from the caller. in particular, the auth queries currently use a timeout of 5 seconds, and we will want to set a different timeout when executed in some different context. (cherry picked from commit `3c3dd4cf9d`)	2025-10-09 12:46:53 +00:00
Michael Litvak	874f336b3e	auth: refactor query_all_directly_granted rewrite query_all_directly_granted to use execute_internal instead of query_internal in a style that is more consistent with the rest of the module. This will also be useful for a later change because execute_internal accepts an additional parameter of query_state. (cherry picked from commit `a1161c156f`)	2025-10-09 12:46:53 +00:00
Jenkins Promoter	6c539463bb	Update ScyllaDB version to: 2025.1.9 scylla-2025.1.9-candidate-20251010101320 scylla-2025.1.9	2025-10-09 12:26:37 +03:00
Avi Kivity	67c4d980dd	Revert "Merge 'auth: move passwords::check call to alien thread' from Andrzej Jackowski" This reverts commit `1fd82d32e0`. It causes connection storms to snowball into a node crash via this mechanism: 1. large node suffers mild connection storm 2. password hash requests queue up on alien hash thread 3. incoming hash requests queue faster than the alien thread can retire them. 4. auth latency grows without bounds 5. this encourages the clients to create new connections 6. problem grows Reverting the patch restores the hash stall, but at least prevents node crashes. Fixes #26461 (2025.1) Closes scylladb/scylladb#26462	2025-10-09 11:04:34 +03:00
Botond Dénes	9dd32a02af	Merge '[Backport 2025.1] tools: fix documentation links after change to source-available' from Scylladb[bot] Some tools commands have links to online documentation in their help output. These links were left behind in the source-available change, they still point to the old opensource docs. Furthermore, the links in the scylla-sstable help output always point to the latest stable release's documentation, instead of the appropriate one for the branch the tool was built from. Fix both of these. Fixes: scylladb/scylladb#26320 Broken documentation link fix for the tool help output, needs backport to all live source-available versions. - (cherry picked from commit `5a69838d06`) - (cherry picked from commit `15a4a9936b`) - (cherry picked from commit `fe73c90df9`) Parent PR: #26322 Closes scylladb/scylladb#26386 * github.com:scylladb/scylladb: tools/scylla-sstable: fix doc links release: adjust doc_link() for the post source-available world tools/scylla-nodetool: remove trailing " from doc urls	2025-10-08 06:38:50 +03:00
Botond Dénes	694fb53aad	tools/scylla-sstable: fix doc links The doc links in scylla-sstable help output are static, so they always point to the documentation of the latest stable release, not to the documentation of the release the tool binary is from. On top of that, the links point to old open-source documentation, which is now EOL. Fix both problems: point link at the new source-available documentation pages and make them version aware. (cherry picked from commit `fe73c90df9`)	2025-10-07 10:22:29 +03:00
Botond Dénes	b6a458f9a9	Merge '[Backport 2025.1] scylla-gdb: Fix fair-queue entry printing' from Scylladb[bot] Catching a live entry in IO queue is very rare event, so we haven't seen it so far, but the `_ticket` member had been removed ~2 years ago and had been replaced with `_capacity` which is plain 64bit integer. Fixes #26184 The issue is present in 2025.x as well and looks cheap to backport - (cherry picked from commit `8438c59ad3`) Parent PR: #26185 Also includes backport of #24835 which also applies to 2025.1 and is now crucial. The scylla_io_queues.ticket() method is renamed by this backport, but without 24835 it will be problematic to fix all callers of it Closes scylladb/scylladb#26260 * github.com:scylladb/scylladb: scylla-gdb: Fix fair-queue entry printing scylla-gdb: Don't show io_queue executing and queued resources	2025-10-06 17:05:35 +03:00
Benny Halevy	10b473ffcb	test_tablets_merge: test_tablet_split_merge_with_many_tables: reduce number of tables in debug mode As the test hits timeouts in debug mode on aarch64. Fixes #26252 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#26303 (cherry picked from commit `b81c6a339b`) Closes scylladb/scylladb#26323	2025-10-06 17:05:00 +03:00
Botond Dénes	b37ddaee90	Merge '[Backport 2025.1] compaction: ensure that all compaction executors are stopped' from Scylladb[bot] Currently, while stopping the compaction_manager, we stop task_manager compaction module and concurrently run compaction_manager::really_do_stop. really_do_stop stops and waits for all task_executors that are kept in compaction_manager::_tasks, but nothing ensures that no more tasks will be added there. Due to leftover tasks, we trigger on_fatal_internal_error. Modify the order of compaction_manager::stop. After the change, we stop compaction tasks in the following order: - abort module abort source; - close module gate in the background; - stop_ongoing_compactions (kept in compaction_manager::_tasks); - wait until module gate is closed. Check module abort source before creating compaction executor and adding it to _tasks. Thanks to the above, we can be sure that: - after module::stop there will be no tasks in _tasks; - compaction_manager::stop aborts all tasks; we don't wait for any whole compaction to finish. Fixes: https://github.com/scylladb/scylladb/issues/25806. Fixes shutdown bug; Needs backports to all version - (cherry picked from commit `17707d0e6b`) - (cherry picked from commit `97c77d7cd5`) Parent PR: #25885 Closes scylladb/scylladb#26222 * github.com:scylladb/scylladb: compaction: move _tasks check compaction: stop compaction module in really_do_stop	2025-10-06 07:04:20 +03:00
Botond Dénes	ee792f7257	release: adjust doc_link() for the post source-available world There is no more separate enterprise product and the doc urls are slightly different. (cherry picked from commit `15a4a9936b`)	2025-10-03 14:27:27 +00:00
Botond Dénes	4f01659eda	tools/scylla-nodetool: remove trailing " from doc urls They are accidental leftover from a previous way of storing command descriptions. (cherry picked from commit `5a69838d06`)	2025-10-03 14:27:27 +00:00
Piotr Smaron	54bfbb6303	[2025.1] test/ldap: assign non-busy ports to ldap It may happen that the ports we randomly choose for LDAP are busy, and that'd fail the test suite, so once we randomly select ports, now we'll see if they're busy or not, and if they're busy, we'll select next ones, until we finally have some free ports for LDAP. Tested with: `./test.py ldap/ldap_connection_test --repeat 1000 -j 10`: before the fix, this command fails after ~112 runs, and of course it passes with the fix. This is a backport of https://github.com/scylladb/scylladb/pull/23275, but not 1:1, because the patch no longer applies, since the originally modified files no longer exist on this branch. Fixes: scylladb/scylla-enterprise#5120 Fixes: #23149 Fixes: #23242 Fixes: scylladb/scylladb#26295 (cherry picked from commit `d365d9b2ad`) Closes scylladb/scylladb#26310 scylla-2025.1.8-candidate-20251005035347	2025-10-02 06:10:26 +03:00
Jenkins Promoter	ac33223701	Update pgo profiles - aarch64	2025-10-01 04:58:10 +03:00
Jenkins Promoter	eef3cc2baf	Update pgo profiles - x86_64	2025-10-01 04:35:21 +03:00
Tomasz Grabiec	e71c8a40a9	Merge '[Backport 2025.1] replica: Fix split compaction when tablet boundaries change' from Scylladb[bot] Consider the following: 1) balancer emits split decision 2) split compaction starts 3) split decision is revoked 4) emits merge decision 5) completes merge, before compaction in step 2 finishes After last step, split compaction initiated in step 2 can fail because it works with the global tablet map, rather than the map when the compaction started. With the global state changing under its feet, on merge, the mutation splitting writer will think it's going backwards since sibling tablets are merged. This problem was also seen when running load-and-stream, where split initiated by the sstable writer failed, split completed, and the unsplit sstable is left in the table dir, causing problems in the restart. To fix this, let's make split compaction always work with the state when it started, not a global state. Fixes #24153. All 2025.* versions are vulnerable, so fix must be backported to them. - (cherry picked from commit `0c1587473c`) - (cherry picked from commit `68f23d54d8`) Parent PR: #25690 Closes scylladb/scylladb#25933 * github.com:scylladb/scylladb: replica: Fix split compaction when tablet boundaries change replica: Futurize split_compaction_options() test: fix flakiness of test_missing_data	2025-09-30 19:45:45 +02:00
Pavel Emelyanov	75918dda8d	scylla-gdb: Fix fair-queue entry printing Catching a live entry in IO queue is very rare event, so we haven't seen it so far, but the `_ticket` member had been removed ~2 years ago and had been replaced with `_capacity` which is plain 64bit integer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#26185 (cherry picked from commit `8438c59ad3`)	2025-09-30 11:23:38 +03:00
Pavel Emelyanov	67d0f0b754	scylla-gdb: Don't show io_queue executing and queued resources These counters are no longer accounted by io-queue code and are always zero. Even more -- accounting removal happened years ago and we don't have Scylla versions built with seastar older than that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#24835	2025-09-30 11:23:38 +03:00
Gleb Natapov	1787e9aaa3	storage_service: change node_ops_info::ignore_nodes to host id It drop useless translation from id to ip during removenode through topology coordinator. Closes scylladb/scylladb#25958 (cherry picked from commit `d3badf7406`) Closes scylladb/scylladb#26228	2025-09-26 10:59:15 +02:00
Aleksandra Martyniuk	f322369f07	compaction: move _tasks check In compaction_manager::really_do_stop we check whether _tasks list is empty after the compactions are stopped. However, a new task may still sneak in, causing the assertion failure. Such a task won't be there for long - module::make_task will fail as the module is already stopped. Move the assertion, that checks if _tasks is empty, after the compaction_states' gates are closed. Fixes: #25806. (cherry picked from commit `97c77d7cd5`)	2025-09-25 16:18:46 +02:00
Aleksandra Martyniuk	9c5ed9586d	compaction: stop compaction module in really_do_stop Currently, compaction::task_manager_module is stopped in compaction_manager::stop, concurrently to really_do_stop. We can't predict the order of the two. Do not set _task_manager_module to nullptr at stop, because compaction_manager::really_do_stop() may be called before the actual shutdown, while other components still try to use it. compaction::task_manager_module does not keep a pointer to compaction_manager, so we won't end up with memory leak. Stop compaction module in really_do_stop, after ongoing compactions are stopped. It's a preparation for further patches. (cherry picked from commit `17707d0e6b`)	2025-09-25 16:18:45 +02:00
Ferenc Szili	38aa7be6a1	load_balancer: fix std::out_of_bounds when decommissioning with empty nodes Consider the following: The tablet load balancer is working on: - node1: an empty node (no tablets) with a large disk capacity - node2: an empty node (no tablets) with a lower disk capacity then node1 - node3: is being decommissioned and contains tablet replicas In load_balancer::make_internode_plan() the initial destination node/shard is selected like this: // Pick best target shard. auto dst = global_shard_id {target, _load_sketch->get_least_loaded_shard(target)}; load_sketch::get_least_loaded_shard(host_id) calls ensure_node() which adds the host to load_sketch's internal hash maps in case the node was not yet seen by load_sketch. Let's assume dst is a shard on node1. Later in load_balancer::make_internode_plan() we will call pick_candidate() to try to find a better destination node than the initial one: // May choose a different source shard than src.shard or different destination host/shard than dst. auto candidate = co_await pick_candidate(nodes, src_node_info, target_info, src, dst, nodes_by_load_dst, drain_skipped); auto source_tablets = candidate.tablets; src = candidate.src; dst = candidate.dst; If pick_candidate() selects some other empty destination (due to larger capacity: node1) node, and that node has not yet been seen by load_sketch (because it was empty), a subsequent call to load_sketch::pick() will search for the node using std::unordered_map::at(), and because the node is not found it will throw a std::out_of_bounds() exception crashing the load balancer. This problem is fixed by changing load_sketch::populate() to initialize its internal maps with all the nodes which populate()'s arguments filter for. Fixes: #26203 Closes scylladb/scylladb#26207 (cherry picked from commit `c6c9c316a7`) Closes scylladb/scylladb#26238	2025-09-25 09:51:23 +03:00
Ferenc Szili	142156a808	docs: add capacity based balancing explanation Capacity based balancing was introduced in 2025.1. It computes balance based on a node's capacity: the number of tablets located on a node should be directly proportional to that node's storage capacity. This change adds this explanation to the docs. Fixes: #25686 Closes scylladb/scylladb#25687 (cherry picked from commit `de5dab8429`) Closes scylladb/scylladb#26105	2025-09-25 09:50:41 +03:00
Raphael S. Carvalho	ca1974da91	replica: Fix split compaction when tablet boundaries change Consider the following: 1) balancer emits split decision 2) split compaction starts 3) split decision is revoked 4) emits merge decision 5) completes merge, before compaction in step 2 finishes After last step, split compaction initiated in step 2 can fail because it works with the global tablet map, rather than the map when the compaction started. With the global state changing under its feet, on merge, the mutation splitting writer will think it's going backwards since sibling tablets are merged. This problem was also seen when running load-and-stream, where split initiated by the sstable writer failed, split completed, and the unsplit sstable is left in the table dir, causing problems in the restart. To fix this, let's make split compaction always work with the state when it started, not a global state. Fixes #24153. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `68f23d54d8`)	2025-09-24 20:27:11 -03:00
Raphael S. Carvalho	b4e6795bd5	replica: Futurize split_compaction_options() Prepararation for the fix of #24153. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `0c1587473c`)	2025-09-24 19:28:07 -03:00
Dawid Mędrek	ee800b9682	db/batchlog: Drop batch if table has been dropped If there are pending mutations in the batchlog for a table that has been dropped, we'll keep attempting to replay them but with no success -- `db::no_such_column_family` exceptions will be thrown, and we'll keep trying again and again. To prevent that, we drop the batch in that case just like we do in the case of a non-existing keyspace. A reproducer test has been included in the commit. It fails without the changes in `db/batchlog_manager.cc`, and it succeeds with them. Fixes scylladb/scylladb#24806 Closes scylladb/scylladb#26057 (cherry picked from commit `35f7d2aec6`) Closes scylladb/scylladb#26198	2025-09-24 09:51:29 +03:00
Dawid Mędrek	d6c59c5224	test/perf/tablet_load_balancing.cc: Create nodes within one DC In `789a4a1ce7`, we adjusted the test file to work with the configuration option `rf_rack_valid_keyspaces`. Part of the commit was making the two tables used in the test replicate in separate data centers. Unfortunately, that destroyed the point of the test because the tables no longer competed for resources. We fix that by enforcing the same replication factor for both tables. We still accept different values of replication factor when provided manually by the user (by `--rf1` and `--rf2` commandline options). Scylla won't allow for creating RF-rack-invalid keyspaces, but there's no reason to take away the flexibility the user of the test already has. Fixes scylladb/scylladb#26026 Closes scylladb/scylladb#26115 (cherry picked from commit `0d2560c07f`) Closes scylladb/scylladb#26170	2025-09-24 09:50:39 +03:00
Pavel Emelyanov	4431dd158f	Merge '[Backport 2025.1] compaction/scrub: register sstables for compaction before validation' from Scylladb[bot] compaction/scrub: register sstables for compaction before validation When `scrub --validate` runs, it collects all candidate sstables at the start and validates them one by one in separate compaction tasks. However, scrub in validate mode does not register these sstables for compaction, which allows regular compaction to pick them up and potentially compact them away before validation begins. This leads to scrub failures because the sstables can no longer be found. This patch fixes the issue by first disabling compaction, collecting the sstables, and then registering them for compaction before starting validation. This ensures that the enqueued sstables remain available for the entire duration of the scrub validation task. Fixes #23363 This reported scrub failure occurs on all versions that have the checksum/digest validation feature for uncompressed sstables. So, backport it to older versions. - (cherry picked from commit `84f2e99c05`) - (cherry picked from commit `7cdda510ee`) Parent PR: #26034 Closes scylladb/scylladb#26097 * github.com:scylladb/scylladb: compaction/scrub: register sstables for compaction before validation compaction/scrub: handle exceptions when moving invalid sstables to quarantine	2025-09-24 09:50:05 +03:00
Nadav Har'El	b7f1497efd	alternator: fix bug in combination of AttributeUpdates + ReturnValues In test/alternator/test_returnvalues.py we had tests for the ReturnValues feature on UpdateItem requests - but we only tested UpdateItem requests with the "modern" UpdateExpression, and forgot to test the combination of ReturnValues with the old AttributeUpdates API. It turns out this combination is buggy: when both ReturnValues=ALL_OLD and AttributeUpdates need the previous value of the item, we may wrongly std::move() the value out, and the operation will fail with a strange error: An error occurred (ValidationException) when calling the UpdateItem operation: JSON assert failed on condition 'IsObject()' The fix in this patch is trivial - just move the std::move() to the correct place, after both UpdateExpression and AttributeUpdates handling is done. This patch also includes a reproducing test, which fails before this patch and passes with it - and of course passes on DynamoDB. This test reproduces two cases where the bug happened, as well as one case where it didn't (to make sure we don't regress in what already worked). Fixes #25894 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#25900 (cherry picked from commit `3c0032deb4`) Closes scylladb/scylladb#26094	2025-09-24 09:48:49 +03:00

1 2 3 4 5 ...

47096 Commits