scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Author	SHA1	Message	Date
Andrei Chekun	ca615af407	test.py: refactor resource_gather.py Refactor resource_gather.py to not create the initial cgroup when the process it's already in it. This will allow not going deeper, creating again and again the same cgroup with each test.py execution when the terminal isn't closed. Add creation of own event loop in case it's not exists. This needed to be able to work with test.py that creates loop and with pytest that not create loop.	2025-04-24 14:05:49 +02:00
Benny Halevy	f279625f59	test_tablets_cql: test_alter_dropped_tablets_keyspace: extend expected error The query may fail also on a no_such_keyspace exception, which generates the following cql error: ``` Error from server: code=2200 [Invalid query] message="Can\'t find a keyspace test_1745198244144_qoohq" ``` Extend the pytest.raises match expression to include this error as well. Fixes #23812 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#23875	2025-04-23 18:54:22 +03:00
Benny Halevy	2bbdaeba1c	Update seastar submodule * seastar e44af9b0...d7ff58f2 (2): > rpc: client: support timeout and cancellation > doc/io-properties-file.md: correct a typo Closes scylladb/scylladb#23865	2025-04-23 16:10:51 +03:00
Kefu Chai	0d7752b010	build: cmake: generalize update_cxx_flags() Refactor our CMake flag handling to make it more flexible and reduce repetition: - Rename update_cxx_flags() to update_build_flags() to better reflect its expanded purpose - Generate CMake variable names internally based on configuration type instead of requiring callers to specify full variable names - Follow CMake's standard naming conventions for configuration-specific flags, see https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html#variable:CMAKE_%3CLANG%3E_FLAGS - Prepare groundwork for handling linker flags in addition to compiler flags in future changes Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23842	2025-04-23 12:06:04 +03:00
Nadav Har'El	64a5eee6b9	test/cqlpy: insert test names into Scylla logs Both test.py and test/cqlpy/run run many test functions against the same Scylla process. In the resulting log file, it is hard to understand which log messages are related to which test. In this patch, we log a message (using the "/system/log" REST API) every time a test is started or ends. The messages look like this: INFO 2025-04-22 15:10:44,625 [shard 1:strm] api - /system/log: test/cqlpy: Starting test_lwt.py::test_lwt_missing_row_with_static ... INFO 2025-04-22 15:10:44,631 [shard 0:strm] api - /system/log: test/cqlpy: Ended test_lwt.py::test_lwt_missing_row_with_static We already had a similar feature in test/alternator, added three years ago in commit `b0371b6bf8`. The implementation is similar but not identical due to different available utility functions, and in any case it's very simple. While at it, this patch also fixes the has_rest_api() to timeout after one second. Without this, if the REST API is blocked in a way that a connection attempt just hangs, the tests can hang. With the new timeout, the test will hang for a second, realize the REST API is not available, and remember this decision (the next tests will not wait one second again). We had the same bug in Alternator, and fixed it in `758f8f01d7`. This one second "pause" will only happen if the REST API port is blocked - in the more typical case the REST API port is just not listening but not blocked, and the failure will be noticed immediately and won't wait a whole second. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23857	2025-04-23 12:04:14 +03:00
Piotr Dulikowski	3d73c79a72	test: mv: skip test_view_building_scheduling_group in debug The test populates a table with 50k rows, creates a view on that table and then compares the time spent in streaming vs. gossip scheduling groups. It only takes 10s in dev mode on my machine, but is much slower in debug mode in CI - building the view doesn't finish within 2 minutes. The bigger the view to build, the more accurrate the measurement; moreover, the test scenario isn't interesting enough to be worth running it in debug mode as this should be covered by other tests. Therefore, just skip this test in debug mode. Fixes: scylladb/scylladb#23862 Closes scylladb/scylladb#23866	2025-04-23 11:29:35 +03:00
Pavel Emelyanov	a6ba535c3c	Merge 'test.py: refactoring before boost pytest integration' from Andrei Chekun This PR contains changes that do not add new functionality, and have small refactoring of the existing code. The most significant change though is switching the SQLite writer from a singleton to a thread locking mechanism that will be needed later on. This PR is an extraction of several commits from https://github.com/scylladb/scylladb/pull/22894 as reviewer [request](https://github.com/scylladb/scylladb/pull/22894?notification_referrer_id=NT_kwDOACiLR7MxNDg0ODk2MDU1MjoyNjU3MDk1&notifications_query=reason%3Aparticipating#pullrequestreview-2778582278). Closes scylladb/scylladb#23867 * github.com:scylladb/scylladb: test.py: move the readme file for LDAP tests to the correct location test.py: eliminate deprecation warning for xml.etree.ElementTree.Element test.py: align the behavior of max-failures parameter with pytest maxfail test.py: fix typo in toxiproxy name parameter test.py: add locking to the sqlite writer for resource gather test.py: add sqlite datetime adapter for resource gather test.py: change the parameter for get_modes_to_run()	2025-04-23 11:10:56 +03:00
Andrei Chekun	57b66e6b2e	test.py: move the readme file for LDAP tests to the correct location README file was created in incorrect location, now it moved to the directory with source files where it intended to be.	2025-04-22 19:03:28 +02:00
Andrei Chekun	cf4747c151	test.py: eliminate deprecation warning for xml.etree.ElementTree.Element Testing the truth value of an Element emits DeprecationWarning. This check is done correctly	2025-04-22 19:03:21 +02:00
Andrei Chekun	bc49cd5214	test.py: align the behavior of max-failures parameter with pytest maxfail This will allow to just transfer the existing max-failures values to the pytest without any modification. As a downside test.py logic of handling these changes slightly.	2025-04-22 19:03:08 +02:00
Andrei Chekun	5c3501e4bf	test.py: fix typo in toxiproxy name parameter Fix typo in toxiproxy name parameter. No any functional changes just cosmetic fix.	2025-04-22 19:02:12 +02:00
Andrei Chekun	2c37a793d1	test.py: add locking to the sqlite writer for resource gather SQLite blocking the DB during writes, so it's not possible to make writes from several thread. To be able to gather metrics in several threads, we need a locking mechanism for threads during writes. So thread will not try to write metrics while another thread is performing writes.	2025-04-22 19:01:30 +02:00
Andrei Chekun	800710dc2c	test.py: add sqlite datetime adapter for resource gather Add sqlite datetime adapter for resource gather since default adapters are deprecated from 3.12	2025-04-22 18:59:49 +02:00
Andrei Chekun	bf2a9e267e	test.py: change the parameter for get_modes_to_run() Change the parameter for get_modes_to_run() from session to config to narrow the scope, and prepare it to later use in method that do not have access to the session, but have access to the config object	2025-04-22 18:58:33 +02:00
Kefu Chai	7254c0c515	db/config.cc: correct a typo in option's description s/incomming/incoming/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23826	2025-04-22 16:55:04 +03:00
Pavel Emelyanov	65efd2b2f6	Merge 'Refactor and enhance s3_tests' from Ernest Zaslavsky This PR introduces a cleanup mechanism in s3_tests to remove uploaded objects after the test completes, ensuring a clean testing environment. Additionally, the recently added test has been refactored and split into smaller, more maintainable parts, improving readability and extending its coverage to include the "proxied" case. As these changes primarily improve code aesthetics and maintainability, backporting is not necessary. Refs: https://github.com/scylladb/scylladb/issues/23830 Closes scylladb/scylladb#23828 * github.com:scylladb/scylladb: s3_tests: Improve and extend copy object test coverage s3_tests: Implement post-test cleanup for uploaded objects	2025-04-22 16:40:37 +03:00
Nadav Har'El	5fd2eabd48	Merge 'Generalize the diversity of parse_table_infos() callers in API' from Pavel Emelyanov The helper in question is used in several different ways -- by handlers directly (most of the callers), as a part of wrap_ks_cf() helper and by one of its overloads that unpack the "cf" query parameter from request. This PR generalizes most of the described callers thus reducing the number differently-looking of ways API handlers parse "keyspace" and "cf" request parameters. Continuation of #22742 Closes scylladb/scylladb#23368 * github.com:scylladb/scylladb: api: Squash two parse_table_infos into one api: Generalize keyspaces:tables parsing a little bit more api: Provide general pair<keyspace, vector<table>> parsing api: Remove ks_cf_func and related code	2025-04-22 15:40:06 +03:00
Nadav Har'El	8d1a413357	test/scylla_gdb: better error message when running on dev build mode The test/scylla_gdb suite needs Scylla to have been built with debug symbols - which is NOT the case for the dev build. So the script test/scylla_gdb/run attempts to recognize when a developer runs it on an executable with the debug symbols missing - and prints a clear error. Unfortunately, as we noticed in #10863, and again in #23832, because wasmtime is compiled with debug symbols and linked with Scylla, build/dev/scylla "pretends" to have debug symbols, foiling the check in test/scylla_gdb/run. Reviewers rejected two solutions to this problem (pull requests #10865 and #10923), so in pull request #10937 I added a cosmetic solution just for test/scylla_gdb: in test/scylla_gdb/conftest.py we check that there are really debug symbols that interest us, and if not, exit immediately instead of failing each test separately. For some reason, the sys.exit() we used is no longer effective - it no longer exits pytest, so in this patch we use pytest.exit() instead. Fixes #23832 (sort of, we leave build/dev/scylla with the fake claim that it has debug symbols, but test/scylla_gdb will handle this situation more gracefully). Closes scylladb/scylladb#23834	2025-04-22 15:02:06 +03:00
Michael Litvak	5c1d24f983	test: test_mv_topology_change: increase timeout for remove_node The test `test_mv_write_to_dead_node` currently uses a timeout of 60 seconds for remove_node, after it was increased from 30 seconds to fix scylladb/scylladb#22953. Apparently it is still too low, and it was observed to fail in debug mode. Normally remove_node uses a default timeout of TOPOLOGY_TIMEOUT = 1000 seconds, but the test requires a timeout which is shorter than 5 minutes, because it is a regression test for an issue where MV updates hold topology changes for more than 5 minutes, and we want to verify in the test that the topology change completes in less than 5 minutes. To resolve the issue, we set the test to skip in debug mode, because the remove node operation is unpredictably slow, and we increase the timeout to 180 seconds which is hopefully enough time for remove_node in non-debug modes, and still sufficient to satisfy the test requirements. Fixes scylladb/scylladb#22530 Closes scylladb/scylladb#23833	2025-04-22 10:51:19 +02:00
Ernest Zaslavsky	edaa3f4bdd	s3_tests: Improve and extend copy object test coverage Refactored the copy object test to enhance readability and maintainability. The test was simplified and split into smaller, more focused parts. Additionally, a "proxied" variant of the test was introduced to expand coverage.	2025-04-21 20:54:14 +03:00
Ernest Zaslavsky	252a0a14af	s3_tests: Implement post-test cleanup for uploaded objects Ensure cleanup after tests by deleting objects uploaded to MinIO. This improves resource management and maintains a clean test environment.	2025-04-21 20:54:14 +03:00
Avi Kivity	2dcd2b21ae	Merge 'tablets: Equalize per-table balance when allocating tablets for a new table' from Tomasz Grabiec Fixes the following scenario: 1. Scale out adds new nodes to each rack 2. Table is created - all tablets are allocated to new nodes because they have low load 3. Rebalancing moves tablets from old nodes to new nodes - table balance for the new table is not fixed We're wrong to try to equalize global load when allocating tablets, and we should equalize per-table load instead, and let background load balancing fix it in a fair way. It will add to the allocated storage imbalance, but: 1. The table is initially empty, so doesn't impact actual storage imbalance. 2. It's more important to avoid overloading CPU on the nodes - imbalance hurts this aspect immediately. 3. If the table was created before imbalance was formed, we would end up in the same situation as in the problematic scenario after the patch. 4. It's the job of the load balancing to keep up with storage growing, and if it's not, scale out should kick in. Before we have CPU-aware tablet allocation, and thus can prove we have CPU capacity on the small nodes, we should respect per-table balance as this is the way in which we achieve full CPU utilization. Fixes #23631 Backport to 2025.1 because load imbalance is a serious problem in production. Closes scylladb/scylladb#23708 * github.com:scylladb/scylladb: tablets: Equalize per-table balance when allocating tablets for a new table load_sketch: Tolerate missing tablet_map when selecting for a given table tests: tablets: Simplify tests by moving common code to topology_builder	2025-04-21 17:06:30 +03:00
Pavel Emelyanov	eb5b52f598	Merge 'main: make DC and rack immutable after bootstrap' from Piotr Dulikowski Changing DC or rack on a node which was already bootstrapped is, in case of vnodes, very unsafe (almost guaranteed to cause data loss or unavailability), and is outright not supported if the cluster has a tablet-backed keyspaces. Moreover, the possibility of doing that makes it impossible to uphold some of the invariants promised by the RF-rack-valid flag, which is eventually going to become unconditionally enabled. Get rid of the above problems by removing the possibility of changing the DC / rack of a node. A node will now fail to start if its snitch reports a different DC or rack than the one that was reported during the first boot. Fixes: scylladb/scylladb#23278 Fixes: scylladb/scylladb#22869 Marking for backport to 2025.1, as this is a necessary part of the RF-rack-valid saga Closes scylladb/scylladb#23800 * github.com:scylladb/scylladb: doc: changing topology when changing snitches is no longer supported test: cluster: introduce test_no_dc_rack_change storage_service: don't update DC/rack in update_topology_with_local_metadata main: make dc and rack immutable after bootstrap test: cluster: remove test_snitch_change	2025-04-21 15:52:55 +03:00
Yaniv Michael Kaul	b374f94b15	pip installation: use --no-cache-dir There are two reasons we may want NOT to use caching of pip deps: 1. When building a container, unless we specifically clean it up, it'll remain, even when we squash the image layers later. 2. When building a container, that cache is not useful, as we squash our containers later (so that layer is not cached really). And our CI cleans up the layers repo anyway. 3. Caching sometimes isn't great, and doesn't ensure we pick up the exact version (or latest) that we wish to... This PR changes two locations in Scylla, both of which (also) build containers, so certainly relevant for 1, 2 above and possibly 3. No real need to backport. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#23822	2025-04-21 13:46:57 +03:00
Avi Kivity	0ba3ce1741	test: gdb: avoid using `file(1)` to determine if debug information is present The scylla_gdb tests verify, as a sanity check, that the executable was built with debug information. They do so via file(1). In Fedora 42, file(1) crashes on ELF files that have interpreter pathnames larger than 128 characters[1]. This was later fixed[2], but the fix is not in any release. Work around the problem by using objdump instead of file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=2354970 [2] `b3384a1fbf` Closes scylladb/scylladb#23823	2025-04-21 13:29:27 +03:00
Andrei Chekun	441cee8d9c	test.py: fix gathering logs in case of fail Currently log files have information about run_id twice: cluster.object_store_test_backup.10.test_abort_restore_with_rpc_error.dev.10_cluster.log However, sometimes the first run_id can be incorrect: cluster.object_store_test_backup.1.test_abort_restore_with_rpc_error.dev.10_cluster.log Removing first run_id in the name to not face this issue and because it's actually redundant. Removing creation empty file for scylla manager log, since it redundant and was done as incorrect assumption on the root cause of the fail. Add extension to the stacktrace file, so it will be opened in the browser in Jenkins in the new tab instead of downloading it. Fixes: https://github.com/scylladb/scylladb/issues/23731 Closes scylladb/scylladb#23797	2025-04-21 13:12:35 +03:00
Sergey Zolotukhin	2314feeae2	test: Ignore DEBUG,TRACE,INFO level messages when checking for failed mutations. Update the regular expression in `check_node_log_for_failed_mutations` to avoid false test failures when DEBUG-level logging is enabled. Fixes scylladb/scylladb#23688 Closes scylladb/scylladb#23658	2025-04-18 16:17:41 +03:00
Calle Wilund	4a44651fce	encryption_at_rest_test: Make fake_proxy read/write loop noexcept Fixes #23774 Test code falls into same when_all issue as http client did. Avoid passing exceptions through this, and instead catch and report in worker lambda. Closes scylladb/scylladb#23778	2025-04-18 16:17:41 +03:00
Pavel Emelyanov	324daac156	Merge 'Add CopyObject API implementation to S3 client' from Ernest Zaslavsky Implement the CopyObject API to directly copy S3 object from one location to another. This implementation consumes zero networking overhead on the client side since the object is copied internally by S3 machinery Usage example: Backup of tiered SSTables - you already have SSTables on S3, CopyObject is the ideal way to go No need to backport since we are adding new functionality for a future use Closes scylladb/scylladb#23779 * github.com:scylladb/scylladb: s3_client: implement S3 copy object s3_client: improve exception message s3_client: reposition local function for future use	2025-04-18 16:17:41 +03:00
Pavel Emelyanov	cc919b08c2	Merge 'backup: Optimize S3 throughput with shard-based upload' from Ernest Zaslavsky This PR enhances S3 throughput by leveraging every available shard to upload backup files concurrently. By distributing the load across multiple shards, we significantly improve the upload performance. Each shard retrieves an SSTable and processes its files sequentially, ensuring efficient, file-by-file uploads. To prevent uncontrolled fiber creation and potential resource exhaustion, the backup task employs a directory semaphore from the sstables_manager. This mechanism helps regulate concurrency at the directory level, ensuring stable and predictable performance during large-scale backup operations. Refs #22460 fixes: #22520 ``` =========================================== Release build, master, smp-16, mem-32GiB Bytes: 2342880184, backup time: 9.51 s =========================================== Release build, this PR, smp-16, mem-32GiB Bytes: 2342891015, backup time: 1.23 s =========================================== ``` Looks like it is faster at least x7.7 No backport needed since it (native backup) is still unused functionality Closes scylladb/scylladb#23727 * github.com:scylladb/scylladb: backup: Add test for invalid endpoint backup_task: upload on all shards backup_task: integrate sharded storage manager for upload	2025-04-18 16:17:41 +03:00
Avi Kivity	6b415cfd4b	Merge 'managed_bytes: in the copy constructor, respect the target preferred allocation size' from Michał Chojnowski Commit `14bf09f447` added a single-chunk layout to `managed_bytes`, which makes the overhead of `managed_bytes` smaller in the common case of a small buffer. But there was a bug in it. In the copy constructor of `managed_bytes`, a copy of a single-chunk `managed_bytes` is made single-chunk too. But this is wrong, because the source of the copy and the target of the copy might have different preferred max contiguous allocation sizes. In particular, if a `managed_bytes` of size between 13 kiB and 128 kiB is copied from the standard allocator into LSA, the resulting `managed_bytes` is a single chunk which violates LSA's preferred allocation size. (And therefore is placed by LSA in the standard allocator). In other words, since Scylla 6.0, cache and memtable cells between 13 kiB and 128 kiB are getting allocated in the standard allocator rather than inside LSA segments. Consequences of the bug: 1. Effective memory consumption of an affected cell is rounded up to the nearest power of 2. 2. With a pathological-enough allocation pattern (for example, one which somehow ends up placing a single 16 kiB memtable-owned allocation in every aligned 128 kiB span), memtable flushing could theoretically deadlock, because the allocator might be too fragmented to let the memtable grow by another 128 kiB segment, while keeping the sum of all allocations small enough to avoid triggering a flush. (Such an allocation pattern probably wouldn't happen in practice though). 3. It triggers a bug in reclaim which results in spurious allocation failures despite ample evictable memory. There is a path in the reclaimer procedure where we check whether reclamation succeeded by checking that the number of free LSA segments grew. But in the presence of evictable non-LSA allocations, this is wrong because the reclaim might have met its target by evicting the non-LSA allocations, in which case memory is returned directly to the standard allocator, rather than to the pool of free segments. If that happens, the reclaimer wrongly returns `reclaimed_nothing` to Seastar, which fails the allocation. Refs (possibly fixes) https://github.com/scylladb/scylladb/issues/21072 Fixes https://github.com/scylladb/scylladb/issues/22941 Fixes https://github.com/scylladb/scylladb/issues/22389 Fixes https://github.com/scylladb/scylladb/issues/23781 This is a regression fix, should be backported to all affected releases. Closes scylladb/scylladb#23782 * github.com:scylladb/scylladb: managed_bytes_test: add a reproducer for #23781 managed_bytes: in the copy constructor, respect the target preferred allocation size	2025-04-17 21:14:10 +03:00
Pavel Emelyanov	ca2cc5e826	Merge 'test/cluster/test_read_repair: make incremental test work with tablets' from Botond Dénes There are two tests which test incremental read repair: one with row the other with partition tombstones. The tests currently force vnodes, by creating the test keyspace with {'enabled': false}. Even so, the tests were found to be flaky so one of them are marked for skip. This commit does the following changes: * Make the tests use tablets by creating the test keyspace with tablets. * Change the way the tests write data so it works with tablets: currently the tests use scylla-sstable write + upload but this won't work with tablets since upload with tablets implies --load-and-stream which means data is streamed to all replicas (no difference created between nodes). Switch to the classic stop-node + write to other replica with CL=ONE. * Remove the skip added to the partition-tombstone test variant. Fixes: #21179 Test improvement, no backport required. Closes scylladb/scylladb#23167 * github.com:scylladb/scylladb: wip test/cluster/test_read_repair: make incremental test work with tablets	2025-04-17 18:54:00 +03:00
Piotr Dulikowski	325a89638c	doc: changing topology when changing snitches is no longer supported Update the "How to Switch Snitches" document to indicate that changing topology (i.e. changing node's DC or rack) while changing the snitch is no longer supported. Remove a note which said that switching snitches is not supported with tablets. It was introduced because of the concern that switching a snitch might change DC or rack of the node, for which our current tablet load balancer is completely unprepated. Now that changing DC/rack is forbidden, there doesn't seem to be anything related to snitches which could cause trouble for tablets.	2025-04-17 16:22:58 +02:00
Piotr Dulikowski	796c8d1601	test: cluster: introduce test_no_dc_rack_change The test makes sure that changing the DC or rack in the snitch's configuration fails with an expected error.	2025-04-17 16:22:58 +02:00
Piotr Dulikowski	1791ae3581	storage_service: don't update DC/rack in update_topology_with_local_metadata The DC/rack are now immutable and cannot be changed after restart, so there is no need to update the node's system.topology entry with this information on restart.	2025-04-17 16:22:58 +02:00
Piotr Dulikowski	ce2fab7cce	main: make dc and rack immutable after bootstrap Changing DC or rack on a node which was already bootstrapped is, in case of vnodes, very unsafe (almost guaranteed to cause data loss or unavailability), and is outright not supported if the cluster has a tablet-backed keyspaces. Moreover, the possibility of doing that makes it impossible to uphold some of the invariants promised by the RF-rack-valid flag, which is eventually going to become unconditionally enabled. Get rid of the above problems by removing the possibility of changing the DC / rack of a node. A node will now fail to start if its snitch reports a different DC or rack than the one that was reported during the first boot. Fixes: scylladb/scylladb#23278	2025-04-17 16:22:26 +02:00
Tomasz Grabiec	1e407ab4d2	tablets: Equalize per-table balance when allocating tablets for a new table Fixes the following scenario: 1. Scale out adds new nodes to each rack 2. Table is created - all tablets are allocated to new nodes because they have low load 3. Rebalancing moves tablets from old nodes to new nodes - table balance for the new table is not fixed We're wrong to try to equalize global load when allocating tablets, and we should equalize per-table load instead, and let background load balancing fix it in a fair way. It will add to the allocated storage imbalance, but: 1. The table is initially empty, so doesn't impact actual storage imbalance. 2. It's more important to avoid overloading CPU on the nodes - imbalance hurts this aspect immediately. 3. If the table was created before imbalance was formed, we would end up in the same situation in the problematic scenario after the patch. 4. It's the job of the load balancing to keep up with storage growing, and if it's not, scale out should kick in. Before we have CPU-aware tablet allocation, and thus can prove we have CPU capacity on the small nodes, we should respect per-table balance as this is the way in which we achieve full CPU utilization. Fixes #23631	2025-04-17 16:01:23 +02:00
Tomasz Grabiec	2597a7e980	load_sketch: Tolerate missing tablet_map when selecting for a given table To simplify future usage in network_topology_strategy::add_tablets_in_dc() which invokes populate() for a given table, which may be both new and preexisitng.	2025-04-17 16:01:16 +02:00
Ernest Zaslavsky	b79ca5a1aa	backup: Add test for invalid endpoint * During the development phase, the backup functionality broke because we lacked a test that runs backup with an invalid endpoint. This commit adds a test to cover that scenario. * Add checking for the expected error to be propagated from failing/aborted backup	2025-04-17 16:31:43 +03:00
Benny Halevy	b7212620f9	backup_task: upload on all shards Use all shards to upload snapshot files to S3. By using the sharded sstables_manager_for_table infrastructure. Refs #22460 Quick perf comparison =========================================== Release build, master, smp-16, mem-32GiB Bytes: 2342880184, backup time: 9.51 s =========================================== Release build, this PR, smp-16, mem-32GiB Bytes: 2342891015, backup time: 1.23 s =========================================== Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Co-authored-by: Ernest Zaslavsky <ernest.zaslavsky@scylladb.com>	2025-04-17 16:31:42 +03:00
Piotr Dulikowski	dd2e507ece	test: cluster: remove test_snitch_change This test checked that it is possible to change DC/rack of a node during restart. This will become explicitly forbidden, so remove the test.	2025-04-17 13:51:22 +02:00
Michał Chojnowski	6c1889f65c	managed_bytes_test: add a reproducer for #23781	2025-04-17 12:51:01 +02:00
Botond Dénes	8ac7c54d8b	Merge 'topology_coordinator: stop: await all background_action_holder:s' from Benny Halevy Add missing awaits for the rebuild_repair and repair background actions. Although the background actions hold the _async_gate which is closed in topology_coordinator::run(), stop() still needs to await all background action futures and handle any errors they may have left behind. Fixes #23755 * The issue exists since 6.2 Closes scylladb/scylladb#17712 * github.com:scylladb/scylladb: topology_coordinator: stop: await all background_action_holder:s topology_coordinator: stop: improve error messages topology_coordinator: stop: define stop_background_action helper	2025-04-17 12:10:29 +03:00
Kefu Chai	b0cbe86780	s3/client: define a constant for security credential resource instead of repeating it, let's define a consstant and reuse it. less repeatings this way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23713	2025-04-17 11:51:15 +03:00
Kefu Chai	a33651b03e	db, service: do not include unused header these unused headers were flagged by clang-include-cleaner. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23735	2025-04-17 11:49:59 +03:00
Botond Dénes	33e383c557	scripts/pull_github_pr.sh: add argument parsing Instead of hardcoding PR_NUM=$1 and FORCE=$2. This current setup is not very flexible and one gets no feedback if the arguments are incorrect or not recognized. Add proper position-independent argument parsing using a classic while case loop. Closes scylladb/scylladb#23623	2025-04-17 11:49:15 +03:00
Nadav Har'El	84d4af1f0e	Merge 'Alternator batch rcu' from Amnon Heiman This series adds support for reporting consumed capacity in BatchGetItem operations in Alternator. It includes changes to the RCU accounting logic, exposing internal functionality to support batch-specific behavior, and adds corresponding tests for both simple and complex use cases involving multiple tables and consistency modes. Need backporting to 2025.1, as RCU and WCU are not fully supported Fixes #23690 Closes scylladb/scylladb#23691 * github.com:scylladb/scylladb: test_returnconsumedcapacity.py: test RCU for batch get item alternator/executor: Add RCU support for batch get items alternator/consumed_capacity: make functionality public	2025-04-17 10:08:16 +03:00
Botond Dénes	22a28ca1db	wip	2025-04-17 03:01:17 -04:00
Ernest Zaslavsky	a369dda049	s3_client: implement S3 copy object Add support for the CopyObject API to enable direct copying of S3 objects between locations. This approach eliminates networking overhead on the client side, as the operation is handled internally by S3.	2025-04-17 09:47:47 +03:00
Botond Dénes	19b4f10598	test/cluster/test_read_repair: make incremental test work with tablets There are two tests which test incremental read repair: one with row the other with partition tombstones. The tests currently force vnodes, by creating the test keyspace with {'enabled': false}. Even so, the tests were found to be flaky so one of them are marked for skip. This commit does the following changes: * Make the tests use tablets by creating the test keyspace with tablets. * Change the way the tests write data so it works with tablets: currently the tests use scylla-sstable write + upload but this won't work with tablets since upload with tablets implies --load-and-stream which means data is streamed to all replicas (no difference created between nodes). Switch to the classic stop-node + write to other replica with CL=ONE. * Remove the skip added to the partition-tombstone test variant. Also add tracing to the read-repair query, to make debugging the test easier if it fails. Fixes: #21179	2025-04-17 02:01:17 -04:00

1 2 3 4 5 ...

47600 Commits