scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Petr Gusev	78aa36b257	check_internal_table_permissions: handle Paxos state tables CDC and $paxos tables are managed internally by Scylla. Users are already prohibited from running ALTER and DROP commands on CDC tables. In this commit, we extend the same restrictions to $paxos tables to prevent users from shooting themselves in the foot. Other commands are generally allowed for CDC and $paxos tables. An important distinction is that CDC tables are meant to be accessed directly by users, so appropriate permissions must be set for non-superusers. In contrast, $paxos tables are not intended for direct access by users. Therefore, this commit explicitly disallows non-superusers from accessing them. Superusers are still allowed access for debugging and troubleshooting purposes. Note that these restrictions apply even if explicit permissions have been granted. For example, a non-superuser may be granted SELECT permissions on a $paxos table, but the restriction above will still take precedence. We don't try to restrict users from giving permissions to $paxos tables for simplicity.	2025-07-24 19:48:08 +02:00
Petr Gusev	ec3c5f4cbc	client_state: extract check_internal_table_permissions This is a refactoring commit — it extracts the CDC permissions handling logic into a separate function: check_internal_table_permissions. This is a preparatory step for the next commit, where we'll handle paxos state tables similarly to CDC tables.	2025-07-24 19:48:08 +02:00
Petr Gusev	bb4e7a669f	paxos_store: handle base table removal Subscribe to on_before_drop_column_family to drop the associated Paxos state table when the corresponding user table is dropped.	2025-07-24 19:48:08 +02:00
Petr Gusev	1b70623908	database: get_base_table_for_tablet_colocation: handle paxos state table We need to mark paxos state table as colocated with the user table, so that the corresponding tablets are migrated/repaired together.	2025-07-24 19:48:08 +02:00
Petr Gusev	03aa2e4823	paxos_state: use node_local_only mode to access paxos state	2025-07-24 19:48:08 +02:00
Petr Gusev	ff1caa9798	query_options: add node_local_only mode We want to access the paxos state table only on the local node and shard (or shards in case of intranode_migration). In this commit we add a node_local_only flag to query_options, which allows to do that. This flag can be set for a query via make_internal_options. We handle this flag on the statements layer by forwarding it to either coordinator_query_options or coordinator_mutate_options.	2025-07-24 19:48:08 +02:00
Petr Gusev	65c7e36b7c	storage_proxy: handle node_local_only in query In this commit we support node_local_only flag in read code path in storage_proxy.	2025-07-24 19:48:08 +02:00
Petr Gusev	2d747d97b8	storage_proxy: handle node_local_only in mutate We add the remove_non_local_host_ids() helper, which will be used in the next commit to support the read path. HostIdVector concept is introduced to be able to handle both host_id_vector_replica_set and host_id_vector_topology_change uniformly. The storage_proxy_coordinator_mutate_options class is declared outside of storage_proxy to avoid C++ compiler complaints about default field initializers. In particular, some storage_proxy methods use this class for optional parameters with default values, which is not allowed when the class is defined inside storage_proxy.	2025-07-24 19:48:08 +02:00
Petr Gusev	7eb198f2cc	storage_proxy: introduce node_local_only flag Add a per-request flag that restricts query execution to the local node by filtering out all non-local replicas. Standard consistency level (CL) rules still apply: if the local node alone cannot satisfy the requested CL, an exception is thrown. This flag is required for Paxos state access, where reads and writes must target only the local node. As a side effect, this also enables the implementation of scylladb/scylladb#16478, which proposes a CQL extension to expose 'local mode' query execution to users. Support for this flag in storage_proxy's read and write code paths will be added in follow-up commits.	2025-07-24 19:48:08 +02:00
Petr Gusev	8e745137de	abstract_replication_strategy: remove unused using	2025-07-24 19:48:08 +02:00
Petr Gusev	4c1aca3927	storage_proxy: add coordinator_mutate_options In upcoming commits, we want to add a node_local_only flag to both read and write paths in storage_proxy. This requires passing the flag from query_processor to the part of storage_proxy where replica selection decisions are made. For reads, it's sufficient to add the flag to the existing coordinator_query_options class. For writes, there is no such options container, so we introduce coordinator_mutate_options in this commit. In the future, we may move some of the many mutate() method arguments into this container to simplify the code.	2025-07-24 19:48:08 +02:00
Petr Gusev	b6ccaffd45	storage_proxy: rename create_write_response_handler -> make_write_response_handler Most of the create_write_response_handler overloads follow the same signature pattern to satisfy the sp::mutate_prepare call. The one which doesn't follow it is invoked by others and is responsible for creating a concrete handler instance. In this refactoring commit we rename it to make_write_response_handler to reduce confusion.	2025-07-24 19:48:08 +02:00
Petr Gusev	db946edd1d	storage_proxy: simplify mutate_prepare This is a refactoring commit. We remove extra lambda parameters from mutate_prepare since the CreateWriteHandler lambda can simply capture them. We can't std::move(permit) in another mutate_prepare overload, because each handler wants its own copy of this pemit.	2025-07-24 19:48:08 +02:00
Petr Gusev	ac4bc3f816	paxos_state: lazily create paxos state table We call paxos_store::ensure_initialized in the beginning of storage_proxy::cas to create a paxos state table for a user table if it doesn't exist. When the LWT coordinator sends RPCs to replicas, some of them may not yet have the paxos schema. In paxos_store::get_paxos_state_schema we just wait for them to appear, or throw 'no_such_column_family' if the base table was dropped.	2025-07-24 19:48:08 +02:00
Petr Gusev	3e0347c614	migration_manager: add timeout to start_group0_operation and announce Pass a timeout parameter through to start_operation() and add_entry(), respectively. This is a preparatory change for the next commit, which will use the timeout to properly handle timeouts during lazy creation of Paxos state tables.	2025-07-24 16:39:50 +02:00
Petr Gusev	519f40a95e	paxos_store: use non-internal queries Switch paxos_store from using internal queries to regular prepared queries, so that prepared statements are correctly updated when the base table is recreated. The do_execute_cql_with_timeout function is extracted to reduce code bloat when execute_cql_with_timeout template function is instantiated. We change return type of execute_cql_with_timeout to untyped_result_set since shared_ptr is not really needed here.	2025-07-24 16:39:50 +02:00
Petr Gusev	6caa1ae649	qp: make make_internal_options public In upcoming commits, we will switch paxos_store from using internal queries to regular prepared queries, so that prepared statements are correctly updated when the base table is recreated. To support this, we want to reuse the logic for converting parameters from vector<data_value_or_unset> to raw_value_vector_with_unset. This commit makes make_internal_options public to enable that reuse.	2025-07-24 16:39:50 +02:00
Petr Gusev	13f7266052	paxos_store: conditional cf_id filter We want to reuse the same queries to access system.paxos and the the co-located table. A separate co-located table will be created for each user table, so we won't need cf_id filter for them. In this commit we make cf_if filter optional and apply it only if the stable table is actually system.paxos.	2025-07-24 16:39:50 +02:00
Petr Gusev	370f91adb7	paxos_store: coroutinize This is another preparational step. We want to add more logic to paxos_store state access functions in the next commits, it's easier to do with coroutines. Pass ballot by value to delete_paxos_decision because paxos_state::prune is not a coroutine and the ballot parameter is destroyed when we return from it. The alternative solution -- pass by const reference to paxos_state::prune -- doesn't work because paxos_state::prune is called from a lambda in paxos_response_handler::prune, this lambda is not a coroutine and the 'ballot' field could be destroyed along with the body of this lambda as soon as we return from paxos_state::prune.	2025-07-24 16:39:50 +02:00
Petr Gusev	ab03badc15	feature_service: add LWT_WITH_TABLETS feature We will need this feature to determine if it's safe to enable LWTs for a tablet-based table.	2025-07-24 16:39:50 +02:00
Petr Gusev	8292ecf2e1	paxos_state: inline system_keyspace functions into paxos_store Prepares for reusing the same functions to access either system.paxos or a co-located table.	2025-07-24 16:39:50 +02:00
Petr Gusev	6e87a6cdb0	paxos_state: extract state access functions into paxos_store Introduce paxos_store abstraction to isolate Paxos state access. Prepares for supporting either system.paxos or a co-located table as the storage backend.	2025-07-24 16:39:50 +02:00
Gleb Natapov	d5e023bbad	topology coordinator: drop no longer needed token metadata barrier Currently we do token metadata barrier before accepting a replacing node. It was needed for the "replace with the same IP" case to make sure old request will not contact new node by mistake. But now since we address nodes by id this is no longer possible since old requests will use old id and will be rejected. Closes scylladb/scylladb#25047	2025-07-24 11:15:42 +02:00
Tomasz Grabiec	c9bf010d6d	Merge 'test.py: skip cleaning testlog' from Andrei Chekun Skip removing any artifacts when -s provided between test.py invocation. Logs from the previous run will be overridden if tests were executed one more time. Fox example: 1. Execute tests A, B, C with parameter -s 2. All logs are present even if tests are passed 3. Execute test B with parameter -s 4. Logs for A and C are from the first run 5. Logs for B are from the most recent run Backport is not needed, since it framework enhancement. Closes scylladb/scylladb#24838 * github.com:scylladb/scylladb: test.py: skip cleaning artifacts when -s provided test.py: move deleting directory to prepare_dir	2025-07-24 09:46:42 +03:00
Gleb Natapov	ab6e328226	storage_proxy: preallocate write response handler hash table Currently it grows dynamically and triggers oversized allocation warning. Also it may be hard to find sufficient contiguous memory chunk after the system runs for a while. This patch pre-allocates enough memory for ~1M outstanding writes per shard. Fixes #24660 Fixes #24217 Closes scylladb/scylladb#25098	2025-07-24 09:46:42 +03:00
Patryk Jędrzejczak	f89ffe491a	Merge 'storage_service: cancel all write requests after stopping transports' from Sergey Zolotukhin When a node shuts down, in storage service, after storage_proxy RPCs are stopped, some write handlers within storage_proxy may still be waiting for background writes to complete. These handlers hold appropriate ERMs to block schema changes before the write finishes. After the RPCs are stopped, these writes cannot receive the replies anymore. If, at the same time, there are RPC commands executing `barrier_and_drain`, they may get stuck waiting for these ERM holders to finish, potentially blocking node shutdown until the writes time out. This change introduces cancellation of all outstanding write handlers from storage_service after the storage proxy RPCs were stopped. Fixes scylladb/scylladb#23665 Backport: since this fixes an issue that frequently causes issues in CI, backport to 2025.1, 2025.2, and 2025.3. Closes scylladb/scylladb#24714 * https://github.com/scylladb/scylladb: storage_service: Cancel all write requests on storage_proxy shutdown test: Add test for unfinished writes during shutdown and topology change	2025-07-24 09:46:42 +03:00
Gleb Natapov	ddc3b6dcf5	migration manager: assert that if schema pull is disabled the group0 is not in use_pre_raft_procedures state If schema pull are disabled group0 is used to bring up to date schema by calling start_group0_operation() which executes raft read barrier internally, but if the group0 is still in use_pre_raft_procedures start_group0_operation() silently does nothing. Later the code that assumes that schema is already up-to-date will fail and print warnings into the log. But since getting queries in the state when a node is in raft enabled mode but group0 is still not configured is illegal it is better to make those errors more visible buy asserting them during testing. Closes scylladb/scylladb#25112	2025-07-23 14:10:17 +02:00
Botond Dénes	b65a2e2303	Update seastar submodule * seastar 26badcb1...60b2e7da (42): > Revert "Fix incorrect defaults for io queue iops/bandwidth" > fair_queue: Ditch queue-wide accumulator reset on overflow > addr2line, scripts/stall-analyser: change the default tool to llvm-addr2line > Fix incorrect defaults for io queue iops/bandwidth > core/reactor: add cxx_exceptions() getter > gate: make destructor virtual > scripts/seastar-addr2line: change the default addr2line utility to llvm-addr2line > coding-style: Align example return types > reactor: Remove min_vruntime() declaration > reactor: Move enable_timer() method to private section > smp: fix missing span include > core: Don't keep internal errors counter on reactor > pollable_fd: Untangle shutdown() > io_queue: Remove deprecated statistics getters > fair_queue: Remove queued/executing resource counters > reactor: Move set_current_task() from public reactor API > util: make SEASTAR_ASSERT() failure generate SIGABRT > core: fix high CPU use at idle on high core count machines > Merge 'Move output IO throttler to IO queue level' from Pavel Emelyanov fair_queue: Move io_throttler to io_queue.hh fair_queue: Move metrics from to io_queue::stream fair_queue: Remove io_throttler from tests fair_queue_test: Remove io-throttler from fair-queue fair_queue: Remove capacity getters fair_queue: Move grab_result into io_queue::stream too fair_queue: Move throtting code to io_queue.cc fair_queue: Move throttling code to io_queue::stream class fair_queue: Open-code dispatch_requests() into users fair_queue: Split dispatch_requests() into top() and pop_front() fair_queue: Swap class push back and dispatch fair_queue: Configure forgiving factor externally fair_queue: Move replenisher kick to dispatch caller io_queue: Introduce io_queue::stream fair_queue: Merge two grab_capacity overloads fair_queue: Detatch outcoming capacity grabbing from main dispatch loop fair_queue: Move available tokens update into if branch io_queue: Rename make_fair_group_config into configure_throttler io_queue: Rename get_fair_group into get_throttler fair_queue: Rename fair_group -> io_throttler > http::reply: Add 308 (permanent redirect) and make pretty-print handle unknown values > Merge 'Relax reactor coupling with file_data_source_impl' from Pavel Emelyanov reactor: Relax friendship with file_data_source_impl fstream: Use direct io_stats reference > thread_pool: Relax coupling with reactor > reactor: Mark some IO classes management methods private > http: Deprecate json_exception > io_tester: Collect and report disk queue length samples > test/perf: Add context-switch measurer > http/client: Zero-copy forward content-length body into the underlying stream > json2code: Genrate move constructor and move-assignment operator > Merge 'Semi-mixed mode for output_stream' from Pavel Emelyanov output_stream: Support semi-mixed mode writing output_stream: Complete write(temporary_buffer) piggy-back-ing write(packet) iostream: Add friends for iostream tests packet: Mark bool cast operator const iostream: Document output_stream::write() methods > io_tester: Show metrics about requests split > reactor: add counter for internal errors > iotune: Print correct throughput units > core: add label to io_threaded_fallbacks to categorize operations > slab: correct allocation logic and enforce memory limits > Merge 'Fix for non-json http function_handlers' from Travis Downs httpd_test: add test for non-JSON function handler function_handlers: avoid implicit conversions http: do not always treat plain text reply as json > Merge 'tls: add ALPN support' from Łukasz Kurowski tls: add server-side ALPN support tls: add client-side ALPN support > Merge 'coroutine: experimental: generator: implement move and swap' from Benny Halevy coroutine: experimental: generator: implement move and swap coroutine: experimental: generator: unconstify buffer capacity > future: downgrade asserts > output_stream: Remove unused bits > Merge 'Upstream a couple of minor reactor optimizations' from Travis Downs Match type for pure_check_for_work Do not use std::function for check_for_work() > Handle ENOENT in getgrnam Includes scylla-gdb.py update by Pavel Emelyanov. Closes scylladb/scylladb#25094	2025-07-22 18:19:58 +02:00
Sergey Zolotukhin	e0dc73f52a	storage_service: Cancel all write requests on storage_proxy shutdown During a graceful node shutdown, RPC listeners are stopped in `storage_service::drain_on_shutdown` as one of the first steps. However, even after RPCs are shut down, some write handlers in `storage_proxy` may still be waiting for background writes to complete. These handlers retain the ERM. Since the RPC subsystem is no longer active, replies cannot be received, and if any RPC commands are concurrently executing `barrier_and_drain`, they may get stuck waiting for those writes. This can block the messaging server shutdown and delay the entire shutdown process until the write timeout occurs. This change introduces the cancellation of all outstanding write handlers in `storage_proxy` during shutdown to prevent unnecessary delays. Fixes scylladb/scylladb#23665	2025-07-22 15:03:30 +02:00
Sergey Zolotukhin	bc934827bc	test: Add test for unfinished writes during shutdown and topology change This test reproduces an issue where a topology change and an ongoing write query during query coordinator shutdown can cause the node to get stuck. When a node receives a write request, it creates a write handler that holds a copy of the current table's ERM (Effective Replication Map). The ERM ensures that no topology or schema changes occur while the request is being processed. After the query coordinator receives the required number of replica write ACKs to satisfy the consistency level (CL), it sends a reply to the client. However, the write response handler remains alive until all replicas respond — the remaining writes are handled in the background. During shutdown, when all network connections are closed, these responses can no longer be received. As a result, the write response handler is only destroyed once the write timeout is reached. This becomes problematic because the ERM held by the handler blocks topology or schema change commands from executing. Since shutdown waits for these commands to complete, this can lead to unnecessary delays in node shutdown and restarts, and occasional test case failures. Test for: scylladb/scylladb#23665	2025-07-22 15:03:13 +02:00
Ran Regev	3d82b9485e	docs: update nodetool restore documentation for --sstables-file-list Fixes: #25128 A leftover from #25077 Closes scylladb/scylladb#25129	2025-07-22 14:43:35 +02:00
Yaron Kaikov	4445c11c69	./github/workflows/conflict_reminder: improve workflow with weekly notifications - Change schedule from twice weekly (Mon/Thu) to once weekly (Mon only) - Extend notification cooldown period from 3 days to 1 week - Prevent notification spam while maintaining immediate conflict detection on pushes Fixes: https://github.com/scylladb/scylladb/issues/25130 Closes scylladb/scylladb#25131	2025-07-22 15:21:12 +03:00
Avi Kivity	e4c4141d97	test.py: don't crash on early cleanup of ScyllaServer If a test fails very early (still have to find why), test.py crashes while flushing a non-existent log_file, as shown below. To fix, initialize the property to None and check it during cleanup. ``` ================================================================================ [N/TOTAL] SUITE MODE RESULT TEST ------------------------------------------------------------------------------ 'ScyllaServer' object has no attribute 'log_file' test_cluster_features Traceback (most recent call last): File "/home/avi/scylla-maint/./test.py", line 816, in <module> sys.exit(asyncio.run(main())) ~~~~~~~~~~~^^^^^^^^ File "/usr/lib64/python3.13/asyncio/runners.py", line 195, in run return runner.run(main) ~~~~~~~~~~^^^^^^ File "/usr/lib64/python3.13/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^ File "/usr/lib64/python3.13/asyncio/base_events.py", line 725, in run_until_complete return future.result() ~~~~~~~~~~~~~^^ File "/home/avi/scylla-maint/./test.py", line 523, in main total_tests_pytest, failed_pytest_tests = await run_all_tests(signaled, options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/avi/scylla-maint/./test.py", line 452, in run_all_tests failed += await reap(done, pending, signaled) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/avi/scylla-maint/./test.py", line 418, in reap result = coro.result() File "/home/avi/scylla-maint/test/pylib/suite/python.py", line 143, in run return await super().run(test, options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/avi/scylla-maint/test/pylib/suite/base.py", line 216, in run await test.run(options) File "/home/avi/scylla-maint/test/pylib/suite/topology.py", line 48, in run async with get_cluster_manager(self.uname, self.suite.clusters, str(self.suite.log_dir)) as manager: ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.13/contextlib.py", line 221, in __aexit__ await anext(self.gen) File "/home/avi/scylla-maint/test/pylib/scylla_cluster.py", line 2006, in get_cluster_manager await manager.stop() File "/home/avi/scylla-maint/test/pylib/scylla_cluster.py", line 1539, in stop await self.clusters.put(self.cluster, is_dirty=True) File "/home/avi/scylla-maint/test/pylib/pool.py", line 104, in put await self.destroy(obj) File "/home/avi/scylla-maint/test/pylib/suite/python.py", line 65, in recycle_cluster srv.log_file.close() ^^^^^^^^^^^^ AttributeError: 'ScyllaServer' object has no attribute 'log_file' ``` Closes scylladb/scylladb#24885	2025-07-22 12:39:01 +02:00
Avi Kivity	2db2b42556	sstables: version: drop custom operator<=> The default comparison for enums is equivalent and sufficient. Closes scylladb/scylladb#24888	2025-07-22 12:39:01 +02:00
Avi Kivity	e89f6c5586	config, main: make cpu scheduling mandatory CPU scheduling has been with us since `641aaba12c` (2017), and no one ever disables it. Likely nothing really works without it. Make it mandatory and mark the option unused. Closes scylladb/scylladb#24894	2025-07-22 12:39:01 +02:00
Avi Kivity	ee138217ba	alternator: simplify std::views::transform calls that extract a member from a class Rather than calling std::views::transform with a lambda that extracts a member from a class, call std::views::transform with a pointer-to-member to do the same thing. This results in more concise code. Closes scylladb/scylladb#25012	2025-07-22 12:39:01 +02:00
Jakub Smolar	6e0a063ce3	gdb: handle zero-size reads in managed_bytes Fixes: https://github.com/scylladb/scylladb/issues/25048 Closes scylladb/scylladb#25050	2025-07-22 12:39:01 +02:00
Nadav Har'El	298a0ec4de	test/cqlpy: in README.md, remind users of run-cassandra to set NODETOOL test/cqlpy/README.md explains how to run the cqlpy tests against Cassandra, and mentions that if you don't have "nodetool" in your path you need to set the NODETOOL variable. However, when giving a simple example how to use the run-cassandra script, we forgot to remind the user to set NODETOOL in addition to CASSANDRA, causing confusion for users who didn't know why tests were failing. So this patch fixes the section in test/cqlpy/README.md with the run-cassandra example to also set the NODETOOL environment variable, not just CASSANDRA. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#25051	2025-07-22 12:39:00 +02:00
Aleksandra Martyniuk	b5026edf49	tasks: change _finished_children type Parent task keeps a vector of statuses (task_essentials) of its finished children. When the children number is large - for example because we have many tables and a child task is created for each table - we may hit oversize allocation while adding a new child essentials to the vector. Keep task_essentails of children in chunked_vector. Fixes: #25040. Closes scylladb/scylladb#25064	2025-07-22 12:39:00 +02:00
Pavel Emelyanov	d94be313c1	Merge 'test: audit: ignore cassandra user audit logs in AUTH tests' from Andrzej Jackowski Audit tests are vulnerable to noise from LOGIN queries (because AUTH audit logs can appear at any time). Most tests already use the `filter_out_noise` mechanism to remove this noise, but tests focused on AUTH verification did not, leading to sporadic failures. This change adds a filter to ignore AUTH logs generated by the default "cassandra" user, so tests only verify logs from the user created specifically for each test. Additionally, this PR: - Adds missing `nonlocal new_rows` statement that prevented some checks from being called - Adds a testcase for audit logs of `cassandra` user Fixes: https://github.com/scylladb/scylladb/issues/25069 Better backport those test changes to 2025.3. 2025.2 and earlier don't have `./cluster/dtest/audit_test.py`. Closes scylladb/scylladb#25111 * github.com:scylladb/scylladb: test: audit: add cassandra user test case test: audit: ignore cassandra user audit logs in AUTH tests test: audit: change names of `filter_out_noise` parameters test: audit: add missing `nonlocal new_rows` statement	2025-07-22 10:42:16 +03:00
Pavel Emelyanov	295165d8ea	Merge 's3_client: Enhance s3_client error handling' from Ernest Zaslavsky Enhance and fix error handling in the `chunked_download_source` to prevent errors seeping from the request callback. Also stop retrying on seastar's side since it is going to break the integrity of data which maybe downloaded more than once for the same range. Fixes: https://github.com/scylladb/scylladb/issues/25043 Should be backported to 2025.3 since we have an intention to release native backup/restore feature Closes scylladb/scylladb#24883 * github.com:scylladb/scylladb: s3_client: Disable Seastar-level retries in HTTP client creation s3_test: Validate handling of non-`aws_error` exceptions s3_client: Improve error handling in chunked_download_source aws_error: Add factory method for `aws_error` from exception	2025-07-22 10:40:39 +03:00
Ran Regev	dd67d22825	nodetool restore: sstable list from a file Fixes: #25045 added the ability to supply the list of files to restore from the a given file. mainly required for local testing. Signed-off-by: Ran Regev <ran.regev@scylladb.com> Closes scylladb/scylladb#25077	2025-07-22 09:11:02 +03:00
Ernest Zaslavsky	fc2c9dd290	s3_client: Disable Seastar-level retries in HTTP client creation Prevent Seastar from retrying HTTP requests to avoid buffer double-feed issues when an entire request is retried. This could cause data corruption in `chunked_download_source`. The change is global for every instance of `s3_client`, but it is still safe because: * Seastar's `http_client` resets connections regardless of retry behavior * `s3_client` retry logic handles all error types—exceptions, HTTP errors, and AWS-specific errors—via `http_retryable_client`	2025-07-21 17:03:23 +03:00
Ernest Zaslavsky	ba910b29ce	s3_test: Validate handling of non-`aws_error` exceptions Inject exceptions not wrapped in `aws_error` from request callback lambda to verify they are properly caught and handled.	2025-07-21 16:52:43 +03:00
Ernest Zaslavsky	b7ae6507cd	s3_client: Improve error handling in chunked_download_source Create aws_error from raised exceptions when possible and respond appropriately. Previously, non-aws_exception types leaked from the request handler and were treated as non-retryable, causing potential data corruption during download.	2025-07-21 16:49:47 +03:00
Ernest Zaslavsky	d53095d72f	aws_error: Add factory method for `aws_error` from exception Move `aws_error` creation logic out of `retryable_http_client` and into the `aws_error` class to support reuse across components.	2025-07-21 16:42:44 +03:00
Andrzej Jackowski	21aedeeafb	test: audit: add cassandra user test case Audit tests use the `filter_out_noise` function to remove noise from audit logs generated by user authentication. As a result, none of the existing tests covered audit logs for the default `cassandra` user. This change adds a test case for that user. Refs: scylladb/scylladb#25069	2025-07-21 14:54:20 +02:00
Andrzej Jackowski	aef6474537	test: audit: ignore cassandra user audit logs in AUTH tests Audit tests are vulnerable to noise from LOGIN queries (because AUTH audit logs can appear at any time). Most tests already use the `filter_out_noise` mechanism to remove this noise, but tests focused on AUTH verification did not, leading to sporadic failures. This change adds a filter to ignore AUTH logs generated by the default "cassandra" user, so tests only verify logs from the user created specifically for each test. Fixes: scylladb/scylladb#25069	2025-07-21 14:54:20 +02:00
Andrzej Jackowski	daf1c58e21	test: audit: change names of `filter_out_noise` parameters This is a refactoring commit that changes the names of the parameters of the `filter_out_noise` function, as well as names of related variables. The motiviation for the change is introduction of more complex filtering logic in next commit of this patch series. Refs: scylladb/scylladb#25069	2025-07-21 14:54:01 +02:00
Andrzej Jackowski	e634a2cb4f	test: audit: add missing `nonlocal new_rows` statement The variable `new_rows` was not updated by the inner function `is_number_of_new_rows_correct` because the `nonlocal new_rows` statement was missing. As a result, `sorted_new_rows` was empty and certain checks were skipped. This change: - Introduces the missing `nonlocal new_rows` declaration - Adds an assertion verifying that the number of new rows matches the expected count - Fixes the incorrect variable name in the lambda used for row sorting	2025-07-21 14:53:48 +02:00

1 2 3 4 5 ...

48597 Commits