scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 19:10:42 +00:00

Author	SHA1	Message	Date
Piotr Smaron	a2bbbc6904	auth: forbid modifying system ks by non-superusers Before this patch, granting a user MODIFY permissions on ALL KEYSPACES allowed the user to write to system tables, where the user could also set himself to "superuser" granting him all other permissions. After this patch, MODIFY permissions on ALL KEYSPACES is limited only to non-system keyspaces. Fixes: scylladb/scylladb#23218 Closes scylladb/scylladb#23219	2025-03-30 16:55:04 +03:00
Ferenc Szili	2c9b312b58	test: port of test and reproducer for resurrection during file based streaming This change ports test/cluster/test_resurrection.py from enterprise to master. Because the underlying issue deals with file based streaming, this test was a part of the enterprise repo. It contains the test and reproducer for the issue described below: When tablets are migrated with file-based streaming, we can have a situation where a tombstone is garbage collected before the data it shadows lands. For instance, if we have a tablet replica with 3 sstables: 1 sstable containing an expired tombstone 2 sstable with additional data 3 sstable containing data which is shadowed by the expired tombstone in sstable 1 If this tablet is migrated, and the sstables are streamed in the order listed above, the first two sstables can be compacted before the third sstable arrives. In that case, the expired tombstone will be garbage collected, and data in the third sstable will be resurrected after it arrives to the pending replica. The fix for the issue was merged in `b66479ea98` This patch only ports the missing test. Closes scylladb/scylladb#23466	2025-03-30 13:39:40 +03:00
Andrzej Jackowski	b8adbcbc84	audit: fix empty query string in BATCH query Function modification_statement::add_raw() is never called, which makes query string in audit_info of batch queries empty. In enterprise branch, add_raw is called in Cql.g and those changes were never merged to master. This changes: - Add missing call of add_raw() to Cql.g - Include other related changes (from PR#3228 in scylla-enterprise) Fixes scylladb#23311 Closes scylladb/scylladb#23315	2025-03-30 13:37:11 +03:00
Michał Chojnowski	79a477ecb6	cmake: add the `-dynamic-linker=...` form to the -dynamic-linker regex On my system (Nix), the compiler produces a `-dynamic-linker=/nix/store/...` in the linker call scanned by get_padded_dynamic_linker_option. But the regex can't deal with the `=` there, it requires a ` `. Fix that. We also do the same in configure.py, and remove the Nix-specific hack which used to disable the entire mechanism. Closes scylladb/scylladb#22308	2025-03-30 11:58:47 +03:00
Kefu Chai	7814f6d374	github: improve seastar bad include check for better developer experience: - add inline annotations using problem matchers, see https://github.com/actions/toolkit/blob/main/docs/problem-matchers.md - use a single step for uploading both output files, because the `path` setting is actually passed to [@actions/glob](https://github.com/actions/toolkit/tree/main/packages/glob), i removed the double quotes and the leading "./" from the paths. - use "::error" workflow command to signify the failure, see https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#example-creating-an-annotation-for-an-error Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23310	2025-03-30 11:56:18 +03:00
Michał Jadwiszczak	0ee0696959	test/cqlpy/test_service_level_api: update to service levels on raft and remove flakiness Tests in `test_service_level_api` were written before scylladb/scylladb#16585 and they were doing 10s sleeps to wait for service level controller to update its configuration. Now performing a read barrier is sufficient to ensure SL configuration is up-to-date, which significantly reduces tests time (from ~60s to ~2-3s). Moreover, there was flakiness in the `test_switch_tenants` test. Until now, the test waited up to 60s for the connections to update their scheduling groups. However, it is difficult to determine how long the process might take because a connection may be blocked while waiting for the next request to be processed, and the scheduling group will be updated only after a request is processed (see `generic_server::connection::process_until_tenant_switch()`). To address this issue, 100 simple queries are executed so that connections on all shards process at least one request and update their scheduling groups. Fixes scylladb/scylladb#22768 Closes scylladb/scylladb#23381	2025-03-28 17:14:21 +03:00
Avi Kivity	6d7cb68aab	test: ldap: avoid io_uring Seastar reactor backend It tends to fail sometimes with ENOMEM: ``` ERROR 2025-03-24 01:05:22,983 [shard 0:sl:d] ldap_role_manager - error in reconnect: std::system_error (error C-Ares:4, server.that.will.never.exist.scylladb.com: Not found) ERROR 2025-03-24 01:05:30,984 [shard 0:sl:d] ldap_role_manager - error in reconnect: std::system_error (error C-Ares:4, server.that.will.never.exist.scylladb.com: Not found) ERROR 2025-03-24 01:05:47,123 [shard 0:main] storage_service - Shutting down communications due to I/O errors until operator intervention: Disk error: std::system_error (error system:12, Cannot allocate memory) ERROR 2025-03-24 01:05:47,139 [shard 0:main] table - failed to write sstable /scylladir/testlog/x86_64/debug/scylla-33787f64/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa/me-3got_1s5n_0lfls1y4z7vkkts07a-big-Data.db: storage_io_error (Storage I/O error: 12: Cannot allocate memory) ERROR 2025-03-24 01:05:47,140 [shard 0:main] table - Memtable flush failed due to: storage_io_error (Storage I/O error: 12: Cannot allocate memory). Aborting, at 0x30f5605 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x4514f14 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x4514b96 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x45165b1 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x4518dcf 0x3fde842 0x35dc5c6 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36c26ed /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36cdd0c /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36d2cd2 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36d0e56 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x327f47a /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x327c8f0 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1cdd4 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1c79c /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1c69c /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1c184 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x34b2674 0x314b8b6 /lib64/libc.so.6+0x70ba7 /lib64/libc.so.6+0xf4b8b -------- seastar::internal::coroutine_traits_base<void>::promise_type -------- seastar::internal::coroutine_traits_base<void>::promise_type -------- seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)> >(seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> -------- seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)> >(seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> -------- seastar::shared_future<>::shared_state Aborting on shard 0, in scheduling group main. Backtrace: 0x30f5605 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x384a0e4 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x3849db2 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x369bd84 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36d42a2 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x37a5ed9 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x37a61d5 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x37a601f /lib64/libc.so.6+0x1a04f /lib64/libc.so.6+0x72b53 /lib64/libc.so.6+0x19f9d /lib64/libc.so.6+0x1941 0x3fde8b1 0x35dc5c6 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36c26ed /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36cdd0c /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36d2cd2 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x36d0e56 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x327f47a /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x327c8f0 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1cdd4 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1c79c /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1c69c /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar_testing.so+0x1c184 /jenkins/workspace/scylla-master/next/scylla/build/debug/seastar/libseastar.so+0x34b2674 0x314b8b6 /lib64/libc.so.6+0x70ba7 /lib64/libc.so.6+0xf4b8b === TEST.PY SUMMARY START === Test exited with code -6 === TEST.PY SUMMARY END === === decoded === Backtrace: [Backtrace #0] __interceptor_backtrace at /mnt/clang_build/llvm-project-x86_64/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:4369 void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at ./build/debug/seastar/./seastar/include/seastar/util/backtrace.hh:70 seastar::backtrace_buffer::append_backtrace() at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:805 seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:838 seastar::print_with_backtrace(char const, bool) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:850 seastar::sigabrt_action() at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:4004 seastar::install_oneshot_signal_handler<6, (void ()())(&seastar::sigabrt_action)>()::{lambda(int, siginfo_t, void)#1}::operator()(int, siginfo_t, void) const at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:3981 seastar::install_oneshot_signal_handler<6, (void ()())(&seastar::sigabrt_action)>()::{lambda(int, siginfo_t, void)#1}::__invoke(int, siginfo_t, void) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:3976 /lib64/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=c8c3fa52aaee3f5d73b6fd862e39e9d4c010b6ba, for GNU/Linux 3.2.0, not stripped ?? ??:0 printf_positional at ??:? ?? ??:0 ?? ??:0 replica::table::seal_active_memtable(replica::compaction_group&, replica::flush_permit&&)::$_0::operator()(std::function<seastar::future<void> ()>) const at ././replica/table.cc:1512 std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume() const at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/coroutine:242 (inlined by) seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose() at ././seastar/include/seastar/core/coroutine.hh:122 seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:2616 seastar::reactor::run_some_tasks() at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:3088 seastar::reactor::do_run() at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:3256 seastar::reactor::run() at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/reactor.cc:3146 seastar::app_template::run_deprecated(int, char, std::function<void ()>&&) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/app-template.cc:276 seastar::app_template::run(int, char, std::function<seastar::future<int> ()>&&) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/app-template.cc:167 seastar::testing::test_runner::start_thread(int, char)::$_0::operator()() at ./build/debug/seastar/./build/debug/seastar/./seastar/src/testing/test_runner.cc:77 void std::__invoke_impl<void, seastar::testing::test_runner::start_thread(int, char)::$_0&>(std::__invoke_other, seastar::testing::test_runner::start_thread(int, char)::$_0&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:61 std::enable_if<is_invocable_r_v<void, seastar::testing::test_runner::start_thread(int, char)::$_0&>, void>::type std::__invoke_r<void, seastar::testing::test_runner::start_thread(int, char)::$_0&>(seastar::testing::test_runner::start_thread(int, char)::$_0&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:111 std::_Function_handler<void (), seastar::testing::test_runner::start_thread(int, char)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:290 seastar::posix_thread::start_routine(void) at ./build/debug/seastar/./build/debug/seastar/./seastar/src/core/posix.cc:90 asan_thread_start(void*) at /mnt/clang_build/llvm-project-x86_64/compiler-rt/lib/asan/asan_interceptors.cpp:239 __vfscanf_internal at :? peek_token at ??:? ``` In `ce65164315`, we banned io_uring from tests, but missed the ldap tests. This extends coverage to ldap tests. I verified that the new options indeed reach the test. Refs #23411. Credit to Botond for recognizing the failure reason. Closes scylladb/scylladb#23422	2025-03-28 07:45:53 +02:00
Botond Dénes	bd8973a025	tools/scylla-nodetool: s/GetInt()/GetInt64()/ GetInt() was observed to fail when the integer JSON value overflows the int32_t type, which `GetInt()` uses for storage. When this happens, rapidjson will assign a distinct 64 bit integer type to the value, and attempting to access it as 32 bit integer triggers the wrong-type error, resulting in assert failure. This was hit on the field where invoking nodetool netstats resulted in nodetool crashing when the streamed bytes amounts were higher than maxint. To avoid such bugs in the future, replace all usage of GetInt() in nodetool of GetInt64(), just to be sure. A reproducer is added to the nodetool netstats crash. Fixes: scylladb/scylladb#23394 Closes scylladb/scylladb#23395	2025-03-27 14:05:39 +02:00
Botond Dénes	d57e71837f	Merge 'Improve scoped restore test' from Pavel Emelyanov This PR includes several fixes to the nowadays flaky test_restore_with_streaming_scopes test. 1. Check that backup and restore APIs don't fail. Currently, if either of them does the test cases fails anyway checking that the data is not restored back, but it's better to know what exactly failed 2. For restore API the test collects the list of sstables to restore from. Currently collecting this list races with background compaction and sometimes leads to restore API to fail which, in turn, makes the whole test to fail 3. Add a test case that validates that restore-from-missing-sstable fails nicely refs: #23189 No backport, as it's a relatively new test Closes scylladb/scylladb#23445 * github.com:scylladb/scylladb: test/backup: Validate that restoring from non-existing sstables fails test/backup: Collect sstables names after snapshot test/backup: Check that backup and restore succeed	2025-03-27 13:23:41 +02:00
Piotr Dulikowski	288216a89e	Merge 'Ignore wrapped exceptions `gate_closed_exception` and `rpc::closed_error` when node shuts down.' from Sergey Zolotukhin Normally, when a node is shutting down, `gate_closed_exception` and `rpc::closed_error` in `send_to_live_endpoints` should be ignored. However, if these exceptions are wrapped in a `nested_exception`, an error message is printed, causing tests to fail. This commit adds handling for nested exceptions in this case to prevent unnecessary error messages. Fixes scylladb/scylladb#23325 Fixes scylladb/scylladb#23305 Fixes scylladb/scylladb#21815 Backport: looks like this is quite a frequent issue, therefore backport to 2025.1. Closes scylladb/scylladb#23336 * github.com:scylladb/scylladb: database: Pass schema_ptr as const ref in `wrap_commitlog_add_error` database: Unify exception handling in `do_apply` and `apply_with_commitlog` storage_proxy: Ignore wrapped `gate_closed_exception` and `rpc::closed_error` when node shuts down. exceptions: Add `try_catch_nested` to universally handle nested exceptions of the same type.	2025-03-27 11:39:42 +01:00
Pavel Emelyanov	9f036d957a	Merge 'test/clqpy/test_tool.py: get_sstables_for_table(): exclude non-sealed sstables' from Botond Dénes Filter out sstables which don't have a TOC or have a temporary TOC. Such sstables are incomplete and can dissapear if the compaction which writes them is interrupted. Fixes: #23203 This PR fixes a flaky test which is only on master, no backports required. Closes scylladb/scylladb#23450 * github.com:scylladb/scylladb: test/cqlpy/test_tools.py: test_scylla_sstable_query: reduce scope of no-compaction context test/clqpy/test_tool.py: get_sstables_for_table(): exclude non-sealed sstables	2025-03-27 09:45:07 +03:00
Tomasz Grabiec	8e506c5a8f	test: tablets: Fix flakiness due to ungraceful shutdown The test fails sporadically with: cassandra.ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed for test3.test2 - received 1 responses and 1 failures from 2 CL=QUORUM." info={'consistency': 'QUORUM', 'required_responses': 2, 'received_responses': 1, 'failures': 1} That's becase a server is stopped in the middle of the workload. The server is stopped ungracefully which will cause some requests to time out. We should stop it gracefully to allow in-flight requests to finish. Fixes #20492 Closes scylladb/scylladb#23451	2025-03-27 09:44:07 +03:00
Avi Kivity	b292b5800b	Merge 'test.py: move starting LDAP service to dedicate method' from Andrei Chekun Move starting LDAP to the method where the rest of the services are started. This will unify the way of starting the 3rd party services. Fix LDAP tests flakiness due not possible to connect to LDAP server. Add catching stdout and stderr of toxiproxy-cli in case of errors Related: https://github.com/scylladb/scylladb/pull/23333 This PR is based on https://github.com/scylladb/scylladb/pull/23221, so #23221 should be merged first. Closes scylladb/scylladb#23235 * github.com:scylladb/scylladb: test.py: Refactor nodetool/conftest test.py: Refactor test/pylib/cpp/ldap test.py: move starting LDAP service to dedicate method	2025-03-26 15:31:00 +02:00
Botond Dénes	801339bad9	test/cqlpy/test_tools.py: test_scylla_sstable_query: reduce scope of no-compaction context To just system.local, the table these tests operate on. No need to disable autocompaction for all of the system keyspace.	2025-03-26 09:19:38 -04:00
Botond Dénes	3ec863c4ce	test/clqpy/test_tool.py: get_sstables_for_table(): exclude non-sealed sstables Filter out sstables which don't have a TOC or have a temporary TOC. Such sstables are incomplete and can dissapear if the compaction which writes them is interrupted.	2025-03-26 09:18:34 -04:00
Pavel Emelyanov	1da889f239	Merge 'Allow abort during join_cluster' from Benny Halevy Bootstrap or replace can take a long time, but since `feef7d3fa1`, the stop_signal is checked only in checkpoints, and in particular, abort isn't requested during join_cluster. Fixes #23222 * requires backport on top of https://github.com/scylladb/scylladb/pull/23184 Closes scylladb/scylladb#23306 * github.com:scylladb/scylladb: main: allow abort during join_cluster main: add checkpoint before joining cluster storage_service: add start_sys_dist_ks	2025-03-26 15:48:58 +03:00
Sergey Zolotukhin	d448f3de77	database: Pass schema_ptr as const ref in `wrap_commitlog_add_error`	2025-03-26 11:15:26 +01:00
Sergey Zolotukhin	0d9d0fe60e	database: Unify exception handling in `do_apply` and `apply_with_commitlog` Move exception wrapping logic from `do_apply` and `apply_with_commitlog` to `wrap_commitlog_add_error` to ensure consistent error handling.	2025-03-26 11:15:18 +01:00
Sergey Zolotukhin	b1e89246d4	storage_proxy: Ignore wrapped `gate_closed_exception` and `rpc::closed_error` when node shuts down. Normally, when a node is shutting down, `gate_closed_exception` and `rpc::closed_error` in `send_to_live_endpoints` should be ignored. However, if these exceptions are wrapped in a `nested_exception`, an error message is printed, causing tests to fail. This commit adds handling for nested exceptions in this case to prevent unnecessary error messages. Fixes scylladb/scylladb#23325	2025-03-26 11:15:16 +01:00
Sergey Zolotukhin	6abfed9817	exceptions: Add `try_catch_nested` to universally handle nested exceptions of the same type.	2025-03-26 11:15:13 +01:00
Evgeniy Naydanov	574c81eac6	test.py: random_failures: deselect topology ops for some injections After recent changes #18640 and #19151 started to reproduce for stop_after_sending_join_node_request and stop_after_bootstrapping_initial_raft_configuration error injections too. The solution is the same: deselect the tests. Fixes #23302 Closes scylladb/scylladb#23405	2025-03-26 12:07:12 +03:00
Pavel Emelyanov	38f37763d6	test/backup: Validate that restoring from non-existing sstables fails When restore API is called and is given a non-existing sstable (object name) the task should complete with failed status and some meaningful message in the error text. refs: #23189 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-26 10:55:42 +03:00
Pavel Emelyanov	02610a9072	test/backup: Collect sstables names after snapshot The scoped restoer test works like this - populate table - flush it - collect list of sstables - take snapshot - backup - restore (with the list of sstables as argument) - check the data is back Steps 2 and 3 are racy -- in case compaction comes in the middle, the list of collected sstables would differ from those snapshotted (and backuped) which will later lead to restore failure due to missing sstable. Fix by collecting the list of sstables after taking snapshot, and collect those not from the datadir, but from the snapshot dir. fixes: #23189 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-26 10:40:54 +03:00
Pavel Emelyanov	08004fe470	test/backup: Check that backup and restore succeed The scoped-restore test calls backup and restore APIs on several nodes, but doesn't check if any of the operations actually succeeds. Sometimes they indeed don't and test captures this, but in a weird manner -- the post-test checks for data presense fails, because the expected data is not in fact in its place. It's more debugging-friendly if we know in advance if backup or restore fails, rather than see that some data is missing after (failed) restore. refs: #23189 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-25 19:45:56 +03:00
Gleb Natapov	0aa4a82c83	messaging_service: do not call uninitialized _address_to_host_id_mapper std::function During messaging_service object creation remove_rpc_client function may be called if prefer_local snitch setting is true. The caller does not provide host id, so _address_to_host_id_mapper is called to obtain it, but at this point the function is not initialized yet. The patch fixes the code to not call the function if not initialized. This is not the problem since during messaging_service creation there is no connection to drop. Fixes: #23353 Message-ID: <Z-J2KbBK8NoFNYZZ@scylladb.com>	2025-03-25 18:41:16 +02:00
Wojciech Mitros	88d3fc68b5	alter_table_statement: fix renaming multiple columns in tables with views When we rename columns in a table which has materialized views depending on it, we need to also rename them in the materialized views' WHERE clauses. Currently, we do that by creating a new WHERE clause after each rename, with the updated column. This is later converted to a mutation that overwrites the WHERE clause. After multiple renames, we have multiple mutations, each overwriting the WHERE clause with one column renamed. As a result, the final WHERE clause is one of the modified clauses with one column renamed. Instead, we should prepare one new WHERE clause which includes all the renamed columns. This patch accomplishes this by processing all the column renames first, and only preparing the new view schema with the new WHERE clause afterwards. This patch also includes a test reproducer for this scenario. Fixes scylladb/scylladb#22194 Closes scylladb/scylladb#23152	2025-03-25 09:58:58 +01:00
Michael Litvak	49b8cf2d1d	storage_service: fix tablet split of materialized views This fixes an issue where materialized view tablets are not split because they are not registered as split candidates by the storage service. The code in storage_service::replicate_to_all_cores was changed in `4bfa3060d0` to handle normal tables and view tables separately, but with that change register_tablet_split_candidate is applied only to normal tables and not every table like before. We fix it by registering view tables as well. We add a test to verify that split of MV tables works. Closes scylladb/scylladb#23335	2025-03-24 08:23:58 +01:00
Pavel Emelyanov	79b9626d16	Merge 'service: do not include unused headers ' from Kefu Chai these unused includes were identified by clang-include-cleaner. after auditing these source files, all of the reports have been confirmed. also, updated the "iwyu.yaml" (short for include what you use) workflow to include "service" and "raft" subdirectories to prevent future regressions of including unused headers in them. --- it's a cleanup, hence no need to backport. Closes scylladb/scylladb#23373 * github.com:scylladb/scylladb: .github: add "raft" and "service" subdirectories to CLEANER_DIR service: do not include unused headers	2025-03-24 10:20:15 +03:00
Avi Kivity	cc5fe542ed	test: ignore unused fmt::to_string() result fmt 11.1 apparently marks to_string() as [[nodiscard]]. Here we aren't interested in the result, so explicitly ignore it to avoid an error. Closes scylladb/scylladb#23403	2025-03-24 10:19:09 +03:00
Avi Kivity	9d49c3254f	install-dependencies.sh: disabiguate python magic package There are in fact two python magic packages, file-magic (that binds to libmagic and comes from the file package), magic, an independent one. The name we use in install-depedencies.sh, python3-magic, resolves to file-magic. In Fedora 42, the resolution from the name python3-magic to file-magic was removed [1], and so install-dependencies.sh now tries to install the wrong magic package, which turns out not to coexist with the one we want anyway. Fix by naming python3-file-magic directly instead. Since this is what's installed in the current frozen toolchain, there's no need to regenerate it; we're just making the package list work in Fedora 42. [1] `81910b7d88` Closes scylladb/scylladb#23402	2025-03-24 10:18:27 +03:00
Avi Kivity	cd04ab1a4e	test: avoid spaces when defining user-defined literal operator Clang 20 complains when it sees a user-defined literal operator defined with a space before the underscore. Assume it's adhering to the standard and comply. Closes scylladb/scylladb#23401	2025-03-24 10:17:12 +03:00
Pavel Emelyanov	d436fb8045	Merge 'Fix EAR not applied on write to S3 (but on read).' from Calle Wilund Fixes #23225 Fixes #23185 Adds a "wrap_sink" (with default implementation) to sstables::file_io_extension, and moves extension wrapping of file and sink objects to storage level. (Wrapping/handling on sstable level would be problematic, because for file storage we typically re-use the sstable file objects for sinks, whereas for S3 we do not). This ensures we apply encryption on both read and write, whereas we previously only did so on read -> fail. Adds io wrapper objects for adapting file/sink for default implementation, as well as a proper encrypted sink implementation for EAR. Unit tests for io objects and a macro test for S3 encrypted storage included. Closes scylladb/scylladb#23261 * github.com:scylladb/scylladb: encryption: Add "wrap_sink" to encryption sstable extension encrypted_file_impl: Add encrypted_data_sink sstables::storage: Move wrapping sstable components to storage provider sstables::file_io_extension: Add a "wrap_sink" method. sstables::file_io_extension: Make sstable argument to "wrap" const utils: Add "io-wrappers", useful IO helper types	2025-03-24 10:12:46 +03:00
Artsiom Mishuta	8bb6414037	test.py: reuse clusters in Python suite PR https://github.com/scylladb/scylladb/pull/22274 was introduced due to CI instability and want to mark the cluster dirty after each test for topology But in fact, affects only Python suites that are quite stable, and CI was Stabilized by PR https://github.com/scylladb/scylladb/pull/22252 This PR get back cluster reusage in Python test suites Closes scylladb/scylladb#23179	2025-03-23 20:08:36 +02:00
Kefu Chai	fdc5255eb8	build: disable DPDK for all release builds Previously, DPDK was enabled by default in standard release builds but disabled in "release-pgo" and "release-cs-pgo" builds. This inconsistency caused linking warnings during PGO phase 2, when trained profiles from non-DPDK builds were used with DPDK-enabled builds: ``` [1980/1983] LINK build/release/scylla ld.lld: warning: /home/avi/scylla-maint/build/release/seastar/libseastar.a(reactor.cc.o at 57829248): function control flow change detected (hash mismatch) _ZN7seastar7reactor14run_some_tasksEv Hash = 2095857468992035112 up to 0 count discarded ld.lld: warning: /home/avi/scylla-maint/build/release/seastar/libseastar.a(reactor.cc.o at 57829248): function control flow change detected (hash mismatch) _ZN7seastar7reactor6do_runEv Hash = 2184396189398169723 up to 50134372 count discarded ld.lld: warning: /home/avi/scylla-maint/build/release/seastar/libseastar.a(reactor.cc.o at 57829248): function control flow change detected (hash mismatch) _ZN7seastar18syscall_work_queue11submit_itemESt10unique_ptrINS0_9work_itemESt14default_deleteIS2_EE Hash = 1533150042646546219 up to 1979931 count discarded ``` Since DPDK is not used in production and increases build time, this change disables DPDK across all release build types. This both silences the warnings and improves build performance. Fixes #23323 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23391	2025-03-23 15:26:10 +02:00
Avi Kivity	9adfb91f46	Merge 'Introduce s3 data_source_impl for optimized object streaming' from Pavel Emelyanov Currently, to stream data from sstable component the sstables code uses file_data_source_impl. In case the component is on S3, the s3::readable_file is put into that data source. The data source is configured with 128k buffers and at most 4 read-ahead-s. With that configuration, downloading full object from S3 becomes too slow -- GET-ing file with 128k requests is not nice even with 4 parallel read-ahead-s. Better solution for S3 downloading is to request way larger chunk with one GET and then produce smaller, 128k or alike, buffers upon data arrival. This is what the newly introduced data source impl does -- it spawns a background GET and lets the upper input stream read buffers directly from the arriving body. This PR doesn't yet make sstable layer use the new sink, just introduces it and adds unit and perf tests. Testing \|Test\|Download speed, MB/s\| \|-\|-\| \|file_input_stream (), 1 socket \| 4.996\| \|file_input_stream (), 2 sockets \| 9.403\| \|s3_data_source (*) \| 93.164\| () The file_input_stream test renders 128k GETs and is configured to issue at most 4 read-ahead-s (*) The s3_data_source uses at most 1 socket regardless of what perf-test configures it to refs: #22458 Closes scylladb/scylladb#22907 github.com:scylladb/scylladb: test: Extend s3-perf test with stream download one test/perf: Tune-up s3 test options parsing test: Add unit test for newly introduced download source s3/client: Introduce data_source_impl for object downloading s3/client: Detach format_range_header() helper	2025-03-23 14:22:04 +02:00
Pavel Emelyanov	ca3b604afa	test: Extend s3-perf test with stream download one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:07 +03:00
Pavel Emelyanov	283e8e0706	test/perf: Tune-up s3 test options parsing Rename the `--upload bool` into `--operation string` one, so that new tests can be added in the future. Also rename run_download() to run_contiguous_get() because this is what the internals of this method do -- just GET contiguous ranges sequentially. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:07 +03:00
Pavel Emelyanov	bd313c581f	test: Add unit test for newly introduced download source Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Pavel Emelyanov	1f301b1c5d	s3/client: Introduce data_source_impl for object downloading The new data source implementation runs a single GET for the whole range specified and lends the body input_stream for the upper input_stream's get()-s. Eventually, getting the data from the body stream EOFs or fails. In either case, the existing body is closed and a new GET is spawn with the updater Range header so that not to include the bytes read so far. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Pavel Emelyanov	d47719f70e	s3/client: Detach format_range_header() helper The get_object_contiguous() formats the 'bytes=X-Y' one for its GET request. The very same code will be needed by next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Avi Kivity	7646e1448a	Merge 'cql3: Introduce RF-rack-valid keyspaces' from Dawid Mędrek This PR is an introductory step towards enforcing RF-rack-valid keyspaces in Scylla. The scope of changes: * defining RF-rack-valid keyspaces, * introducing a configuration option enforcing RF-rack-valid keyspaces, * restricting the CREATE and ALTER KEYSPACE statements so that they never lead to RF-rack invalid keyspaces, * during the initialization of a node, it verifies that all existing keyspaces are RF-rack-valid. If not, the initialization fails. We provide tests verifying that the changes behave as intended. --- Note that there are a number of things that still need to be implemented. That includes, for instance, restricting topology operations too. --- Implementation strategy (going beyond the scope of this PR): 1. Introduce the new configuration option `rf_rack_valid_keyspaces`. 2. Start enforcing RF-rack-validity in keyspaces if the option is enabled. 3. Adjust the tests: in the tree and out of it. Explicitly enable the option in all tests. 4. Once the tests have been adjusted, change the default value of the option to enabled. 5. Stop explicitly enabling the option in tests. 6. Get rid of the option. --- Fixes scylladb/scylladb#20356 Fixes scylladb/scylladb#23276 Fixes scylladb/scylladb#23300 --- Backport: this is part of the requirements for releasing 2025.1. Closes scylladb/scylladb#23138 * github.com:scylladb/scylladb: main: Refuse to start node when RF-rack-invalid keyspace exists cql3: Ensure that CREATE and ALTER never lead to RF-rack-invalid keyspaces db/config: Introduce RF-rack-valid keyspaces	2025-03-20 19:10:36 +02:00
Paweł Zakrzewski	0d14177409	audit/syslog: escape quotes and add explicit section names Before this change we outputted CSV-like structure, that looked like the following: Feb 27 12:31:30 scylla-audit: "10.200.200.41:0", "AUTH", "", "", "", "", "10.200.200.41:0", "cassandra", "false" While this is passably readable for humans, the ordering of fields is not clear and can be confusing. Furthermore, the `"` character (double quote) was not escaped. This is not an issue for CQL, but will be a problem for auditing Alternator, which will require logging JSON payloads. The new format will consist of key=value pairs and will escape the quote character, making it easy to parse programmatically. Feb 28 02:21:56 scylla-audit: node="10.200.200.41:0", category="AUTH", cl="", error="false", keyspace="", query="", client_ip="10.200.200.41:0", table="", username="cassandra" This is required for the auditing alternator feature. Closes scylladb/scylladb#23099	2025-03-20 19:55:51 +03:00
Calle Wilund	5c6337b887	encryption: Add "wrap_sink" to encryption sstable extension Creates a more efficient data_sink wrapper for encrypted output stream (S3).	2025-03-20 14:54:24 +00:00
Calle Wilund	9ac9813c62	encrypted_file_impl: Add encrypted_data_sink Adds a sibling type to encrypted file, a data_sink, that will write a data stream in the same block format as a file object would. Including end padding. For making encrypted data sink writing less cumbersome.	2025-03-20 14:54:24 +00:00
Calle Wilund	e02be77af7	sstables::storage: Move wrapping sstable components to storage provider Fixes #23225 Fixes #23185 Moved wrapping component files/sinks to storage provider. Also ensures to wrap data_sinks as well as actual files. This ensures that we actually write encryption if active.	2025-03-20 14:54:24 +00:00
Calle Wilund	d46dcbb769	sstables::file_io_extension: Add a "wrap_sink" method. Similar to wrap file, should wrap a data_sink (used for sstable writers), in obvious write-only, simple stream mode. Default impl will detect if we wrap files for this component, and if so, generate a file wrapper for the input sink, wrap this, and the wrap it in a file_data_sink_impl. This is obviously not efficient, so extensions used in actual non-test code should implement the method.	2025-03-20 14:54:22 +00:00
Calle Wilund	e100af5280	sstables::file_io_extension: Make sstable argument to "wrap" const This matches the signature of call sites. Since the only "real" extension to actually make a marker in the sstable will do so in the scylla component, which is writable even in a const sstable, this is ok.	2025-03-20 14:54:09 +00:00
Calle Wilund	98a6d0f79c	utils: Add "io-wrappers", useful IO helper types Mainly to add a somewhat functional file-impl wrapping a data_sink. This can implement a rudimentary, write-only, file based on any output sink. For testing, and because they fit there, place memory sink and source types there as well.	2025-03-20 14:54:09 +00:00
David Garcia	209ea2ea27	docs: update issues label Closes scylladb/scylladb#23304	2025-03-20 17:46:58 +03:00
Kefu Chai	c37149d106	test: stop using seastar::at_exit() seastar::at_exit() was marked deprecated recently. so let's use the recommended approach to perform cleanups. following tests were updated in this changes - scylla perf-tablets: tested with scylla perf-tablets - scylla perf-row-cache-update: tested with scylla perf-row-cache-update - scylla perf-fast-forward: tested with scylla perf-fast-forward --populate --run-tests small-partition-skips \ --smp 1 scylla perf-fast-forward --run-tests small-partition-skips \ --smp 1 - scylla perf-load-balancing: tested with scylla perf-load-balancing --nodes 3 --tablets1 16 --tablets2 16 --rf1 3 --rf2 3 --shards 16 - unit/row_cache_stress_test: tested with row_cache_stress_test --seconds 10 - perf/perf_cache_eviction: tested with ./perf_cache_eviction --seconds 1 --smp 1 - perf/perf_row_cache_reads: tested with ./perf_row_cache_reads Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23356	2025-03-20 17:44:57 +03:00

1 2 3 4 5 ...

47201 Commits