Before this patch, granting a user MODIFY permissions on ALL KEYSPACES allowed the user to write to system tables, where the user could also set himself to "superuser" granting him all other permissions. After this patch, MODIFY permissions on ALL KEYSPACES is limited only to non-system keyspaces.
Fixes: scylladb/scylladb#23218Closesscylladb/scylladb#23219
This change ports test/cluster/test_resurrection.py from enterprise to
master. Because the underlying issue deals with file based streaming,
this test was a part of the enterprise repo. It contains the test and
reproducer for the issue described below:
When tablets are migrated with file-based streaming, we can have a situation
where a tombstone is garbage collected before the data it shadows lands. For
instance, if we have a tablet replica with 3 sstables:
1 sstable containing an expired tombstone
2 sstable with additional data
3 sstable containing data which is shadowed by the expired tombstone in sstable 1
If this tablet is migrated, and the sstables are streamed in the order listed
above, the first two sstables can be compacted before the third sstable arrives.
In that case, the expired tombstone will be garbage collected, and data in the
third sstable will be resurrected after it arrives to the pending replica.
The fix for the issue was merged in b66479ea98
This patch only ports the missing test.
Closesscylladb/scylladb#23466
Function modification_statement::add_raw() is never called, which
makes query string in audit_info of batch queries empty. In enterprise
branch, add_raw is called in Cql.g and those changes were never merged
to master.
This changes:
- Add missing call of add_raw() to Cql.g
- Include other related changes (from PR#3228 in scylla-enterprise)
Fixes scylladb#23311
Closesscylladb/scylladb#23315
On my system (Nix), the compiler produces a `-dynamic-linker=/nix/store/...` in
the linker call scanned by get_padded_dynamic_linker_option.
But the regex can't deal with the `=` there, it requires a ` `. Fix that.
We also do the same in configure.py, and remove the Nix-specific hack
which used to disable the entire mechanism.
Closesscylladb/scylladb#22308
Tests in `test_service_level_api` were written before
scylladb/scylladb#16585 and they were doing 10s sleeps to wait for
service level controller to update its configuration. Now performing
a read barrier is sufficient to ensure SL configuration is up-to-date,
which significantly reduces tests time (from ~60s to ~2-3s).
Moreover, there was flakiness in the `test_switch_tenants` test.
Until now, the test waited up to 60s for the connections to update
their scheduling groups. However, it is difficult to determine
how long the process might take because a connection may be blocked
while waiting for the next request to be processed,
and the scheduling group will be updated only after a request is processed
(see `generic_server::connection::process_until_tenant_switch()`).
To address this issue, 100 simple queries are executed so that
connections on all shards process at least one request
and update their scheduling groups.
Fixesscylladb/scylladb#22768Closesscylladb/scylladb#23381
GetInt() was observed to fail when the integer JSON value overflows the
int32_t type, which `GetInt()` uses for storage. When this happens,
rapidjson will assign a distinct 64 bit integer type to the value, and
attempting to access it as 32 bit integer triggers the wrong-type error,
resulting in assert failure. This was hit on the field where invoking
nodetool netstats resulted in nodetool crashing when the streamed bytes
amounts were higher than maxint.
To avoid such bugs in the future, replace all usage of GetInt() in
nodetool of GetInt64(), just to be sure.
A reproducer is added to the nodetool netstats crash.
Fixes: scylladb/scylladb#23394Closesscylladb/scylladb#23395
This PR includes several fixes to the nowadays flaky test_restore_with_streaming_scopes test.
1. Check that backup and restore APIs don't fail. Currently, if either of them does the test cases fails anyway checking that the data is not restored back, but it's better to know what exactly failed
2. For restore API the test collects the list of sstables to restore from. Currently collecting this list races with background compaction and sometimes leads to restore API to fail which, in turn, makes the whole test to fail
3. Add a test case that validates that restore-from-missing-sstable fails nicely
refs: #23189
No backport, as it's a relatively new test
Closesscylladb/scylladb#23445
* github.com:scylladb/scylladb:
test/backup: Validate that restoring from non-existing sstables fails
test/backup: Collect sstables names after snapshot
test/backup: Check that backup and restore succeed
Normally, when a node is shutting down, `gate_closed_exception` and `rpc::closed_error`
in `send_to_live_endpoints` should be ignored. However, if these exceptions are wrapped
in a `nested_exception`, an error message is printed, causing tests to fail.
This commit adds handling for nested exceptions in this case to prevent unnecessary
error messages.
Fixesscylladb/scylladb#23325Fixesscylladb/scylladb#23305Fixesscylladb/scylladb#21815
Backport: looks like this is quite a frequent issue, therefore backport to 2025.1.
Closesscylladb/scylladb#23336
* github.com:scylladb/scylladb:
database: Pass schema_ptr as const ref in `wrap_commitlog_add_error`
database: Unify exception handling in `do_apply` and `apply_with_commitlog`
storage_proxy: Ignore wrapped `gate_closed_exception` and `rpc::closed_error` when node shuts down.
exceptions: Add `try_catch_nested` to universally handle nested exceptions of the same type.
Filter out sstables which don't have a TOC or have a temporary TOC. Such sstables are incomplete and can dissapear if the compaction which writes them is interrupted.
Fixes: #23203
This PR fixes a flaky test which is only on master, no backports required.
Closesscylladb/scylladb#23450
* github.com:scylladb/scylladb:
test/cqlpy/test_tools.py: test_scylla_sstable_query: reduce scope of no-compaction context
test/clqpy/test_tool.py: get_sstables_for_table(): exclude non-sealed sstables
The test fails sporadically with:
cassandra.ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed for test3.test2 - received 1 responses and 1 failures from 2 CL=QUORUM." info={'consistency': 'QUORUM', 'required_responses': 2, 'received_responses': 1, 'failures': 1}
That's becase a server is stopped in the middle of the workload.
The server is stopped ungracefully which will cause some requests to
time out. We should stop it gracefully to allow in-flight requests to
finish.
Fixes#20492Closesscylladb/scylladb#23451
Move starting LDAP to the method where the rest of the services are started. This will unify the way of starting the 3rd party services.
Fix LDAP tests flakiness due not possible to connect to LDAP server.
Add catching stdout and stderr of toxiproxy-cli in case of errors
Related: https://github.com/scylladb/scylladb/pull/23333
This PR is based on https://github.com/scylladb/scylladb/pull/23221, so #23221 should be merged first.
Closesscylladb/scylladb#23235
* github.com:scylladb/scylladb:
test.py: Refactor nodetool/conftest
test.py: Refactor test/pylib/cpp/ldap
test.py: move starting LDAP service to dedicate method
Filter out sstables which don't have a TOC or have a temporary TOC. Such
sstables are incomplete and can dissapear if the compaction which writes
them is interrupted.
Bootstrap or replace can take a long time, but
since feef7d3fa1,
the stop_signal is checked only in checkpoints,
and in particular, abort isn't requested during
join_cluster.
Fixes#23222
* requires backport on top of https://github.com/scylladb/scylladb/pull/23184Closesscylladb/scylladb#23306
* github.com:scylladb/scylladb:
main: allow abort during join_cluster
main: add checkpoint before joining cluster
storage_service: add start_sys_dist_ks
Normally, when a node is shutting down, `gate_closed_exception` and `rpc::closed_error`
in `send_to_live_endpoints` should be ignored. However, if these exceptions are wrapped
in a `nested_exception`, an error message is printed, causing tests to fail.
This commit adds handling for nested exceptions in this case to prevent unnecessary
error messages.
Fixesscylladb/scylladb#23325
After recent changes #18640 and #19151 started to reproduce for
stop_after_sending_join_node_request and
stop_after_bootstrapping_initial_raft_configuration error injections too.
The solution is the same: deselect the tests.
Fixes#23302Closesscylladb/scylladb#23405
When restore API is called and is given a non-existing sstable (object
name) the task should complete with failed status and some meaningful
message in the error text.
refs: #23189
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The scoped restoer test works like this
- populate table
- flush it
- collect list of sstables
- take snapshot
- backup
- restore (with the list of sstables as argument)
- check the data is back
Steps 2 and 3 are racy -- in case compaction comes in the middle, the
list of collected sstables would differ from those snapshotted (and
backuped) which will later lead to restore failure due to missing
sstable.
Fix by collecting the list of sstables after taking snapshot, and
collect those not from the datadir, but from the snapshot dir.
fixes: #23189
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The scoped-restore test calls backup and restore APIs on several nodes,
but doesn't check if any of the operations actually succeeds. Sometimes
they indeed don't and test captures this, but in a weird manner -- the
post-test checks for data presense fails, because the expected data is
not in fact in its place.
It's more debugging-friendly if we know in advance if backup or restore
fails, rather than see that some data is missing after (failed) restore.
refs: #23189
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
During messaging_service object creation remove_rpc_client function may
be called if prefer_local snitch setting is true. The caller does not
provide host id, so _address_to_host_id_mapper is called to obtain it,
but at this point the function is not initialized yet.
The patch fixes the code to not call the function if not initialized.
This is not the problem since during messaging_service creation there
is no connection to drop.
Fixes: #23353
Message-ID: <Z-J2KbBK8NoFNYZZ@scylladb.com>
When we rename columns in a table which has materialized views depending
on it, we need to also rename them in the materialized views' WHERE
clauses.
Currently, we do that by creating a new WHERE clause after each rename,
with the updated column. This is later converted to a mutation that
overwrites the WHERE clause. After multiple renames, we have multiple
mutations, each overwriting the WHERE clause with one column renamed.
As a result, the final WHERE clause is one of the modified clauses with
one column renamed.
Instead, we should prepare one new WHERE clause which includes all the
renamed columns. This patch accomplishes this by processing all the
column renames first, and only preparing the new view schema with the
new WHERE clause afterwards.
This patch also includes a test reproducer for this scenario.
Fixesscylladb/scylladb#22194Closesscylladb/scylladb#23152
This fixes an issue where materialized view tablets are not split
because they are not registered as split candidates by the storage
service.
The code in storage_service::replicate_to_all_cores was changed in
4bfa3060d0 to handle normal tables and view tables separately, but with
that change register_tablet_split_candidate is applied only to normal
tables and not every table like before. We fix it by registering view
tables as well.
We add a test to verify that split of MV tables works.
Closesscylladb/scylladb#23335
these unused includes were identified by clang-include-cleaner. after auditing these source files, all of the reports have been confirmed. also, updated the "iwyu.yaml" (short for include what you use) workflow to include "service" and "raft" subdirectories to prevent future regressions of including unused headers in them.
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#23373
* github.com:scylladb/scylladb:
.github: add "raft" and "service" subdirectories to CLEANER_DIR
service: do not include unused headers
fmt 11.1 apparently marks to_string() as [[nodiscard]]. Here we aren't
interested in the result, so explicitly ignore it to avoid an error.
Closesscylladb/scylladb#23403
There are in fact two python magic packages, file-magic (that binds
to libmagic and comes from the file package), magic, an independent
one. The name we use in install-depedencies.sh, python3-magic,
resolves to file-magic.
In Fedora 42, the resolution from the name python3-magic to
file-magic was removed [1], and so install-dependencies.sh now tries
to install the wrong magic package, which turns out not to coexist
with the one we want anyway.
Fix by naming python3-file-magic directly instead. Since this is what's
installed in the current frozen toolchain, there's no need to
regenerate it; we're just making the package list work in Fedora 42.
[1] 81910b7d88Closesscylladb/scylladb#23402
Clang 20 complains when it sees a user-defined literal operator
defined with a space before the underscore. Assume it's adhering
to the standard and comply.
Closesscylladb/scylladb#23401
Fixes#23225Fixes#23185
Adds a "wrap_sink" (with default implementation) to sstables::file_io_extension, and moves
extension wrapping of file and sink objects to storage level.
(Wrapping/handling on sstable level would be problematic, because for file storage we typically re-use the sstable file objects for sinks, whereas for S3 we do not).
This ensures we apply encryption on both read and write, whereas we previously only did so on read -> fail.
Adds io wrapper objects for adapting file/sink for default implementation, as well as a proper encrypted sink implementation for EAR.
Unit tests for io objects and a macro test for S3 encrypted storage included.
Closesscylladb/scylladb#23261
* github.com:scylladb/scylladb:
encryption: Add "wrap_sink" to encryption sstable extension
encrypted_file_impl: Add encrypted_data_sink
sstables::storage: Move wrapping sstable components to storage provider
sstables::file_io_extension: Add a "wrap_sink" method.
sstables::file_io_extension: Make sstable argument to "wrap" const
utils: Add "io-wrappers", useful IO helper types
Previously, DPDK was enabled by default in standard release builds but disabled
in "release-pgo" and "release-cs-pgo" builds. This inconsistency caused linking
warnings during PGO phase 2, when trained profiles from non-DPDK builds were
used with DPDK-enabled builds:
```
[1980/1983] LINK build/release/scylla
ld.lld: warning: /home/avi/scylla-maint/build/release/seastar/libseastar.a(reactor.cc.o at 57829248): function control flow change detected (hash mismatch) _ZN7seastar7reactor14run_some_tasksEv Hash = 2095857468992035112 up to 0 count discarded
ld.lld: warning: /home/avi/scylla-maint/build/release/seastar/libseastar.a(reactor.cc.o at 57829248): function control flow change detected (hash mismatch) _ZN7seastar7reactor6do_runEv Hash = 2184396189398169723 up to 50134372 count discarded
ld.lld: warning: /home/avi/scylla-maint/build/release/seastar/libseastar.a(reactor.cc.o at 57829248): function control flow change detected (hash mismatch) _ZN7seastar18syscall_work_queue11submit_itemESt10unique_ptrINS0_9work_itemESt14default_deleteIS2_EE Hash = 1533150042646546219 up to 1979931 count discarded
```
Since DPDK is not used in production and increases build time, this
change disables DPDK across all release build types. This both silences
the warnings and improves build performance.
Fixes#23323
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#23391
Currently, to stream data from sstable component the sstables code uses file_data_source_impl. In case the component is on S3, the s3::readable_file is put into that data source. The data source is configured with 128k buffers and at most 4 read-ahead-s. With that configuration, downloading full object from S3 becomes too slow -- GET-ing file with 128k requests is not nice even with 4 parallel read-ahead-s.
Better solution for S3 downloading is to request way larger chunk with one GET and then produce smaller, 128k or alike, buffers upon data arrival. This is what the newly introduced data source impl does -- it spawns a background GET and lets the upper input stream read buffers directly from the arriving body.
This PR doesn't yet make sstable layer use the new sink, just introduces it and adds unit and perf tests.
Testing
|Test|Download speed, MB/s|
|-|-|
|file_input_stream (*), 1 socket | 4.996|
|file_input_stream (*), 2 sockets | 9.403|
|s3_data_source (**) | 93.164|
(*) The file_input_stream test renders 128k GETs and is configured to issue at most 4 read-ahead-s
(**) The s3_data_source uses at most 1 socket regardless of what perf-test configures it to
refs: #22458Closesscylladb/scylladb#22907
* github.com:scylladb/scylladb:
test: Extend s3-perf test with stream download one
test/perf: Tune-up s3 test options parsing
test: Add unit test for newly introduced download source
s3/client: Introduce data_source_impl for object downloading
s3/client: Detach format_range_header() helper
Rename the `--upload bool` into `--operation string` one, so that new
tests can be added in the future. Also rename run_download() to
run_contiguous_get() because this is what the internals of this method
do -- just GET contiguous ranges sequentially.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The new data source implementation runs a single GET for the whole range
specified and lends the body input_stream for the upper input_stream's
get()-s. Eventually, getting the data from the body stream EOFs or
fails. In either case, the existing body is closed and a new GET is
spawn with the updater Range header so that not to include the bytes
read so far.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The get_object_contiguous() formats the 'bytes=X-Y' one for its GET
request. The very same code will be needed by next patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This PR is an introductory step towards enforcing
RF-rack-valid keyspaces in Scylla.
The scope of changes:
* defining RF-rack-valid keyspaces,
* introducing a configuration option enforcing RF-rack-valid
keyspaces,
* restricting the CREATE and ALTER KEYSPACE statements
so that they never lead to RF-rack invalid keyspaces,
* during the initialization of a node, it verifies that all existing
keyspaces are RF-rack-valid. If not, the initialization fails.
We provide tests verifying that the changes behave as intended.
---
Note that there are a number of things that still need to be implemented.
That includes, for instance, restricting topology operations too.
---
Implementation strategy (going beyond the scope of this PR):
1. Introduce the new configuration option `rf_rack_valid_keyspaces`.
2. Start enforcing RF-rack-validity in keyspaces if the option is enabled.
3. Adjust the tests: in the tree and out of it. Explicitly enable the option in all tests.
4. Once the tests have been adjusted, change the default value of the option to enabled.
5. Stop explicitly enabling the option in tests.
6. Get rid of the option.
---
Fixesscylladb/scylladb#20356Fixesscylladb/scylladb#23276Fixesscylladb/scylladb#23300
---
Backport: this is part of the requirements for releasing 2025.1.
Closesscylladb/scylladb#23138
* github.com:scylladb/scylladb:
main: Refuse to start node when RF-rack-invalid keyspace exists
cql3: Ensure that CREATE and ALTER never lead to RF-rack-invalid keyspaces
db/config: Introduce RF-rack-valid keyspaces
Before this change we outputted CSV-like structure, that looked like the
following:
Feb 27 12:31:30 scylla-audit: "10.200.200.41:0", "AUTH", "", "", "", "", "10.200.200.41:0", "cassandra", "false"
While this is passably readable for humans, the ordering of fields is
not clear and can be confusing. Furthermore, the `"` character (double
quote) was not escaped. This is not an issue for CQL, but will be a
problem for auditing Alternator, which will require logging JSON
payloads.
The new format will consist of key=value pairs and will escape the quote
character, making it easy to parse programmatically.
Feb 28 02:21:56 scylla-audit: node="10.200.200.41:0", category="AUTH", cl="", error="false", keyspace="", query="", client_ip="10.200.200.41:0", table="", username="cassandra"
This is required for the auditing alternator feature.
Closesscylladb/scylladb#23099
Adds a sibling type to encrypted file, a data_sink, that
will write a data stream in the same block format as a file
object would. Including end padding.
For making encrypted data sink writing less cumbersome.
Fixes#23225Fixes#23185
Moved wrapping component files/sinks to storage provider. Also ensures
to wrap data_sinks as well as actual files. This ensures that we actually
write encryption if active.
Similar to wrap file, should wrap a data_sink (used for
sstable writers), in obvious write-only, simple stream
mode.
Default impl will detect if we wrap files for this component,
and if so, generate a file wrapper for the input sink, wrap
this, and the wrap it in a file_data_sink_impl.
This is obviously not efficient, so extensions used in actual
non-test code should implement the method.
This matches the signature of call sites. Since the only "real"
extension to actually make a marker in the sstable will do so in
the scylla component, which is writable even in a const sstable,
this is ok.
Mainly to add a somewhat functional file-impl wrapping
a data_sink. This can implement a rudimentary, write-only,
file based on any output sink.
For testing, and because they fit there, place memory
sink and source types there as well.