These tests check serialization and deserialization (including JSON),
basic inserts and selects, aggregate functions, element validation,
vector usage in user defined types and functions.
test_vector_between_user_types is a translated Apache Cassandra test
to check if it is handled properly internally.
This PR contains the missing part of a fix for scylladb/scylla-enterprise#4912 which was omitted during migration of workload prioritization to the source available repository. Even though the regression test for it was ported, it was silently made ineffective by a different fix (scylladb/scylla-enterprise#4764), so this PR also improves the test.
Fixes: scylladb/scylladb#22404
No need to backport - service levels are not yet a part of any source-available release.
Closesscylladb/scylladb#22416
* github.com:scylladb/scylladb:
test/auth_cluster: make test_service_level_metric_name_change useful
main: rename `cql_sg_stats` metrics on scheduling group rename
In order to reduce the dependency on external libraries, and for better integration with ranges in C++ standard library. let's use the homebrew `utils::views::unique()` before unique is accepted by the C++ standard.
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#22393
* github.com:scylladb/scylladb:
cql3, test: switch from boost::adaptors::uniqued to utils::views:unique
utils: implement drop-in replacement for replacing boost::adaptors::uniqued
The log file names created in `scylla_cluster.py` by
`ScyllaClusterManager`
and files to be collected in conftest.py by `manager` should be in
sync. This patch fixes the issue, originally introduced in
scylladb/scylladb#22192Fixesscylladb/scylladb#22387
Backports: 6.1 and 6.2.
Closesscylladb/scylladb#22415
In order to reduce the dependency on external libraries, and for better
integration with ranges in C++ standard library. let's use the homebrew
`utils::views::unique()` before unique is accepted by the C++ standard.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Add a custom implementation of boost::adaptors::uniqued that is compatible
with C++20 ranges library. This bridges the gap between Boost.Range and
the C++ standard library ranges until std::views::unique becomes available
in C++26. Currently, the unique view is included in
[P2214](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2760r0.html)
"A Plan for C++ Ranges Evolution", which targets C++26.
The implementation provides:
- A lazy view adaptor that presents unique consecutive elements
- No modification of source range
- Compatibility with C++20 range views and concepts
- Lighter header dependencies compared to Boost
This resolves compilation errors when piping C++20 range views to
boost::adaptors::uniqued, which fails due to concept requirements
mismatch. For example:
```c++
auto range = std::views::take(n) | boost::adaptors::uniqued; // fails
```
This change also offers us a lightweight solution in terms of smaller
header dependency.
While std::ranges::unique exists in C++23, it's an eager algorithm that
modifies the source range in-place, unlike boost::adaptors::uniqued which
is a lazy view. The proposed std::views::unique (P2214) targeting C++26
would provide this functionality, but is not yet available.
This implementation serves as an interim solution for filtering consecutive
duplicate elements using range views until std::views::unique is
standardized.
For more details on the differences between `std::ranges::unique` and
`boost::adaptors::uniqued`:
- boost::adaptors::uniqued is a view adaptor that creates a lazy view over the original range. It:
* Doesn't modify the source range
* Returns a view that presents unique consecutive elements
* Is non-destructive and lazy-evaluated
* Can be composed with other views
- std::ranges::unique is an algorithm that:
* Modifies the source range in-place
* Removes consecutive duplicates by shifting elements
* Returns an iterator to the new logical end
* Cannot be used as a view or composed with other range adaptors
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Currently, when we load a frozen schema into the registry, we lose
the base info if the schema was of a view. Because of that, in various
places we need to set the base info again, and in some codepaths we
may miss it completely, which may make us unable to process some
requests (for example, when executing reverse queries on views).
Even after setting the base info, we may still lose it if the schema
entry gets deactivated due to all `schema_ptr`s temporarily dying.
To fix this, this patch adds the base schema to the registry, alongside
the view schema. We store just the frozen base schema, so that we can
transfer it across shards. With the base schema, we can now set the base
info when returning the schema from the registry. As a result, we can now
assume that all view schemas returned by the registry have base_info set.
In this series we also make sure that the view schemas in the registry are
kept up-to-date in regards to base schema changes.
Fixes https://github.com/scylladb/scylladb/issues/21354
This issue is a bug, so adding backport labels 6.1 and 6.2
Closesscylladb/scylladb#21862
* github.com:scylladb/scylladb:
test: add test for schema registry maintaining base info for views
schema_registry: avoid setting base info when getting the schema from registry
schema_registry: update cached base schemas when updating a view
schema_registry: cache base schemas for views
db: set base info before adding schema to registry
As discussed in
https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813,
compact storage tables are deprecated.
Yet, there's is nothing in the code that prevents users
from creating such tables.
This patch adds a live-updateable config option:
`enable_create_table_with_compact_storage`, set to
`false` by default, that require users to opt-in
in order to create new tables WITH COMPACT STORAGE.
Refs scylladb/scylladb#12263, scylladb/scylladb#16375
* Since this guardrail is an enhancement, no backport is needed
Closesscylladb/scylladb#16403
* github.com:scylladb/scylladb:
docs: ddl: document the deprecation of compact tables
test: enable_create_table_with_compact_storage for tests that need it
config: add enable_create_table_with_compact_storage
The test test_service_level_metric_name_change was originally introduced
to serve as a regression test for scylladb/scylla-enterprise#4912.
Before the fix, some per-scheduling-group metrics would not get adjusted
when the scheduling group gets renamed (which does happen for SL-managed
scheduling groups) and it would be possible to attempt to register
metrics with the same set of labels, resulting in an error.
However, in scylladb/scylla-enterprise#4764, another bug was fixed which
affected the test. Before a service level is created, a "test"
scheduling group can be created by service level controller if it is
unsure whether it is allowed to create more scheduling groups or not. If
creation of the scheduling group succeeds, it is put into the pool of
scheduling groups to be reused when a new service level is created.
Therefore, the node handling CREATE SERVICE LEVEL would always use the
scheduling group that was originally created for the sake of the test as
a SG for the new service level.
All of the above is intentional and was actually fixed by the
aforementioned issue. However, the test scheduling groups would always
get unique names and, therefore, the error would no longer reproduce.
However, the faulty logic that ran previously and caused the bug still
runs - when a node updates its service levels cache on group0 reload.
The test previously used only one node. Fix it by starting two nodes
instead of one at the beginning of the test and by serving all service
level commands to the first node - were the issue not fixed, the error
would get triggered on the second node.
The repair_time in system.tablets will be updated when repair runs
successfully. We can now use it to update the repair time for tombstone
gc, i.e, when the system.tablets.repair_time is propagated, call
gc_state.update_repair_time() on the node that is the owner of the
tablet.
Since b3b3e880d3 ("repair: Reduce hints and batchlog flush"), the
repair time that could be used for tombstone gc might be smaller than
when the repair is started, so the actual repair time for tombstone gc
is returned by the repair rpc call from the repair master node.
Fixes#17507
New feature. No backport is needed.
Closesscylladb/scylladb#21896
* github.com:scylladb/scylladb:
repair: Stop using rpc to update repair time for repairs scheduled by scheduler
repair: Wire repair_time in system.tablets for tombstone gc
test: Disable flush_cache_time for two tablet repair tests
test: Introduce guarantee_repair_time_next_second helper
repair: Return repair time for repair_service::repair_tablet
service: Add tablet_operation.hh
File based stream is a new feature that optimizes tablet movement
significantly. It streams the entire SSTable files without deserializing
SSTable files into mutation fragments and re-serializing them back into
SSTables on receiving nodes. As a result, less data is streamed over the
network, and less CPU is consumed, especially for data models that
contain small cells.
The following patches are imported from the scylla enterprise:
*) Merge 'Introduce file stream for tablet' from Asias He
This patch uses Seastar RPC stream interface to stream sstable files on
network for tablet migration.
It streams sstables instead of mutation fragments. The file based
stream has multiple advantages over the mutation streaming.
- No serialization or deserialization for mutation fragments
- No need to read and process each mutation fragments
- On wire data is more compact and smaller
In the test below, a significant speed up is observed.
Two nodes, 1 shard per node, 1 initial_tablets:
- Start node 1
- Insert 10M rows of data with c-s
- Bootstrap node 2
Node 1 will migration data to node2 with the file stream.
Test results:
1) File stream: bytes on wire = 1132006250 bytes, bw = 836MB/s
[shard 0:stre] stream_blob - stream_sstables[eadaa8e0-a4f2-4cc6-bf10-39ad1ce106b0]
Finished sending sstable_nr=2 files_nr=18 files={} range=(-1,9223372036854775807] bytes_sent=1132006250 stream_bw=836MB/s
[shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 1.08004s seconds
2) Mutation stream: bytes on wire = 3030004736 bytes, bw = 125410.87 KiB/s = 128MB/s
[shard 0:stre] stream_session - [Stream #406dc8b0-56b5-11ee-bc2d-000bf4871058]
Streaming plan for Tablet migration-ks1-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=2958989 KiB, 125410.87 KiB/s
[shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 23.5992s seconds
Test Summary:
File stream v.s. Mutation stream improvements
- Stream bandwidth = 836 / 128 (MB/s) = 6.53X
- Stream time = 23.60 / 1.08 (Seconds) = 21.85X
- Stream bytes on wire = 3030004736 / 1132006250 (Bytes)= 2.67X
Closes scylladb/scylla-enterprise#3438
* github.com:scylladb/scylla-enterprise:
tests: Add file_stream_test
streaming: Implement file stream for tablet
*) streaming: Use new take_storage_snapshot interface
The new take_storage_snapshot returns a file object instead of a file
name. This allows the file stream sender to read from the file even if
the file is deleted by compaction.
Closes scylladb/scylla-enterprise#3728
*) streaming: Protect unsupported file types for file stream
Currently, we assume the file streamed over the stream_blob rpc verb is
a sstable file. This patch rejects the unsupported file types on the
receiver side. This allows us to stream more file types later using the
current file stream infrastructure without worrying about old nodes
processing the new file types in the wrong way.
- The file_ops::noop is renamed to file_ops::stream_sstables to be
explicit about the file types
- A missing test_file_stream_error_injection is added to the idl
Fixes: #3846
Tests: test_unsupported_file_ops
Closesscylladb/scylla-enterprise#3847
*) idl: Add service::session_id id to idl
It will be used in the next patch.
Refs #3907
*) streaming: Protect file stream with topology_guard
Similar to "storage_service, tablets: Use session to guard tablet
streaming", this patch protects file stream with topology_guard.
Fixes#3907
*) streaming: Take service topology_guard under the try block
Taking the service::topology_guard could throw. Currently, it throws
outside the try block, so the rpc sink will not be closed, causing the
following assertion:
```
scylla: seastar/include/seastar/rpc/rpc_impl.hh:815: virtual
seastar::rpc::sink_impl<netw::serializer,
streaming::stream_blob_cmd_data>::~sink_impl() [Serializer =
netw::serializer, Out = <streaming::stream_blob_cmd_data>]: Assertion
`this->_con->get()->sink_closed()' failed.
```
To fix, move more code including the topology_guard taking code to the
try block.
Fixes https://github.com/scylladb/scylla-enterprise/issues/4106Closesscylladb/scylla-enterprise#4110
*) Merge 'Preserve original SSTable state with file based tablet migration' from Raphael "Raph" Carvalho
We're not preserving the SSTable state across file based migration, so
staging SSTables for example are being placed into main directory, and
consequently, we're mixing staging and non-staging data, losing the
ability to continue from where the old replica left off.
It's expected that the view update backlog is transferred from old
into new replica, as migration doesn't wait for leaving replica to
complete view update work (which can take long). Elasticity is preferred.
So this fix guarantees that the state of the SSTable will be preserved
by propagating it in form of subdirectory (each subdirectory is
statically mapped with a particular state).
The staging sstables aren't being registered into view update generator
yet, as that's supposed to be fixed in OSS (more details can be found
at https://github.com/scylladb/scylladb/issues/19149).
Fixes#4265.
Closesscylladb/scylla-enterprise#4267
* github.com:scylladb/scylla-enterprise:
tablet: Preserve original SSTable state with file based tablet migration
sstables: Add get method for sstable state
*) sstable: (Re-)add shareabled_components getter
*) Merge 'File streaming sstables: Use sstable source/sink to transfer snapshots' from Calle Wilund
Fixes#4246
Alternative approach/better separation of concern, transport vs. sstable layer. Builds on #4472, but fancier.
Ensures we transfer and pre-process scylla metadata for streamed
file blobs first, then properly apply receiving nodes local config
by using a source and sink layer exported from sstables, which
handles things like ordering, metadata filtering (on source) as well
as handling metadata and proper IO paths when writing data on
receiver node (sink).
This implementation maintains the statelessness of the current
design, and the delegated sink side will re-read and re-write the
metadata for each component processed. This is a little wasteful,
but the meta is small, and it is less error prone than trying to do
caching cross-shards etc. The transport is isolated from the
knowledge.
This is an alternative/complement to #4436 and #4472, fixing the
underlying issue. Note that while the layers/API:s here allows easy
fixing of other fundamental problems in the feature (such as
destination location etc), these are not included in the PR, to keep
it as close to the current behaviour as possible.
Closesscylladb/scylla-enterprise#4646
* github.com:scylladb/scylla-enterprise:
raft_tests: Copy/add a topology test with encryption
file streaming: Use sstable source/sink to transfer snapshots
sstables: Add source and sink objects + producers for transfering a snapshot
sstable::types: Add remove accessor for extension info in metadata
*) The change for error injection in merge commit 966ea5955dd8760:
File streaming now has "stream_mutation_fragments" error injection points
so test_table_dropped_during_streaming works with file streaming.
*) doc: document file-based streaming
This commit adds a description of the file-based streaming feature to the documentation.
It will be displayed in the docs using the scylladb_include_flag directive after
https://github.com/scylladb/scylladb/pull/20182 is merged, backported to branch-6.0,
and, in turn, branch-2024.2.
Refs https://github.com/scylladb/scylla-enterprise/issues/4585
Refs https://github.com/scylladb/scylla-enterprise/issues/4254Closesscylladb/scylla-enterprise#4587
*) doc: move File-based streaming to the Tablets source file-based-streaming
This commit moves the description of file-based streaming from a common include file
to the regular doc source file where tablets are described.
Closesscylladb/scylla-enterprise#4652
*) streaming: sstable_stream_sink_impl: abort: prevent null pointer dereference
Closesscylladb/scylladb#22034
The way that the "test/cqlpy/run --release" feature runs older Scylla
releases is that it takes *today*'s command line parameters and "fixes"
it to conform to what old releases took. This approach was easy to
implement (and the resulting "--release" feature is super useful), but
the downside is that we need to update this fixup code whenever we add
new options to the Scylla command line used by test/cqlpy/run.py.
Commit d04f376 made test/cqlpy/run.py use a new option
"--experimental-features=views-with-tablets", so now we need to remove
it when running older versions of Scylla. So this is what we do in this
patch.
Fixes#22349
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#22350
The small cqlpy test in this patch is a regression test for issue #14390,
which claimed that the Scylla-only "tombstone_gc" option is missing from
the output of "describe table".
This test shows that this report is *not* true, at least not when the
"server-side describe" is used. "test/cqlpy/run --release ..." shows
that this test passes on master and also for Scylla versions all the
way back to Scylla 5.2 (Scylla 5.1 did not support server-side
describe, so the test fails for that reason).
This suggests that the report in issue #14390 was for old-style
client-side (cqlsh) describe, which we no longer support, so this
issue can be closed.
Fixes#14390.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#22354
As discussed in
https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813,
compact storage tables are deprecated.
Yet, there's is nothing in the code that prevents users
from creating such tables.
This patch adds a live-updateable config option:
`enable_create_table_with_compact_storage` that require users
to opt-in in order to create new tables WITH COMPACT STORAGE.
The option is currently set to `true` by default in db/config
to reduce the churn to tests and to `false` in scylla.yaml,
for new clusters.
TODO: once regressions tests that use compact storage
are converted to enable the option, change the default in
db/config to false.
A unit test was added to test/cql-pytest that
checks that the respective cql query fails as expected
with the default option or when it is explicitly set to `false`,
and that the query succeeds when the option is set to `true`.
Note that `check_restricted_table_properties` already
returns an optional warning, but it is only logged
but not returned in the `prepared_statement`.
Fixing that is out of the scope of this patch.
See https://github.com/scylladb/scylladb/issues/20945
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Replace remaining uses of boost::adaptors::transformed with std::views::transform
to reduce Boost dependencies, following the migration pattern established in
bab12e3a. This change addresses recently merged code that reintroduced Boost
header dependencies through boost::adaptors::transformed usage.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22365
In this change, tablet_virtual_task starts supporting tablet
resize (i.e. split and merge).
Users can see running resize tasks - finished tasks are not
presented with the task manager API.
A new task state "suspended" is added. If a resize was revoked,
it will appear to users as suspended. We assume that the resize was revoked
when the tablet number didn't change.
Fixes: #21366.
Fixes: #21367.
No backport, new feature
Closesscylladb/scylladb#21891
* github.com:scylladb/scylladb:
test: boost: check resize_task_info in tablet_test.cc
test: add tests to check revoked resize virtual tasks
test: add tests to check the list of resize virtual tasks
test: add tests to check spilt and merge virtual tasks status
test: test_tablet_tasks: generalize functions
replica: service: add split virtual task's children
replica: service: pass parent info down to storage_group::split
tasks: children of virtual tasks aren't internal by default
tasks: initialize shard in task_info ctor
service: extend tablet_virtual_task::abort
service: retrun status_helper struct from tablet_virtual_task::get_status_helper
service: extend tablet_virtual_task::wait
tasks: add suspended task state
service: extend tablet_virtual_task::get_status
service: extend tablet_virtual_task::contains
service: extend tablet_virtual_task::get_stats
service: add service::task_manager_module::get_nodes
tasks: add task_manager::get_nodes
tasks: drop noexcept from module::get_nodes
replica: service: add resize_task_info static column to system.tablets
locator: extend tablet_task_info to cover resize tasks
Introduces a comprehensive audit system to track database operations for security
and compliance purposes. This change includes:
Core Components:
- New audit subsystem for logging database operations
- Service level integration for proper resource management
- CQL statement tracking with operation categories
- Login process integration for tenant management
Key Features:
- Configurable audit logging (syslog/table)
- Operation categorization (QUERY/DML/DDL/DCL/AUTH/ADMIN)
- Selective auditing by keyspace/table
- Password sanitization in audit logs
- Service level shares support (1-1000) for workload prioritization
- Proper lifecycle management and cleanup
I ran the dtests for audit (manually enabled) and they pass.
The in-repo tests pass.
Notably, there should be no non-whitespace changes between this and scylla-enterprise
Fixesscylladb/scylla-enterprise#4999Closesscylladb/scylladb#22147
* github.com:scylladb/scylladb:
audit: Add shares support to service level management
audit: Add service level support to CQL login process
audit: Add support to CQL statements
audit: Integrate audit subsystem into Scylla main process
audit: Add documentation for the audit subsystem
audit: Add the audit subsystem
Now that all topology related code uses host ids there is not point to
maintain ip to id (and back) mappings in the token metadata. After the
patch the mapping will be maintained in the gossiper only. The rest of
the system will use host ids and in rare cases where translation is
needed (mostly for UX compatibility reasons) the translation will be
done using gossiper.
Fixes: scylladb/scylla#21777
* 'gleb/drop-ip-from-tm-v3' of github.com:scylladb/scylla-dev: (57 commits)
hint manager: do not translate ip to id in case hint manager is stopped already
locator: token_metadata: drop update_host_id() function that does nothing now
locator: topology: drop indexing by ips
repair: drop unneeded code
storage_service: use host_id to look for a node in on_alive handler
storage_proxy: translate ips to ids in forward array using gossiper
locator: topology: remove unused functions
storage_service: check for outdated ip in on_change notification in the peers table
storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology
topology coordinator: change connection dropping code to work on host ids
cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query
locator: drop unused function from tablet_effective_replication_map
api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead
locator: token_metadata: remove unused ip based functions
locator: network_topology_strategy: use host_id based function to check number of endpoints in dcs
gossiper: drop get_unreachable_token_owners functions
storage_service: use gossiper to map ip to id in node_ops operations
storage_service: fix indentation after the last patch
storage_service: drop loops from node ops replace_prepare handling since there can be only one replacing node
token_metadata: drop no longer used functions
...
The sstable loader relied on the generation id to provide an efficient
hint about the shard that owns an sstable. But, this hint was rendered
ineffective with the introduction of UUID generation, as the shard id
was no longer embedded in the generation id. This also became suboptimal
with the introduction of tablets. Commit 0c77f77 addressed this issue by
reading the minimum from disk to determine sstable ownership but this
improvement was lost with commit 63f1969, which optimistically assumed
that hints would work most of the time, which isn't true.
This commit restores that change - shard id of a table is deduced by
reading minially from disk and then the sstable is fully loaded only if
it belongs to the local shard. This patch also adds a testcase to verify
that the sstable are loaded only in their respective shards.
Fixes#21015
This fixes a regression and should be backported.
Closesscylladb/scylladb#22263
* github.com:scylladb/scylladb:
sstable_directory: do not load remote sstables in process_descriptor
sstable_directory: update `load_sstable()` definition
sstable_directory: reintroduce `get_shards_for_this_sstable()`
The repair_time in system.tablets will be updated when repair runs
successfully. We can now use it to update the repair time for tombstone
gc, i.e, when the system.tablets.repair_time is propagated, call
gc_state.update_repair_time() on the node that is the owner of the
tablet.
Since b3b3e880d3 ("repair: Reduce hints and batchlog flush"), the
repair time that could be used for tombstone gc might be smaller than
when the repair is started, so the actual repair time for tombstone gc
is returned by the repair rpc call from the repair master node.
Fixes#17507
The cache of the hints and batchlog flush makes the exact repair time
check difficult in the test. Disabling it for two repair tests
that check the exact repair time.
For several years now, we have seen a strange, and very rare, flakiness
in Alternator tests described in issue #17564: We see all the test pass,
pytest declares them to have passed, and while Python is existing, it
crashes with a signal 11 (SIGSEGV). Because this happens exclusively in
test/alternator and never in the test/cqlpy, we suspect that something
that the test/alternator leaves behind but test/cqlpy does not, causes
some race and crashes during shutdown.
The immediate suspect is the boto3 library, or rather, the urllib3 library
which it uses. This is more-or-less the only thing that test/alternator
does which test/cqlpy doesn't. The urllib3 library keeps around pools of
reusable connections, and it's possible (although I don't actually have any
proof for it) that these open connections may cause a crash during shutdown.
So in this patch I add to the "dynamodb" and "dynamodbstreams" fixtures
(which all Alternator tests use to connect to the server), a teardown which
calls close() for the boto3 client object. This close() call percolates
down to calling clear() on urllib3's PoolManager. Hopefully, this will
make some difference in the chance to crash during shutdown - and if it
doesn't, it won't hurt.
Refs #17564Closesscylladb/scylladb#22341
We want to drop ips from token_metadata so move to use host id based
counterparts. Messaging service gets a function that maps from ips to id
when is starts listening.
wasm32-wasi has been removed in Rust 1.84 (Jan 5th, 2025). if one
compiles the tree with Rust 1.84 or up, following build failure is
expected:
```
[2/305] Building WASM /home/kefu/dev/scylladb/build/wasm/return_input.wasm
FAILED: wasm/return_input.wasm /home/kefu/dev/scylladb/build/wasm/return_input.wasm
cd /home/kefu/dev/scylladb/test/resource/wasm/rust && /usr/bin/cargo build --target=wasm32-wasi --example=return_input --locked --manifest-path=Cargo.toml --target-dir=/home/kefu/dev/scylladb/build/test/resource/wasm/rust && wasm-opt /home/kefu/dev/scylladb/build/test/resource/wasm/rust/wasm32-wasi//debug/examples/return_input.wasm -Oz -o /home/kefu/dev/scylladb/build/wasm/return_input.wasm && wasm-strip /home/kefu/dev/scylladb/build/wasm/return_input.wasm
error: failed to run `rustc` to learn about target-specific information
Caused by:
process didn't exit successfully: `rustc - --crate-name ___ --print=file-names --target wasm32-wasi --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=split-debuginfo --print=crate-name --print=cfg` (exit status: 1)
--- stderr
error: Error loading target specification: Could not find specification for target "wasm32-wasi". Run `rustc --print target-list` for a list of built-in targets
```
in order to workaround this issue, let's check for supported target,
and use wasm32-wasip1 if wasm32-wasi is not listed as the supported
target.
Refs #20878
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22320
When a node is bootstrapped and joined a cluster as a non-voter and changes it's role to a voter, errors can occur while committing a new Raft record, for instance, if the Raft leader changes during this time. These errors are not critical and should not cause a node crash, as the action can be retried.
Fixesscylladb/scylladb#20814
Backport: This issue occurs frequently and disrupts the CI workflow to some extent. Backports are needed for versions 6.1 and 6.2.
Closesscylladb/scylladb#22253
* github.com:scylladb/scylladb:
raft: refactor `remove_from_raft_config` to use a timed `modify_config` call.
raft: Refactor functions using `modify_config` to use a common wrapper for retrying.
raft: Handle non-critical config update errors in when changing status to voter.
test: Add test to check that a node does not fail on unknown commit status error when starting up.
raft: Add run_op_with_retry in raft_group0.
Introduces shares-based workload prioritization for service levels, allowing
fine-grained control over resource allocation between tenants. Key changes:
- Add shares option to service level configuration:
- Valid range: 1-1000 shares
- Default value: 1000 shares
- Enterprise-only feature gated by WORKLOAD_PRIORITIZATION feature flag
- Extend CQL interface:
- Add shares parameter to CREATE/ALTER SERVICE_LEVEL
- Add shares column to system_distributed.service_levels
- Add percentage calculation to LIST SERVICE_LEVELS
- Add shares to DESCRIBE EFFECTIVE SERVICE_LEVEL output
- Add validation:
- Enforce shares range (1-1000)
- Validate enterprise feature flag
- Handle unset/delete markers properly
- Update service level statements:
- Add shares validation to CREATE/ALTER operations
- Preserve shares through default value replacement
- Add proper decomposition for shares values in result sets
This change enables operators to control relative resource allocation between
tenants using proportional share scheduling, while maintaining backward
compatibility with existing service level configurations.
This change integrates service level functionality into the CQL authentication and connection handling:
- Add scheduling_group_name to client_data to track service level assignments
- Extend SASL challenge interface to expose authenticated username
- Modify connection processing to support tenant switching:
- Add switch_tenant() method to handle scheduling group changes
- Add process_until_tenant_switch() to handle request processing boundaries
- Implement no_tenant() default executor
- Add execute_under_tenant_type for scheduling group management
- Update connection lifecycle to properly handle service level changes:
- Initialize connections with default scheduling group
- Support dynamic scheduling group updates when service levels change
- Ensure proper cleanup of scheduling group assignments
The changes enable proper scheduling group assignment and management based on
authenticated users' service levels, while maintaining backward compatibility
for connections without service level assignments.
Adds core integration of the audit subsystem into Scylla's main process flow. Changes include:
- Import audit subsystem header
- Initialize audit system during server startup using configuration and token metadata
- Start audit system after API server initialization with query processor and memory manager
- Add proper shutdown sequence for audit system using RAII pattern
- Add error handling for audit system initialization failures
The audit system is now properly integrated into Scylla's lifecycle, ensuring:
- Correct initialization order relative to other subsystems
- Proper resource cleanup during shutdown
- Graceful error handling for initialization failures
This change introduces a new audit subsystem that allows tracking and logging of database operations for security and compliance purposes. Key features include:
- Configurable audit logging to either syslog or a dedicated system table (audit.audit_log)
- Selective auditing based on:
- Operation categories (QUERY, DML, DDL, DCL, AUTH, ADMIN)
- Specific keyspaces
- Specific tables
- New configuration options:
- audit: Controls audit destination (none/syslog/table)
- audit_categories: Comma-separated list of operation categories to audit
- audit_tables: Specific tables to audit
- audit_keyspaces: Specific keyspaces to audit
- audit_unix_socket_path: Path for syslog socket
- audit_syslog_write_buffer_size: Buffer size for syslog writes
The audit logs capture details including:
- Operation timestamp
- Node and client IP addresses
- Operation category and query
- Username
- Success/failure status
- Affected keyspace and table names
In this PR, we pair draining the view builder with its start.
To better understand what was done and why, let's first look at the
situation before this commit and the context of it:
(a) The following things happened in order:
1. The view builder would be constructed.
2. Right after that, a deferred lambda would be created to stop the
view builder during shutdown.
3. group0_service would be started.
4. A deferred lambda stopping group0_service would be created right
after that.
5. The view builder would be started.
(b) Because the view builder depends on group0_client, it couldn't be
started before starting group0_service. On the other hand, other
services depend on the view builder, e.g. the stream manager. That
makes changing the order of initialization a difficult problem,
so we want to avoid doing that unless we're sure it's the right
choice.
(c) Since the view builder uses group0_client, there was a possibility
of running into a segmentation fault issue in the following
scenario:
1. A call to `view_builder::mark_view_build_success()` is issued.
2. We stop group0_service.
3. `view_builder::mark_view_build_success()` calls
`announce_with_raft()`, which leads to a use-after-free because
group0_service has already been destroyed.
This very scenario took place in scylladb/scylladb#20772.
Initially, we decided to solve the issue by initializing
group0_service a bit earlier (scylladb/scylladb@7bad8378c7).
Unfortunately, it led to other issues described in scylladb/scylladb#21534,
so we revert that patch. These changes are the second attempt
to the problem where we want to solve it in a safer manner.
The solution we came up with is to pair the start of the view builder
with a deferred lambda that deinitializes it by calling
`view_builder::drain()`. No other component of the system should be
able to use the view builder anymore, so it's safe to do that.
Furthermore, that pairing makes the analysis of
initialization/deinitialization order much easier. We also solve the
aformentioned use-after-free issue because the view builder itself
will no longer attempt to use group0_client.
Note that we still pair a deferred lambda calling `view_builder::stop()`
with the construction of the view builder; that function will also call
`view_builder::drain()`. Another notable thing is `view_builder::drain()`
may be called earlier by `storage_service::do_drain()`. In other words,
these changes cover the situation when Scylla runs into a problem when
starting up.
Backport: The patch I'm reverting made it to 6.2, so we want to backport this one there too.
Fixesscylladb/scylladb#20772Fixesscylladb/scylladb#21534Closesscylladb/scylladb#21909
* github.com:scylladb/scylladb:
test/topology_custom: Add test for Scylla with disabled view building
main, view: Pair view builder drain with its start
Revert "main,cql_test_env: start group0_service before view_builder"
to voter.
When a node is bootstrapped and joins a cluster as a non-voter, errors can occur while committing
a new Raft record, for instance, if the Raft leader changes during this time. These errors are not
critical and should not cause a node crash, as the action can be retried.
Fixesscylladb/scylladb#20814
error when starting up.
Test that a node is starting successfully if while joining a cluster and becoming a voter, it
receives an unknown commit status error.
Test for scylladb/scylladb#20814
- use "foo not in bar" instead of "not foo in bar"
- test/pylib: use foo instead of `'{}'.format(foo)`
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#22066
* github.com:scylladb/scylladb:
test/pylib: use `foo` instead of `'{}'.format(foo)`
test/pylib: use "foo not in bar" instead of "not foo in bar"
As part of #18750, we added a CQL statement CREATE ROLE WITH SALTED HASH that prevented hashing a password when creating a role, effectively leading to inserting a hash given by the user directly into the database. In #21350, we noticed that Cassandra had implemented a CQL statement of similar semantics but different syntax. We decided to rename Scylla's statement to be compatible with Cassandra. Unfortunately, we didn't notice one more difference between what we had in Scylla and what was part of Cassandra.
Scylla's statement was originally supposed to only be used when restoring the schema and the user needn't have to be aware of its existence at all: the database produced a sequence of CQL statements that the user saved to a file and when a need to restore the schema arose, they would execute the contents of the file. That's why that although we documented the feature, it was only done in the necessary places. Those that weren't related to the backup & restore procedure were deliberately skipped.
Cassandra, on the other hand, added the statement for a different purpose (for details, see the relevant issue) and it was supposed to be used by the user by design. The statement is also documented as such.
Since we want to preserve compatibility with Cassandra, we document the statement and its semantics in the user documentation, explicitly implying that it can be used by the user.
We also add a test verifying that logging in works correctly.
Fixesscylladb/scylladb#21691
Backport: not needed. The relevant code didn't make it to 6.2 or any previous version of OSS.
Closesscylladb/scylladb#21752
* github.com:scylladb/scylladb:
docs: Update documentation on CREATE ROLE WITH HASHED PASSWORD
test/boost: Add test for creating roles with hashed passwords
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.
please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
memtable_flush_period test sets the flush period to 200ms and checks
whether the data is flushed after 500ms.
When flush period is set, the timer is armed with the given value.
On expiration, memtables are flushed and then the timer is rearmed.
There is no certainty that during 500ms the flush finishes, though.
Check if after 500ms flush has started. Wait until there is an sstable.
Fixes: #21965.
Closesscylladb/scylladb#22162
In Scylla there are two options that control IO bandwidth limit -- the /storage_service/(compaction|stream)_throughput REST API endpoints. The endpoints are partially implemented and have no counterparts in the nodetool.
This set implements the missing bits and adds tests for new functionality.
Closesscylladb/scylladb#21877
* github.com:scylladb/scylladb:
nodetool: Implement [gs]etstreamthroughput commands
nodetool: Implement [gs]etcompationthroughput commands
test: Add validation of how IO-updating endpoints work
api: Implement /storage_service/(stream|compaction)_throughput endpoints
api: Disqualify const config reference
api: Implement /storage_service/stream_throughput endpoint
api: Move stream throughput set/get endpoints from storage service block
api: Move set_compaction_throughput_mb_per_sec to config block
util: Include fmt/ranges.h in config_file.hh