Now that all topology related code uses host ids there is not point to
maintain ip to id (and back) mappings in the token metadata. After the
patch the mapping will be maintained in the gossiper only. The rest of
the system will use host ids and in rare cases where translation is
needed (mostly for UX compatibility reasons) the translation will be
done using gossiper.
Fixes: scylladb/scylla#21777
* 'gleb/drop-ip-from-tm-v3' of github.com:scylladb/scylla-dev: (57 commits)
hint manager: do not translate ip to id in case hint manager is stopped already
locator: token_metadata: drop update_host_id() function that does nothing now
locator: topology: drop indexing by ips
repair: drop unneeded code
storage_service: use host_id to look for a node in on_alive handler
storage_proxy: translate ips to ids in forward array using gossiper
locator: topology: remove unused functions
storage_service: check for outdated ip in on_change notification in the peers table
storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology
topology coordinator: change connection dropping code to work on host ids
cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query
locator: drop unused function from tablet_effective_replication_map
api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead
locator: token_metadata: remove unused ip based functions
locator: network_topology_strategy: use host_id based function to check number of endpoints in dcs
gossiper: drop get_unreachable_token_owners functions
storage_service: use gossiper to map ip to id in node_ops operations
storage_service: fix indentation after the last patch
storage_service: drop loops from node ops replace_prepare handling since there can be only one replacing node
token_metadata: drop no longer used functions
...
The sstable loader relied on the generation id to provide an efficient
hint about the shard that owns an sstable. But, this hint was rendered
ineffective with the introduction of UUID generation, as the shard id
was no longer embedded in the generation id. This also became suboptimal
with the introduction of tablets. Commit 0c77f77 addressed this issue by
reading the minimum from disk to determine sstable ownership but this
improvement was lost with commit 63f1969, which optimistically assumed
that hints would work most of the time, which isn't true.
This commit restores that change - shard id of a table is deduced by
reading minially from disk and then the sstable is fully loaded only if
it belongs to the local shard. This patch also adds a testcase to verify
that the sstable are loaded only in their respective shards.
Fixes#21015
This fixes a regression and should be backported.
Closesscylladb/scylladb#22263
* github.com:scylladb/scylladb:
sstable_directory: do not load remote sstables in process_descriptor
sstable_directory: update `load_sstable()` definition
sstable_directory: reintroduce `get_shards_for_this_sstable()`
For several years now, we have seen a strange, and very rare, flakiness
in Alternator tests described in issue #17564: We see all the test pass,
pytest declares them to have passed, and while Python is existing, it
crashes with a signal 11 (SIGSEGV). Because this happens exclusively in
test/alternator and never in the test/cqlpy, we suspect that something
that the test/alternator leaves behind but test/cqlpy does not, causes
some race and crashes during shutdown.
The immediate suspect is the boto3 library, or rather, the urllib3 library
which it uses. This is more-or-less the only thing that test/alternator
does which test/cqlpy doesn't. The urllib3 library keeps around pools of
reusable connections, and it's possible (although I don't actually have any
proof for it) that these open connections may cause a crash during shutdown.
So in this patch I add to the "dynamodb" and "dynamodbstreams" fixtures
(which all Alternator tests use to connect to the server), a teardown which
calls close() for the boto3 client object. This close() call percolates
down to calling clear() on urllib3's PoolManager. Hopefully, this will
make some difference in the chance to crash during shutdown - and if it
doesn't, it won't hurt.
Refs #17564Closesscylladb/scylladb#22341
We want to drop ips from token_metadata so move to use host id based
counterparts. Messaging service gets a function that maps from ips to id
when is starts listening.
wasm32-wasi has been removed in Rust 1.84 (Jan 5th, 2025). if one
compiles the tree with Rust 1.84 or up, following build failure is
expected:
```
[2/305] Building WASM /home/kefu/dev/scylladb/build/wasm/return_input.wasm
FAILED: wasm/return_input.wasm /home/kefu/dev/scylladb/build/wasm/return_input.wasm
cd /home/kefu/dev/scylladb/test/resource/wasm/rust && /usr/bin/cargo build --target=wasm32-wasi --example=return_input --locked --manifest-path=Cargo.toml --target-dir=/home/kefu/dev/scylladb/build/test/resource/wasm/rust && wasm-opt /home/kefu/dev/scylladb/build/test/resource/wasm/rust/wasm32-wasi//debug/examples/return_input.wasm -Oz -o /home/kefu/dev/scylladb/build/wasm/return_input.wasm && wasm-strip /home/kefu/dev/scylladb/build/wasm/return_input.wasm
error: failed to run `rustc` to learn about target-specific information
Caused by:
process didn't exit successfully: `rustc - --crate-name ___ --print=file-names --target wasm32-wasi --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=split-debuginfo --print=crate-name --print=cfg` (exit status: 1)
--- stderr
error: Error loading target specification: Could not find specification for target "wasm32-wasi". Run `rustc --print target-list` for a list of built-in targets
```
in order to workaround this issue, let's check for supported target,
and use wasm32-wasip1 if wasm32-wasi is not listed as the supported
target.
Refs #20878
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22320
When a node is bootstrapped and joined a cluster as a non-voter and changes it's role to a voter, errors can occur while committing a new Raft record, for instance, if the Raft leader changes during this time. These errors are not critical and should not cause a node crash, as the action can be retried.
Fixesscylladb/scylladb#20814
Backport: This issue occurs frequently and disrupts the CI workflow to some extent. Backports are needed for versions 6.1 and 6.2.
Closesscylladb/scylladb#22253
* github.com:scylladb/scylladb:
raft: refactor `remove_from_raft_config` to use a timed `modify_config` call.
raft: Refactor functions using `modify_config` to use a common wrapper for retrying.
raft: Handle non-critical config update errors in when changing status to voter.
test: Add test to check that a node does not fail on unknown commit status error when starting up.
raft: Add run_op_with_retry in raft_group0.
In this PR, we pair draining the view builder with its start.
To better understand what was done and why, let's first look at the
situation before this commit and the context of it:
(a) The following things happened in order:
1. The view builder would be constructed.
2. Right after that, a deferred lambda would be created to stop the
view builder during shutdown.
3. group0_service would be started.
4. A deferred lambda stopping group0_service would be created right
after that.
5. The view builder would be started.
(b) Because the view builder depends on group0_client, it couldn't be
started before starting group0_service. On the other hand, other
services depend on the view builder, e.g. the stream manager. That
makes changing the order of initialization a difficult problem,
so we want to avoid doing that unless we're sure it's the right
choice.
(c) Since the view builder uses group0_client, there was a possibility
of running into a segmentation fault issue in the following
scenario:
1. A call to `view_builder::mark_view_build_success()` is issued.
2. We stop group0_service.
3. `view_builder::mark_view_build_success()` calls
`announce_with_raft()`, which leads to a use-after-free because
group0_service has already been destroyed.
This very scenario took place in scylladb/scylladb#20772.
Initially, we decided to solve the issue by initializing
group0_service a bit earlier (scylladb/scylladb@7bad8378c7).
Unfortunately, it led to other issues described in scylladb/scylladb#21534,
so we revert that patch. These changes are the second attempt
to the problem where we want to solve it in a safer manner.
The solution we came up with is to pair the start of the view builder
with a deferred lambda that deinitializes it by calling
`view_builder::drain()`. No other component of the system should be
able to use the view builder anymore, so it's safe to do that.
Furthermore, that pairing makes the analysis of
initialization/deinitialization order much easier. We also solve the
aformentioned use-after-free issue because the view builder itself
will no longer attempt to use group0_client.
Note that we still pair a deferred lambda calling `view_builder::stop()`
with the construction of the view builder; that function will also call
`view_builder::drain()`. Another notable thing is `view_builder::drain()`
may be called earlier by `storage_service::do_drain()`. In other words,
these changes cover the situation when Scylla runs into a problem when
starting up.
Backport: The patch I'm reverting made it to 6.2, so we want to backport this one there too.
Fixesscylladb/scylladb#20772Fixesscylladb/scylladb#21534Closesscylladb/scylladb#21909
* github.com:scylladb/scylladb:
test/topology_custom: Add test for Scylla with disabled view building
main, view: Pair view builder drain with its start
Revert "main,cql_test_env: start group0_service before view_builder"
to voter.
When a node is bootstrapped and joins a cluster as a non-voter, errors can occur while committing
a new Raft record, for instance, if the Raft leader changes during this time. These errors are not
critical and should not cause a node crash, as the action can be retried.
Fixesscylladb/scylladb#20814
error when starting up.
Test that a node is starting successfully if while joining a cluster and becoming a voter, it
receives an unknown commit status error.
Test for scylladb/scylladb#20814
- use "foo not in bar" instead of "not foo in bar"
- test/pylib: use foo instead of `'{}'.format(foo)`
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#22066
* github.com:scylladb/scylladb:
test/pylib: use `foo` instead of `'{}'.format(foo)`
test/pylib: use "foo not in bar" instead of "not foo in bar"
As part of #18750, we added a CQL statement CREATE ROLE WITH SALTED HASH that prevented hashing a password when creating a role, effectively leading to inserting a hash given by the user directly into the database. In #21350, we noticed that Cassandra had implemented a CQL statement of similar semantics but different syntax. We decided to rename Scylla's statement to be compatible with Cassandra. Unfortunately, we didn't notice one more difference between what we had in Scylla and what was part of Cassandra.
Scylla's statement was originally supposed to only be used when restoring the schema and the user needn't have to be aware of its existence at all: the database produced a sequence of CQL statements that the user saved to a file and when a need to restore the schema arose, they would execute the contents of the file. That's why that although we documented the feature, it was only done in the necessary places. Those that weren't related to the backup & restore procedure were deliberately skipped.
Cassandra, on the other hand, added the statement for a different purpose (for details, see the relevant issue) and it was supposed to be used by the user by design. The statement is also documented as such.
Since we want to preserve compatibility with Cassandra, we document the statement and its semantics in the user documentation, explicitly implying that it can be used by the user.
We also add a test verifying that logging in works correctly.
Fixesscylladb/scylladb#21691
Backport: not needed. The relevant code didn't make it to 6.2 or any previous version of OSS.
Closesscylladb/scylladb#21752
* github.com:scylladb/scylladb:
docs: Update documentation on CREATE ROLE WITH HASHED PASSWORD
test/boost: Add test for creating roles with hashed passwords
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.
please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
memtable_flush_period test sets the flush period to 200ms and checks
whether the data is flushed after 500ms.
When flush period is set, the timer is armed with the given value.
On expiration, memtables are flushed and then the timer is rearmed.
There is no certainty that during 500ms the flush finishes, though.
Check if after 500ms flush has started. Wait until there is an sstable.
Fixes: #21965.
Closesscylladb/scylladb#22162
In Scylla there are two options that control IO bandwidth limit -- the /storage_service/(compaction|stream)_throughput REST API endpoints. The endpoints are partially implemented and have no counterparts in the nodetool.
This set implements the missing bits and adds tests for new functionality.
Closesscylladb/scylladb#21877
* github.com:scylladb/scylladb:
nodetool: Implement [gs]etstreamthroughput commands
nodetool: Implement [gs]etcompationthroughput commands
test: Add validation of how IO-updating endpoints work
api: Implement /storage_service/(stream|compaction)_throughput endpoints
api: Disqualify const config reference
api: Implement /storage_service/stream_throughput endpoint
api: Move stream throughput set/get endpoints from storage service block
api: Move set_compaction_throughput_mb_per_sec to config block
util: Include fmt/ranges.h in config_file.hh
The sstable loader relied on the generation id to provide an efficient
hint about the shard that owns an sstable. But, this hint was rendered
ineffective with the introduction of UUID generation, as the shard id
was no longer embedded in the generation id. This also became suboptimal
with the introduction of tablets. Commit 0c77f77 addressed this issue by
reading the minimum from disk to determine sstable ownership but this
improvement was lost with commit 63f1969, which optimistically assumed
that hints would work most of the time, which isn't true.
This commit restores that change - shard id of a table is deduced by
reading minially from disk and then the sstable is fully loaded only if
it belongs to the local shard. This patch also adds a testcase to verify
that the sstable are loaded only in their respective shards.
Fixes#21015
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
This series adds WCU support for the Alternator update item.
This motivation behind it, is to have a rough estimation of what a similar operation would have taken from WCU perspective if used with DynamoDB.
The calculation is done while minimal overhead is the prime objective, the results are values that is less or equal to what it would have been in DynamoDB
** New feature, no need to backport. **
Closesscylladb/scylladb#21999
* github.com:scylladb/scylladb:
alternator/test_returnconsumedcapacity.py: update item
alternator/executor.cc: Add WCU for update_item
This update addresses an issue in the mutation diff calculation
algorithm used during read repair. Previously, the algorithm used
`token` as the hashmap key. Since `token` is calculated basing on the
Murmur3 hash function, it could generate duplicate values for different
partition keys, causing corruption in the affected rows' values.
Fixesscylladb/scylladb#19101
Since the issue affects all the relevant scylla versions, backport to: 6.1, 6.2
Closesscylladb/scylladb#21996
* github.com:scylladb/scylladb:
storage_proxy/read_repair: Remove redundant 'schema' parameter from `data_read_resolver::resolve` function.
storage_proxy/read_repair: Use `partition_key` instead of `token` key for mutation diff calculation hashmap.
test: Add test case for checking read repair diff calculation when having conflicting keys.
Since we manage ip to id mapping directly in gossiper now we need to
load the mapping on boot. We already do it anyway, but only due to a bug
which checks raft topology mode config before it is set, so the code
thinks that it is in the gossiper mode and loads peers table into the
gossiper and token metadata. Fix the bug and load peers into the gossiper
only since token metadata is managed by raft.
The series also removes address map related test that no longer checks
anything and replace it with unit test.
It also adds the dc/rack check to "join node" rpc. The check is done
during shadow round now, but for it to work it requires dc/rack to be
propagated through the gossiper and we want to eventually drop it.
Ref: scylladb/scylladb#21777
* 'load-peers' of https://github.com/gleb-cloudius/scylla:
topology coordinator: reject replace request if topology does not match
gossiper: fix the logic of shadow_round parameter
storage_service: do not add endpoint to the gossiper during topology loading.
storage_service: load peers into gossiper on boot in raft topology mode
storage_service: set raft topology change mode before using it in join_cluster
locator: drop inet_address usage to figure out per dc/rack replication
test: drop test_old_ip_notification_repro.py
test: address_map: check generation handling during entry addition
Developers using asyncio.gather() often assume that it waits for all futures (awaitables) givens.
But this isn't true when the return_exceptions parameter is False, which is the default.
In that case, as soon as one future completes with an exception, the gather() call will return this exception immediately, and some of the finished tasks may continue to run in the background.
This is bad for applications that use gather() to ensure that a list of background tasks has all completed.
So such applications must use asyncio.gather() with return_exceptions=True, to wait for all given futures to complete either successfully or unsuccessfully.
Closesscylladb/scylladb#22252
When destroying a test cluster, ScyllaCluster.stop() calls ScyllaServer.stop()
for each running server. Previously, non-zero exit status codes from scylla
servers were silently ignored during test teardown.
This change modifies the logging behavior to print the exit status code when
a scylla server exits with a non-zero status. This helps developers quickly
identify potential issues or unexpected terminations during test runs.
Differences in handling:
- Before: Non-zero exit codes were not logged
- After: Non-zero exit codes are printed, providing visibility into
server termination errors
This improvement aids in diagnosing intermittent test failures or
unexpected server shutdowns during test execution.
Refs #21742
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21934
We restore a snapshot of table by streaming the sstables of
the given snapshot of the table using
`sstable_streamer::stream_sstable_mutations()` in batches. This function
reads mutations from a set of sstables, and streams them to the target
nodes. Due to the limit of this function, we are not able to track the
progress in bytes.
Previously, progress tracking used individual sstables as units, which caused
inaccuracies with tablet-distributed tables, where:
- An sstable spanning multiple tablets could be counted multiple times
- Progress reporting could become misleading (e.g., showing "40" progress
for a table with 10 sstables)
This change introduces a more robust progress tracking method:
- Use "batch" as the unit of progress instead of individual sstables.
Each batch represents a tablet when restoring a table snapshot if
the tablet being restored is distributed with tablets. When it comes
to tables distributed with vnode, each batch represents an sstable.
- Stream sstables for each tablet separately, handling both partially and
fully contained sstables
- Calculate progress based on the total number of sstables being streamed
- Skip tablet IDs with no owned tokens
For vnode-distributed tables, the number of "batches" directly corresponds
to the number of sstables, ensuring:
- Consistent progress reporting across different table distribution models
- Simplified implementation
- Accurate representation of restore progress
The new approach provides a more reliable and uniform method of tracking
restoration progress across different table distribution strategies.
Also, Corrected the use of `_sstables.size()` in
`sstable_streamer::stream_sstables()`. It addressed a review comment
from Pavel that was inadvertently overlooked during previous rebasing
the commit of 5ab4932f34.
Fixesscylladb/scylladb#21816
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21841
Before this commit, there doesn't seem to have been a test verifying that
starting and shutting down Scylla behave correctly when the configuration
option `view_building` is set to false. In these changes, we add one.
In these changes, we pair draining the view builder with its start.
To better understand what was done and why, let's first look at the
situation before this commit and the context of it:
(a) The following things happened in order:
1. The view builder would be constructed.
2. Right after that, a deferred lambda would be created to stop the
view builder during shutdown.
3. group0_service would be started.
4. A deferred lambda stopping group0_service would be created right
after that.
5. The view builder would be started.
(b) Because the view builder depends on group0_client, it couldn't be
started before starting group0_service. On the other hand, other
services depend on the view builder, e.g. the stream manager. That
makes changing the order of initialization a difficult problem,
so we want to avoid doing that unless we're sure it's the right
choice.
(c) Since the view builder uses group0_client, there was a possibility
of running into a segmentation fault issue in the following
scenario:
1. A call to `view_builder::mark_view_build_success()` is issued.
2. We stop group0_service.
3. `view_builder::mark_view_build_success()` calls
`announce_with_raft()`, which leads to a use-after-free because
group0_service has already been destroyed.
This very scenario took place in scylladb/scylladb#20772.
Initially, we decided to solve the issue by initializing
group0_service a bit earlier (scylladb/scylladb@7bad8378c7).
Unfortunately, it led to other issues described in scylladb/scylladb#21534.
We reverted that change in the previous commit. These changes are the
second attempt to the problem where we want to solve it in a safer manner.
The solution we came up with is to pair the start of the view builder
with a deferred lambda that deinitializes it by calling
`view_builder::drain()`. No other component of the system should be
able to use the view builder anymore, so it's safe to do that.
Furthermore, that pairing makes the analysis of
initialization/deinitialization order much easier. We also solve the
aformentioned use-after-free issue because the view builder itself
will no longer attempt to use group0_client.
Note that we still pair a deferred lambda calling `view_builder::stop()`
with the construction of the view builder; that function will also call
`view_builder::drain()`. Another notable thing is `view_builder::drain()`
may be called earlier by `storage_service::do_drain()`. In other words,
these changes cover the situation when Scylla runs into a problem when
starting up.
Fixesscylladb/scylladb#20772
The patch solved a problem related to an initialization order
(scylladb/scylladb#20772), but we ran into another one: scylladb/scylladb#21534.
After moving the initialization of group0_service, it ended up being destroyed
AFTER the CDC generation service would. Since CDC generations are accessed
in `storage_service::topology_state_load()`:
```
for (const auto& gen_id : _topology_state_machine._topology.committed_cdc_generations) {
rtlogger.trace("topology_state_load: process committed cdc generation {}", gen_id);
co_await _cdc_gens.local().handle_cdc_generation(gen_id);
```
we started getting the following failure:
```
Service &seastar::sharded<cdc::generation_service>::local() [Service = cdc::generation_service]: Assertion `local_is_initialized()' failed.
```
We're reverting the patch to go back to a more stable version of Scylla
and in the following commit, we'll solve the original issue in a more
systematic way.
This reverts commit 7bad8378c7.
Fixes https://github.com/scylladb/scylla-enterprise/issues/5016#issuecomment-2558464631
EAR - encryption at rest. Allows on-disk file encryption of sstables and commitlog data.
Introduces OpenSSL based file level encrypted storage, managed via a set of providers
ranging from local files to cloud KMS providers.
For a more comprehensive explanation, see the included docs (or if possible, original
source tree).
Manual bulk merge of EAR feature from enterprise repo to main scylla repo.
Breaks some features apart, but main EAR is still a humongous commit, because to separate this
I would have to mess with code incrementally, adding time and risk.
This PR includes the local file gen tool, tests and also p11 validation.
Note: CI will not execute the full tests unless master CI is set to provide the same environment
as the enterprise one. Not sure about the status of this ATM.
Note: Includes code to compile against cryptsoft kmipc SDK, but not the SDK. If you happen to
check out this tree in the scylla folder and configure, it will be linked against and KMIP functionality
will be enabled, otherwise not.
Closesscylladb/scylladb#22233
* github.com:scylladb/scylladb:
docs: Add EAR docs
main/build: Add p11-kit and initialize
tools: Add local-file-key-generator tool
tests: Add EAR tests
tmpdir: shorten test tempdir path
EAR: port the ear feature from enterprise
cql_test_env: Add optional query timeout
schema/migration_manager: Add schema validate
sstables: add get_shared_components accessor
config/config_file: Add exports and definitions of config_type_for<>
Instantiated only on shard 0.
Currently, only subscribe from unit test
Manual unit test using loop mount was added.
Note that the test requires sudo access
and root access to /dev/loop, so it cannot
run in rootless podman instance, and it'd
fail with Permission denied.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#21523
This PR extends authentication with 2 mechanisms:
- a new role_manager subclass, which allows managing users via
LDAP server,
- a new authenticator, which delegates plaintext authentication
to a running saslauthd daemon.
The features have been ported from the enterprise repository
with their test.py tests and the documentation as part of
changing license to source available.
Fixes: scylladb/scylla-enterprise#5000Fixes: scylladb/scylla-enterprise#5001Closesscylladb/scylladb#22030
Add the test file name to `ScyllaClusterManager` log file names alongside the test function name.
This avoids race conditions when tests with the same function names are executed simultaneously.
Fixesscylladb/scylladb#21807
Backport: not needed since this is a fix in the testing scripts.
Closesscylladb/scylladb#22192
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22199
* seastar 3133ecdd...a9bef537 (24):
> file: add file_system_space
> future: avoid inheriting from future payload type
> treewide: include fmt/ostream.h for using fmt::print()
> build: remove messages used for debugging
> demos: Rename websocket demo to websocket_server demo
> demos: Add a way to set port from cmd line in websocket demo
> tls: Add optional builder + future-wait to cert reload callback + expose rebuild
> rwlock: add try_hold_{read,write}_lock methods
> json: add moving push to json_list
> github: add a step to build "check-include-style"
> build: add a target for checking include style
> scheduling_group: use map for key configs instead of vector
> scheduling_group: fix indentation
> scheduling_group: fix race between scheduling group and key creation
> http: Make request writing functions public
> http: Expose connection_factory implementations
> metrics: Use separate type for shared metadata
> file: unexpected throw from inside noexcept
> metrics: Internalize metric label sets
> thread: optimize maybe_yield
> reactor: fix crash in pending registration task after poller dtor
> net: Fix ipv6 socket_address comparision
> reactor, linux-aio: factor out get_smp_count() lambda
> reactor, linux-aio: restore "available_aio" meaning after "reserve_iocbs"
Fixed usage of seastar metric label sets due to:
scylladb/seastar@733420d57 Merge 'metrics: Internalize metric label sets' from Stephan Dollberg
Closesscylladb/scylladb#22076
We still have a number of issues to be solved for views with tablets.
Until they are fixed, we should prevent users from creating them,
and use the vnode-based views instead.
This patch prepares the feature for enabling views with tablets. The
feature is disabled by default, but currently it has no effect.
After all tests are adjusted to use the feature, we should depend
on the feature for deciding whether we can create materialized views
in tablet-enabled keyspaces.
The unit tests are adjusted to enable this feature explicitly, and it's
also added to the scylla sstable tool config - this tool treats all
tables as if they were tablet-based (surprisingly, with SimpleStrategy),
so for it to work on views, the new feature must be enabled.
Refs scylladb/scylladb#21832Closesscylladb/scylladb#21833
In case of error, repair will be moved into the end_repair stage. We
should not remove repair_task_info in this case because the repair task
requested by the user is not finished yet.
To fix, we should remove repair_task_info at the end of repair stage.
Tests are added to ensure failed repair is not reported as finished.
Closesscylladb/scylladb#21973
Test the limited voters feature by creating a cluster with 3 DCs, one of
them disproportionately larger than the others. The raft majority should
not be lost in case the large DC goes down.
Fixes: scylladb/scylla#21915
Refs: scylladb/scylla#18793Closesscylladb/scylladb#21901
mark the config parameter --commitlog-use-hard-size-limit as deprecated so the
default 'true' is always used, making the hard limit mandatory.
Fixesscylladb/scylladb#16471Closesscylladb/scylladb#21804
node_ops_virtual_task does not filter the entries of system.topology_request
and so it creates statuses of operations that aren't node ops.
Filter the entries used by node_ops_virtual_task. With this change, the status
of a bootstrap of the first node will not be visible.
Fixes: https://github.com/scylladb/scylladb/issues/22008.
Needs backport to 6.2 that introduced node_ops_virtual_task
Closesscylladb/scylladb#22009
* github.com:scylladb/scylladb:
test: truncate the table before node ops task checks
node_ops: rename a method that get node ops entries
node_ops: filter topology_requests entries
now that we are allowed to use C++23. we now have the luxury of using
`std::views::reverse`.
- replace `boost::adaptors::transformed` with `std::views::transform`
- remove unused `#include <boost/range/adaptor/reversed.hpp>`
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Main problem:
If we're draining the last node in a DC, we won't have a chance to
evaluate candidates and notice that constraints cannot be satisfied (N
< RF). Draining will succeed and node will be removed with replicas
still present on that node. This will cause later draining in the same
DC to fail when we will have 2 replicas which need relocaiton for a
given tablet.
The expected behvior is for draining to fail, because we cannot keep
the RF in the DC. This is consistent, for example, with what happens
when removing a node in a 2-node cluster with RF=2.
Fixes#21826
Secondary problem:
We allowed tablet_draining transition to be exited with undrained nodes, leaving replicas on nodes in the "left" state.
Third problem:
We removed DOWN nodes from the candidate node set, even when draining. This is not safe because it may lead to overload. This also makes the "main problem" more likely by extending it to the scenario when the DC is DOWN.
The overload part in not a problem in practice currently, since migrations will block on global topology barrier if there are DOWN nodes.
Closesscylladb/scylladb#21928
* github.com:scylladb/scylladb:
tablets: load_balancer: Fail when draining with no candidate nodes
tablets: load_balancer: Ignore skip_list when draining
tablets: topology_coordinator: Keep tablet_draining transition if nodes are not drained
Replace usages of `boost::algorithm::join()` with `fmt::join()` to improve
performance and reduce dependency on Boost. `fmt::join()` allows direct
formatting of ranges and tuples with custom separators without creating
intermediate strings.
When formatting comma-separated values into another string, fmt::join()
avoids the overhead of temporary string creation that
`boost::algorithm::join()` requires. This change also helps streamline
our dependencies by leveraging the existing fmt library instead of
Boost.Algorithm.
To avoid the ambiguity, some caller sites were updated to call
`seastar::format()` explicitly.
See also
- boost::algorithm::join():
https://www.boost.org/doc/libs/1_87_0/doc/html/string_algo/reference.html#doxygen.join_8hpp
- fmt::join():
https://fmt.dev/11.0/api/#ranges-api
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22082
This patch adds tests for return consumed capacity for update_item.
The tests cover: a simple update for a small object, a missing item, an
update with a very large attribute (where the attribute itself is more
than 1KB), and an update of a big item that uses read-before-write.
This patch undoes multiple mistakes done when introducing the test
for service levels in pull request #22031:
1. The PR introduced in test/alternator/run and test/alternator/suite.yaml
a permanent role and service level that the service-level test is
supposed to use. This was a mistake - the test can create the service
level for its own use, using CQL, it does not need to assume such a
service level already exists.
It's important to fix this to allow the service level test to run
against an installation of Scylla not set up by our own scripts.
Moreover, while the code in suite.yaml was correct, the code in
"run" was incorrect (used an outdated keyspace name). This patch
removes that incorrect code.
2. The PR introduced a duplicate "cql" fixture, copied verbatim
from test_cql_rbac.py (including a comment that was correct only
in the latter file :-)). Let's de-duplicate it, using the fixture
that I moved to conftest.py in the previous patch.
3. The PR used temporary_grant(). This needelessly complicated the test
and added even more duplicate code, and this patch removes all that
stuff. This test is about service levels, not RBAC and "grant".
This test should just use a superuser role that has the permissions
to do everything, and don't need to be granted specific permissions.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>