Commit Graph

8114 Commits

Author SHA1 Message Date
Kamil Braun
89ee2a6834 Merge 'drop ip addresses from token metadata' from Gleb
Now that all topology related code uses host ids there is not point to
maintain ip to id (and back) mappings in the token metadata. After the
patch the mapping will be maintained in the gossiper only. The rest of
the system will use host ids and in rare cases where translation is
needed (mostly for UX compatibility reasons) the translation will be
done using gossiper.

Fixes: scylladb/scylla#21777

* 'gleb/drop-ip-from-tm-v3' of github.com:scylladb/scylla-dev: (57 commits)
  hint manager: do not translate ip to id in case hint manager is stopped already
  locator: token_metadata: drop update_host_id() function that does nothing now
  locator: topology: drop indexing by ips
  repair: drop unneeded code
  storage_service: use host_id to look for a node in on_alive handler
  storage_proxy: translate ips to ids in forward array using gossiper
  locator: topology: remove unused functions
  storage_service: check for outdated ip in on_change notification in the peers table
  storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology
  topology coordinator: change connection dropping code to work on host ids
  cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query
  locator: drop unused function from tablet_effective_replication_map
  api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead
  locator: token_metadata: remove unused ip based functions
  locator: network_topology_strategy: use host_id based function to check number of endpoints in dcs
  gossiper: drop get_unreachable_token_owners functions
  storage_service: use gossiper to map ip to id in node_ops operations
  storage_service: fix indentation after the last patch
  storage_service: drop loops from node ops replace_prepare handling since there can be only one replacing node
  token_metadata: drop no longer used functions
  ...
2025-01-17 11:00:52 +01:00
Pavel Emelyanov
14c3fbbf8c Merge 'sstable_directory: do not load remote unshared sstables in process_descriptor()' from Lakshmi Narayanan Sreethar
The sstable loader relied on the generation id to provide an efficient
hint about the shard that owns an sstable. But, this hint was rendered
ineffective with the introduction of UUID generation, as the shard id
was no longer embedded in the generation id. This also became suboptimal
with the introduction of tablets. Commit 0c77f77 addressed this issue by
reading the minimum from disk to determine sstable ownership but this
improvement was lost with commit 63f1969, which optimistically assumed
that hints would work most of the time, which isn't true.

This commit restores that change - shard id of a table is deduced by
reading minially from disk and then the sstable is fully loaded only if
it belongs to the local shard. This patch also adds a testcase to verify
that the sstable are loaded only in their respective shards.

Fixes #21015

This fixes a regression and should be backported.

Closes scylladb/scylladb#22263

* github.com:scylladb/scylladb:
  sstable_directory: do not load remote sstables in process_descriptor
  sstable_directory: update `load_sstable()` definition
  sstable_directory: reintroduce `get_shards_for_this_sstable()`
2025-01-17 11:17:54 +03:00
Nadav Har'El
955ac1b7b7 test/alternator: close boto3 client before shutting down
For several years now, we have seen a strange, and very rare, flakiness
in Alternator tests described in issue #17564: We see all the test pass,
pytest declares them to have passed, and while Python is existing, it
crashes with a signal 11 (SIGSEGV). Because this happens exclusively in
test/alternator and never in the test/cqlpy, we suspect that something
that the test/alternator leaves behind but test/cqlpy does not, causes
some race and crashes during shutdown.

The immediate suspect is the boto3 library, or rather, the urllib3 library
which it uses. This is more-or-less the only thing that test/alternator
does which test/cqlpy doesn't. The urllib3 library keeps around pools of
reusable connections, and it's possible (although I don't actually have any
proof for it) that these open connections may cause a crash during shutdown.

So in this patch I add to the "dynamodb" and "dynamodbstreams" fixtures
(which all Alternator tests use to connect to the server), a teardown which
calls close() for the boto3 client object. This close() call percolates
down to calling clear() on urllib3's PoolManager. Hopefully, this will
make some difference in the chance to crash during shutdown - and if it
doesn't, it won't hurt.

Refs #17564

Closes scylladb/scylladb#22341
2025-01-16 19:21:00 -05:00
Gleb Natapov
1e4b2f25dc locator: token_metadata: drop update_host_id() function that does nothing now 2025-01-16 16:37:08 +02:00
Gleb Natapov
50fb22c8f9 locator: topology: drop indexing by ips
Do not track id to ip mapping in the topology class any longer. There
are no remaining users.
2025-01-16 16:37:08 +02:00
Gleb Natapov
97f95f1dbd locator: token_metadata: remove unused ip based functions 2025-01-16 16:37:07 +02:00
Gleb Natapov
415e8de36e locator: topology: change get_datacenter_endpoints and get_datacenter_racks to return host ids and amend users 2025-01-16 16:37:06 +02:00
Gleb Natapov
8433947932 locator: topology: remove get_location overload that works on ip and its last users 2025-01-16 16:37:06 +02:00
Gleb Natapov
1b6e1456e5 messaging_service: drop the usage of ip based token_metadata APIs
We want to drop ips from token_metadata so move to use host id based
counterparts. Messaging service gets a function that maps from ips to id
when is starts listening.
2025-01-16 16:37:06 +02:00
Gleb Natapov
542360e825 test: drop inet_address usage from network_topology_strategy_test
Move the test to work on host ids. IPs will be dropped eventually.
2025-01-16 16:37:06 +02:00
Kefu Chai
8d7786cb0e build: cmake: use wasm32-wasip1 as an alternative of wasm32-wasi
wasm32-wasi has been removed in Rust 1.84 (Jan 5th, 2025). if one
compiles the tree with Rust 1.84 or up, following build failure is
expected:

```
[2/305] Building WASM /home/kefu/dev/scylladb/build/wasm/return_input.wasm
FAILED: wasm/return_input.wasm /home/kefu/dev/scylladb/build/wasm/return_input.wasm
cd /home/kefu/dev/scylladb/test/resource/wasm/rust && /usr/bin/cargo build --target=wasm32-wasi --example=return_input --locked --manifest-path=Cargo.toml --target-dir=/home/kefu/dev/scylladb/build/test/resource/wasm/rust && wasm-opt /home/kefu/dev/scylladb/build/test/resource/wasm/rust/wasm32-wasi//debug/examples/return_input.wasm -Oz -o /home/kefu/dev/scylladb/build/wasm/return_input.wasm && wasm-strip /home/kefu/dev/scylladb/build/wasm/return_input.wasm
error: failed to run `rustc` to learn about target-specific information

Caused by:
  process didn't exit successfully: `rustc - --crate-name ___ --print=file-names --target wasm32-wasi --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=split-debuginfo --print=crate-name --print=cfg` (exit status: 1)
  --- stderr
  error: Error loading target specification: Could not find specification for target "wasm32-wasi". Run `rustc --print target-list` for a list of built-in targets
```

in order to workaround this issue, let's check for supported target,
and use wasm32-wasip1 if wasm32-wasi is not listed as the supported
target.

Refs #20878

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22320
2025-01-16 16:28:29 +03:00
Botond Dénes
b2a03e03f7 Merge 'raft: Handle non-critical config update errors in when changing voter status.' from Sergey Zolotukhin
When a node is bootstrapped and joined a cluster as a non-voter and changes it's role to a voter, errors can occur while committing a new Raft record, for instance, if the Raft leader changes during this time. These errors are not critical and should not cause a node crash, as the action can be retried.

Fixes scylladb/scylladb#20814

Backport: This issue occurs frequently and disrupts the CI workflow to some extent. Backports are needed for versions 6.1 and 6.2.

Closes scylladb/scylladb#22253

* github.com:scylladb/scylladb:
  raft: refactor `remove_from_raft_config` to use a timed `modify_config` call.
  raft: Refactor functions using `modify_config` to use a common wrapper for retrying.
  raft: Handle non-critical config update errors in when changing status to voter.
  test: Add test to check that a node does not fail on unknown commit status error when starting up.
  raft: Add run_op_with_retry in raft_group0.
2025-01-16 11:00:47 +02:00
Piotr Dulikowski
72f28ce81e Merge 'main, view: Pair view builder drain with its start' from Dawid Mędrek
In this PR, we pair draining the view builder with its start.
To better understand what was done and why, let's first look at the
situation before this commit and the context of it:

(a) The following things happened in order:

    1. The view builder would be constructed.
    2. Right after that, a deferred lambda would be created to stop the
       view builder during shutdown.
    3. group0_service would be started.
    4. A deferred lambda stopping group0_service would be created right
       after that.
    5. The view builder would be started.

(b) Because the view builder depends on group0_client, it couldn't be
    started before starting group0_service. On the other hand, other
    services depend on the view builder, e.g. the stream manager. That
    makes changing the order of initialization a difficult problem,
    so we want to avoid doing that unless we're sure it's the right
    choice.

(c) Since the view builder uses group0_client, there was a possibility
    of running into a segmentation fault issue in the following
    scenario:

    1. A call to `view_builder::mark_view_build_success()` is issued.
    2. We stop group0_service.
    3. `view_builder::mark_view_build_success()` calls
       `announce_with_raft()`, which leads to a use-after-free because
       group0_service has already been destroyed.

      This very scenario took place in scylladb/scylladb#20772.

Initially, we decided to solve the issue by initializing
group0_service a bit earlier (scylladb/scylladb@7bad8378c7).
Unfortunately, it led to other issues described in scylladb/scylladb#21534,
so we revert that patch. These changes are the second attempt
to the problem where we want to solve it in a safer manner.

The solution we came up with is to pair the start of the view builder
with a deferred lambda that deinitializes it by calling
`view_builder::drain()`. No other component of the system should be
able to use the view builder anymore, so it's safe to do that.
Furthermore, that pairing makes the analysis of
initialization/deinitialization order much easier. We also solve the
aformentioned use-after-free issue because the view builder itself
will no longer attempt to use group0_client.

Note that we still pair a deferred lambda calling `view_builder::stop()`
with the construction of the view builder; that function will also call
`view_builder::drain()`. Another notable thing is `view_builder::drain()`
may be called earlier by `storage_service::do_drain()`. In other words,
these changes cover the situation when Scylla runs into a problem when
starting up.

Backport: The patch I'm reverting made it to 6.2, so we want to backport this one there too.

Fixes scylladb/scylladb#20772
Fixes scylladb/scylladb#21534

Closes scylladb/scylladb#21909

* github.com:scylladb/scylladb:
  test/topology_custom: Add test for Scylla with disabled view building
  main, view: Pair view builder drain with its start
  Revert "main,cql_test_env: start group0_service before view_builder"
2025-01-15 09:50:26 +01:00
Sergey Zolotukhin
8c48f7ad62 raft: Handle non-critical config update errors in when changing status
to voter.

When a node is bootstrapped and joins a cluster as a non-voter, errors can occur while committing
a new Raft record, for instance, if the Raft leader changes during this time. These errors are not
critical and should not cause a node crash, as the action can be retried.

Fixes scylladb/scylladb#20814
2025-01-15 09:49:15 +01:00
Sergey Zolotukhin
16053a86f0 test: Add test to check that a node does not fail on unknown commit status
error when starting up.

Test that a node is starting successfully if while joining a cluster and becoming a voter, it
receives an unknown commit status error.

Test for scylladb/scylladb#20814
2025-01-14 17:12:06 +01:00
Kamil Braun
2eac7a2d61 Merge 'test/pylib: two trivial cleanups' from Kefu Chai
- use "foo not in bar" instead of "not foo in bar"
- test/pylib: use foo instead of `'{}'.format(foo)`

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#22066

* github.com:scylladb/scylladb:
  test/pylib: use `foo` instead of `'{}'.format(foo)`
  test/pylib: use "foo not in bar" instead of "not foo in bar"
2025-01-14 16:27:44 +01:00
Nadav Har'El
15c252fd8f Merge 'docs: Update documentation on CREATE ROLE WITH HASHED PASSWORD' from Dawid Mędrek
As part of #18750, we added a CQL statement CREATE ROLE WITH SALTED HASH that prevented hashing a password when creating a role, effectively leading to inserting a hash given by the user directly into the database. In #21350, we noticed that Cassandra had implemented a CQL statement of similar semantics but different syntax. We decided to rename Scylla's statement to be compatible with Cassandra. Unfortunately, we didn't notice one more difference between what we had in Scylla and what was part of Cassandra.

Scylla's statement was originally supposed to only be used when restoring the schema and the user needn't have to be aware of its existence at all: the database produced a sequence of CQL statements that the user saved to a file and when a need to restore the schema arose, they would execute the contents of the file. That's why that although we documented the feature, it was only done in the necessary places. Those that weren't related to the backup & restore procedure were deliberately skipped.

Cassandra, on the other hand, added the statement for a different purpose (for details, see the relevant issue) and it was supposed to be used by the user by design. The statement is also documented as such.

Since we want to preserve compatibility with Cassandra, we document the statement and its semantics in the user documentation, explicitly implying that it can be used by the user.

We also add a test verifying that logging in works correctly.

Fixes scylladb/scylladb#21691

Backport: not needed. The relevant code didn't make it to 6.2 or any previous version of OSS.

Closes scylladb/scylladb#21752

* github.com:scylladb/scylladb:
  docs: Update documentation on CREATE ROLE WITH HASHED PASSWORD
  test/boost: Add test for creating roles with hashed passwords
2025-01-14 15:33:30 +02:00
Kefu Chai
7215d4bfe9 utils: do not include unused headers
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.

please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-14 07:56:39 -05:00
Aleksandra Martyniuk
592512fd0f test: fix memtable_flush_period test
memtable_flush_period test sets the flush period to 200ms and checks
whether the data is flushed after 500ms.

When flush period is set, the timer is armed with the given value.
On expiration, memtables are flushed and then the timer is rearmed.

There is no certainty that during 500ms the flush finishes, though.

Check if after 500ms flush has started. Wait until there is an sstable.

Fixes: #21965.

Closes scylladb/scylladb#22162
2025-01-14 07:56:38 -05:00
Botond Dénes
686a997c04 Merge 'Complete implementation of configuring IO bandwidth limits' from Pavel Emelyanov
In Scylla there are two options that control IO bandwidth limit -- the /storage_service/(compaction|stream)_throughput REST API endpoints. The endpoints are partially implemented and have no counterparts in the nodetool.

This set implements the missing bits and adds tests for new functionality.

Closes scylladb/scylladb#21877

* github.com:scylladb/scylladb:
  nodetool: Implement [gs]etstreamthroughput commands
  nodetool: Implement [gs]etcompationthroughput commands
  test: Add validation of how IO-updating endpoints work
  api: Implement /storage_service/(stream|compaction)_throughput endpoints
  api: Disqualify const config reference
  api: Implement /storage_service/stream_throughput endpoint
  api: Move stream throughput set/get endpoints from storage service block
  api: Move set_compaction_throughput_mb_per_sec to config block
  util: Include fmt/ranges.h in config_file.hh
2025-01-14 07:56:38 -05:00
Aleksandra Martyniuk
94f4871352 test: start waiting for task before it gets aborted
Ensure that the repair task was aborted after wait API acknowledged
its existence.

Fixes: #22011.

Closes scylladb/scylladb#22012
2025-01-14 07:56:37 -05:00
Lakshmi Narayanan Sreethar
63100b34da sstable_directory: do not load remote sstables in process_descriptor
The sstable loader relied on the generation id to provide an efficient
hint about the shard that owns an sstable. But, this hint was rendered
ineffective with the introduction of UUID generation, as the shard id
was no longer embedded in the generation id. This also became suboptimal
with the introduction of tablets. Commit 0c77f77 addressed this issue by
reading the minimum from disk to determine sstable ownership but this
improvement was lost with commit 63f1969, which optimistically assumed
that hints would work most of the time, which isn't true.

This commit restores that change - shard id of a table is deduced by
reading minially from disk and then the sstable is fully loaded only if
it belongs to the local shard. This patch also adds a testcase to verify
that the sstable are loaded only in their respective shards.

Fixes #21015

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-01-13 20:01:30 +05:30
Nadav Har'El
321d0fd3b1 Merge 'Alternator: Add WCU suppport for update item' from Amnon Heiman
This series adds WCU support for the Alternator update item.
This motivation behind it, is to have a rough estimation of what a similar operation would have taken from WCU perspective if used with DynamoDB.

The calculation is done while minimal overhead is the prime objective, the results are values that is less or equal to what it would have been in DynamoDB

** New feature, no need to backport. **

Closes scylladb/scylladb#21999

* github.com:scylladb/scylladb:
  alternator/test_returnconsumedcapacity.py: update item
  alternator/executor.cc: Add WCU for update_item
2025-01-13 14:35:46 +02:00
Kamil Braun
48a4efba2f Merge 'Fix possible data corruption due to token keys clashing in read repair.' from Sergey Zolotukhin
This update addresses an issue in the mutation diff calculation
algorithm used during read repair. Previously, the algorithm used
`token` as the hashmap key. Since `token` is calculated basing on the
Murmur3 hash function, it could generate duplicate values for different
partition keys, causing corruption in the affected rows' values.

Fixes scylladb/scylladb#19101

Since the issue affects all the relevant scylla versions, backport to: 6.1, 6.2

Closes scylladb/scylladb#21996

* github.com:scylladb/scylladb:
  storage_proxy/read_repair: Remove redundant 'schema' parameter from `data_read_resolver::resolve` function.
  storage_proxy/read_repair: Use `partition_key` instead of `token` key for mutation diff calculation hashmap.
  test: Add test case for checking read repair diff calculation when having conflicting keys.
2025-01-13 10:54:34 +01:00
Kamil Braun
88a48f2355 Merge 'Load peers table into the gossiper on boot' from Gleb
Since we manage ip to id mapping directly in gossiper now we need to
load the mapping on boot. We already do it anyway, but only due to a bug
which checks raft topology mode config before it is set, so the code
thinks that it is in the gossiper mode and loads peers table into the
gossiper and token metadata. Fix the bug and load peers into the gossiper
only since token metadata is managed by raft.

The series also removes address map related test that no longer checks
anything and replace it with unit test.

It also adds the dc/rack check to "join node" rpc. The check is done
during shadow round now, but for it to work it requires dc/rack to be
propagated through the gossiper and we want to eventually drop it.

Ref: scylladb/scylladb#21777

* 'load-peers' of https://github.com/gleb-cloudius/scylla:
  topology coordinator: reject replace request if topology does not match
  gossiper: fix the logic of shadow_round parameter
  storage_service: do not add endpoint to the gossiper during topology loading.
  storage_service: load peers into gossiper on boot in raft topology mode
  storage_service: set raft topology change mode before using it in join_cluster
  locator: drop inet_address usage to figure out per dc/rack replication
  test: drop test_old_ip_notification_repro.py
  test: address_map: check generation handling during entry addition
2025-01-13 09:40:36 +01:00
Andrei Chekun
2aea2610e0 test.py: Wait for tasks finish before going further
Developers using asyncio.gather() often assume that it waits for all futures (awaitables) givens.
But this isn't true when the return_exceptions parameter is False, which is the default.
In that case, as soon as one future completes with an exception, the gather() call will return this exception immediately, and some of the finished tasks may continue to run in the background.
This is bad for applications that use gather() to ensure that a list of background tasks has all completed.
So such applications must use asyncio.gather() with return_exceptions=True, to wait for all given futures to complete either successfully or unsuccessfully.

Closes scylladb/scylladb#22252
2025-01-13 09:43:28 +02:00
Kefu Chai
752e6561fb test/pylib: log if scylla exits with non-zero status code
When destroying a test cluster, ScyllaCluster.stop() calls ScyllaServer.stop()
for each running server. Previously, non-zero exit status codes from scylla
servers were silently ignored during test teardown.

This change modifies the logging behavior to print the exit status code when
a scylla server exits with a non-zero status. This helps developers quickly
identify potential issues or unexpected terminations during test runs.

Differences in handling:
- Before: Non-zero exit codes were not logged
- After: Non-zero exit codes are printed, providing visibility into
  server termination errors

This improvement aids in diagnosing intermittent test failures or
unexpected server shutdowns during test execution.

Refs #21742

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21934
2025-01-13 09:09:43 +03:00
Kefu Chai
d815d7013c sstables_loader: report progress with the unit of batch
We restore a snapshot of table by streaming the sstables of
the given snapshot of the table using
`sstable_streamer::stream_sstable_mutations()` in batches. This function
reads mutations from a set of sstables, and streams them to the target
nodes. Due to the limit of this function, we are not able to track the
progress in bytes.

Previously, progress tracking used individual sstables as units, which caused
inaccuracies with tablet-distributed tables, where:
- An sstable spanning multiple tablets could be counted multiple times
- Progress reporting could become misleading (e.g., showing "40" progress
  for a table with 10 sstables)

This change introduces a more robust progress tracking method:
- Use "batch" as the unit of progress instead of individual sstables.
  Each batch represents a tablet when restoring a table snapshot if
  the tablet being restored is distributed with tablets. When it comes
  to tables distributed with vnode, each batch represents an sstable.
- Stream sstables for each tablet separately, handling both partially and
  fully contained sstables
- Calculate progress based on the total number of sstables being streamed
- Skip tablet IDs with no owned tokens

For vnode-distributed tables, the number of "batches" directly corresponds
to the number of sstables, ensuring:
- Consistent progress reporting across different table distribution models
- Simplified implementation
- Accurate representation of restore progress

The new approach provides a more reliable and uniform method of tracking
restoration progress across different table distribution strategies.

Also, Corrected the use of `_sstables.size()` in
`sstable_streamer::stream_sstables()`. It addressed a review comment
from Pavel that was inadvertently overlooked during previous rebasing
the commit of 5ab4932f34.

Fixes scylladb/scylladb#21816

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21841
2025-01-13 09:04:35 +03:00
Dawid Mędrek
d1f960eee2 test/topology_custom: Add test for Scylla with disabled view building
Before this commit, there doesn't seem to have been a test verifying that
starting and shutting down Scylla behave correctly when the configuration
option `view_building` is set to false. In these changes, we add one.
2025-01-13 00:41:27 +01:00
Dawid Mędrek
06ce976370 main, view: Pair view builder drain with its start
In these changes, we pair draining the view builder with its start.
To better understand what was done and why, let's first look at the
situation before this commit and the context of it:

(a) The following things happened in order:

    1. The view builder would be constructed.
    2. Right after that, a deferred lambda would be created to stop the
       view builder during shutdown.
    3. group0_service would be started.
    4. A deferred lambda stopping group0_service would be created right
       after that.
    5. The view builder would be started.

(b) Because the view builder depends on group0_client, it couldn't be
    started before starting group0_service. On the other hand, other
    services depend on the view builder, e.g. the stream manager. That
    makes changing the order of initialization a difficult problem,
    so we want to avoid doing that unless we're sure it's the right
    choice.

(c) Since the view builder uses group0_client, there was a possibility
    of running into a segmentation fault issue in the following
    scenario:

    1. A call to `view_builder::mark_view_build_success()` is issued.
    2. We stop group0_service.
    3. `view_builder::mark_view_build_success()` calls
       `announce_with_raft()`, which leads to a use-after-free because
       group0_service has already been destroyed.

      This very scenario took place in scylladb/scylladb#20772.

Initially, we decided to solve the issue by initializing
group0_service a bit earlier (scylladb/scylladb@7bad8378c7).
Unfortunately, it led to other issues described in scylladb/scylladb#21534.
We reverted that change in the previous commit. These changes are the
second attempt to the problem where we want to solve it in a safer manner.

The solution we came up with is to pair the start of the view builder
with a deferred lambda that deinitializes it by calling
`view_builder::drain()`. No other component of the system should be
able to use the view builder anymore, so it's safe to do that.
Furthermore, that pairing makes the analysis of
initialization/deinitialization order much easier. We also solve the
aformentioned use-after-free issue because the view builder itself
will no longer attempt to use group0_client.

Note that we still pair a deferred lambda calling `view_builder::stop()`
with the construction of the view builder; that function will also call
`view_builder::drain()`. Another notable thing is `view_builder::drain()`
may be called earlier by `storage_service::do_drain()`. In other words,
these changes cover the situation when Scylla runs into a problem when
starting up.

Fixes scylladb/scylladb#20772
2025-01-13 00:41:22 +01:00
Dawid Mędrek
a5715086a4 Revert "main,cql_test_env: start group0_service before view_builder"
The patch solved a problem related to an initialization order
(scylladb/scylladb#20772), but we ran into another one: scylladb/scylladb#21534.
After moving the initialization of group0_service, it ended up being destroyed
AFTER the CDC generation service would. Since CDC generations are accessed
in `storage_service::topology_state_load()`:

```
for (const auto& gen_id : _topology_state_machine._topology.committed_cdc_generations) {
    rtlogger.trace("topology_state_load: process committed cdc generation {}", gen_id);
    co_await _cdc_gens.local().handle_cdc_generation(gen_id);
```

we started getting the following failure:

```
Service &seastar::sharded<cdc::generation_service>::local() [Service = cdc::generation_service]: Assertion `local_is_initialized()' failed.
```

We're reverting the patch to go back to a more stable version of Scylla
and in the following commit, we'll solve the original issue in a more
systematic way.

This reverts commit 7bad8378c7.
2025-01-12 18:13:56 +01:00
Avi Kivity
814942505f Merge 'Introduce Encryption-at-Rest (EAR) for sstables and commitlog' from Calle Wilund
Fixes https://github.com/scylladb/scylla-enterprise/issues/5016#issuecomment-2558464631

EAR - encryption at rest. Allows on-disk file encryption of sstables and commitlog data.
Introduces OpenSSL based file level encrypted storage, managed via a set of providers
ranging from local files to cloud KMS providers.

For a more comprehensive explanation, see the included docs (or if possible, original
source tree).

Manual bulk merge of EAR feature from enterprise repo to main scylla repo.

Breaks some features apart, but main EAR is still a humongous commit, because to separate this
I would have to mess with code incrementally, adding time and risk.

This PR includes the local file gen tool, tests and also p11 validation.

Note: CI will not execute the full tests unless master CI is set to provide the same environment
as the enterprise one. Not sure about the status of this ATM.

Note: Includes code to compile against cryptsoft kmipc SDK, but not the SDK. If you happen to
check out this tree in the scylla folder and configure, it will be linked against and KMIP functionality
will be enabled, otherwise not.

Closes scylladb/scylladb#22233

* github.com:scylladb/scylladb:
  docs: Add EAR docs
  main/build: Add p11-kit and initialize
  tools: Add local-file-key-generator tool
  tests: Add EAR tests
  tmpdir: shorten test tempdir path
  EAR: port the ear feature from enterprise
  cql_test_env: Add optional query timeout
  schema/migration_manager: Add schema validate
  sstables: add get_shared_components accessor
  config/config_file: Add exports and definitions of config_type_for<>
2025-01-12 16:10:46 +02:00
Benny Halevy
8d2ff8a915 utils: add disk_space_monitor
Instantiated only on shard 0.
Currently, only subscribe from unit test

Manual unit test using loop mount was added.
Note that the test requires sudo access
and root access to /dev/loop, so it cannot
run in rootless podman instance, and it'd
fail with Permission denied.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#21523
2025-01-12 14:51:15 +02:00
Piotr Smaron
288f9b2b15 Introduce LDAP role manager & saslauthd authenticator
This PR extends authentication with 2 mechanisms:
- a new role_manager subclass, which allows managing users via
LDAP server,
- a new authenticator, which delegates plaintext authentication
to a running saslauthd daemon.

The features have been ported from the enterprise repository
with their test.py tests and the documentation as part of
changing license to source available.

Fixes: scylladb/scylla-enterprise#5000
Fixes: scylladb/scylla-enterprise#5001

Closes scylladb/scylladb#22030
2025-01-12 14:50:29 +02:00
Calle Wilund
c596ae6eb1 tests: Add EAR tests
Adds the migrated EAR/encryption tests.
Note: Until scylla CI is updated to provide all the proper
ENV vars, some tests will not execute.
2025-01-09 10:40:39 +00:00
Calle Wilund
ee62b61c84 tmpdir: shorten test tempdir path
To make certain python tests work in CI
2025-01-09 10:37:35 +00:00
Sergey Zolotukhin
2f1731c551 test: Include parent test name in ScyllaClusterManager log file names.
Add the test file name to `ScyllaClusterManager` log file names alongside the test function name.
This avoids race conditions when tests with the same function names are executed simultaneously.

Fixes scylladb/scylladb#21807

Backport: not needed since this is a fix in the testing scripts.

Closes scylladb/scylladb#22192
2025-01-08 15:42:31 +02:00
Calle Wilund
e734fc11ec cql_test_env: Add optional query timeout
Some tests need queries to actually fail.
2025-01-08 12:50:03 +00:00
Kefu Chai
d0a3311ced locator: do not include unused headers
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22199
2025-01-08 14:26:48 +02:00
Benny Halevy
e6efaa3b73 Update seastar submodule
* seastar 3133ecdd...a9bef537 (24):
  > file: add file_system_space
  > future: avoid inheriting from future payload type
  > treewide: include fmt/ostream.h for using fmt::print()
  > build: remove messages used for debugging
  > demos: Rename websocket demo to websocket_server demo
  > demos: Add a way to set port from cmd line in websocket demo
  > tls: Add optional builder + future-wait to cert reload callback + expose rebuild
  > rwlock: add try_hold_{read,write}_lock methods
  > json: add moving push to json_list
  > github: add a step to build "check-include-style"
  > build: add a target for checking include style
  > scheduling_group: use map for key configs instead of vector
  > scheduling_group: fix indentation
  > scheduling_group: fix race between scheduling group and key creation
  > http: Make request writing functions public
  > http: Expose connection_factory implementations
  > metrics: Use separate type for shared metadata
  > file: unexpected throw from inside noexcept
  > metrics: Internalize metric label sets
  > thread: optimize maybe_yield
  > reactor: fix crash in pending registration task after poller dtor
  > net: Fix ipv6 socket_address comparision
  > reactor, linux-aio: factor out get_smp_count() lambda
  > reactor, linux-aio: restore "available_aio" meaning after "reserve_iocbs"

Fixed usage of seastar metric label sets due to:
scylladb/seastar@733420d57 Merge 'metrics: Internalize metric label sets' from Stephan Dollberg

Closes scylladb/scylladb#22076
2025-01-08 09:37:16 +02:00
Wojciech Mitros
d04f376227 mv: add an experimental feature for creating views using tablets
We still have a number of issues to be solved for views with tablets.
Until they are fixed, we should prevent users from creating them,
and use the vnode-based views instead.

This patch prepares the feature for enabling views with tablets. The
feature is disabled by default, but currently it has no effect.
After all tests are adjusted to use the feature, we should depend
on the feature for deciding whether we can create materialized views
in tablet-enabled keyspaces.

The unit tests are adjusted to enable this feature explicitly, and it's
also added to the scylla sstable tool config - this tool treats all
tables as if they were tablet-based (surprisingly, with SimpleStrategy),
so for it to work on views, the new feature must be enabled.

Refs scylladb/scylladb#21832

Closes scylladb/scylladb#21833
2025-01-07 15:52:36 +01:00
Asias He
935dcd69fa repair: Remove repair_task_info only when repair is finished
In case of error, repair will be moved into the end_repair stage. We
should not remove repair_task_info in this case because the repair task
requested by the user is not finished yet.

To fix, we should remove repair_task_info at the end of repair stage.

Tests are added to ensure failed repair is not reported as finished.

Closes scylladb/scylladb#21973
2025-01-07 16:19:40 +02:00
Emil Maskovsky
2ac9ed2073 raft: test the limited voters feature
Test the limited voters feature by creating a cluster with 3 DCs, one of
them disproportionately larger than the others. The raft majority should
not be lost in case the large DC goes down.

Fixes: scylladb/scylla#21915

Refs: scylladb/scylla#18793

Closes scylladb/scylladb#21901
2025-01-07 15:09:49 +01:00
Michael Litvak
0617564123 db/commitlog: make the commit log hard limit mandatory
mark the config parameter --commitlog-use-hard-size-limit as deprecated so the
default 'true' is always used, making the hard limit mandatory.

Fixes scylladb/scylladb#16471

Closes scylladb/scylladb#21804
2025-01-07 15:03:56 +02:00
Botond Dénes
b3f8c4faa7 Merge 'node_ops: filter topology_requests entries shown by node_ops_virtual_task' from Aleksandra Martyniuk
node_ops_virtual_task does not filter the entries of system.topology_request
and so it creates statuses of operations that aren't node ops.

Filter the entries used by node_ops_virtual_task. With this change, the status
of a bootstrap of the first node will not be visible.

Fixes: https://github.com/scylladb/scylladb/issues/22008.

Needs backport to 6.2 that introduced node_ops_virtual_task

Closes scylladb/scylladb#22009

* github.com:scylladb/scylladb:
  test: truncate the table before node ops task checks
  node_ops: rename a method that get node ops entries
  node_ops: filter topology_requests entries
2025-01-07 14:17:01 +02:00
Kefu Chai
353b522ca0 treewide: migrate from boost::adaptors::reversed to std::views::reverse
now that we are allowed to use C++23. we now have the luxury of using
`std::views::reverse`.

- replace `boost::adaptors::transformed` with `std::views::transform`
- remove unused `#include <boost/range/adaptor/reversed.hpp>`

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-07 13:22:00 +02:00
Botond Dénes
69150f0680 Merge 'Fix edge case issues related to tablet draining ' from Tomasz Grabiec
Main problem:

If we're draining the last node in a DC, we won't have a chance to
evaluate candidates and notice that constraints cannot be satisfied (N
< RF). Draining will succeed and node will be removed with replicas
still present on that node. This will cause later draining in the same
DC to fail when we will have 2 replicas which need relocaiton for a
given tablet.

The expected behvior is for draining to fail, because we cannot keep
the RF in the DC. This is consistent, for example, with what happens
when removing a node in a 2-node cluster with RF=2.

Fixes #21826

Secondary problem:

We allowed tablet_draining transition to be exited with undrained nodes, leaving replicas on nodes in the "left" state.

Third problem:

We removed DOWN nodes from the candidate node set, even when draining. This is not safe because it may lead to overload. This also makes the "main problem" more likely by extending it to the scenario when the DC is DOWN.

The overload part in not a problem in practice currently, since migrations will block on global topology barrier if there are DOWN nodes.

Closes scylladb/scylladb#21928

* github.com:scylladb/scylladb:
  tablets: load_balancer: Fail when draining with no candidate nodes
  tablets: load_balancer: Ignore skip_list when draining
  tablets: topology_coordinator: Keep tablet_draining transition if nodes are not drained
2025-01-07 13:04:00 +02:00
Kefu Chai
e4463b11af treewide: replace boost::algorithm::join() with fmt::join()
Replace usages of `boost::algorithm::join()` with `fmt::join()` to improve
performance and reduce dependency on Boost. `fmt::join()` allows direct
formatting of ranges and tuples with custom separators without creating
intermediate strings.

When formatting comma-separated values into another string, fmt::join()
avoids the overhead of temporary string creation that
`boost::algorithm::join()` requires. This change also helps streamline
our dependencies by leveraging the existing fmt library instead of
Boost.Algorithm.

To avoid the ambiguity, some caller sites were updated to call
`seastar::format()` explicitly.

See also

- boost::algorithm::join():
  https://www.boost.org/doc/libs/1_87_0/doc/html/string_algo/reference.html#doxygen.join_8hpp
- fmt::join():
  https://fmt.dev/11.0/api/#ranges-api

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22082
2025-01-07 12:45:05 +02:00
Amnon Heiman
7390116620 alternator/test_returnconsumedcapacity.py: update item
This patch adds tests for return consumed capacity for update_item.

The tests cover: a simple update for a small object, a missing item, an
update with a very large attribute (where the attribute itself is more
than 1KB), and an update of a big item that uses read-before-write.
2025-01-06 09:55:17 +02:00
Nadav Har'El
e919794db8 test/alternator: fix mistakes introduced with test_service_levels.py
This patch undoes multiple mistakes done when introducing the test
for service levels in pull request #22031:

1. The PR introduced in test/alternator/run and test/alternator/suite.yaml
   a permanent role and service level that the service-level test is
   supposed to use. This was a mistake - the test can create the service
   level for its own use, using CQL, it does not need to assume such a
   service level already exists.
   It's important to fix this to allow the service level test to run
   against an installation of Scylla not set up by our own scripts.
   Moreover, while the code in suite.yaml was correct, the code in
   "run" was incorrect (used an outdated keyspace name). This patch
   removes that incorrect code.

2. The PR introduced a duplicate "cql" fixture, copied verbatim
   from test_cql_rbac.py (including a comment that was correct only
   in the latter file :-)). Let's de-duplicate it, using the fixture
   that I moved to conftest.py in the previous patch.

3. The PR used temporary_grant(). This needelessly complicated the test
   and added even more duplicate code, and this patch removes all that
   stuff. This test is about service levels, not RBAC and "grant".
   This test should just use a superuser role that has the permissions
   to do everything, and don't need to be granted specific permissions.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-01-05 19:40:14 +02:00