Commit Graph

42264 Commits

Author SHA1 Message Date
Botond Dénes
275ed9a9bc replica/mutation_dump: create_underlying_mutation_sources(): remove false move
transformed_cr is moved in a loop, in each iteration. This is harmless
because the variable is const and the move has no effect, yet it is
confusing to readers and triggers false positives in clang-tidy
(moved-from object reused). Remove it.

Fixes: #18322

Closes scylladb/scylladb#18348
2024-04-23 01:21:36 +02:00
Kamil Braun
e9285e5c04 Merge 'various fixes for topology coordinator' from Gleb
The series contains fixes for some problems found during scalability
testing and one clean up patch.

Ref: scylladb/scylladb#17545

* 'gleb/topology-fixes-v4' of github.com:scylladb/scylla-dev:
  gossiper: disable status check for endpoints in raft mode
  storage_service: introduce a setter for topology_change_kind
  topology coordinator: drop unused structure
  storage_service: yield in get_system_mutations
2024-04-22 17:37:47 +02:00
Calle Wilund
82d97da3e0 commitlog: Remove (benign) use-after-move
Fixes #18329

named_file::assign call uses old object "known_size" after a move
of the object. While this is wholly ok, since the attribute accessed
will not be modified/destroyed by the move, it causes warnings in
"tidy" runs, and might confuse or cause real errors should impl. change.

Closes scylladb/scylladb#18337
2024-04-22 17:20:19 +03:00
Kefu Chai
ff04375016 main: drop unused namespace alias
`fs` namespace alias was introduced in ff4d8b6e85, but we don't
use it anymore. so let's drop it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18308
2024-04-22 13:50:28 +03:00
Nadav Har'El
59b40484c8 Update seastar submodule
* seastar 8fabb30a...2b43417d (6):
  > future: deprecate future::get0()
  > build: do not export valgrind with export()
  > http: deprecate buggy path param[]
  > http/request: add get_path_param method
  > http/request: get_query_param refactor
  > http/util: add path_decode method

Refs #5883 (fixes https://github.com/scylladb/seastar/issues/725 and
provides a new API to read the decoded paths).

Closes scylladb/scylladb#18297
2024-04-22 11:12:49 +03:00
Gleb Natapov
06e6ed09ed gossiper: disable status check for endpoints in raft mode
Gossiper automatically removes endpoints that do not have tokens in
normal state and either do not send gossiper updates or are dead for a
long time. We do not need this with topology coordinator mode since in
this mode the coordinator is responsible to manage the set of nodes in
the cluster. In addition the patch disables quarantined endpoint
maintenance in gossiper in raft mode and uses left node list from the
topology coordinator to ignore updates for nodes that are no longer part
of the topology.
2024-04-21 16:36:07 +03:00
Gleb Natapov
0e3f92fa49 storage_service: introduce a setter for topology_change_kind
In the next patch we will extend it to have other side affects.
2024-04-21 16:36:07 +03:00
Gleb Natapov
040c6ca0c1 topology coordinator: drop unused structure 2024-04-21 16:36:07 +03:00
Gleb Natapov
d0a00f3489 storage_service: yield in get_system_mutations
Yield in a loop that converts a result to canonical_mutation. We
observed stalls for very large tables.
2024-04-21 16:36:07 +03:00
Avi Kivity
87b08c957f Merge 'treewide: drop FMT_DEPRECATED_OSTREAM macro and homebrew range formatters' from Kefu Chai
before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<.
with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Closes scylladb/scylladb#17968

* github.com:scylladb/scylladb:
  treewide: do not define FMT_DEPRECATED_OSTREAM
  treewide: include fmt/ranges.h and/or fmt/std.h
  utils/managed_bytes: add support for fmt::to_string() to bytes and friends
2024-04-20 22:25:00 +03:00
Mikołaj Grzebieluch
65cfb9b4e0 storage_service: skip wait_for_gossip_to_settle if topology changes are based on raft
Waiting for gossip to settle slows down the bootstrap of the cluster.
It is safe to disable it if the topology is based on Raft.

Fixes scylladb/scylladb#16055

Closes scylladb/scylladb#17960
2024-04-20 17:56:51 +02:00
Kefu Chai
372a4d1b79 treewide: do not define FMT_DEPRECATED_OSTREAM
since we do not rely on FMT_DEPRECATED_OSTREAM to define the
fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`.

in this change,

* utils: drop the range formatters in to_string.hh and to_string.c, as
  we don't use them anymore. and the tests for them in
  test/boost/string_format_test.cc are removed accordingly.
* utils: use fmt to print chunk_vector and small_vector. as
  we are not able to print the elements using operator<< anymore
  after switching to {fmt} formatters.
* test/boost: specialize fmt::details::is_std_string_like<bytes>
  due to a bug in {fmt} v9, {fmt} fails to format a range whose
  element type is `basic_sstring<uint8_t>`, as it considers it
  as a string-like type, but `basic_sstring<uint8_t>`'s char type
  is signed char, not char. this issue does not exist in {fmt} v10,
  so, in this change, we add a workaround to explicitly specialize
  the type trait to assure that {fmt} format this type using its
  `fmt::formatter` specialization instead of trying to format it
  as a string. also, {fmt}'s generic ranges formatter calls the
  pair formatter's `set_brackets()` and `set_separator()` methods
  when printing the range, but operator<< based formatter does not
  provide these method, we have to include this change in the change
  switching to {fmt}, otherwise the change specializing
  `fmt::details::is_std_string_like<bytes>` won't compile.
* test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends
  for comparing values. but without the operator<< based formatters,
  Boost.Test would not be able to print them. after removing
  the homebrew formatters, we need to use the generic
  `boost_test_print_type()` helper to do this job. so we are
  including `test_utils.hh` in tests so that we can print
  the formattable types.
* treewide: add "#include "utils/to_string.hh" where
  `fmt::formatter<optional<>>` is used.
* configure.py: do not define FMT_DEPRECATED_OSTREAM
* cmake: do not define FMT_DEPRECATED_OSTREAM

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:57:36 +08:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Kefu Chai
01f13850cb utils/managed_bytes: add support for fmt::to_string() to bytes and friends
in 3835ebfcdc, `fmt::formatter` were added to `bytes` and friend, but
their `format()` methods were intentionally implemented as plain
methods, which only acccept `fmt::format_context`. it was a decision
decision. the intention was to reduce the usage of template, to speed
up the compilation at the expense of dropping the support of other
appenders, notably the one used by `fmt::to_string()`, where the type
of "format_context" is not a `fmt::format_context`, but a string
appender. but it turns out we still have users in tests using
`fmt::to_string()`, to convert, for instance, `bytes` to `std::string`,

so, to make their life easier, we add the templated `format()` to
these types. an alternative is to change the callers to use something
like `fmt::format("{}", v)`, which is less convenient though.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:13 +08:00
Kefu Chai
5ab527e669 main: do not echo parsed options when calling scylla interactively
in 2f0f53ac, we added logging of parsed command line options so that we
can see how scylla is launched in case it fails to boot. but when scylla
is called interactively in console. this echo is a little bit annoying.
see following console session
```console
$ scylla --help-loggers
Scylla version 5.5.0~dev-0.20240419.3c9651adf297 with build-id 7dd6a110e608535e5c259a03548eda6517ab4bde starting ...
command used: "./RelWithDebInfo/scylla --help-loggers"
pid: 996503
parsed command line options: [help-loggers]
Available loggers:
    BatchStatement
    LeveledManifest
    alter_keyspace
    alter_table
...
```

so in this change, we check if the stdin is associated with a terminal
device, if that the case, we don't print the scylla version, parsed
command line and pid. and the interactive session looks like:

```console
$ scylla --help-loggers
Available loggers:
    BatchStatement
    LeveledManifest
    alter_keyspace
    alter_table
```
no more distracting information printed. the original behavior
can be tested like:

```console
$ : | ./RelWithDebInfo/scylla --help-loggers
```

assuming scylla is always launched with systemd, which connects
stdin to /dev/null. see
https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#Logging%20and%20Standard%20Input/Output
. so this behavior is preserved with this change.

Refs #4203

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18309
2024-04-19 15:00:05 +03:00
Raphael S. Carvalho
223214439b compaction: Disconsider active tables in the hourly compaction reevaluation
This hourly reevaluation is there to help tablets that have very low
write activity, which can go a long time without flushing a memtable,
and it's important to reevaluate compaction as data can get expired.
Today it can happen that we reevaluate a table that is being compacted
actively, which is waste of cpu as the reevaluation will happen anyway
when there are changes to sstable set. This waste can be amplified with
a significant tablet count in a given shard.
Eventually, we could make the revaluation time per table based on
expiration histogram, but until we get there, let's avoid this waste
by only reevaluating tables that are compaction idle for more than 1h.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#18280
2024-04-19 14:33:40 +03:00
Kefu Chai
a5dae74aee doc: update nodetool setlogginglevel sample output with most recent loggers list
in order to reduce the confusion like:

> I cannot find foobar in the list, is it supported?

also, take this opportunity to use "console" instead of "shell" for
rendering the code block. it's a better fit in this case. since we
are using pygment for syntax highlighting,
see https://pygments.org/docs/lexers/#pygments.lexers.shell.BashSessionLexer
for details on the "console" lexer.

and add a prompt before the command line, so that "console" lexer
can render the command line and output better.

also, add a note explaining that user should refer the output of
`scylla` to see the list of logger classes.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18311
2024-04-19 13:25:39 +03:00
Kefu Chai
c04654e865 storage_service: capture this explicitly
clang-19 complains with `-Wdeprecated-this-capture`:

```
/home/kefu/dev/scylladb/service/storage_service.cc:5837:22: error: implicit capture of 'this' with a capture default of '=' is deprecated [-Werror,-Wdeprecated-this-capture]
 5837 |         auto* node = get_token_metadata().get_topology().find_node(dst.host);
      |                      ^
/home/kefu/dev/scylladb/service/storage_service.cc:5830:44: note: add an explicit capture of 'this' to capture '*this' by reference
 5830 |     co_await transit_tablet(table, token, [=] (const locator::tablet_map& tmap, api::timestamp_type write_timestamp) {
      |                                            ^
      |                                             , this
```

since https://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0806r2.html
was approved, see https://eel.is/c++draft/depr.capture.this. and newer
versions of C++ compilers implemented it, so we need to capture `this`
explicitly to be more standard compliant, and to be more future-proof
in this regard.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18306
2024-04-19 10:05:57 +03:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Avi Kivity
6e487a49aa Merge 'toolchain: support building an optimized clang' from Takuya ASADA
This is complete version of #12786, since I take over the issue from @mykaul.
Update from original version are:
- Support ARM64 build (disable BOLT for now since it doesn't functioning)
- Changed toolchain settings to get current scylla able to build (support WASM, etc)
- Stop git clone scylladb repo, create git-archive of current scylla directory and import it
-  Update Clang to 17.0.6
- Save entire clang directory for BUILD mode, not just /usr/bin/clang binary
- Implemented INSTALL_PREBUILT mode to install prebuilt image which built in BUILD mode

Note that this patch drops cross-build support of frozen toolchain, since building clang and scylla multiple time in qemu-user-static will very slow, it's not usable.
Instead, we should build the image for each architecture natively.

----

This is a different way attempting to combine building an optimized clang (using LTO, PGO and BOLT, based on compiling ScyllaDB) to dbuild. Per Avi's request, there are 3 options: skip this phase (which is the current default), build it and build + install it to the default path.

Fixes: #10985
Fixes: scylladb/scylla-enterprise#2539

Closes scylladb/scylladb#17196

* github.com:scylladb/scylladb:
  toolchain: support building an optimized clang
  configure.py: add --build-dir option
2024-04-18 19:20:23 +00:00
Anna Stuchlik
a3481a4566 doc: document the system_auth_v2 feature
This commit includes updates related to replacing system_auth with system_auth_v2.

- The keyspace name system_auth is renamed to system_auth_v2.
- The procedures are updated to account for system_auth_v2.
- No longer required system_auth RF changes are removed from procedures.
- The information is added that if the consistent topology updates feature
  was not enabled upon upgrade from 5.4, there are limitations or additional
  steps to do (depending on the procedure).
  The files with that kind of information are to be found in _common folders
  and included as needed.
- The upgrade guide has been updated to reflect system_auth_v2 and related impacts.

Closes scylladb/scylladb#18077
2024-04-18 18:33:49 +02:00
Kefu Chai
21b03d2ce3 topology_coordinator: remove unused variable
when compiling the tree with clang-19, it complains:

```
/home/kefu/dev/scylladb/service/topology_coordinator.cc:1968:31: error: variable 'reject' set but not used [-Werror,-Wunused-but-set-variable]
 1968 |                     if (auto* reject = std::get_if<join_node_response_params::rejected>(&validation_result)) {
      |                               ^
1 error generated.
```
so, despite that we evaluate the assignment statement to see it
evaluates to true or false, the compiler still believes that the
variable is not used. probably, the value of the statement is not
dependent on the value of the value being assigned. either way,
let's use `std::holds_alternative<..>` instead of `std::get_if<..>`,
to silence this warning, and the code is a little bit more compacted,
in the sense of less tokens in the `if` statement.

in order to be self-contained, we take the opportunity to
include `<variant>` in this source file, as a function declared
in this header is used.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18291
2024-04-18 18:04:56 +03:00
Amnon Heiman
e8410848a8 Update seastar submodule
This patch updates the seastar submodule to get the latest safety patch for the metric layer.
The latest patch allows manipulating metric_families early in the
start-up process and is safer if someone chooses to aggregate summaries.

* seastar f3058414...8fabb30a (4):
  > stall-analyser: improve stall pattern matching
  > TLS: Move background BYE handshake to engine::run_in_background
  > metrics.cc: Safer set_metric_family_configs
  > src/core/metrics.cc: handle SUMMARY add operator

Closes scylladb/scylladb#18293
2024-04-18 18:02:28 +03:00
Tomasz Grabiec
393cb54c01 Merge 'Generalize tablet transition API calls' from Pavel Emelyanov
Recently there had been added add_tablet_replica and del_tablet_replica API calls that copy big portion of the existing move_tablet API call's logic. This PR generalizes the common parts

Closes scylladb/scylladb#18272

* github.com:scylladb/scylladb:
  tablets: Generalize transition mutations preparation
  tablets: Generalize tablet-already-in-transition check
  tablets: Generalize raft communications for tablet transition API calls
  tablets: Drop src vs dst equality check from move_tablet()
2024-04-18 14:42:10 +02:00
Anna Stuchlik
ad81f9f56a doc: replace Scylla with ScyllaDB in Glossary
This commit replaces "Scylla" with "ScyllaDB" on the Glossary page.

The product has been rebranded as "ScyllaDB".

Closes scylladb/scylladb#18296
2024-04-18 14:59:23 +03:00
Kamil Braun
9c2a836607 Revert "Merge 'Drain view_builder in generic drain' from ScyllaDB"
This reverts commit 298a7fcbf2, reversing
changes made to 5cf53e670d.

The change made CI flaky.

Fixes: scylladb/scylladb#18278
2024-04-18 11:50:41 +02:00
Aleksandr Bykov
e8833c6f2a test: Kill coordinator during topology operation
If coordinator node was killed, restarted, become not
operatable during topology operation, new coordinator should be elected,
operation should be aborted and cluster should be rolled back

Error injection will be used to kill the coordinator before streaming
starts

Closes scylladb/scylladb#16197
2024-04-17 17:24:20 +02:00
Tomasz Grabiec
c6c8347493 migration_manager: Pull all of group0 state on repair
Current code uses non-raft path to pull the schema, which violates
group0 linearizability because the node will have latest schema but
miss group0 updates of other system tables. In particular,
system.tablets. This manifests as repair errors due to missing
tablet_map for a given table when trying to access it. Tablet map is
always created together with the table in the same group0 command.

When a node is bootstrapping, repair calls sync_schema() to make
sure local schema is up to date. This races with group0 catch up,
and if sync_schema() wins, repair may fail on misssing tablet map.

Fix by making sync_schema() do a group0 read barrier when in raft
mode.

Fixes #18002

Closes scylladb/scylladb#18175
2024-04-17 16:21:05 +02:00
Nadav Har'El
e78fc75323 Merge 'tools/scylla-nodetool: add doc link for getsstables and sstableinfo commands' from Botond Dénes
Just like all the other commands already have it. These commands didn't have documentation at the point where they were implemented, hence the missing doc link.

The links don't work yet, but they will work once we release 6.0 and the current master documentation is promoted to stable.

Closes scylladb/scylladb#18147

* github.com:scylladb/scylladb:
  tools/scylla-nodetool: fix typo: Fore -> For
  tools/scylla-nodetool: add doc link for getsstables and sstableinfo commands
2024-04-17 15:15:56 +03:00
Asias He
642f9a1966 repair: Improve estimated_partitions to reduce memory usage
Currently, we use the sum of the estimated_partitions from each
participant node as the estimated_partitions for sstable produced by
repair. This way, the estimated_partitions is the biggest possible
number of partitions repair would write.

Since repair will write only the difference between repair participant
nodes, using the biggest possible estimation will overestimate the
partitions written by repair, most of the time.

The problem is that overestimated partitions makes the bloom filter
consume more memory. It is observed that it causes OOM in the field.

This patch changes the estimation to use a fraction of the average
partitions per node instead of sum. It is still not a perfect estimation
but it already improves memory usage significantly.

Fixes #18140

Closes scylladb/scylladb#18141
2024-04-17 14:31:38 +03:00
Pavel Emelyanov
1b2cd56bcc tablets: Generalize transition mutations preparation
Tablet transition handlers prepare two mutations -- one for tablets
table, that sets transition state, transition mode and few others; and
another one for topology table that "activates" the tablet_migration
state for topology coordinator.

The latter is common to all three handlers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-17 12:01:51 +03:00
Pavel Emelyanov
3beccb8165 tablets: Generalize tablet-already-in-transition check
Continuation of the previous patch -- there's a common sanity check of
tablet transition API handlers, namely that this tablet is not in
transition already.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-17 12:01:02 +03:00
Pavel Emelyanov
14923812ad tablets: Generalize raft communications for tablet transition API calls
There are three transition calls -- move, add replica and del replica --
and all three work similarly. In a loop they try to get guard for raft
operation, then perform sanity checks on topology state, then prepare
mutations and then try to apply them to raft. After the loop finishes
all three wait for transition for the given tablet to complete.

This patch generalizes the raft kicking loop and the transition
completion waiting code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-17 11:59:03 +03:00
Pavel Emelyanov
c4d538320e tablets: Drop src vs dst equality check from move_tablet()
The code here looks like this

    if src.host == dst.host
        throw "Local migration not possible"

    if src == dst
        co_return;

The 2nd check is apparently never satisfied -- if src == dst this means
that src.host == dst.host and it should have thrown already

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-17 11:57:10 +03:00
Kefu Chai
e431e7dc16 test: paritioner_test: print using fmt::print()
instead of using `operator<<`, use `fmt::print()` to
format and print, so we can ditch the `operator<<`-based formatters.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18259
2024-04-17 07:13:20 +03:00
Kefu Chai
0ff28b2a2a test: extract boost_test_print_type() into test_utils.hh
since Boost.Test relies on operator<< or `boost_test_print_type()`
to print the value of variables being compared, instead of defining
the fallback formatter of `boost_test_print_type()` for each
individual test, let's define it in `test/lib/test_utils.hh`, so
that it can be shared across tests.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18260
2024-04-17 07:12:39 +03:00
Kefu Chai
2bb8e7c3c3 utils: include "seastarx.hh" in composite_abort_source.hh
there is chance that `utils/small_vector.hh` does not include
`using namespace seastar`, and even if it does, we should not rely
on it. but if it does not, checkhh would fail. so let's include
"seastarx.hh" in this header, so it is self-contained.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18265
2024-04-17 07:11:01 +03:00
David Garcia
6707bc673c docs: update theme 1.7
Closes scylladb/scylladb#18252
2024-04-16 13:48:11 +02:00
Kamil Braun
eb9ba914a3 Merge 'Set dc and rack in gossiper when loaded from system.peers and load the ignored nodes state for replace' from Benny Halevy
The problem this series solves is correctly ignoring DOWN nodes state
when replacing a node.

When a node is replaced and there are other nodes that are down, the
replacing node is told to ignore those DOWN nodes using the
`ignore_dead_nodes_for_replace` option.

Since the replacing node is bootstrapping it starts with an empty
system.peers table so it has no notion about any node state and it
learns about all other nodes via gossip shadow round done in
`storage_service::prepare_replacement_info`.

Normally, since the DOWN nodes to ignore already joined the ring, the
remaining node will have their endpoint state already in gossip, but if
the whole cluster was restarted while those DOWN nodes did not start,
the remaining nodes will only have a partial endpoint state from them,
which is loaded from system.peers.

Currently, the partial endpoint state contains only `HOST_ID` and
`TOKENS`, and in particular it lacks `STATUS`, `DC`, and `RACK`.

The first part of this series loads also `DC` and `RACK` from
system.peers to make them available to the replacing node as they are
crucial for building a correct replication map with network topology
replication strategy.

But still, without a `STATUS` those nodes are not considered as normal
token owners yet, and they do not go through handle_state_normal which
adds them to the topology and token_metadata.

The second part of this series uses the endpoint state retrieved in the
gossip shadow round to explicitly add the ignored nodes' state to
topology (including dc and rack) and token_metadata (tokens) in
`prepare_replacement_info`.  If there are more DOWN nodes that are not
explicitly ignored replace will fail (as it should).

Fixes scylladb/scylladb#15787

Closes scylladb/scylladb#15788

* github.com:scylladb/scylladb:
  storage_service: join_token_ring: load ignored nodes state if replacing
  storage_service: replacement_info: return ignore_nodes state
  locator: host_id_or_endpoint: keep value as variant
  gms: endpoint_state: add getters for host_id, dc_rack, and tokens
  storage_service: topology_state_load: set local STATUS state using add_saved_endpoint
  gossiper: add_saved_endpoint: set dc and rack
  gossiper: add_saved_endpoint: fixup indentation
  gossiper: add_saved_endpoint: make host_id mandatory
  gossiper: add load_endpoint_state
  gossiper: start_gossiping: log local state
2024-04-16 10:27:36 +02:00
Pavel Emelyanov
2c3d6fe72f storage_proxy: Simplify create_hint_sync_point() code
It tries to call container().invoke_on_all() the hard way.
Calling it directly is not possible, because there's no
sharded::invoke_on_all() const overload

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18202
2024-04-16 07:26:06 +03:00
Nadav Har'El
a175e34375 cql-pytest: add instructions on how to get Cassandra
The cql-pytest framework allows running tests also against Cassandra,
but developers need to install Cassandra on their own because modern
distributions such as Fedora no longer carry a Cassandra package.

This patch adds clear and easy to follow (I think) instructions on how
to download a pre-compiled Cassadra, or alternatively how to download
and build Cassandra from source - and how either can be used with the
test/cql-pytest/run-cassandra script.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#18138
2024-04-16 07:23:36 +03:00
Botond Dénes
298a7fcbf2 Merge 'Drain view_builder in generic drain' from ScyllaDB
For view builder draining there's dedicated deferred action in main while all other services that need to be drained do it via storage_service. The latter is to unify shutdown for services and to make `nodetool drain` drain everything, not just some part of those. This PR makes view builder drain look the same. As a side effect it also moves `mark_existing_views_as_built` from storage service to view builder and generalizes this marking code inside view builder itself.

refs: #2737
refs: #2795

Closes scylladb/scylladb#16558

* github.com:scylladb/scylladb:
  storage_service: Drain view builder on drain too
  view_builder: Generalize mark_as_built(view_ptr) method
  view_builder: Move mark_existing_views_as_built from storage service
  storage_service: Add view_builder& reference
  main,cql_test_env: Move view_builder start up (and make unconditional)
2024-04-16 07:21:42 +03:00
Pavel Emelyanov
5cf53e670d replica: Remove unused ex variable from table::take_snapshot
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18215
2024-04-16 07:16:38 +03:00
Pavel Emelyanov
f17c594d21 large_data_handler: If-less statistics increment
The partitions_bigger_than_threshold is incremented only if the previous
check detects that the partition exceeds a threshold by its size. It's
done with an extra if, but it can be done without (explicit) condition
as bool type is guaranteed by the standard to convert into integers as
true = 1 and false = 0

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18217
2024-04-16 07:16:05 +03:00
Pavel Emelyanov
0f70d276d2 tools/scylla-sstable: Use shorter check is unordered_set contains a key
Currentl code counts the number of keys in it just to see if this number
is non-zero. Using .contains() method is better fit here

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18219
2024-04-16 07:14:48 +03:00
Pavel Emelyanov
1df7c2a0e9 topology_coordinator: Mark retake_node() const
Runaway from 4d83a8c12c

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18218
2024-04-16 07:13:07 +03:00
Pavel Emelyanov
05c4042511 api/lsa: Don't use database to perform invoke-on-all
The sharded<database> is used as a invoke_in_all() method provider,
there's no real need in database itself. Simple smp::invoke_on_all()
would work just as good.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18221
2024-04-16 07:12:40 +03:00
Pavel Emelyanov
4a6291dce5 test/sstable: Use .handle_exception_type() shortcut
Some tests want to ignore out_of_range exception in continuation and go
the longer route for that

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18216
2024-04-16 07:11:35 +03:00
Pavel Emelyanov
1612aa01ca cql3: Reserve vector with pk columns
When constructing a vector with partition key data, the size of that
vector is known beforehand

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18239
2024-04-16 07:06:07 +03:00
Pavel Emelyanov
f3edde7d2e api: Qualify callback commitlog* argument with const
There's a helper map-reducer that accepts a function to call on
commitlog. All callers accumulate statistics with it, so the commitlog
argument is const pointer.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18238
2024-04-16 07:02:31 +03:00