Commit Graph

42783 Commits

Author SHA1 Message Date
Benny Halevy
68544d5bb3 repair_service: debug stop
Seen the following unexplained assertion failure with
pytest -s -v --scylla-version=local_tarball --tablets repair_additional_test.py::TestRepairAdditional::test_repair_option_pr_multi_dc
```
INFO  2024-05-27 11:18:05,081 [shard 0:main] init - Shutting down repair service
INFO  2024-05-27 11:18:05,081 [shard 0:main] task_manager - Stopping module repair
INFO  2024-05-27 11:18:05,081 [shard 0:main] task_manager - Unregistered module repair
INFO  2024-05-27 11:18:05,081 [shard 1:main] task_manager - Stopping module repair
INFO  2024-05-27 11:18:05,081 [shard 1:main] task_manager - Unregistered module repair
scylla: repair/row_level.cc:3230: repair_service::~repair_service(): Assertion `_stopped' failed.
Aborting on shard 0.
Backtrace:
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3f040c
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x41c7a1
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x3dbaf
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x8e883
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x3dafd
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x2687e
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x2679a
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x36186
  0x26f2428
  0x10fb373
  0x10fc8b8
  0x10fc809
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x456c6d
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x456bcf
  0x10fc65b
  0x10fc5bc
  0x10808d0
  0x1080800
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff22f
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x4003b7
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff888
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36dea8
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36d0e2
  0x101cefa
  0x105a390
  0x101bde7
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27b89
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27c4a
  0x101a764
```

Decoded:
```
~repair_service at ./repair/row_level.cc:3230
~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:491
 (inlined by) ~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:491
~shared_ptr at ././seastar/include/seastar/core/shared_ptr.hh:569
 (inlined by) seastar::shared_ptr<repair_service>::operator=(seastar::shared_ptr<repair_service>&&) at ././seastar/include/seastar/core/shared_ptr.hh:582
 (inlined by) seastar::shared_ptr<repair_service>::operator=(decltype(nullptr)) at ././seastar/include/seastar/core/shared_ptr.hh:588
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:727
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&>(seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&) at ././seastar/include/seastar/core/future.hh:2035
 (inlined by) seastar::futurize<std::invoke_result<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::type>::type seastar::smp::submit_to<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::smp_submit_to_options, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) at ././seastar/include/seastar/core/smp.hh:367
seastar::futurize<std::invoke_result<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::type>::type seastar::smp::submit_to<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) at ././seastar/include/seastar/core/smp.hh:394
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:725
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>(std::__invoke_other, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>(seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (unsigned int), seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}>::_M_invoke(std::_Any_data const&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
```

FWIW, gdb crashed when opening the coredump.

This commit will help catch the issue earlier
when repair_service::stop() fails (and it must never fail)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 38845754c4)
2024-05-28 17:07:26 +00:00
Raphael S. Carvalho
13f8486cd7 replica: Fix tablet's compaction_groups_for_token_range() with unowned range
File-based tablet streaming calls every shard to return data of every
group that intersects with a given range.
After dynamic group allocation, that breaks as the tablet range will
only be present in a single shard, so an exception is thrown causing
migration to halt during streaming phase.
Ideally, only one shard is invoked, but that's out of the scope of this
fix and compaction_groups_for_token_range() should return empty result
if none of the local groups intersect with the range.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit eb8ef38543)

Closes scylladb/scylladb#18859
2024-05-27 15:20:04 +03:00
Kefu Chai
747ffd8776 migration_manager: do not reference moved-away smart pointer
this change is inspired by clang-tidy. it warns like:
```
[752/852] Building CXX object service/CMakeFiles/service.dir/migration_manager.cc.o
Warning: /home/runner/work/scylladb/scylladb/service/migration_manager.cc:891:71: warning: 'view' used after it was moved [bugprone-use-after-move]
  891 |             db.get_notifier().before_create_column_family(*keyspace, *view, mutations, ts);
      |                                                                       ^
/home/runner/work/scylladb/scylladb/service/migration_manager.cc:886:86: note: move occurred here
  886 |             auto mutations = db::schema_tables::make_create_view_mutations(keyspace, std::move(view), ts);
      |                                                                                      ^
```
in which,  `view` is an instance of view_ptr which is a type with the
semantics of shared pointer, it's backed by a member variable of
`seastar::lw_shared_ptr<const schema>`, whose move-ctor actually resets
the original instance. so we are actually accessing the moved-away
pointer in

```c++
db.get_notifier().before_create_column_family(*keyspace, *view, mutations, ts)
```

so, in this change, instead of moving away from `view`, we create
a copy, and pass the copy to
`db::schema_tables::make_create_view_mutations()`. this should be fine,
as the behavior of `db::schema_tables::make_create_view_mutations()`
does not rely on if the `view` passed to it is a moved away from it or not.

the change which introduced this use-after-move was 88a5ddabce

Refs 88a5ddabce
Fixes #18837
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit 125464f2d9)

Closes scylladb/scylladb#18873
2024-05-27 15:18:29 +03:00
Anna Stuchlik
a87683c7be doc: remove outdated MV error from Troubleshooting
This commit removes the MV error message, which only
affect older versions of ScyllaDB, from the Troubleshooting section.

Fixes https://github.com/scylladb/scylladb/issues/17205

(cherry picked from commit 92bc8053e2)

Closes scylladb/scylladb#18855
2024-05-27 15:12:22 +03:00
Anna Stuchlik
eff7b0d42d doc: replace Raft-disabled with Raft-enabled procedure
This commit fixes the incorrect Raft-related information on the Handling Cluster Membership Change Failures page
introduced with https://github.com/scylladb/scylladb/pull/17500.

The page describes the procedure for when Raft is disabled. Since 6.0, Raft for consistent schema management
is enabled and mandatory (cannot be disabled), this commit adds the procedure for Raft-enabled setups.

(cherry picked from commit 6626d72520)

Closes scylladb/scylladb#18858
2024-05-27 15:11:09 +03:00
David Garcia
7dbcfe5a39 docs: docs: autogenerate metrics
Autogenerates metrics documentation using the scripts/get_description.py script introduced in #17479

docs: add beta
(cherry picked from commit 9eef3d6139)

Closes scylladb/scylladb#18857
2024-05-27 15:10:48 +03:00
Jenkins Promoter
d078bafa00 Update ScyllaDB version to: 6.0.0-rc1 2024-05-23 15:35:32 +03:00
Yaron Kaikov
1b4d5d02ef Update ScyllaDB version to: 6.0.0-rc0 scylla-6.0.0-rc0 scylla-6.0.0-rc0-candidate 2024-05-22 14:07:45 +03:00
Anna Stuchlik
a86fb293fe doc: update Raft information in 6.0
This commit updates the documentation about Raft in version 6.0.

- "Introduction": The outdated information about consistent topology updates not being supported
  is removed and replaced with the correct information.
- "Enabling Raft": The relevant information is moved to other sections. The irrelevant information
   is removed. The section no longer exists.
- "Verifying that the Raft upgrade procedure finished successfully" - moved under Schema
   (in the same document). I additionally removed the include saying that after you verify
   that schema on Raft is enabled, you MUST enable topology changes on Raft (it is not mandatory;
   also, it should be part of the upgrade guide, not the Raft document).
- Unnecessary or incorrect references to versions are removed.

Refs https://github.com/scylladb/scylladb/issues/18580

Closes scylladb/scylladb#18689
2024-05-21 11:45:36 +02:00
Anna Stuchlik
eefa4a7333 doc: replace 5.4-to-5.5 with 5.4-to-6.0 upgrade guide
This commit replaces the 5.4-to-5.5 upgrade guide with the 5.4-to-6.0 upgrade guide,
including the metrics update information.

The guide references the "Enable Consistent Topology Updates" document,
as enabling consistent topology updates is a new step when upgrading to version 6.0.

Also, a procedure for image upgrades has been added (as verified by @yaronkaikov).

Fixes scylladb/scylladb#18254
Fixes scylladb/scylladb#17896
Refs scylladb/scylladb#18580

Closes scylladb/scylladb#18728
2024-05-21 11:31:04 +02:00
Piotr Dulikowski
9820472277 main: introduce schema commitlog scheduling group
Currently, we do not explicitly set a scheduling group for the schema
commitlog which causes it to run in the default scheduling group (called
"main"). However:

- It is important and significant enough that it should run in a
  scheduling group that is separate from the main one,
- It should not run in the existing "commitlog" group as user writes may
  sometimes need to wait for schema commitlog writes (e.g. read barrier
  done to learn the schema necessary to interpret the user write) and we
  want to avoid priority inversion issues.

Therefore, introduce a new scheduling group dedicated to the schema
commitlog.

Fixes: scylladb/scylladb#15566

Closes scylladb/scylladb#18715
2024-05-21 11:29:57 +02:00
Kefu Chai
5db315930e sstables: fix a typo in comment: s/Mimicks/Mimics/
this typo was identified by the codespell workflow

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18781
2024-05-21 12:14:10 +03:00
Nadav Har'El
dcd26d8a16 Merge 'docs: update isolation.md' from Botond Dénes
Update `docs/dev/isolation.d`:
* Update the list of scheduling groups
* Remove IO priority groups (they were folded into scheduling groups)
* Add section on RPC isolation

Closes scylladb/scylladb#18749

* github.com:scylladb/scylladb:
  docs: isolation.md: add section on RPC call isolation
  docs: isolation.md: remove mention of IO priority groups
  docs: isolation.md: update scheduling group list, add aliases
2024-05-21 11:46:57 +03:00
Kefu Chai
44e85c7d79 build: "undo" the coverage compiling options added to abseil
we are not interseted in the code coverage of abseil library, so no need
to apply the compiling options enabling the coverage instrumentation
when building the abseil library.

moreover, since the path of the file passed to `-fprofile-list` is a relative
path. when building with coverage enabled, the build fails when building
abseil, like:

```
 /usr/lib64/ccache/clang++  -I/jenkins/workspace/scylla-master/scylla-ci/scylla/abseil -std=c++20 -I/jenkins/workspace/scylla-master/scylla-ci/scylla/seastar/include -I/jenkins/workspace/scylla-master/scylla-ci/scylla/build/debug/seastar/gen/include -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -DSEASTAR_API_LEVEL=7 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_SSTRING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_DEBUG -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEBUG_PROMISE -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_TYPE_ERASE_MORE -DBOOST_NO_CXX98_FUNCTION_BASE -DFMT_SHARED -I/usr/include/p11-kit-1 -fprofile-instr-generate -fcoverage-mapping -fprofile-list=./coverage_sources.list -std=gnu++20 -Wall -Wextra -Wcast-qual -Wconversion -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wfor-loop-analysis -Wformat-security -Wgnu-redeclared-enum -Winfinite-recursion -Winvalid-constexpr -Wliteral-conversion -Wmissing-declarations -Woverlength-strings -Wpointer-arith -Wself-assign -Wshadow-all -Wshorten-64-to-32 -Wsign-conversion -Wstring-conversion -Wtautological-overlap-compare -Wtautological-unsigned-zero-compare -Wundef -Wuninitialized -Wunreachable-code -Wunused-comparison -Wunused-local-typedefs -Wunused-result -Wvla -Wwrite-strings -Wno-float-conversion -Wno-implicit-float-conversion -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -DNOMINMAX -MD -MT absl/strings/CMakeFiles/strings.dir/str_cat.cc.o -MF absl/strings/CMakeFiles/strings.dir/str_cat.cc.o.d -o absl/strings/CMakeFiles/strings.dir/str_cat.cc.o -c /jenkins/workspace/scylla-master/scylla-ci/scylla/abseil/absl/strings/str_cat.cc
clang-16: error: no such file or directory: './coverage_sources.list'`
```

in this change, we just remove the compiling options enabling the
coverage instrumentation from the cflags when building abseil.

Fixes #18686
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18748
2024-05-21 11:43:16 +03:00
Botond Dénes
11fa79a537 docs: isolation.md: add section on RPC call isolation 2024-05-21 03:12:22 -04:00
Kefu Chai
86b988a70b test/lib: do not use variable which could be moved away
C++ standard does not define the order in which the parameters
passed to a function are evaluated. so in theory, in
```c++
reusable_sst(sst->get_schema(), std::move(sst));
```
`std::move(sst)` could be evaluated before `sst->get_schema`.
but please note, `std::move(sst)` does not move `sst`
away, it merely cast `sst` to a rvalue reference, it is
`reusable_sst()` which *could* move `sst` away by
consuming it. so following call is much more dangerous
than the above one:
```c++
reusable_sst(sst->get_schema(), modify_sst(std::move(sst)))
```
nevertheless, this usage is still confusing. so instead
of passing a copy of `sst` to `reusable_sst`.

this change is inspired by clang-tidy, it warns like:

```
Warning: /home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: warning: 'sst' used after it was moved [bugprone-use-after-move]
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |                         ^
/home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:44: note: move occurred here
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |                                            ^
/home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |
```

per the analysis above, this is a false alarm.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18775
2024-05-21 10:02:10 +03:00
Pavel Emelyanov
428e0bd7d4 locator: Remove unused lshift-operator for topology
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18714
2024-05-21 09:46:30 +03:00
Pavel Emelyanov
b24fb8dc87 inet_address: Remove to_sstring() in favor of fmt::to_string
The existing inet_address::to_string() calls fmt::format("{}", *this)
anyway. However, the to_string() method is declared in .cc file, while
form formatter is in the header and is equipeed with constexprs so
that converting an address to string is done as much as possible
compile-time.

Also, though minor, fmt::to_string(foo) is believed to be even faster
than fmt::format("{}", foo).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18712
2024-05-21 09:43:08 +03:00
Kefu Chai
b6e2d6868b build: add dependencies from binaries to abseil libraries
in 0b0e661a, we brought abseil submodule back. but we didn't update
the build.ninja rules properly -- we should have add the abseil
libraries to the dependencies of the binaries so that the abseil
libraries are always generated before a certain binary is built.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18753
2024-05-21 08:50:48 +03:00
Avi Kivity
33ec6ccea9 test: boost: chunked_vector_test: include <optional>
std::optional is used but not imported. This fails on libstdc++-14.

Closes scylladb/scylladb#18739
2024-05-21 07:37:11 +03:00
Yaron Kaikov
bc596a3e76 pull_request_template: clearify the template and remove checkbox verification
It seems that having the checkbox in the PR template and failing the action is confusing and not very clear. Let's remove it completely and just add to the template an explanation to explain the backport reason

Closes scylladb/scylladb#18708
2024-05-20 18:24:28 +03:00
Botond Dénes
f239339a29 Merge 'Improve modularity of some per-table API endpoints' from Pavel Emelyanov
There's a set of API endpoints that toggle per-table auto-compaction and tombstone-gc booleans. They all live in two different .cc files under api/ directory and duplicate code of each other. This PR generalizes those handlers, places them next to each other, fixes leak on stop and, as a nice side effect, enlightens database.hh header.

Closes scylladb/scylladb#18703

* github.com:scylladb/scylladb:
  api,database: Move auto-compaction toggle guard
  api: Move some table manipulation helpers from storage_service
  api: Move table-related calls from storage_service domain
  api: Reimplement some endpoints using existing helpers
  api: Lost unset of tombstone-gc endpoints
2024-05-20 18:01:54 +03:00
Avi Kivity
61505d057e Merge 'Sort user-defined types in describe statements' from Michał Jadwiszczak
User-defined types can depend on each other, creating directed acyclic graph.

In order to support restoring schema from `DESC SCHEMA`, UDTs should be
ordered topologically, not alphabetically as it was till now.

This patch changes the way UDTs are ordered in `DESC SCHEMA`/`DESC KEYSPACE <ks>` statements, so the output can be safely copy-pasted to restore the schema.

Fixes #18539

Closes scylladb/scylladb#18302

* github.com:scylladb/scylladb:
  test/cql-pytest/test_describe: add test for UDTs ordering
  cql3/statements/describe_statement: UDTs topological sorting
  cql3/statements/describe_statement: allow to skip alphabetical sorting
  types: add a method to get all referenced user types
  db/cql_type_parser: use generic topological sorting
  db/cql_type_parses: futurize raw_builder::build()
  test/boost: add test for topological sorting
  utils: introduce generic topological sorting algorithm
2024-05-20 16:58:17 +03:00
Pavel Emelyanov
159e44d08a test.py: Make it possible to avoid wildcard test names matching
There's a nasty scenario when this searching plays bad joke.

When CI picks up a new branch and notices, that a test had changed, it
spawns a custom job with test.py --repeat 100 $changed_test_name in
it. Next, when the test.py tries opt-in test name matching, it uses the
wildcard search and can pick up extra unwanted tests into the run.

To solve this, the case-selection syntax is extended. Now if the caller
specifies `suite/test::*` as test, the test file is selected by exact
name match, but the specific test-case is not selected, the `*` makes it
run all cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18704
2024-05-20 15:50:47 +02:00
Botond Dénes
e1c4e6c151 Merge 'sstables_manager: use maintenance scheduling group to run components reload fiber' from Lakshmi Narayanan Sreethar
PR https://github.com/scylladb/scylladb/pull/18186 introduced a fiber that reloads reclaimed bloom filters when memory becomes available. Use maintenance scheduling group to run that fiber instead of running it in the main scheduling group.

Fixes #18675

Closes scylladb/scylladb#18721

* github.com:scylladb/scylladb:
  sstables_manager: use maintenance scheduling group to run components reload fiber
  sstables_manager: add member to store maintenance scheduling group
2024-05-20 16:38:42 +03:00
Takuya ASADA
33af97ca5a dist/docker: revert dropping systemd package
On 7ce6962141 we dropped openssh-server,
it also dropped systemd package and caused an error on Scylla Operator
(#17787).

This reverts dropping systemd package and fix the issue.

Fix #17787

Closes scylladb/scylladb#18643
2024-05-20 16:38:15 +03:00
Andrei Chekun
bce53efd36 Enrich test results produced by test.py
This PR resolves issue with double count of the test result for topology tests. It will not appear in the consolidated report anymore.
Another fix is to provide a better view which test failed by modifying the test case name in the report enriching it with mode and run id, so making them unique across the run.

The scope of this change is:
1. Modify the test name to have run id in name
2. Add handlers to get logs of test.py and pytest in one file that are related to test, rather than to the full suite
3. Remove topology tests from aggregating them on a suite level in Junit results
4. Add a link to the logs related to the failed tests in Junit results, so it will be easier to navigate to all logs related to test
5. Gather logs related to the failed test to one directory for better logs investigation

Ref: scylladb/scylladb#17851

Closes scylladb/scylladb#18277
2024-05-20 15:33:57 +02:00
Avi Kivity
52fe351c31 Merge 'Balance tablets within nodes (intra-node migration)' from Tomasz Grabiec
This is needed to avoid severe imbalance between shards which can
happen when some table grows and is split. The inter-node balance can
be equal, so inter-node migration cannot fix the imbalance. Also, if RF=N
then there is not even a possibility of moving tablets around to fix the imbalance.
The only way to bring the system to balance is to move tablets within the nodes.

The system is not prepared for intra-node migration currently. Request coordination
is host-based, while for intra-node migration it should be (also) shard-based.
The solution employed here is to keep the coordination between nodes as-is,
and for intra-node migration storage_proxy-level coordinator is not aware of
the migration (no pending host). The replica-side request handler will be a
second-level coordinator which routes requests to shards, similar to how
the first-level coordinator routes them to hosts.

Tablet sharder is adjusted to handle intra-migration where a tablet
can have two replicas on the same host. For reads, sharder uses the
read selector to resolve the conflict. For writes, the write selector
is used.

The old shard_of() API is kept to represent shard for reads, and new
method is introduced to query the shards for writing:
shard_for_writes(). All writers should be switched to that API, which
is not done in this patch yet.

The request handler on replica side acts as a second-level
coordinator, using sharder to determine routing to shards. A given
sharder has a scope of a single topology version, a single
effective_replication_map_ptr, which should be kept alive during
writes.

perf-simple-query test results show no signs of regression:

Command: perf-simple-query -c1 -m1G --write --tablets --duration=10

Before:

> 83294.81 tps ( 59.5 allocs/op,  14.3 tasks/op,   53725 insns/op,        0 errors)
> 87756.72 tps ( 59.5 allocs/op,  14.3 tasks/op,   54049 insns/op,        0 errors)
> 86428.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54208 insns/op,        0 errors)
> 86211.38 tps ( 59.7 allocs/op,  14.3 tasks/op,   54219 insns/op,        0 errors)
> 86559.89 tps ( 59.6 allocs/op,  14.3 tasks/op,   54188 insns/op,        0 errors)
> 86609.39 tps ( 59.6 allocs/op,  14.3 tasks/op,   54117 insns/op,        0 errors)
> 87464.06 tps ( 59.5 allocs/op,  14.3 tasks/op,   54039 insns/op,        0 errors)
> 86185.43 tps ( 59.6 allocs/op,  14.3 tasks/op,   54169 insns/op,        0 errors)
> 86254.71 tps ( 59.6 allocs/op,  14.3 tasks/op,   54139 insns/op,        0 errors)
> 83395.35 tps ( 60.2 allocs/op,  14.4 tasks/op,   54693 insns/op,        0 errors)
>
> median 86428.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54208 insns/op,        0 errors)
> median absolute deviation: 243.04
> maximum: 87756.72
> minimum: 83294.81
>

After:

> 85523.06 tps ( 59.5 allocs/op,  14.3 tasks/op,   53872 insns/op,        0 errors)
> 89362.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54226 insns/op,        0 errors)
> 88167.55 tps ( 59.7 allocs/op,  14.3 tasks/op,   54400 insns/op,        0 errors)
> 87044.40 tps ( 59.7 allocs/op,  14.3 tasks/op,   54310 insns/op,        0 errors)
> 88344.50 tps ( 59.6 allocs/op,  14.3 tasks/op,   54289 insns/op,        0 errors)
> 88355.06 tps ( 59.6 allocs/op,  14.3 tasks/op,   54242 insns/op,        0 errors)
> 88725.46 tps ( 59.6 allocs/op,  14.3 tasks/op,   54230 insns/op,        0 errors)
> 88640.08 tps ( 59.6 allocs/op,  14.3 tasks/op,   54210 insns/op,        0 errors)
> 90306.31 tps ( 59.4 allocs/op,  14.3 tasks/op,   54043 insns/op,        0 errors)
> 87343.62 tps ( 59.8 allocs/op,  14.3 tasks/op,   54496 insns/op,        0 errors)
>
> median 88355.06 tps ( 59.6 allocs/op,  14.3 tasks/op,   54242 insns/op,        0 errors)
> median absolute deviation: 1007.41
> maximum: 90306.31
> minimum: 85523.06

Command (reads): perf-simple-query -c1 -m1G  --tablets --duration=10

Before:

> 95860.18 tps ( 63.1 allocs/op,  14.1 tasks/op,   42476 insns/op,        0 errors)
> 97537.69 tps ( 63.1 allocs/op,  14.1 tasks/op,   42454 insns/op,        0 errors)
> 97549.23 tps ( 63.1 allocs/op,  14.1 tasks/op,   42470 insns/op,        0 errors)
> 97511.29 tps ( 63.1 allocs/op,  14.1 tasks/op,   42470 insns/op,        0 errors)
> 97227.32 tps ( 63.1 allocs/op,  14.1 tasks/op,   42471 insns/op,        0 errors)
> 94031.94 tps ( 63.1 allocs/op,  14.1 tasks/op,   42441 insns/op,        0 errors)
> 96978.04 tps ( 63.1 allocs/op,  14.1 tasks/op,   42462 insns/op,        0 errors)
> 96401.70 tps ( 63.1 allocs/op,  14.1 tasks/op,   42473 insns/op,        0 errors)
> 96573.77 tps ( 63.1 allocs/op,  14.1 tasks/op,   42440 insns/op,        0 errors)
> 96340.54 tps ( 63.1 allocs/op,  14.1 tasks/op,   42468 insns/op,        0 errors)
>
> median 96978.04 tps ( 63.1 allocs/op,  14.1 tasks/op,   42462 insns/op,        0 errors)
> median absolute deviation: 571.20
> maximum: 97549.23
> minimum: 94031.94
>

After:

> 99794.67 tps ( 63.1 allocs/op,  14.1 tasks/op,   42471 insns/op,        0 errors)
> 101244.99 tps ( 63.1 allocs/op,  14.1 tasks/op,   42472 insns/op,        0 errors)
> 101128.37 tps ( 63.1 allocs/op,  14.1 tasks/op,   42485 insns/op,        0 errors)
> 101065.27 tps ( 63.1 allocs/op,  14.1 tasks/op,   42465 insns/op,        0 errors)
> 101212.98 tps ( 63.1 allocs/op,  14.1 tasks/op,   42456 insns/op,        0 errors)
> 101413.31 tps ( 63.1 allocs/op,  14.1 tasks/op,   42463 insns/op,        0 errors)
> 101464.92 tps ( 63.1 allocs/op,  14.1 tasks/op,   42466 insns/op,        0 errors)
> 101086.74 tps ( 63.1 allocs/op,  14.1 tasks/op,   42488 insns/op,        0 errors)
> 101559.09 tps ( 63.1 allocs/op,  14.1 tasks/op,   42468 insns/op,        0 errors)
> 100742.58 tps ( 63.1 allocs/op,  14.1 tasks/op,   42491 insns/op,        0 errors)
>
> median 101212.98 tps ( 63.1 allocs/op,  14.1 tasks/op,   42456 insns/op,        0 errors)
> median absolute deviation: 200.33
> maximum: 101559.09
> minimum: 99794.67
>

Fixes #16594

Closes scylladb/scylladb#18026

* github.com:scylladb/scylladb:
  Implement fast streaming for intra-node migration
  test: tablets_test: Test sharding during intra-node migration
  test: tablets_test: Check sharding also on the pending host
  test: py: tablets: Test writes concurrent with migration
  test: py: tablets: Test crash during intra-node migration
  api, storage_service: Introduce API to wait for topology to quiesce
  dht, replica: Remove deprecated sharder APIs
  test: Avoid using deprecated sharded API
  db: do_apply_many() avoid deprecated sharded API
  replica: mutation_dump: Avoid deprecated sharder API
  repair: Avoid deprecated sharder API
  table: Remove optimization which returns empty reader when key is not owned by the shard
  dht: is_single_shard: Avoid deprecated sharder API
  dht: split_range_to_single_shard: Work with static_sharder only
  dht: ring_position_range_sharder: Avoid deprecated sharder APIs
  dht: token: Avoid use of deprecated sharder API by switching to static_sharder
  selective_token_sharder: Avoid use of deprecated sharder API
  docs: Document tablet sharding vs tablet replica placement
  readers/multishard.cc: use shard_for_reads() instead of shard_of()
  multishard_mutation_query.cc: use shard_for_reads() instead of shard_of()
  storage_proxy: Extract common code to apply mutations on many shards according to sharder
  storage_proxy: Prepare per-partition rate-limiting for intra-node migration
  storage_proxy: Avoid shard_of() use in mutate_counter_on_leader_and_replicate()
  storage_proxy: Prepare mutate_hint() for intra-node tablet migration
  commitlog_replayer: Avoid deprecated sharder::shard_of()
  lwt: Avoid deprecated sharder::shard_of()
  compaction: Avoid deprecated sharder::shard_of()
  dht: Extract dht::static_sharder
  replica: Deprecate table::shard_of()
  locator: Deprecate effective_replication_map::shard_of()
  dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard
  tests: tablets: py: Add intra-node migration test
  tests: tablets: Test that drained nodes are not balanced internally
  tests: tablets: Add checks of replica set validity to test_load_balancing_with_random_load
  tests: tablets: Verify that disabling balancing results in no intra-node migrations
  tests: tablets: Check that nodes are internally balanced
  tests: tablets: Improve debuggability by showing which rows are missing
  tablets, storage_service: Support intra-node migration in move_tablet() API
  tablet_allocator: Generate intra-node migration plan
  tablet_allocator: Extract make_internode_plan()
  tablet_allocator: Maintain candidate list and shard tablet count for target nodes
  tablet_allocator: Lift apply_load/can_accept_load lambdas to member functions
  tablets, streaming: Implement tablet streaming for intra-node migration
  dht, auto_refreshing_sharder: Allow overriding write selector
  multishard_writer: Handle intra-node migration
  storage_proxy: Handle intra-node tablet migration for writes
  tablets: Get rid of tablet_map::get_shard()
  tablets: Avoid tablet_map::get_shard in cleanup
  tablets: test: Use sharder instead of tablet_map::get_shard()
  tablets: tablet_sharder: Allow working with non-local host
  sharding: Prepare for intra-node-migration
  docs: Document sharder use for tablets
  tablets: Introduce tablet transition kind for intra-node migration
  tests: tablets: Fix use-after-move of skiplist in rebalance_tablets()
  sstables, gdb: Track readers in a linked list
  raft topology: Fix global token metadata barrier to not fence ahead of what is drained
2024-05-20 16:13:01 +03:00
Kefu Chai
a517fcf970 service/storage_proxy: capture tr_state by copy in handle_paxos_accept()
this change is inspired by following warning from clang-tidy

```
Warning: /home/runner/work/scylladb/scylladb/service/storage_proxy.cc:884:13: warning: 'tr_state' used after it was moved [bugprone-use-after-move]
  884 |         if (tr_state) {
      |             ^
/home/runner/work/scylladb/scylladb/service/storage_proxy.cc:872:139: note: move occurred here
  872 |         auto f = get_schema_for_read(proposal.update.schema_version(), src_addr, *timeout).then([&sp = _sp, &sys_ks = _sys_ks, tr_state = std::move(tr_state),
      |                                                                                                                                           ^
```

this is not a false positive. as `tr_state` is a captured by move for
constructing a variable in the captured list of a lambda which is in
turn passed to the expression evaluated to `f`.

even the expression itself is not evaluated yet when we reference
`tr_state` to check if it is empty after preparing the expression,
`tr_state` is already moved away into the captured variable. so
at that moment, the statement of `f = f.finally(...)` is never
evaluated, because `tr_state` is always empty by then.

so before this change, the trace message is never recorded.

in this change, we address this issue by capturing `tr_state` by
copying it. as `tr_state` is backed by a `lw_shared_ptr`, the overhead is
neglectable.

after this change, the tracing message is recorded.

the change introduced this issue was 548767f91e.

please note, we could coroutinize this function to improve its
readability, but since this is a fix and should be backported,
let's start with a minimal fix, and worry about the readability
in a follow-up change.

Refs 548767f91e
Fixes #18725
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18702
2024-05-20 12:58:49 +03:00
Kefu Chai
40ce52c3cc test: use generic boost_test_print_type()
in this change, we trade the `boost_test_print_type()` overloads
for the generic template of `boost_test_print_type()`, except for
those in the very small tests, which presumably want to keep
themselves relative self-contained.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18727
2024-05-20 12:56:20 +03:00
Botond Dénes
0e23cd45ad Merge 'feature: grandfather some old cluster features' from Avi Kivity
This series grandfathers the following features:

  MD_SSTABLE_FORMAT
  ME_SSTABLE feature
  VIEW_VIRTUAL_COLUMNS
  DIGEST_INSENSITIVE_TO_EXPIRY
  CDC
  NONFROZEN_UDTS
  PER_TABLE_PARTITIONERS
  PER_TABLE_CACHING
  DIGEST_FOR_NULL_VALUES
  CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX

Note that for the last (CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX) some code remains to support indexes created before the new feature was adopted.

Each patch names the version where the feature was introduced.

Closes scylladb/scylladb#18428

* github.com:scylladb/scylladb:
  feature, index: grandfather CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX
  feature: grandfather DIGEST_FOR_NULL_VALUES
  storage_proxy: drop use of MD5 as a digest algorithm
  feature: grandfather PER_TABLE_CACHING
  feature: grandfather LWT
  feature: grandfather HINTED_HANDOFF_SEPARATE_CONNECTION
  feature: grandfather PER_TABLE_PARTITIONERS
  test: schema_change_test: regenerate digest for PER_TABLE_PARTITIONERS
  test: test_schema_change_digest: drop unneeded reference digests
  feature: grandfather NONFROZEN_UDTS
  feature: grandfather CDC
  feature: grandfather DIGEST_INSENSITIVE_TO_EXPIRY
  feature: grandfather VIEW_VIRTUAL_COLUMNS
  feature: grandfather ME_SSTABLE feature
  feature: grandfather MD_SSTABLE_FORMAT
2024-05-20 11:48:07 +03:00
Botond Dénes
936a7e282b docs: isolation.md: remove mention of IO priority groups
They were folded into CPU scheduling groups, which now apply to both CPU
and IO.
2024-05-20 03:33:24 -04:00
Botond Dénes
8f61468322 docs: isolation.md: update scheduling group list, add aliases 2024-05-20 03:30:04 -04:00
Lakshmi Narayanan Sreethar
6f58768c46 sstables_manager: use maintenance scheduling group to run components reload fiber
Fixes #18675

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-19 15:23:45 +05:30
Lakshmi Narayanan Sreethar
79f6746298 sstables_manager: add member to store maintenance scheduling group
Store that maintenance scheduling group inside the sstables_manager. The
next patch will use this to run the components reloader fiber.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-19 15:23:45 +05:30
Avi Kivity
54a82fed6b feature, index: grandfather CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX
This feature corrected how we store the token in secondary indexes. It
was introduced in 7ff72b0ba5 (2020; 4.4) and can now be assumed present
everywhere. Note that we still support indexes created with the old format.
2024-05-18 00:24:11 +03:00
Avi Kivity
2fbd78c769 feature: grandfather DIGEST_FOR_NULL_VALUES
The DIGEST_FOR_NULL_VALUES feature was added in 21a77612b3 (2020; 4.4)
and can now be assumed to be always present. The hasher which it invoked
is removed.
2024-05-18 00:24:00 +03:00
Avi Kivity
879583c489 storage_proxy: drop use of MD5 as a digest algorithm
The XXHASH feature was introduced in 0bab3e59c2 (2017; 2.2) and made
mandatory in defe6f49df (2020; 4.4), but some vestiges remain.
Remove them now. Note that md5_hasher itself is still in use by
other components, so it cannot be removed.
2024-05-18 00:23:47 +03:00
Avi Kivity
7c264e8a71 feature: grandfather PER_TABLE_CACHING
The PER_TABLE_CACHING feature was added in 0475dab359 (2020; 4.2)
and can now be assumed to be always present.
2024-05-18 00:23:30 +03:00
Avi Kivity
d52c424a5f feature: grandfather LWT
LWT was make non-experimental in 9948f548a5 (2020; 4.1) and can now be
assumed to be always present.
2024-05-18 00:20:53 +03:00
Avi Kivity
93088d0921 feature: grandfather HINTED_HANDOFF_SEPARATE_CONNECTION
The HINTED_HANDOFF_SEPARATE_CONNECTION feature was introduced in 3a46b1bb2b (2019; 3.3)
and can be assumed always present.
2024-05-18 00:18:27 +03:00
Avi Kivity
3bead8cea0 feature: grandfather PER_TABLE_PARTITIONERS
The PER_TABLE_PARTITIONERS feature was added in 90df9a44ce (2020; 4.0)
and can now be assumed to be always present. We also remove the associated
schema_feature.
2024-05-18 00:15:07 +03:00
Avi Kivity
6b532fd40b test: schema_change_test: regenerate digest for PER_TABLE_PARTITIONERS
The first digest tested was generated without the PER_TABLE_PARTITIONERS
schema feature. We're about to make that feature mandatory, so we won't
be able (and won't need) to generate a digest without it.

Update the digest to include the feature. Note it wasn't untested before,
we have a test with schema_features::full().
2024-05-18 00:14:43 +03:00
Avi Kivity
c4d8b17f4c test: test_schema_change_digest: drop unneeded reference digests
digests[0] was used by the VIEW_VIRTUAL_COLUMNS feature, which
no longer exists.

digests[1] is the same as digests[2], so drop it.
2024-05-17 20:41:20 +03:00
Avi Kivity
93113da01b feature: grandfather NONFROZEN_UDTS
The NONFROZEN_UDTS feature was added in e74b5deb5d (2019; 3.2)
and can now be assumed to be always present.
2024-05-17 20:41:20 +03:00
Avi Kivity
c7d7ca2c23 feature: grandfather CDC
The CDC feature was made non-experimental in e9072542c1 (2020; 4.4)
and can now be assumed to be always present. We also remove the corresponding
schema_feature.
2024-05-17 20:41:20 +03:00
Avi Kivity
82ad2913ca feature: grandfather DIGEST_INSENSITIVE_TO_EXPIRY
The DIGEST_INSENSITIVE_TO_EXPIRY feature was added in 9de071d214 (2019; 3.2)
and can now be assumed to be always present. We enable the corresponding
schema_feature unconditionally.

We do not remove the corresponding schema feature, because it can be disabled
when the related TABLE_DIGEST_INSENSITIVE_TO_EXPIRY is present.
2024-05-17 20:41:19 +03:00
Avi Kivity
b5f6021a6b feature: grandfather VIEW_VIRTUAL_COLUMNS
The VIEW_VIRTUAL_COLUMNS feature was added in a108df09f9 (2019; 3.1)
and can now be assumed to be always present.

The corresponding schema_feature is removed. Note schema_features are not sent
over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.
2024-05-17 20:41:19 +03:00
Avi Kivity
7952200c8c feature: grandfather ME_SSTABLE feature
"me" format sstables were introduced in d370558279 (Jan 2022; 5.1)
and so can be assumed always present. The listener that checks when
the cluster understands ME_SSTABLE was removed and in its place
we default to sstable_version_types::me (and call on_enabled()
immediately).
2024-05-17 20:41:19 +03:00
Avi Kivity
6d0c0b542c feature: grandfather MD_SSTABLE_FORMAT
"md" sstable support was introduced in e8d7744040 (2020; 4.4)
and so can be assumed to be present on all versions we upgrade from.
Nothing appears to depend on it.
2024-05-17 20:41:19 +03:00