Commit Graph

38914 Commits

Author SHA1 Message Date
Benny Halevy
8a56050507 main: handle abort_requested_exception on startup
Handle abort_requested_exception exactly like
sleep_aborted, as an expected error when startup
is aborted mid-way.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#15443
2023-09-18 15:05:52 +03:00
Botond Dénes
f7557a4891 Merge 'updating presto integration page documentation' from Guy Shtub
null

Closes scylladb/scylladb#15342

* github.com:scylladb/scylladb:
  Update integration-presto.rst
  Update integration-presto.rst
  Update docs/using-scylla/integrations/integration-presto.rst
  updating presto integration page
2023-09-18 14:41:16 +03:00
Botond Dénes
edb50c27ec Merge 'Use sstable_state in sstables populator' from Pavel Emelyanov
Some time ago populating of tables from sstables was reworked to use sstable states instead of full paths (#12707). Since then few places in the populator was left that still operate on the state-based subdirectory name. This PR collects most of those dangling ends

refs: #13020

Closes scylladb/scylladb#15421

* github.com:scylladb/scylladb:
  distributed_loader: Print sstable state explicitly
  distributed_loader: Move check for the missing dir upper
  distributed_loader: Use state as _sstable_directories key
2023-09-18 14:38:49 +03:00
Kefu Chai
054beb6377 tests: tablets: do not compare signed integer with unsigned integer
when compiling the tests with -Wsign-compare, the compiler complains like:
```
/home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/cmake/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -isystem /home/kefu/dev/scylladb/build/cmake/rust -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere  -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o -MF test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o.d -o test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o -c /home/kefu/dev/scylladb/test/boost/tablets_test.cc
/home/kefu/dev/scylladb/test/boost/tablets_test.cc:1335:53: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
            for (int log2_tablets = 0; log2_tablets < tablet_count_bits; ++log2_tablets) {
                                       ~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~
```

in this case, it should be safe to use an signed int as the loop
variable to be compared with `tablet_count_bits`, but let's just
appease the compiler so we can enable the warning option project-wide
to prevent any potential issues caused by signed-unsigned comparision.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15449
2023-09-18 13:17:16 +02:00
Kamil Braun
bc6f7d1b20 Merge 'raft topology: add garbage collection for internal CDC generations table' from Patryk Jędrzejczak
We add garbage collection for the `CDC_GENERATIONS_V3` table to prevent
it from endlessly growing. This mechanism is especially needed because
we send the entire contents of `CDC_GENERATIONS_V3` as a part of the
group 0 snapshot.

The solution is to keep a clean-up candidate, which is one of the
already published CDC generations. The CDC generation publisher
introduced in #15281 continually uses this candidate to remove all
generations with timestamps not exceeding the candidate's and sets a new
candidate when needed.

We also add `test_cdc_generation_clearing.py` that verifies this new
mechanism.

Fixes #15323

Closes scylladb/scylladb#15413

* github.com:scylladb/scylladb:
  test: add test_cdc_generation_clearing
  raft topology: remove obsolete CDC generations
  raft topology: set CDC generation clean-up candidate
  topology_coordinator: refactor publish_oldest_cdc_generation
  system_keyspace: introduce decode_cdc_generation_id
  system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3
2023-09-18 11:30:10 +02:00
Pavel Emelyanov
30959fc9b1 lsa, test: Extend memory footprint test with per-type total sizes
When memory footprint test is over it prints total size taken by row
cache, memtable and sstables as well as individual objects' sizes. It's
also nice to know the details on the row-cache's individual objects.
This patch extends the printing with total size of allocated object
types according to migrator_fn types.

Sample output:

    mutation footprint:
     - in cache:     11040928
     - in memtable:  9142424
     - in sstable:
       mc:   2160000
       md:   2160000
       me:   2160000
     - frozen:       540
     - canonical:    827
     - query result: 342

     sizeof(cache_entry) = 64
     sizeof(memtable_entry) = 64
     sizeof(bptree::node) = 288
     sizeof(bptree::data) = 72
     -- sizeof(decorated_key) = 32
     -- sizeof(mutation_partition) = 96
     -- -- sizeof(_static_row) = 8
     -- -- sizeof(_rows) = 24
     -- -- sizeof(_row_tombstones) = 40

     sizeof(rows_entry) = 144
     sizeof(evictable) = 24
     sizeof(deletable_row) = 72
     sizeof(row) = 16
     radix_tree::inner_node::node_sizes =  48 80 144 272 528 1040
     radix_tree::leaf_node::node_sizes =  120 216 416 816 3104
     sizeof(atomic_cell_or_collection) = 16
     btree::linear_node_size(1) = 24
     btree::inner_node_size = 216
     btree::leaf_node_size = 120
    LSA stats:
      N18compact_radix_tree4treeI13cell_and_hashjE9leaf_nodeE: 360
      N5bplus4dataIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 5040
      N5bplus4nodeIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 19296
      17partition_version: 952416
      N11intrusive_b4nodeI10rows_entryXadL_ZNS1_5_linkEEENS1_11tri_compareELm12ELm20ELNS_10key_searchE0ELNS_10with_debugE0EEE: 317472
      10rows_entry: 1429056
      12blob_storage: 254

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#15434
2023-09-18 11:23:18 +02:00
Guy Shtub
5d833b2ee7 Update integration-presto.rst 2023-09-18 11:29:38 +03:00
Botond Dénes
bb7121a1fb Merge 'tools/scylla-nodetools: do not create unowned bpo::value ' from Kefu Chai
in other words, do not create bpo::value unless transfer it to an
option_description.

`boost::program_options::value()` create a new typed_value<T> object,
without holding it with a shared_ptr. boost::program_options expects
developer to construct a `bpo::option_description` right away from it.
and `boost::program_options::option_description` takes the ownership
of the `type_value<T>*` raw pointer, and manages its life cycle with
a shared_ptr. but before passing it to a `bpo::option_description`,
the pointer created by `boost::program_options::value()` is a still
a raw pointer.

before this change, we initialize `operations_with_func` as global
variables using `boost::program_options::value()`. but unfortunately,
we don't always initialize a `bpo::option_description` from it --
we only do this on demand when the corresponding subcommand is
called.

so, if the corresponding subcommand is not called, the created
`typed_value<T>` objects are leaked. hence LeakSanitizer warns us.

after this change, we create the option map as a static
local variable in a function so it is created on demand as well.
as an alternative, we could initialize the options map as local
variable where it used. but to be more consistent with how
`global_option` is specified. and to colocate them in a single
place, let's keep the existing code layout.

this change is quite similar to 374bed8c3d

Fixes https://github.com/scylladb/scylladb/issues/15429

Closes scylladb/scylladb#15430

* github.com:scylladb/scylladb:
  tools/scylla-nodetools: reindent
  tools/scylla-nodetools: do not create unowned bpo::value
2023-09-18 11:09:46 +03:00
Kefu Chai
a51b14d4c4 sstables/metadata_collector: drop unused functions
column_stats::update_local_deletion_time() is not used anywhere,
what is being used is
`column_stats::update_local_deletion_time_and_tombstone_histogram(time_point)`.
while `update_local_deletion_time_and_tombstone_histogram(int32_t)`
is only used internally by a single caller.

neither is `column_stats::update(const deletion_time&)` used.

so let's drop them. and merge
`update_local_deletion_time_and_tombstone_histogram(int32_t)`
into its caller.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15189
2023-09-18 10:18:56 +03:00
Botond Dénes
b97778e4b2 Merge 'create-relocatable-package.py: do not assume "build" build directory' from Kefu Chai
in this series, we do not assume the existence of "build" build directory. and prefer using the version files located under the directory specified with the `--build-dir` option.

Refs #15241

Closes scylladb/scylladb#15402

* github.com:scylladb/scylladb:
  create-relocatable-package.py: prefer $build_dir/SCYLLA-RELEASE-FILE
  create-relocatable-package.py: create SCYLLA-RELOCATABLE-FILE with tempfile
2023-09-18 09:07:37 +03:00
Kefu Chai
a03dc92cb5 tools/scylla-nodetools: reindent
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-18 13:57:37 +08:00
Kefu Chai
ed41c725f3 tools/scylla-nodetools: do not create unowned bpo::value
in other words, do not create bpo::value unless transfer it to an
option_description.

`boost::program_options::value()` create a new typed_value<T> object,
without holding it with a shared_ptr. boost::program_options expects
developer to construct a `bpo::option_description` right away from it.
and `boost::program_options::option_description` takes the ownership
of the `type_value<T>*` raw pointer, and manages its life cycle with
a shared_ptr. but before passing it to a `bpo::option_description`,
the pointer created by `boost::program_options::value()` is a still
a raw pointer.

before this change, we initialize `operations_with_func` as global
variables using `boost::program_options::value()`. but unfortunately,
we don't always initialize a `bpo::option_description` from it --
we only do this on demand when the corresponding subcommand is
called.

so, if the corresponding subcommand is not called, the created
`typed_value<T>` objects are leaked. hence LeakSanitizer warns us.

after this change, we create the option map as a static
local variable in a function so it is created on demand as well.
as an alternative, we could initialize the options map as local
variable where it used. but to be more consistent with how
`global_option` is specified. and to colocate them in a single
place, let's keep the existing code layout.

this change is quite similar to 374bed8c3d

Fixes #15429
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-18 13:57:37 +08:00
Kefu Chai
b350596656 docs: correct the code sample for checking service status
```console
$ journalctl --user start scylla-server -xe
Failed to add match 'start': Invalid argument
```

`journalctl` expects a match filter as its positional arguments.
but apparently, start is not a filter. we could use `--unit`
to specify a unit though, like:

```console
$ journalctl --user --unit scylla-server.service -xe
```

but it would flood the stdout with the logging messages printed
by scylla. this is not what a typical user expects. probably a better
use experience can be achieved using

```console
$ systemctl --user status scylla-server
```
which also print the current status reported by the service, and
the command line arguments. they would be more informative in typical
use cases.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15390
2023-09-18 08:37:42 +03:00
Avi Kivity
67a0c865cf tools: toolchain: prepare: don't overwrite existing images
The docker/podman tooling is destructive: it will happily
overwrite images locally and on the server. If a maintainer
forgets to update tools/toolchain/image, this can result
in losing an older toolchain container image.

To prevent that, check that the image name is new.

Closes scylladb/scylladb#15397
2023-09-18 08:35:01 +03:00
Kefu Chai
a04fa0b41e conf: update commented out experimental_features
update commented out experimental_features to reflect the latest
experimental features:

- in 4f23eec4, "raft" was renamed to "consistent-topology-changes".
- in 2dedb5ea, "alternator-ttl" was moved out of experimental features.
- in 5b1421cc, "broadcast-tables" was added to experimental features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15407
2023-09-18 08:31:01 +03:00
Guy Shtub
b8693636b8 Update integration-presto.rst
Removing link to forum, will be added as general footer
2023-09-18 06:50:11 +03:00
Guy Shtub
7d0691b348 Update docs/using-scylla/integrations/integration-presto.rst
Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>
2023-09-18 06:46:02 +03:00
Avi Kivity
4eb4ac4634 scripts: pull_gitgub_pr.sh: absolutize project reference
pull_gitgub_pr.sh adds a "Closes #xyz" tag so github can close
the pull request after next promotion. Convert it to an absolute
refefence (scylladb/scylladb#xyz) so the commit can be cherry-picked
into another repository without the reference dangling.

Closes #15424
2023-09-15 19:29:50 +03:00
Kefu Chai
1e6b2eb4c8 tools/scylla-nodetool: mark format string as constexpr
this change change `const` to `constexpr`. because the string literal
defined here is not only immutable, but also initialized at
compile-time, and can be used by constexpr expressions and functions.

this change is introduced to reduce the size of the change when moving
to compile-time format string in future. so far, seastar::format() does
not use the compile-time format string, but we have patches pending on
review implementing this. and the author of this change has local
branches implementing the changes on scylla side to support compile-time
format string, which practically replaces most of the `format()` calls
with `seastar::format()`.

without this change, if we use compile-time format check, compiler fails
like:

```
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:276:44: error: call to consteval function 'fmt::basic_format_string<char, const char *const &, seastar::basic_sstring<char, unsigned int, 15>>::basic_format_string<const char *, 0>' is not a constant expression
            .description = seastar::format(description_template, app_name, boost::algorithm::join(operations | boost::adaptors::transformed([] (const auto& op) {
                                           ^
/usr/include/fmt/core.h:3148:67: note: read of non-constexpr variable 'description_template' is not allowed in a constant expression
  FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) {
                                                                  ^
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:276:44: note: in call to 'basic_format_string(description_template)'
            .description = seastar::format(description_template, app_name, boost::algorithm::join(operations | boost::adaptors::transformed([] (const auto& op) {
                                           ^
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:258:16: note: declared here
    const auto description_template =
               ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15432
2023-09-15 19:28:38 +03:00
Kefu Chai
6c75dc4be8 tools/scylla-nodetool: do not compare unsigned with int
change the loop variable to `int` to silence warning like

```
/home/kefu/.local/bin/clang++ -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -I/home/kefu/dev/scylladb/build/cmake/gen -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere  -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o -MF tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o.d -o tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o -c /home/kefu/dev/scylladb/tools/scylla-nodetool.cc
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:215:28: error: comparison of integers of different signs: 'unsigned int' and 'int' [-Werror,-Wsign-compare]
    for (unsigned i = 0; i < argc; ++i) {
                         ~ ^ ~~~~
```

`i` is used as the index in a plain C-style array, it's perfectly fine
to use a signed integer as index in this case. as per C++ standard,

> The expression E1[E2] is identical (by definition) to *((E1)+(E2))

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15431
2023-09-15 19:28:14 +03:00
Kefu Chai
30ef69fcb2 docs/dev/object_store: add more samples
in hope to lower the bar to testing object store.

* add language specifier for better readability of the document.
  to highlight the config with YAML syntax
* add more specific comment on the AWS related settings
* explain that endpoint should match in the CREATE KEYSPACE
  statement and the one defined by the YAML configuration.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15433
2023-09-15 17:35:17 +03:00
Pavel Emelyanov
cce2752b64 Merge 'node_ops: move node_ops related classes to node_ops/' from Aleksandra Martyniuk
Move node_ops related classes to node_ops/ so that they
are consistently grouped and could be access from
many modules.

Closes #15351

* github.com:scylladb/scylladb:
  node_ops: extract classes related to node operations
  node_ops: repair: move node_ops_id to node_ops directory
2023-09-15 15:12:00 +03:00
Anna Stuchlik
fb635dccaa doc: add info - support for FIPS-compliant systems
This commit adds the information that ScyllaDB Enterprise
supports FIPS-compliant systems in versions
2023.1.1 and later.
The information is excluded from OSS docs with
the "only" directive, because the support was not
added in OSS.

This commit must be backported to branch-5.2 so that
it appears on version 2023.1 in the Enterprise docs.

Closes #15415
2023-09-15 11:08:34 +02:00
Patryk Jędrzejczak
840e1c5185 test: add test_cdc_generation_clearing
We add a test for the new CDC generation garbage collection
mechanism.
2023-09-15 09:28:32 +02:00
Patryk Jędrzejczak
0cc54e0da7 raft topology: remove obsolete CDC generations
We make the CDC generation publisher continually remove the
obsolete CDC generation data to prevent CDC_GENERATIONS_V3 from
endlessly growing. To achieve this, we use the clean-up candidate.
If it exists and can be safely removed, we remove it together with
all older CDC generations. We also mark the lack of a new
candidate. The next published CDC generation will become one.

Note this solution does not have any guarantee about "when"
it removes obsolete generations. Formally, it guarantees that
if there is a candidate that can be removed and the CDC generation
publisher attempts to remove it, all generations up to the
candidate are removed. In practice, when a new generation appears,
the publisher makes a new candidate or tries to remove an old
candidate, so obsolete generations can stay for a long time only
if no generation appears for a long time. But it is fine because
we only want to prevent CDC_GENERATIONS_V3 from growing too much.
Moreover, providing any guarantees would require a new wake-up
mechanism for the publisher, which would be hard to implement.
2023-09-15 09:26:58 +02:00
Patryk Jędrzejczak
e375e769b9 raft topology: set CDC generation clean-up candidate
We want to use the clean-up candidates to remove the obsolete CDC
generation data, but first, we need to set suitable generations as
a candidate when there is no candidate. Since CDC generations must
be published before we remove them, a generation that is being
published is a good candidate.
2023-09-15 09:23:59 +02:00
Patryk Jędrzejczak
b84e097c28 topology_coordinator: refactor publish_oldest_cdc_generation
In the following commits, we add a new task for the CDC generation
publisher -- clearing obsolete CDC generation data. This task
can be done together with the publishing under one group 0 guard.
We refactor publish_oldest_cdc_generation to make it possible.
Now, this function is more like a command builder. It takes guard
by const reference and updates the vector of mutations and the
reason string. The CDC generation publisher uses them directly to
update the topology at the end after finishing building the
command. This logic will be more visible after adding the clearing
task.
2023-09-15 09:04:23 +02:00
Botond Dénes
b87660f90c tools/scylla-sstable: log where schema was obtained from
Currently, we only log anything about what was tried w.r.t. obtaining
the schema if it failed. Add a log message to the success path too, so
in case the wrong schema was successfully loaded, the user can find the
problem.
The log message is printed with debug-level, so it doesn't distrurb
output by default.

Fixes: #15384

Closes #15417
2023-09-14 23:09:30 +03:00
Botond Dénes
0f8b297d07 Merge 'build: cmake: add targets for building deb and rpm packages' from Kefu Chai
in this series,

- the build of unstripped package is fixed, and
- the targets for building deb and rpm packages are added. these targets builds deb and rpm packages from the unstripped package.

Closes #15403

* github.com:scylladb/scylladb:
  build: cmake: add targets for building deb and rpm packages
  build: cmake: correct the paths used when building unstripped pkg
2023-09-14 18:22:30 +03:00
Kefu Chai
60db7f8ae3 doc: do not suggest "-node xxx" when running c-s
cassandra-stress connects to "localhost" by default. that's exactly the
use case when we install scylla using the unified installer. so do not
suggest "-node xxx" option. the "xxx" part is but confusing.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15411
2023-09-14 18:21:46 +03:00
Petr Gusev
6c3cc7d6e0 test_fence_hints: increase timeouts
We saw failures on CI in debug mode, probably the machine
running the test is shared, and we starved for some resources.

Fix #15285

Closes #15388
2023-09-14 16:22:50 +02:00
Avi Kivity
d9a453e72e Merge 'Introduce a scylla-native nodetool' from Botond Dénes
This series introduces a scylla-native nodetool.  It is invokable via the main scylla executable as the other native tools we have. It uses the seastar's new `http::client` to connect to the specified node and execute the desired commands.
For now a single command is implemented: `nodetool compact`, invokable as `scylla nodetool compact`. Once all the boilerplate is added to create a new tool, implementing a single command is not too bad, in terms of code-bloat. Certainly not as clean as a python implementation would be, but good enough. The advantages of a C++ implementation is that all of us in the core team know C++ and that it is shipped right as part of the scylla executable..

Closes #14841

* github.com:scylladb/scylladb:
  test: add nodetool tests
  test.py: add ToolTestSuite and ToolTest
  tools/scylla-nodetool: implement compact operation
  tools/scylla-nodetool: implement basic scylla_rest_api_client
  tools: introduce scylla-nodetool
  utils: export dns_connection_factory from s3/client.cc to http.hh
  utils/s3/client: pass logger to dns_connection_factory in constructor
  tools/utils: tool_app_template::run_async(): also detect --help* as --help
2023-09-14 17:20:40 +03:00
Avi Kivity
a3d73bfba7 Merge 'Add support for decommission with tablets' from Tomasz Grabiec
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.

Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.

If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be aborted. There is no infrastructure for
aborts yet, so this is not implemented.

Closes #15197

* github.com:scylladb/scylladb:
  tablets, raft topology: Add support for decommission with tablets
  tablet_allocator: Compute load sketch lazily
  tablet_allocator: Set node id correctly
  tablet_allocator: Make migration_plan a class
  tablets: Implement cleanup step
  storage_service, tablets: Prevent stale RPCs from running beyond their stage
  locator: Introduce tablet_metadata_guard
  locator, replica: Add a way to wait for table's effective_replication_map change
  storage_service, tablets: Extract do_tablet_operation() from stream_tablet()
  raft topology: Add break in the final case clause
  raft topology: Fix SIGSEGV when trace-level logging is enabled
  raft topology: Set node state in topology
  raft topology: Always set host id in topology
2023-09-14 17:16:23 +03:00
Kamil Braun
0564d000c6 Merge 'Validate compaction strategy options' from Aleksandra Martyniuk
When a column family's schema is changed new compaction
strategy type may be applied.

To make sure that it will behave as expected, compaction
strategy need to contain only the allowed options and values.
Methods throwing exception on invalid options are added.

Fixes: #2336.

Closes #13956

* github.com:scylladb/scylladb:
  test: add test for compaction strategy validation
  compaction: unify exception messages
  compaction: cql3: validate options in check_restricted_table_properties
  compaction: validate options used in different compaction strategies
  compaction: validate common compaction strategy options
  compaction: split compaction_strategy_impl constructor
  compaction: validate size_tiered_compaction_strategy specific options
  compaction: validate time_window_compaction_strategy specific options
  compaction: add method to validate min and max threshold
  compaction: split size_tiered_compaction_strategy_options constructor
  compaction: make compaction strategy keys static constexpr
  compaction: use helpers in validate_* functions
  compaction: split time_window_compaction_strategy_options construtor
  compaction: add validate method to compaction_strategy_options
  time_window_compaction_strategy_options: make copy and move-able
  size_tiered_compaction_strategy_options: make copy and move-able
2023-09-14 16:11:52 +02:00
Pavel Emelyanov
4370e6c8d0 distributed_loader: Print sstable state explicitly
When populating from a particular directory, populator code converts
state to subdir name, then prints the path. The conversion is pretty
much artificial, it's better to provide printer for state and print
state explicitly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-14 16:41:26 +03:00
Pavel Emelyanov
b19e6a68f8 distributed_loader: Move check for the missing dir upper
The quarantine directory can be missing on the datadir and that's OK. In
order to check that and skip population the populator code uses two-step
logic -- first it checks if the directory exists and either puts or not
the sstable_directory object into the map. Later it checks the map and
decide whether to throw or not if the directory is missing.

Let's keep both check and throw in one place for brevity.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-14 16:39:56 +03:00
Pavel Emelyanov
74eef029e2 distributed_loader: Use state as _sstable_directories key
The populator maintains a map of path -> sstable_directory pairs one for
each subdirectory for every sstable state. The "path" is in fact not
used by the logic as it's just a subdirectory name for the state and the
rest of the core operates on state. So it's good to make the map of
directories also be indexed by the state.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-14 16:38:03 +03:00
Benny Halevy
a5a22fe5b7 tools/scylla-sstable: load_sstables: handle load errors
Currently, exceptions thrown from `sst->load` are unhandled,
resulting in, e.g.:
```
ERROR 2023-09-12 08:02:58,124 [shard 0:main] seastar - Exiting on unhandled exception: std::runtime_error (SSTable /home/bhalevy/.dtest/dtest-dxg4xdxg/test/node1/data/ks/cf-a3009f20512911ee8000d81cd2da3fd7/me-3g9b_0e0x_39vtt1y2rcqrffz55j-big-Data.db uses org.apache.cassandra.dht.Murmur3Partitioner partitioner which is different than com.scylladb.dht.CDCPartitioner partitioner used by the database)
```

Log the errors and exit the tool with non-zero status
in this case.

Fixes #15359

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15376
2023-09-14 14:27:38 +03:00
Tomasz Grabiec
551cc0233d tablets, raft topology: Add support for decommission with tablets
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.

Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.

If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be paused so that admin becomes aware of the problem and
issues an abort on the topology change. There is no infrastructure for
aborts yet, so this is not implemented.
2023-09-14 13:05:49 +02:00
Tomasz Grabiec
8565af4dd3 tablet_allocator: Compute load sketch lazily
This allows any node to act as a target later.
2023-09-14 13:04:49 +02:00
Tomasz Grabiec
1c595ab7f4 tablet_allocator: Set node id correctly
It was unset and unused.
2023-09-14 13:04:49 +02:00
Tomasz Grabiec
389573543e tablet_allocator: Make migration_plan a class
It will be extended with more fields so that load balancer can
communicate more information to the coordinator.
2023-09-14 13:04:47 +02:00
Tomasz Grabiec
d5539e080d tablets: Implement cleanup step
This change adds a stub for tablet cleanup on the replica side and wires
it into the tablet migration process.

The handling on replica side is incomplete because it doesn't remove
the actual data yet. It only flushes the memtables, so that all data
is in sstables and none requires a memtable flush.

This patch is necessary to make decommission work. Otherwise, a
memtable flush would happen when the decommissioned node is put in the
drained state (as in nodetool drain) and it would fail on missing host
id mapping (node is no longer in topology), which is examined by the
tablet sharder when producing sstable sharding metadata. Leading to
abort due to failed memtable flush.
2023-09-14 12:45:10 +02:00
Tomasz Grabiec
5cf035878d storage_service, tablets: Prevent stale RPCs from running beyond their stage
Example scenario:

  1. coordinator A sends RPC #1 to trigger streaming
  2. coordinator fails over to B
  3. coordinator B performs streaming successfully
  4. RPC #1 arrives and starts streaming
  5. coordinator B commits the transition to the post-streaming stage
  6. coordinator B executes global token metadata barrier

We end up with streaming running despite the fact that the current
coordinator moved on. Currently, this won't happen, because streaming
holds on to erm. But we want to change that (see #14995), so that it
does not block barriers for migrations of other tablets. The same
problem applies to tablet cleanup.

The fix is to use tablet_metadata_guard around such long running
operations, which will keep hold to erm so that in the above scenario
coordinator B will wait for it in step 6. The guard ensures that erm
doesn't block other migrations because it switches to the latest erm
if it's compatible. If it's not, it signals abort_source for the guard
so that such stale operation aborts soon and the barrier in step 6
doesn't wait for long.
2023-09-14 12:45:10 +02:00
Tomasz Grabiec
6a62aca3a9 locator: Introduce tablet_metadata_guard
Will be used to synchronize long-running tablet operations with
topology coordinator.

It blocks barriers like erm_ptr, but refreshes if change is
irrelevant, so behaves as if the erm_ptr's scope was narrowed down to
a single tablet.
2023-09-14 12:45:10 +02:00
Patryk Jędrzejczak
c0fd42ead4 system_keyspace: introduce decode_cdc_generation_id
The decode_cdc_generations_ids function allows us to decode
a vector of CDC generation IDs. After adding cleanup_candidate
to CDC_GENERATIONS_V3, we need a similar function that decodes
a single ID.
2023-09-14 12:09:14 +02:00
Patryk Jędrzejczak
6db325fb69 system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3
In the following commits, we implement a garbage collection for
CDC_GENERATIONS_V3. The first step is introducing the clean-up
candidate. It will be continually updated by the CDC generation
publisher and used to remove obsolete data.
2023-09-14 12:09:10 +02:00
Tomasz Grabiec
532ec84210 locator, replica: Add a way to wait for table's effective_replication_map change 2023-09-14 12:08:54 +02:00
Tomasz Grabiec
2c6785dc8f storage_service, tablets: Extract do_tablet_operation() from stream_tablet()
It will be shared with cleanup_tablet().

Minor changes:
  - ditch the redundant optional<> around shared_future<>
2023-09-14 12:08:52 +02:00
Tomasz Grabiec
e2c1f904c8 raft topology: Add break in the final case clause
To be safe in case we add more cases.
2023-09-14 12:07:59 +02:00