Commit Graph

38691 Commits

Author SHA1 Message Date
Nadav Har'El
ea56c8efcd test/alternator: reduce code duplication in test for list_append()
A reviewer noted that test_update_expression_list_append_non_list_arguments
has too much code duplication - the same long API call to run
"SET a = list_append(...)" was repeated many times.

So in this patch we add a short inner function "try_list_append" to
avoid this duplication.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes: #15298
2023-09-11 10:09:35 +03:00
David Garcia
a14bcf7c6a docs: improve configuration properties reference
- Adds type for each option.
- Filters out unused / invalid values, moves them to a separate section.
- Adds the term "liveness" to the glossary.
- Removes unused and invalid properties from the docs.
- Updates to the latest version of pyaml.

docs: rename config template directive

Closes #15164
2023-09-11 09:47:16 +03:00
Botond Dénes
d92620868d Merge 'docs: improve command line samples in unified-installer.rst' from Kefu Chai
in this series, we try to improve `unified-installer.rst`

- encourage user to install smaller package
- run `./install.sh` directly instead relying on that `sh` points to `bash`

Closes #15325

* github.com:scylladb/scylladb:
  doc: run install.sh directly
  doc: install headless jdk in sample command line
2023-09-11 09:34:14 +03:00
Botond Dénes
7385f93816 Merge 'Task manager repair tasks progress' from Aleksandra Martyniuk
Find progress of repair tasks based on the number of ranges
that have been repaired.

Fixes: [#1156](https://github.com/scylladb/scylla-enterprise/issues/1156).

Closes #14698

* github.com:scylladb/scylladb:
  test: repair tasks test
  repair: add methods making repair progress more precise
  tasks: make progress related methods virtual
  repair: add get_progress method to shard_repair_task_impl
  repair: add const noexcept qualifiers to shard_repair_task_impl::ranges_size()
  repair: log a name of a particular table repair is working on
  tasks: delete move and copy constructors from task_manager::task::impl
2023-09-11 09:32:23 +03:00
Raphael S. Carvalho
c7e02a1077 storage_service: Enforce tablet streaming runs on shard 0
SIGSEGV was caught during tablet streaming, and the reason was
that storage_service::_group0 (via set_group0()) is only set on
shard 0, therefore when streaming ran on any other shard,
it tried to dereference garbage, which resulted in the crash.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15307
2023-09-08 20:45:13 +03:00
Kefu Chai
ce6464b649 sstable: do not call into sstable in filesystem_storage::open()
before this change, filesystem_storage::open() reuses
`sstable::make_component_file_writer()` to create the
temporary toc, it will rename the temporary toc to the
real TOC when sealing the sstable.

but this prevents us from reusing filesystem_storage in
yet another storage backend. as the

1. create temporary
2. rename temporary to toc

dance only applies to filesystem_storage. when
filesystem_storage calls into sstable, it calls `sst.make_component_file_writer()`,
which in turn calls the `_storage->make_component_sink()`.
but at this moment, `_storage` is not necessarily `filesystem_storage`
anymore. it could be a wrapper around `filesystem_storage`,
which is not aware of the create-rename dance. and could do
a lot more than create a temporary file when asked to
"make_component_sink()".

if we really want to go this way by reusing sstable's API
in `filesystem_storage` to create a temporary toc, we will
have to rename the whatever temporary toc component created
by the wrapper backend to the toc with the seal() func. but
again, this rename op is only implemented in the
filesystem_storage backend. to mirror this operation in
the wrapper backend does not make sense at all -- it
does not have to be aware of the filesystem_storage's internals.

so in this change, instead of reusing the
`sstable::make_component_file_writer()`, we just inline
its implementation in filesystem_storage to avoid this
problem. this is also an improvement from the design
perspective, as the storage should not call into its
the higher abstraction -- sstable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14443
2023-09-08 19:57:39 +03:00
Kefu Chai
ce291f4385 s3/client: do not use deprecated tls::connect() overload
seastar has deprecated the overload which accepts `server_name`,
let's use the one which accepts `tls::tls_options`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15324
2023-09-08 18:44:45 +03:00
Avi Kivity
0656810c28 Update tools/java submodule
* tools/java 585b30fda6...9dddad27bf (1):
  > install-dependencies.sh: do not install weak dependencies

Frozen toolchain regenerated.

Closes #15322
2023-09-08 17:22:07 +03:00
Kamil Braun
26d9a82636 Merge 'raft topology: replace publish_cdc_generation with a bg fiber' from Patryk Jędrzejczak
Currently, the topology coordinator has the
`topology::transition_state::publish_cdc_generation` state responsible
for publishing the already created CDC generations to the user-facing
description tables. This process cannot fail as it would cause some CDC
updates to be missed. On the other hand, we would like to abort the
`publish_cdc_generation` state when bootstrap aborts. Of course, we
could also wait until handling this state finishes, even in the case of
the bootstrap abort, but that would be inefficient. We don't want to
unnecessarily block topology operations by publishing CDC generations.

The solution proposed by this PR is to remove the
`publish_cdc_generation` state completely and introduce a new background
fiber of the topology coordinator -- `cdc_generation_publisher` -- that
continually publishes committed CDC generations.

Apart from introducing the CDC generation publisher, we add
`test_cdc_generation_publishing.py` that verifies its correctness and we
adapt other CDC tests to the new changes.

Fixes #15194

Closes #15281

* github.com:scylladb/scylladb:
  test: test_cdc: introduce wait_for_first_cdc_generation
  test: move cdc_streams_check_and_repair check
  test: add test_cdc_generation_publishing
  docs: remove information about publish_cdc_generation
  raft topology: introduce the CDC generation publisher
  system_keyspace: load unpublished_cdc_generations to topology
  raft topology: mark committed CDC generations as unpublished
  raft topology: add unpublished_cdc_generations to system.topology
2023-09-08 15:08:41 +02:00
Kamil Braun
8bff5843b5 Merge 'test: topology: add tests for gossiper/endpoint/live and gossiper/endpoint/down' from Aleksandra Martyniuk
Add tests for gossiper/endpoint/live and gossiper/endpoint/down
which run only in release mode.

Enable test_remove_node_with_concurrent_ddl and fix types and
variables names used by it, so that they can be reused in gossiper
test.

Fixes: #15223.

Closes #15244

* github.com:scylladb/scylladb:
  test: topology: add gossiper test
  test: fix types and variable names in wait_for_host_down
2023-09-08 12:43:11 +02:00
Nadav Har'El
548386a0bb treewide: reduce include of cql_statement.hh
ClangBuildAnalyzer reports cql3/cql_statement.hh as being one of the
most expensive header files in the project - being included (mostly
indirectly) in 129 source files, and costing a total of 844 CPU seconds
of compilation.

This patch is an attempt, only *partially* successful, to reduce the
number of times that cql_statement.hh is included. It succeeds in
lowering the number 129 to 99, but not less :-( One of the biggest
difficulties in reducing it further is that query_processor.hh includes
a lot of templated code, which needs stuff from cql_statement.hh.
The solution should be to un-template the functions in
query_processor.hh and move them from the header to a source file, but
this is beyond the scope of this patch and query_processor.hh appears
problematic in other respects as well.

Unfortunately the compilation speedup by this patch is negligible
(the `du -bc build/dev/**/*.o` metric shows less than 0.01% reduction).
Beyond the fact that this patch only removes 30% of the inclusions of
this header, it appears that most of the source files that no longer
include cql_statement.hh after this patch, included anyway many of the
other headers that cql_statement.hh included, so the saving is minimal.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #15212
2023-09-08 13:23:50 +03:00
Kefu Chai
7591b1b384 doc: run install.sh directly
strictly speaking, `sh` is not necessarily bash. while `install.sh`
is written in the Bash dialect. and it errors out if it is not executed
with Bash. and we don't need to add "-x" when running the script, if
we have to, we should add it in `install.sh` not ask user to add this
option. also, `install.sh` is executable with a shebang line using
bash, so we can just execute it.

so, in this change, we just launch this script in the command line
sample.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-08 17:21:30 +08:00
Kefu Chai
1e0c7d14aa doc: install headless jdk in sample command line
in comparison with java-11-openjdk, java-11-openjdk-headless does not
offer audio and video support, and has less dependencies. for instance,
java-11-openjdk depends on the X11 libraries, and it also provides
icons representing JDK. but since scylla is a server side application,
we don't expect user to run a desktop on it. so there is no need to
support audio and video.

in this change, we just suggest the a "smaller" package, which is
actually also a dependency of java-11-open-jdk.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-08 17:21:30 +08:00
Patryk Jędrzejczak
23a4557662 test: test_cdc: introduce wait_for_first_cdc_generation
After introducing the CDC generation publisher,
test_cdc_log_entries_use_cdc_streams could (at least in theory)
fail by accessing system_distributed.cdc_streams_descriptions_v2
before the first CDC generation has been published.

To avoid flakiness, we simply wait until the first CDC generation
is published in a new function -- wait_for_first_cdc_generation.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
3a2c080cbe test: move cdc_streams_check_and_repair check
The part of test_topology_ops that tests the
cdc_streams_check_and_repair request could (at least in theory)
fail on
`assert(len(gen_timestamps) + 1 == len(new_gen_timestamps))`
after introducing the CDC generation publisher because we can
no longer assume that all previously committed CDC generations
have been published before sending the request.

To prevent flakiness, we move this part of the test to
test_cdc_generations_are_published. This test allows for ensuring
that all previous CDC generations have been published.
Additionally, checking cdc_streams_check_and_repair there is
simpler and arguably fits the test better.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
4ee68a47bb test: add test_cdc_generation_publishing
We add two test cases that test the new CDC generation publisher
to detect potential bugs like incorrect order of publications or
not publishing some generations at all.

The purpose of the second test case --
test_multiple_unpublished_cdc_generations -- is to enforce and test
a scenario when there are multiple unpublished CDC generations at
the same time. We expect that this is a rare case. The main fiber
of the topology coordinator would have to make much more progress
(like finishing two bootstraps) than the CDC generation publisher
fiber. Since multiple unpublished CDC generations might never
appear in other tests but could be handled incorrectly, having
such a test is valuable.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
2643ccc70e docs: remove information about publish_cdc_generation
We update documentation after replacing the
topology::transition_state::publish_cdc_generation state with
the CDC generation publisher fiber.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
fc1ee2cc14 raft topology: introduce the CDC generation publisher
Currently, the topology coordinator has the
topology::transition_state::publish_cdc_generation state
responsible for publishing the already created CDC generations
to the user-facing description tables. This process cannot fail
as it would cause some CDC updates to be missed. On the other
hand, we would like to abort the publish_cdc_generation state when
bootstrap aborts. Of course, we could also wait until handling this
state finishes, even in the case of the bootstrap abort, but that
would be inefficient. We don't want to unnecessarily block topology
operations by publishing CDC generations.

The solution is to remove the publish_cdc_generation state
completely and introduce a new background fiber of the topology
coordinator -- cdc_generation_publisher -- that continually
publishes committed CDC generations.

The implementation of the CDC generation publisher is very similar
to the main fiber of the topology coordinator. One noticeable
difference is that we don't catch raft::commit_status_unknown,
which is handled raft_group0_client::add_entry.

Note that this modification changes the Raft-based topology a bit.
Previously, the publish_cdc_generation state had to end before
entering the next state -- write_both_read_old. Now, committed
CDC generations can theoretically be published at any time.
Although it is correct because the following states don't depend on
publish_cdc_generation, it can cause problems in tests. For example,
we can't assume now that a CDC generation is published just because
the bootstrap operation has finished.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
d404443b54 system_keyspace: load unpublished_cdc_generations to topology
We extend service::topology with the list of unpublished CDC
generations and load its contents from system.topology. This step
is the last one in making unpublished CDC generations accessible
to the topology coordinator.

Note that when we load unpublished_cdc_generations, we don't
perform any sanity checks contrary to current_cdc_generation_uuid.
Every unpublished CDC generation was a current generation once,
and we checked it at that moment.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
bc726a066f raft topology: mark committed CDC generations as unpublished
We add committed CDC generations to unpublished_cdc_generations
so that we can load them to topology and properly handle them
in the following commits.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
5ed9d4db6d raft topology: add unpublished_cdc_generations to system.topology
In the following commits, we replace the
topology::transition_state::publish_cdc_generation state with
a background fiber that continually publishes committed CDC
generations. To make these generations accessible to the
topology coordinator, we store them in the new column of
system.topology -- unpublished_cdc_generations.
2023-09-08 09:05:01 +02:00
Israel Fruchter
3d082acd29 Update tools/cqlsh submodule
* tools/cqlsh 2254e920...66ae7eac (5):
  > switch from `ssl_options` to `ssl_context`
  > cqlsh should use cql v4 by default when connecting #44
  > Revert "Skip pp38-macosx wheel builds"
  > update to newer cibuildwheel
  > Skip pp38-macosx wheel builds

Closes #15308
2023-09-07 22:48:37 +03:00
Aleksandra Martyniuk
8a65477202 tasks: db: change default task_ttl value
If a test isn't going to use task manager or isn't interested in
statuses of finished tasks, then keeping them in the memory
for some time (currently 10s by default) after they are finished
is a memory waste.

Set default task_ttl value to zero. It can be changed by setting
--task-ttl-in-seconds or through rest api (/task_manager/ttl).

In conf/scylla.yaml set task-ttl-in-seconds to 10.

Closes #15239
2023-09-07 12:42:29 +03:00
Nadav Har'El
42e26ab13b Merge 'Explicitly use do_with_cql_env_thread in query test' from Pavel Emelyanov
Some tests use non-threaded do_with_cql_env() and wrap the inner lambda with seastar::async(). The cql env already provides a helper for that

Closes #15305

* github.com:scylladb/scylladb:
  cql_query_test: Fix indentation after previous patch
  cql_query_test: Use do_with_cql_env_thread() explicitly
2023-09-07 11:54:54 +03:00
Benny Halevy
c5e4dace8e gossiper: real_mark_alive: do not erase from unreachable_endpoints without holding lock
This code was supposed to be moved into
`mutate_live_and_unreachable_endpoints`
in 2c27297dbd
but it looks like the original statements were left
in place outside the mutate function.

This patch just removes the stale code since the required
logic is already done inside `mutate_live_and_unreachable_endpoints`.

Fixes scylladb/scylladb#15296

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15304
2023-09-07 10:02:49 +02:00
Nadav Har'El
c52e0fd333 test/alternator: avoid warnings about unverified HTTPS
The Alternator tests can run against HTTPS - namely when using
test/alternator/run with the "--https" option (local Alternator
configured with HTTPS) or "--aws" option (DynamoDB, using HTTPS).

In some cases we make these HTTPS requests with verify=False, to avoid
checking the SSL certificates. E.g., this is necessary for Alternator
with a self-signed certificate. Unfortunately, the urllib3 library adds
an ugly warning message when SSL certificate verification is disabled.

In the past we tried to disable these warnings, using the documented
urllib3.disable_warnings() function, but it didn't help. It turns out
that pytest has its own warning handling, so to disable warnings in
pytest we must say so in a special configuration parameter in pytest.ini.

So in this patch, we drop the disable_warnings call from conftest.py
(where it didn't help), and instead put a similar declaration in
pytest.ini. The disable_warnings call in the test/alternator/run
script needs to remain - it is run outside pytest, so pytest.ini
doesn't affect it.

After this patch, running test/alternator/run with --https or --aws
finishes without warnings, as desired.

Fixes #15287

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #15292
2023-09-07 07:23:57 +03:00
Tomasz Grabiec
dd57c53328 Merge 'Topology: use this host_id in is_configured_this_node' from Benny Halevy
Since 5d1f60439a we have
this node's host_id in topology config, so it can be used
to determine this node when adding it.

Prepare for extending the token_metadata interface
to provide host_id in update_topology.

We would like to compare the host_id first to be able to distinguish
this node from a node we're replacing that may have the same ip address
(but different host_id).

Closes #15297

* github.com:scylladb/scylladb:
  locator: topology: is_configured_this_node: delete spurious semicolumn
  locator: topology: is_configured_this_node: compare host_id first
2023-09-06 22:13:29 +02:00
Pavel Emelyanov
9da4668c71 cql_query_test: Fix indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-06 16:54:25 +03:00
Pavel Emelyanov
84e30ab56c cql_query_test: Use do_with_cql_env_thread() explicitly
Some tests use non-threaded do_with_cql_env() and wrap the inner lambda
with seastar::async(). The cql env already provides a helper for that

Indentation is deliberately left broken until next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-06 16:54:14 +03:00
Kefu Chai
1ed894170c sstables: throw at seeing invalid chunk_len
before this change, when running into a zero chunk_len, scylla
crashes with `assert(chunk_size != 0)`. but we can do better than
printing a backtrace like:
```
scylla: sstables/compress.cc:158: void
sstables::compression::segmented_offsets::init(uint32_t): Assertion `chunk_size != 0' failed.
```
so, in this change, a `malformed_sstable_exception` is throw in place
of an `assert()`, which is supposed to verify the programming
invariants, not for identifying corrupted data file.

Fixes #15265
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15264
2023-09-06 14:20:38 +03:00
Nadav Har'El
5930637ad8 Merge 'task_manager: module: make_task: enter gate when the task is created' from Benny Halevy
Passing the gate_closed_exception to the task promise
ends up with abandoned exception since no-one is waiting
for it.

Instead, enter the gate when the task is made
so it will fail make_task if the gate is already closed.

Fixes scylladb/scylladb#15211

In addition, this series adds a private abort_source for each task_manager module
(chained to the main task_manager::abort_source) and abort is requested on task_manager::module::stop().

gate holding in compaction_manager is hardened
and makes sure to stop compaction_manager and task_manager in sstable_compaction_test cases.

Closes #15213

* github.com:scylladb/scylladb:
  compaction_manager: stop: close compaction_state:s gates
  compaction_manager: gracefully handle gate close
  task_manager: task: start: fixup indentation
  task_manager: module: make_task: enter gate when the task is created
  task_manaer: module: stop: request abort
  task_manager: task::impl: subscribe to module about_source
  test: compaction_manager_stop_and_drain_race_test: stop compaction and task managers
  test: simple_backlog_controller_test: stop compaction and task managers
2023-09-06 13:29:26 +03:00
Benny Halevy
574c7e349a locator: topology: is_configured_this_node: delete spurious semicolumn
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-06 12:24:09 +03:00
Benny Halevy
115462be17 locator: topology: is_configured_this_node: compare host_id first
Since 5d1f60439a we have
this node's host_id in topology config, so it can be used
to determine this node when adding it.

Prepare for extending the token_metadata interface
to provide host_id in update_topology.

We would like to compare the host_id first to be able to distinguish
this node from a node we're replacing that may have the same ip address
(but different host_id).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-06 12:24:09 +03:00
Nadav Har'El
cfc70810d3 test/alternator: more error-path tests for list_append() function
Improved the coverage of the tests for the list_append() function
in UpdateExpression - test that if one of its arguments is not a list,
including a missing attribute or item, it is reported as an error as
expected.

The new tests pass on both Alternator and DynamoDB.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #15291
2023-09-06 11:59:54 +03:00
Avi Kivity
f594175042 Merge 'build: extract generate_compdb() out' from Kefu Chai
instead of flattening the functions into the script, let's structure them into functions. so they can be reused. and more maintainable this way.

Refs #15241
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15242

* github.com:scylladb/scylladb:
  build: early return when appropriate
  build: extract generate_compdb() out
2023-09-05 20:54:06 +03:00
Dawid Medrek
c7fe5d7f94 utils/lister: Limit the API of scan_dir() to fs::path
Right now, the function allows for passing the path to a file as a seastar::sstring,
which is then converted to std::filesystem::path -- implicitly to the caller.
However, the function performs I/O, and there is no reason to accept any other type
than std::filesystem::path, especially because the conversion is straightforward.
Callers can perform it on their own.

This commit introduces the more constrained API.

Closes #15266
2023-09-05 20:50:42 +03:00
Nadav Har'El
1cbe60a7e3 Update seastar submodule
* seastar 6e80e84a...576ee47d (9):
  > http/client: Add "total new connections" metrics
  > semaphore: initialize wait_list in move ctor

Fixes #15253
Fixes #15263

  > tutorial: Add a missing argument in code example
  > sstring: format sstring without implicitly conversion
  > coroutine: Add a necessary include in generator.hh
  > tls: Move server name into tls_options
  > net/arp|ip: fix unused param warning in forward virtual method
  > net/ethernet: fix unused param ethernet_address::adjust_endianness
  > tls: Optionally skip client EOF wait

Closes #15273
2023-09-05 17:07:08 +03:00
Aleksandra Martyniuk
c96224e97d test: topology: add gossiper test
Add tests for gossiper/endpoint/live and gossiper/endpoint/down
which run only in release mode.
2023-09-05 15:04:26 +02:00
Aleksandra Martyniuk
ede8182dd4 test: fix types and variable names in wait_for_host_down
Fix types and variable names in ManagerClient::wait_for_host_down
and related methods.
2023-09-05 15:01:59 +02:00
Pavel Emelyanov
1ef4ba196b Merge 'Gossiper: mark const methods and remove dead code' from Benny Halevy
This series cleans up gossiper.
Methods that do not change the gossiper object are marked as const.
Dead code is removed.

Closes #15272

* github.com:scylladb/scylladb:
  gossiper: get_current* methods: mark as const
  gossiper: get_generation_for_nodes: mark as const
  gossiper: examine_gossiper: mark as const
  gossiper: request_all, send_all: mark as const
  gossiper: do_on_*notifications: mark as const
  utils: atomic_vector: mark for_each functions as const
  gossiper: compare_endpoint_startup: mark as const
  gossiper: get_state_for_version_bigger_than: mark as const
  gossiper: make_random_gossip_digest: delete dead legacy code
  gossiper: make_random_gossip_digest: mark as const
  gossiper: do_sort: mark as const
  gossiper: is* methods: mark as const
  gossiper: wait_for_gossip and friends: mark as const
  gossiper: drop unused dump_endpoint_state_map
  gossiper: remove unused shadow version members
2023-09-05 13:47:29 +03:00
Kefu Chai
f6cca741ea config: remove "experimental" option
"experimental" option was marked "Unused" in 64bc8d2f7d. but we
chose to keep it in hope that the upgrade test does not fail.
despite that the upgrade tests per-se survived the "upgrade",
after the upgrade, the tests exercising the experimental features
are still failing hard. they have not been updated to set the
"experimental-features" option, and are still relying on
"experimental" to enable all the experimental features under
test.

so, in this change, let's just drop the option so that
scylla can fail early at seeing this "experimental" option.
this should help us to identify the tests relying on it
quicker. as the "experimental" features should only be used
in development environment, this change should have no impact
to production.

Refs #15214
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15233
2023-09-05 10:09:04 +03:00
Benny Halevy
cfecb68245 compaction_manager: stop: close compaction_state:s gates
Make sure the compaction_state:s are idle before
they are destroyed. Although all tasks are stopped
in stop_ongoing_compactions, make sure there is
fiber holding the compaction_state gate.

compaction_manager::remove now needs to close the
compaction_state gate and to stop_ongoing_compactions
only if the gate is not closed yet.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
96055414c7 compaction_manager: gracefully handle gate close
Check if the compaction_state gate is closed
along with _state != state::enabled and return early
in this case.

At this point entering the gate is guaranteed to succeed.
So enter the gate before calling `perform_compaction`
keeping the std::optional<gate_holder> throughout
the compaction task.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
a5b7f1a275 task_manager: task: start: fixup indentation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
f9a7635390 task_manager: module: make_task: enter gate when the task is created
Passing the gate_closed_exception to the task promise in start()
ends up with abandoned exception since no-one is waiting
for it.

Instead, enter the gate when the task is made
so it will fail make_task if the gate is already closed.

Fixes scylladb/scylladb#15211

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
51792d2292 task_manaer: module: stop: request abort
Have a private about_source for every module
and request abort on stop() to signal all outstanding
tasks to abort (especially when they are sleeping
for the task_ttl).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
d7205db863 task_manager: task::impl: subscribe to module about_source
Rather to the top-level task_manager about_source,
to provide separation between task_manager modules
so each one can be aborted and stopped independentally
of the others (in the next patch).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
062684eb1f test: compaction_manager_stop_and_drain_race_test: stop compaction and task managers
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Benny Halevy
b9127f55ac test: simple_backlog_controller_test: stop compaction and task managers
The compaction_manager and task_manager should
be orderly stopped before they are destroyed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-05 09:17:25 +03:00
Pavel Emelyanov
13a0c29618 storage_service: Remove query processor arg from join_cluster()
The s.service since d42685d0cb is having on-board query processor ref^w
pointer and can use it to join cluster

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15236
2023-09-05 07:30:37 +03:00