Commit Graph

38796 Commits

Author SHA1 Message Date
Kefu Chai
88a7bf2853 build: cmake: add targets for building deb and rpm packages
Refs #15241

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-14 13:00:04 +08:00
Kefu Chai
93faac0a0c build: cmake: correct the paths used when building unstripped pkg
in a0dcbb09c3, the newly introduced unstripped package does not build
at all. it was using the wrong paths. so, let's correct them.

Refs #15241

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-14 13:00:04 +08:00
Tomasz Grabiec
c27d212f4b api, storage_service: Recalculate table digests on relocal_schema api call
Currently, the API call recalculates only per-node schema version. To
workaround issues like #4485 we want to recalculate per-table
digests. One way to do that is to restart the node, but that's slow
and has impact on availability.

Use like this:

  curl -X POST http://127.0.0.1:10000/storage_service/relocal_schema

Fixes #15380

Closes #15381
2023-09-13 18:27:57 +03:00
Avi Kivity
0a5d9532f9 Merge 'Sanitize batchlog manager start/stop' from Pavel Emelyanov
This code is now spread over main and differs in cql_test_env. The PR unifies both places and makes the manager start-stop look standard

refs: #2795

Closes #15375

* github.com:scylladb/scylladb:
  batchlog_manager: Remove start() method
  batchlog_manager: Start replay loop in constructor
  main, cql_test_env: Start-stop batchlog manager in one "block"
  batchlog_manager: Move shard-0 check into batchlog_replay_loop()
  batchlog_manager: Fix drain() reentrability
2023-09-13 18:20:56 +03:00
Pavel Emelyanov
f9b09d4549 migration_manager: Register RPC verbs on start
There's a dedicated call to register migration manager's verbs somewhere
in the middle of main. However, until messaging service listening starts
it makes no difference when to register verbs.

This patch moves the verbs registration into mig. manager constructor
thus making it called it with sharded<migration_manager>::start().

Unregistration happens in migration_manager::drain() and it's not
touched here.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15367
2023-09-13 17:32:51 +03:00
Pavel Emelyanov
9dea26aa03 storage_service: Remove proxy arg from init_messaging_service_part()
It's only used to be carried along down to a handler and get
sharded<database> from. Storage service itself can provide it, and the
handler in question already uses it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15368
2023-09-13 17:11:33 +03:00
Raphael S. Carvalho
c53b8fb1b5 storage_service: initialize group0 in ctor
there are a couple of places that check group is not nullptr,
so let's set it to nullptr on ctor, so shards that don't
have it initialized will bump on assert, instead of failing
with a cryptic segfault error.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15330
2023-09-13 14:51:24 +02:00
Botond Dénes
50e3448527 Merge 'unified: add --build-dir option and respect --pkgs' from Kefu Chai
in this series, `unified/build_unified.sh` is improved in couple perspectives:

1. add `--build-dir` option, so we don't hardwire the build directory to the naming convention of `build/$mode`.
2. `--pkgs` is respected. this allows the caller to specify the paths to the dist tarballs instead of hardwiring to the paths defined in this script

these changes give us more flexibility when building unified package, and enable us to switch over to CMake based building system,

Refs #15241

Closes #15377

* github.com:scylladb/scylladb:
  unified: respect --pkgs option
  unified: allow passing --pkgs with a semicolon-separated list
  unified: prefer SCYLLA-PRODUCT-FILE in build_dir
  unified: derive UNIFIED_PKG from --build-dir
  unified: add --build-dir option to build_unified.sh
2023-09-13 15:30:57 +03:00
Kamil Braun
a184b07cbb Merge 'raft topology: make CDC_GENERATIONS_V3 single-partition, timeuuid-sorted' from Patryk Jędrzejczak
We make the `CDC_GENERATIONS_V3` table single-partition and change the
clustering key from `range_end` to `(id, range_end)`. We also change the
type of `id` to `timeuuid` and ensure that a new generation always has
the highest `id`. These changes allow efficient clearing of obsolete CDC
generation data, which we need to prevent Raft-topology snapshots from
endlessly growing as we introduce new generations over time.

All this code is protected by an experimental feature flag. It includes
the definition of `CDC_GENERATIONS_V3`. The table is not created unless
the feature flag is enabled.

Fixes #15163

Closes #15319

* github.com:scylladb/scylladb:
  system_keyspace: rename cdc_generation_id_v2
  system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
  cdc: generation: remove topology_description_generator
  cdc: do not create uuid in make_new_generation_data
  system_kayspace: make CDC_GENERATIONS_V3 single-partition
  cdc: generation: introduce get_common_cdc_generation_mutations
  cdc: generation: rename get_cdc_generation_mutations
2023-09-13 12:54:49 +02:00
Kefu Chai
bbb6e4f822 docs: s/tar xvfz tar/tar xvfz/ in command line sample
should not "tar" to tar, otherwise we'd have following error:
```
tar (child): tar: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
```
as "tar" is not the compressed tarball we want to untar.

Fixes #15328
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15383
2023-09-13 13:37:38 +03:00
Piotr Dulikowski
66206207f9 gossiper: properly acquire lock_endpoint_update_semaphore in reset_endpoint_state_map
The `gossiper::reset_endpoint_state_map` function is supposed to acquire
a lock in order to serialize with `replicate_live_endpoints_on_change`.
The `lock_endpoint_update_semaphore` is called, but its result is a
future - and it is not co_awaited. Therefore, the lock has no effect.

This commit fixes the issue by adding missing co_await.

Fixes: #15361

Closes #15362
2023-09-13 10:03:47 +02:00
Botond Dénes
7e7101c180 Revert "Merge 'database, storage_proxy: Reconcile pages with dead rows and partitions incrementally' from Botond Dénes"
This reverts commit 628e6ffd33, reversing
changes made to 45ec76cfbf.

The test included with this PR is flaky and often breaks CI.
Revert while a fix is found.

Fixes: #15371
2023-09-13 10:45:37 +03:00
Avi Kivity
2c810e221a Merge 'Gossiper: replace seastar threads with coroutines' from Benny Halevy
Many of the gossiper internal functions currently use seastar threads for historical reasons,
but since they are short living, the cost of spawning a seastar thread for them is excessive
and they can be simplified and made more efficient using coroutines.

Closes #15364

* github.com:scylladb/scylladb:
  gossiper: reindent do_stop_gossiping
  gossiper: coroutinize do_stop_gossiping
  gossiper: reindent assassinate_endpoint
  gossiper: coroutinize assassinate_endpoint
  gossiper: coroutinize handle_ack2_msg
  gossiper: handle_ack_msg: always log warning on exception
  gossiper: reindent handle_ack_msg
  gossiper: coroutinize handle_ack_msg
  gossiper: reindent handle_syn_msg
  gossiper: coroutinize handle_syn_msg
  gossiper: message handlers: no need to capture shared_from_this
  gossiper: add_local_application_state: throw internal error if endpoint state is not found
  gossiper: coroutinize add_local_application_state
2023-09-12 21:50:52 +03:00
Benny Halevy
47dc287efd gossiper: reindent do_stop_gossiping
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:33:09 +03:00
Benny Halevy
8fa65ed016 gossiper: coroutinize do_stop_gossiping
Simplify the function.  It does not need to spawn
a seastar thread.

While at it, declare it as private since it's called
only internally by the gossiper (and on shard 0).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:33:09 +03:00
Benny Halevy
a792babbda gossiper: reindent assassinate_endpoint
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:33:09 +03:00
Benny Halevy
5dbc168c03 gossiper: coroutinize assassinate_endpoint
It has no need to spawn a seastar thread.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:33:09 +03:00
Benny Halevy
29b9596050 gossiper: coroutinize handle_ack2_msg
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:33:09 +03:00
Benny Halevy
cc030a5040 gossiper: handle_ack_msg: always log warning on exception
Unlike handle_syn_msg, the warning is currently printed only
`if (_ack_handlers.contains(from.addr))`.
Unclear why. It is interesting in any case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:32:40 +03:00
Benny Halevy
990ac23d19 gossiper: reindent handle_ack_msg
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:27:08 +03:00
Benny Halevy
2ca2118130 gossiper: coroutinize handle_ack_msg
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:26:03 +03:00
Benny Halevy
8c065bf023 gossiper: reindent handle_syn_msg
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:24:14 +03:00
Benny Halevy
264f4daded gossiper: coroutinize handle_syn_msg
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:23:09 +03:00
Benny Halevy
63ab5f1ab3 gossiper: message handlers: no need to capture shared_from_this
The handlers future is waited on under `background_msg`
which is closed in gossiper::stop so the instance is
already guranteed to be kept valid.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:21:07 +03:00
Benny Halevy
8bfec81985 gossiper: add_local_application_state: throw internal error if endpoint state is not found
If the function is called too early, the first get_endpoint_state_ptr
would throw an exception that is later caught and degraded
into a warning.

But that endpoint_state should never disappear after yielding,
so call on_internal_error in that case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:21:07 +03:00
Benny Halevy
d1c67300d4 gossiper: coroutinize add_local_application_state
There is no need for it to spawn a seastar thread.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-09-12 19:20:41 +03:00
Kefu Chai
75f458f2a5 unified: respect --pkgs option
let's provide the default value, only if user does not specify --pkgs.
otherwise the --pkgs option is always ignored.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 22:56:10 +08:00
Kefu Chai
84387e3856 unified: allow passing --pkgs with a semicolon-separated list
simpler than passing a space-separated list requiring escaping, which
is a source of headache.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 22:56:10 +08:00
Kefu Chai
d9dcda9dd5 unified: prefer SCYLLA-PRODUCT-FILE in build_dir
unlike `configure.py`, the building system created by CMake do not
share the `SCYLLA-PRODUCT-FILE` across different builds. so we cannot
assume that build/SCYLLA-PRODUCT-FILE exists.

so, in this change, we check $BUILD_DIR/SCYLLA-PRODUCT-FILE first,
and fallback to $BUILD_DIR/../SCYLLA-PRODUCT-FILE. this should work
for both configure.py and CMake building system.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 22:56:10 +08:00
Kefu Chai
fea3a11716 unified: derive UNIFIED_PKG from --build-dir
we should respect the --build-dir if --unified-pkg is not specified,
and deduce the path to unified pkg from BUILD_DIR.

so, in this change, we deduce the path to unified pkg from BUILD_DIR
unless --unified-pkg is specfied.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 22:56:10 +08:00
Kefu Chai
4bb5af763b unified: add --build-dir option to build_unified.sh
this allows build_unified.sh to generate unified pkg in specified
directory, instead of assuming the naming convention of build/$mode.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 22:56:10 +08:00
Pavel Emelyanov
d48aff5789 batchlog_manager: Remove start() method
It's now a no-op, can be dropped.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:37:52 +03:00
Pavel Emelyanov
3966a50ed4 batchlog_manager: Start replay loop in constructor
... and sanitize the future used on stop.

The loop in question is now started in .start(), but all callers now
construct the manager late enough, so the loop spawning can be moved.
This also calls for renaming the future member of the class and allows
to make it regular, not shared, future.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:35:53 +03:00
Pavel Emelyanov
512465288f main, cql_test_env: Start-stop batchlog manager in one "block"
Currently starting and stopping of b.m. is spread over main(). Keep it
close to each other.

Another trickery here is that calling b.m.::start() can only be done
after joining the cluster, because this start() spawns replay loop
which, in turn calls token_metadata::count_normal_token_owners() and if
the latter returns zero, the b.m. code uses it as a fraction denominator
and crashes.

With the above in mind, cql_test_env should start batchlog manager after
it "joins the ring" too. For now it doesn't make any difference, but
next patch will make use of it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:33:31 +03:00
Pavel Emelyanov
9f45778467 batchlog_manager: Move shard-0 check into batchlog_replay_loop()
Currently the only caller of it is the batchlog manager itself. It
checks for the shard-id to be zero, calls the method, then the method
asserts that it's run on shard-0.

Moving the check into the method removes the need for assertion and
makes further patching simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:32:12 +03:00
Pavel Emelyanov
38d0ea0916 batchlog_manager: Fix drain() reentrability
Currently drain() is called twise -- first time from
storage_service::drain() (on shutdown), second via
batchlog_manager::stop(). The routine is unintentinally re-entrable,
because:
- explicit check for not aborting the abort source twise
- breaking semaphore can be done multiple times
- co-await-ing of the _started future works because the future is shared

That's not extremely elegant, better to make the drain() bail out early
if it was already called.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:30:07 +03:00
Avi Kivity
a1b2ca6184 Merge 'build: cmake: package cqlsh and fix the noarch postfix of python3 package' from Kefu Chai
in this series, the packaging of tools modules are improved:

- package cqlsh also. as cqlsh should be redistributed as a part of the unified package
- use ${arch} in the postfix of the python3 package. the python3 package is not architecture independent.
- set the version with tide for `Scylla_VERSION`, so it can be reused elsewhere.

Refs #15241

Closes #15369

* github.com:scylladb/scylladb:
  build: cmake: build cqlsh as a submodule
  build: cmake: always use the version with tilde
  build: cmake: build python3 dist tarball with arch postfix
  build: cmake: use the default comment message
2023-09-12 16:27:03 +03:00
David Garcia
5177ddac17 Support advanced db config scenarios
docs: skip html tags from description

Closes #15338
2023-09-12 15:29:16 +03:00
Tomasz Grabiec
6e83e54b0d Merge 'gossiper: get rid of uses_host_id' from Benny Halevy
This function practically returned true from inception.

In d38deef499
it started using messaging_service().knows_version(endpoint)
that also returns `true` unconditionally, to this day

So there's no point calling it since we can assume
that `uses_host_id` is true for all versions.

Closes #15343

* github.com:scylladb/scylladb:
  storage_service: fixup indentation after last patch
  gossiper: get rid of uses_host_id
2023-09-12 12:44:56 +02:00
Kefu Chai
571fab4179 build: cmake: build cqlsh as a submodule
since we also redistribute cqlsh, let's package it as well.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 18:18:31 +08:00
Kefu Chai
4ff5ce9933 build: cmake: always use the version with tilde
since we always use tilde ("~") in the verson number,
let's just cache it as an internal variable in CMake.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 18:18:31 +08:00
Kefu Chai
111d20958e build: cmake: build python3 dist tarball with arch postfix
now that `configure.py` always generate python3 dist tarball with
${arch} postfix, let's mirror this behavior. as `build_unified.sh`
uses this naming convention.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 18:18:31 +08:00
Kefu Chai
760b7c8772 build: cmake: use the default comment message
it turns out "Generating submodule python3 in python3" is not
as informative as default one:
"/home/kefu/dev/scylladb/tools/python3/build/scylla-python3-5.4.0~dev-0.20230908.1668d434e458.noarch.tar.gz"
so let's drop the "COMMENT" argument.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-12 18:18:31 +08:00
Patryk Jędrzejczak
92209996b5 system_keyspace: rename cdc_generation_id_v2
Changing the second value of cdc_generation_id_v2 from uuid_type
to timeuuid_type made the name of cdc_generation_id_v2 unsuitable
because it does not match cdc::generation_id_v2 anymore.
2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak
1c58c6336a system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
We change the type of IDs in CDC_GENERATIONS_V3 to timeuuid to
give them a time-based order. We also change how we initialize
them so that the new CDC generation always has the highest ID.
This is the last step to enabling the efficient clearing of
obsolete CDC generation data.

Additionally, we change the types of current_cdc_generation_uuid,
new_cdc_generation_data_uuid and the second values of the elements
in unpublished_cdc_generations to timeuuid, so that they match id
in CDC_GENERATIONS_V3.
2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak
fab066cffe cdc: generation: remove topology_description_generator
After moving the creation of uuid out of
make_new_generation_description, this function only calls the
topology_description_generator's constructor and its generate
method. We could remove this function, but we instead simplify
the code by removing the topology_description_generator class.
We can do this refactor because make_new_generation_description
is the only place using it. We inline its generate method into
make_new_generation_description and turn its private methods into
static functions.
2023-09-12 11:18:54 +02:00
Patryk Jędrzejczak
3bf4cac72e cdc: do not create uuid in make_new_generation_data
In the future commit, we change how we initialize uuid of the
new CDC generation in the Raft-based topology. It forces us to
move this initialization out of the make_new_generation_data
function shared between Raft-based and gossiper-based topologies.

We also rename make_new_generation_data to
make_new_generation_description since it only returns
cdc::topology_description now.
2023-09-12 11:18:38 +02:00
Patryk Jędrzejczak
2cd430ac80 system_kayspace: make CDC_GENERATIONS_V3 single-partition
We make CDC_GENERATIONS_V3 single-partition by adding the key
column and changing the clustering key from range_end to
(id, range_end). This is the first step to enabling the efficient
clearing of obsolete CDC generation data, which we need to prevent
Raft-topology snapshots from endlessly growing as we introduce new
generations over time. The next step is to change the type of the id
column to timeuuid. We do it in the following commits.

After making CDC_GENERATIONS_V3 single-partition, there is no easy
way of preserving the num_ranges column. As it is used only for
sanity checking, we remove it to simplify the implementation.
2023-09-12 09:51:45 +02:00
Patryk Jędrzejczak
29f54836d0 cdc: generation: introduce get_common_cdc_generation_mutations
In the following commit, we implement the
get_cdc_generation_mutations_v3 function very similar to
get_cdc_generation_mutations_v2. The only differences in creating
mutations between CDC_GENERATIONS_V2 and CDC_GENERATIONS_V3 are:
- a need to set the num_ranges cell for CDC_GENERATIONS_V2,
- different partition keys,
- different clustering keys.

To avoid code duplication, we introduce
get_common_cdc_generation_mutations, which does most of the work
shared by both functions.
2023-09-12 09:37:21 +02:00
Botond Dénes
bc4b3e4fa3 Merge 'build: cmake: add packaging support ' from Kefu Chai
this change allows CMake to build the dist tarball for a certain build.

Refs https://github.com/scylladb/scylladb/issues/15241

Closes #15352

* github.com:scylladb/scylladb:
  build: cmake: add packaging support
  build: cmake: enable build of seastar/apps/iotune
2023-09-12 09:59:53 +03:00