in a0dcbb09c3, the newly introduced unstripped package does not build
at all. it was using the wrong paths. so, let's correct them.
Refs #15241
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Currently, the API call recalculates only per-node schema version. To
workaround issues like #4485 we want to recalculate per-table
digests. One way to do that is to restart the node, but that's slow
and has impact on availability.
Use like this:
curl -X POST http://127.0.0.1:10000/storage_service/relocal_schemaFixes#15380Closes#15381
This code is now spread over main and differs in cql_test_env. The PR unifies both places and makes the manager start-stop look standard
refs: #2795Closes#15375
* github.com:scylladb/scylladb:
batchlog_manager: Remove start() method
batchlog_manager: Start replay loop in constructor
main, cql_test_env: Start-stop batchlog manager in one "block"
batchlog_manager: Move shard-0 check into batchlog_replay_loop()
batchlog_manager: Fix drain() reentrability
There's a dedicated call to register migration manager's verbs somewhere
in the middle of main. However, until messaging service listening starts
it makes no difference when to register verbs.
This patch moves the verbs registration into mig. manager constructor
thus making it called it with sharded<migration_manager>::start().
Unregistration happens in migration_manager::drain() and it's not
touched here.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15367
It's only used to be carried along down to a handler and get
sharded<database> from. Storage service itself can provide it, and the
handler in question already uses it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15368
there are a couple of places that check group is not nullptr,
so let's set it to nullptr on ctor, so shards that don't
have it initialized will bump on assert, instead of failing
with a cryptic segfault error.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#15330
in this series, `unified/build_unified.sh` is improved in couple perspectives:
1. add `--build-dir` option, so we don't hardwire the build directory to the naming convention of `build/$mode`.
2. `--pkgs` is respected. this allows the caller to specify the paths to the dist tarballs instead of hardwiring to the paths defined in this script
these changes give us more flexibility when building unified package, and enable us to switch over to CMake based building system,
Refs #15241Closes#15377
* github.com:scylladb/scylladb:
unified: respect --pkgs option
unified: allow passing --pkgs with a semicolon-separated list
unified: prefer SCYLLA-PRODUCT-FILE in build_dir
unified: derive UNIFIED_PKG from --build-dir
unified: add --build-dir option to build_unified.sh
We make the `CDC_GENERATIONS_V3` table single-partition and change the
clustering key from `range_end` to `(id, range_end)`. We also change the
type of `id` to `timeuuid` and ensure that a new generation always has
the highest `id`. These changes allow efficient clearing of obsolete CDC
generation data, which we need to prevent Raft-topology snapshots from
endlessly growing as we introduce new generations over time.
All this code is protected by an experimental feature flag. It includes
the definition of `CDC_GENERATIONS_V3`. The table is not created unless
the feature flag is enabled.
Fixes#15163Closes#15319
* github.com:scylladb/scylladb:
system_keyspace: rename cdc_generation_id_v2
system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
cdc: generation: remove topology_description_generator
cdc: do not create uuid in make_new_generation_data
system_kayspace: make CDC_GENERATIONS_V3 single-partition
cdc: generation: introduce get_common_cdc_generation_mutations
cdc: generation: rename get_cdc_generation_mutations
should not "tar" to tar, otherwise we'd have following error:
```
tar (child): tar: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
```
as "tar" is not the compressed tarball we want to untar.
Fixes#15328
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15383
The `gossiper::reset_endpoint_state_map` function is supposed to acquire
a lock in order to serialize with `replicate_live_endpoints_on_change`.
The `lock_endpoint_update_semaphore` is called, but its result is a
future - and it is not co_awaited. Therefore, the lock has no effect.
This commit fixes the issue by adding missing co_await.
Fixes: #15361Closes#15362
This reverts commit 628e6ffd33, reversing
changes made to 45ec76cfbf.
The test included with this PR is flaky and often breaks CI.
Revert while a fix is found.
Fixes: #15371
Many of the gossiper internal functions currently use seastar threads for historical reasons,
but since they are short living, the cost of spawning a seastar thread for them is excessive
and they can be simplified and made more efficient using coroutines.
Closes#15364
* github.com:scylladb/scylladb:
gossiper: reindent do_stop_gossiping
gossiper: coroutinize do_stop_gossiping
gossiper: reindent assassinate_endpoint
gossiper: coroutinize assassinate_endpoint
gossiper: coroutinize handle_ack2_msg
gossiper: handle_ack_msg: always log warning on exception
gossiper: reindent handle_ack_msg
gossiper: coroutinize handle_ack_msg
gossiper: reindent handle_syn_msg
gossiper: coroutinize handle_syn_msg
gossiper: message handlers: no need to capture shared_from_this
gossiper: add_local_application_state: throw internal error if endpoint state is not found
gossiper: coroutinize add_local_application_state
Simplify the function. It does not need to spawn
a seastar thread.
While at it, declare it as private since it's called
only internally by the gossiper (and on shard 0).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Unlike handle_syn_msg, the warning is currently printed only
`if (_ack_handlers.contains(from.addr))`.
Unclear why. It is interesting in any case.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The handlers future is waited on under `background_msg`
which is closed in gossiper::stop so the instance is
already guranteed to be kept valid.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
If the function is called too early, the first get_endpoint_state_ptr
would throw an exception that is later caught and degraded
into a warning.
But that endpoint_state should never disappear after yielding,
so call on_internal_error in that case.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
let's provide the default value, only if user does not specify --pkgs.
otherwise the --pkgs option is always ignored.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
unlike `configure.py`, the building system created by CMake do not
share the `SCYLLA-PRODUCT-FILE` across different builds. so we cannot
assume that build/SCYLLA-PRODUCT-FILE exists.
so, in this change, we check $BUILD_DIR/SCYLLA-PRODUCT-FILE first,
and fallback to $BUILD_DIR/../SCYLLA-PRODUCT-FILE. this should work
for both configure.py and CMake building system.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
we should respect the --build-dir if --unified-pkg is not specified,
and deduce the path to unified pkg from BUILD_DIR.
so, in this change, we deduce the path to unified pkg from BUILD_DIR
unless --unified-pkg is specfied.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this allows build_unified.sh to generate unified pkg in specified
directory, instead of assuming the naming convention of build/$mode.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
... and sanitize the future used on stop.
The loop in question is now started in .start(), but all callers now
construct the manager late enough, so the loop spawning can be moved.
This also calls for renaming the future member of the class and allows
to make it regular, not shared, future.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently starting and stopping of b.m. is spread over main(). Keep it
close to each other.
Another trickery here is that calling b.m.::start() can only be done
after joining the cluster, because this start() spawns replay loop
which, in turn calls token_metadata::count_normal_token_owners() and if
the latter returns zero, the b.m. code uses it as a fraction denominator
and crashes.
With the above in mind, cql_test_env should start batchlog manager after
it "joins the ring" too. For now it doesn't make any difference, but
next patch will make use of it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the only caller of it is the batchlog manager itself. It
checks for the shard-id to be zero, calls the method, then the method
asserts that it's run on shard-0.
Moving the check into the method removes the need for assertion and
makes further patching simpler.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently drain() is called twise -- first time from
storage_service::drain() (on shutdown), second via
batchlog_manager::stop(). The routine is unintentinally re-entrable,
because:
- explicit check for not aborting the abort source twise
- breaking semaphore can be done multiple times
- co-await-ing of the _started future works because the future is shared
That's not extremely elegant, better to make the drain() bail out early
if it was already called.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
in this series, the packaging of tools modules are improved:
- package cqlsh also. as cqlsh should be redistributed as a part of the unified package
- use ${arch} in the postfix of the python3 package. the python3 package is not architecture independent.
- set the version with tide for `Scylla_VERSION`, so it can be reused elsewhere.
Refs #15241Closes#15369
* github.com:scylladb/scylladb:
build: cmake: build cqlsh as a submodule
build: cmake: always use the version with tilde
build: cmake: build python3 dist tarball with arch postfix
build: cmake: use the default comment message
This function practically returned true from inception.
In d38deef499
it started using messaging_service().knows_version(endpoint)
that also returns `true` unconditionally, to this day
So there's no point calling it since we can assume
that `uses_host_id` is true for all versions.
Closes#15343
* github.com:scylladb/scylladb:
storage_service: fixup indentation after last patch
gossiper: get rid of uses_host_id
since we always use tilde ("~") in the verson number,
let's just cache it as an internal variable in CMake.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
now that `configure.py` always generate python3 dist tarball with
${arch} postfix, let's mirror this behavior. as `build_unified.sh`
uses this naming convention.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
it turns out "Generating submodule python3 in python3" is not
as informative as default one:
"/home/kefu/dev/scylladb/tools/python3/build/scylla-python3-5.4.0~dev-0.20230908.1668d434e458.noarch.tar.gz"
so let's drop the "COMMENT" argument.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Changing the second value of cdc_generation_id_v2 from uuid_type
to timeuuid_type made the name of cdc_generation_id_v2 unsuitable
because it does not match cdc::generation_id_v2 anymore.
We change the type of IDs in CDC_GENERATIONS_V3 to timeuuid to
give them a time-based order. We also change how we initialize
them so that the new CDC generation always has the highest ID.
This is the last step to enabling the efficient clearing of
obsolete CDC generation data.
Additionally, we change the types of current_cdc_generation_uuid,
new_cdc_generation_data_uuid and the second values of the elements
in unpublished_cdc_generations to timeuuid, so that they match id
in CDC_GENERATIONS_V3.
After moving the creation of uuid out of
make_new_generation_description, this function only calls the
topology_description_generator's constructor and its generate
method. We could remove this function, but we instead simplify
the code by removing the topology_description_generator class.
We can do this refactor because make_new_generation_description
is the only place using it. We inline its generate method into
make_new_generation_description and turn its private methods into
static functions.
In the future commit, we change how we initialize uuid of the
new CDC generation in the Raft-based topology. It forces us to
move this initialization out of the make_new_generation_data
function shared between Raft-based and gossiper-based topologies.
We also rename make_new_generation_data to
make_new_generation_description since it only returns
cdc::topology_description now.
We make CDC_GENERATIONS_V3 single-partition by adding the key
column and changing the clustering key from range_end to
(id, range_end). This is the first step to enabling the efficient
clearing of obsolete CDC generation data, which we need to prevent
Raft-topology snapshots from endlessly growing as we introduce new
generations over time. The next step is to change the type of the id
column to timeuuid. We do it in the following commits.
After making CDC_GENERATIONS_V3 single-partition, there is no easy
way of preserving the num_ranges column. As it is used only for
sanity checking, we remove it to simplify the implementation.
In the following commit, we implement the
get_cdc_generation_mutations_v3 function very similar to
get_cdc_generation_mutations_v2. The only differences in creating
mutations between CDC_GENERATIONS_V2 and CDC_GENERATIONS_V3 are:
- a need to set the num_ranges cell for CDC_GENERATIONS_V2,
- different partition keys,
- different clustering keys.
To avoid code duplication, we introduce
get_common_cdc_generation_mutations, which does most of the work
shared by both functions.
this change allows CMake to build the dist tarball for a certain build.
Refs https://github.com/scylladb/scylladb/issues/15241Closes#15352
* github.com:scylladb/scylladb:
build: cmake: add packaging support
build: cmake: enable build of seastar/apps/iotune