The default error handler throws an exception, which means scylla-sstable will exit with exception if there is any problem in the configuration. Not even ScyllaDB itself is this harsh -- it will just log a warning for most errors. A tool should be much more lenient. So this patch passes an error handler which just logs all errors with debug level.
If reading an sstable fails, the user is expected to investigate turning debug-level logging on. When they do so, they will see any problems while reading the configuration (if it is relevant, e.g. when using EAR).
Fixes: #16538Closesscylladb/scylladb#16657
* github.com:scylladb/scylladb:
tools/scylla-sstable: pass error handler to utils::config_file::read_from_file()
tools/scylla-sstable: allow always passing --scylla-yaml-file option
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define a formatter for gms::heart_beat_state, and
remove its operator<<(). the only caller site of its operator<< is
updated to use `fmt::print()`
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16652
before this change, `std::invalid_argument` is thrown by
`bpo::notify(configuration)` in `app_template::run_deprecated()` when
invalid option is passed in via command line. `utils::named_value`
throws `std::invalid_argument` if the given value is not listed in
`_allowed_values`. but we don't handle `std::invalid_argument` in
`app_template::run_deprecated()`. so the application aborts with
unhandled exception if the specified argument is not allowed.
in this change, we convert the `std::invalid_argument` to a
derived class of `bpo::error` in the customized notify handler,
so that it can be handled in `app_template::run_deprecated()`.
because `name_value::operator()` is also used otherwhere, we
should not throw a bpo::error there. so its exception type
is preserved.
Fixes#16687
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16688
The strategy constructor prints the dc:rf at the end making the sstring
for it by hand. Modern fmt-based logger can format unordered_map-s on
its own. The message would look slightly different though:
Configured datacenter replicas are: foo:1 bar:2
into
Configured datacenter replicas are: {"foo": 1, "bar": 2}
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#16443
We've observed errors during shutdown like the following:
```
ERROR 2023-12-26 17:36:17,413 [shard 0:main] raft - [088f01a3-a18b-4821-b027-9f49e55c1926] applier fiber stopped because of the error: std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down)
INFO 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft_state_monitor_fiber aborted with raft::stopped_error (Raft instance is stopped)
ERROR 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft topology: failed to fence previous coordinator raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down)")
```
some CQL statement execution was trying to use `forward_service` during
shutdown.
It turns out that the statement is in
`system_keyspace::load_topology_state`:
```
auto gen_rows = co_await execute_cql(
format("SELECT count(range_end) as cnt FROM {}.{} WHERE key = '{}' AND id = ?",
NAME, CDC_GENERATIONS_V3, cdc::CDC_GENERATIONS_V3_KEY),
gen_uuid);
```
It's querying a table in the `system` keyspace.
Pushing local table queries through `forward_service` doesn't make sense
as the data is not distributed. Excluding local tables from this logic
also fixes the shutdown error.
Fixesscylladb/scylladb#16570Closesscylladb/scylladb#16662
The removenode operation is defined to succeed only if the node
being removed is dead. Currently, we reject this operation on the
initiator side (in `storage_service::raft_removenode`) when the
failure detector considers the node being removed alive. However,
it is possible that even if the initiator considers the node dead,
the topology coordinator will consider it alive when handling the
topology request. For example, the topology coordinator can use
a bigger failure detector timeout, or the node being removed can
suddenly resurrect.
This PR makes the topology coordinator reject removenode if the
node being removed is considered alive. It also adds
`test_remove_alive_node` that verifies this change.
Fixesscylladb/scylladb#16109Closesscylladb/scylladb#16584
* github.com:scylladb/scylladb:
test: add test_remove_alive_node
topology_coordinator: reject removenode if the removed node is alive
test: ManagerClient: remove unused wait_for_host_down
test: remove_node: wait until the node being removed is dead
During a shutdown, we call `storage_service::stop_transport` first.
We may try to apply a Raft command after that, or still be in the
the process of applying a command. In such a case, the shutdown
process will hang because Raft retries replicating a command until
it succeeds even in the case of a network error. It will stop when
a corresponding abort source is set. However, if we pass `nullptr`
to a function like `add_entry`, it won't stop. The shutdown
process will hang forever.
We fix all places that incorrectly pass `nullptr`. These shutdown
hangs are not only theoretical. The incorrect `add_entry` call in
`update_topology_state` caused scylladb/scylladb#16435.
Additionally, we remove the default `nullptr` values in all member
functions of `server` and `raft_group0_client` to avoid similar bugs
in the future.
Fixesscylladb/scylladb#16435Closesscylladb/scylladb#16663
* github.com:scylladb/scylladb:
server, raft_group0_client: remove the default nullptr values
storage_service: make all Raft-based operations abortable
The default error handler throws an exception, which means
scylla-sstable will exit with exception if there is any problem in the
configuration. Not even ScyllaDB itself is this harsh -- it will just
log a warning for most errors. A tool should be much more lenient. So
this patch passes an error handler which just logs all errors with debug
level.
If reading an sstable fails, the user is expected to investigate turning
debug-level logging on. When they do so, they will see any problems
while reading the configuration (if it is relevant, e.g. when using EAR).
Fixes: #16538
Currently, if multiple schema sources are provided, the tool complains
about ambiguity, over which to consider. One of these option is
--scylla-yaml-file. However, we want to allow passing this option any
time, otherwise encrypted sstables cannot be read. So relax the multiple
schema source check to also allow this option to be used even when e.g.
--schema-file was used as the schema source.
The previous commit has fixed 5 bugs of the same type - incorrectly
passing the default nullptr to one of the changed functions. At
least some of these bugs wouldn't appear if there was no default
value. It's much harder to make this kind of a bug if you have to
write "nullptr". It's also much easier to detect it in review.
Moreover, these default values are rarely used outside tests.
Keeping them is just not worth the time spent on debugging.
During a shutdown, we call `storage_service::stop_transport` first.
We may try to apply a Raft command after that, or still be in the
the process of applying a command. In such a case, the shutdown
process will hang because Raft retries replicating a command until
it succeeds even in the case of a network error. It will stop when
a corresponding abort source is set. However, if we pass `nullptr`
to a function like `add_entry`, it won't stop. The shutdown
process will hang forever.
We fix all places that incorrectly pass `nullptr`. These shutdown
hangs are not only theoretical. The incorrect `add_entry` call in
`update_topology_state` caused scylladb/scylladb#16435.
The test test_filter_expression.py::test_filter_expression_precedence
is flaky - and can fail very rarely (so far we've only actually seen it
fail once). The problem is that the test generates items with random
clustering keys, chosen as an integer between 1 and 1 million, and there
is a chance (roughly 2/10,000) that two of the 20 items happen to have the
same key, so one of the items is "lost" and the comparison we do to the
expected truth fails.
The solution is to just use sequential keys, not random keys.
There is nothing to gain in this test by using random keys.
To make this test bug easy to reproduce, I temporarily changed
random_i()'s range from 1,000,000 to 3, and saw the test failing every
single run before this patch. After this patch - no longer using
random_i() for the keys - the test doesn't fail any more.
Fixes#16647
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#16649
Bootstrap cannot proceed if cdc generation propagation to all nodes
fails, so the patch series handles the error by rolling the ongoing
topology operation back.
* 'gleb/raft-cdc-failure' of github.com:scylladb/scylla-dev:
test: add test to check failure handling in cdc generation commit
storage_service: topology coordinator: rollback on failure to commit cdc generation
Currently, `add_saved_endpoint` is called from two paths: One, is when
loading states from system.peers in the join path (join_cluster,
join_token_ring), when `_raft_topology_change_enabled` is false, and the
other is from `storage_service::topology_state_load` when raft topology
changes are enabled.
In the later path, from `topology_state_load`, `add_saved_endpoint` is
called only if the endpoint_state does not exist yet. However, this is
checked without acquiring the endpoint_lock and so it races with the
gossiper, and once `add_saved_endpoint` acquires the lock, the endpoint
state may already be populated.
Since `add_saved_endpoint` applies local information about the endpoint
state (e.g. tokens, dc, rack), it uses the local heart_beat_version,
with generation=0 to update the endpoint states, and that is
incompatible with changes applies via gossip that will carry the
endpoint's generation and version, determining the state's update order.
This change makes sure that the endpoint state is never update in
`add_saved_endpoint` if it has non-zero generation. An internal error
exception is thrown if non-zero generation is found, and in the only
call site that might reach that state, in
`storage_service::topology_state_load`, the caller acquires the
endpoint_lock for checking for the existence of the endpoint_state,
calling `add_saved_endpoint` under the lock only if the endpoint_state
does not exist.
Fixes#16429Closesscylladb/scylladb#16432
* github.com:scylladb/scylladb:
gossiper: add_saved_endpoint: keep heart_beat_state if ep_state is found
storage_service: topology_state_load: lock endpoint for add_saved_endpoint
raft_group_registry: move on_alive error injection to gossiper
a209ae15 addresses that last -Wimplicit-int-float-conversion warning
in the tree, so we now have the luxury of enabling this warning.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16640
* seastar e0d515b6...70349b74 (33):
> util/log: drop unused function
> util/log, rpc, core: use compile-time formatting with fmtlib >= 8.0
> Fix edge case in memory sampler at OOM
> exp/geo distribution benchmark
> Additional allocation tests
> Remove null pointer check on free hot path
> Optimize final part of allocation hot path
> Optimize zero size checking in allocator
> memory: Optimize free fast path
> memory: Optimize small alloc alloation path
> memory: Limit alloc_sites size
> memory: Add general comment about sampling strategy
> memory: Use probabilistic sampler
> util: Adapt memory sampler to seastar
> util: Import Android Memory Sampler
> memory: Use separate small pool for tracking sampled allocations
> memory: Support enabling memory profiling at runtime
> util/source_location-compat: mark `source_location::current()` consteval
> build: use new behavior defined by CMP0155 when building C++ modules
> circleci: build with C++20 modules enabled
> seastar.cc: replace cryptopp with gnutls when building seastar modules
> alien: include used header
> seastar.cc: include used headers in the global purview
> docker: install clang-tools-17
> net/tcp: generate a random src_port hashed to current shard if smp::count > 1
> net, websocket: replace Crypto++ calls with GnuTLS
> README-DPDK.md: point user to DPDK's quick start guide
> reactor: print fatal error using logger as well
> Avoid ping-pong in spinlock::lock
> memory: Add allocator perf tests
> memory: Add a basic sized deletion test
> Prometheus: Disable Prometheus protobuf with a configuration
> treewide: bring back prometheus protobuf support
* test/manual/sstable_scan_footprint_test: update to adapt to the
breaking change of "memory: Use probabilistic sampler" in seastar
Closesscylladb/scylladb#16610
Boost::dynamic_linking was introduced as a compatibility target
which adds "BOOST_ALL_DYN_LINK" macro on Win32 platform. but since
Scylla only runs on Linux, there is no need to link against this
library.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16544
If for some reason an exception is thrown in compaction_manager::remove,
it might leave behind stale table pointers in _compaction_state. Fix
that by setting up a deffered action to perform the cleanup.
Fixes#16635
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#16632
Refer to the added comment for details.
This problem was found by a compiler warning, and I'm fixing
it mainly to silence the warning. I didn't give any thought
to its effects in practice.
Fixes#13077Closesscylladb/scylladb#16625
[avi: changed Refs to Fixes]
we format `std::variant<std::monostate, seastar::timed_out_error,
raft::not_a_leader, raft::dropped_entry, raft::commit_status_unknown,
raft::conf_change_in_progress, raft::stopped_error, raft::not_a_member>`
in this source file. and currently, we format `std::variant<..>` using
the default-generated `fmt::formatter` from `operator<<`, so in order to
format it using {fmt}'s compile-time check enabled, we have to make the
`operator<<` overload for `std::variant<...>` visible from the caller
sites which format `std::variant<...>` using {fmt}.
in this change, the `operator<<` for `std::variant<...>` is moved to
from the middle of the source file to the top of it, so that it can
be found when the compiler looks up for a matched `fmt::formatter`
for `std::variant<...>`.
please note, we cannot use the `fmt::formatter` provided by `fmt/std.h`,
as its specialization for `std::variant` requires that all the types
of the variant is `is_formattable`. but the default generated formatter
for type `T` is not considered as the proof that `T` is formattable.
this should address the FTBFS with the latest seastar like:
```
/usr/include/fmt/core.h:2743:12: error: call to deleted constructor of 'conditional_t<has_formatter<mapped_type, context>::value, formatter<mapped_type, char_type>, fallback_formatter<stripped_type, char_type>>' (aka 'fmt::detail::fallback_formatter<std::variant<std::monostate, seastar::timed_out_error, raft::not_a_leader, raft::dropped_entry, raft::commit_status_unknown, raft::conf_change_in_progress, raft::stopped_error, raft::not_a_member>>')
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16616
we are using `seastar::format()` to format `append_entry` in
`append_reg_model`, so we have to provide a `fmt::formatter` for
these callers which format `append_entry`.
despite that, with FMT_DEPRECATED_OSTREAM, the formatter is defined
by fmt v9, we don't have it since fmt v10. so this change prepares us
for fmt v10.
Refs https://github.com/scylladb/scylladb/issues/13245Closesscylladb/scylladb#16614
* github.com:scylladb/scylladb:
test: randomized_nemesis_test: add formatter for append_entry
test: randomized_nemesis_test: move append_reg_model::entry out
it was a copy-pasta error introduced by 2508d339. the copyright
blob was copied from a C++ source code, but the CMake language
define the block comment is different from the C++ language.
let's use the line comment of CMake.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16615
In 7d5e22b43b ("replica: memtable: don't forget memtable
memory allocation statistics") we taught memtable_list to remember
learned memory allocation reserves so a new memtable inherits these
statistics from an older memtable. Share it now further across tablets
that belong to the same table as well. This helps the statistics be more
accurate for tablets that are migrated in, as they can share existing
tablet's memory allocation history.
Closesscylladb/scylladb#16571
* github.com:scylladb/scylladb:
table, memtable: share log-structured allocator statistics across all memtables in a table
memtable: consolidate _read_section, _allocating_section in a struct
Change the mutate_live_and_unreachable_endpoints procedure
so that the called `func` would mutate a cloned
`live_and_unreachable_endpoints` object in place.
Those are replicated to temporary copies on all shards
using `foreign<unique_ptr<>>` so that the would be
automatically freed on exception.
Only after all copies are made, they are applied
on all gossiper shards in a noexcept loop
and finally, a `on_success` function is called
to apply further side effects if everything else
was replicated successfully.
The latter is still susceptible to exceptions,
but we can live with those as long as `_live_endpoints`
and `_unreachable_endpoints` are synchronized on all shards.
With that, the read-only methods:
`get_live_members_synchronized` and
`get_unreachable_members_synchronized`
become trivial and they just return the required data
from shard 0.
Fixes#15089
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#16597
remove the unused #include headers from repair.hh, as they are not
directly used. after this change, task_manager_module.hh fails to
have access to stream_reason, so include it where it is used.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16618
it's observed that the mock server could return something not decodable
as JSON. so let's print out the response in the logging message in this case.
this should help us to understand the test failure better if it surfaces again.
Refs #16542
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16543
we are using `seastar::format()` to format `append_entry` in
`append_reg_model`, so we have to provide a `fmt::formatter` for
these callers which format `append_entry`.
despite that, with FMT_DEPRECATED_OSTREAM, the formatter is defined
by fmt v9, we don't have it since fmt v10. so this change prepares us
for fmt v10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change prepares for adding fmt::formatter for append_entry.
as we are using its formatter in the inline member functions of
`append_reg_model`. but its `fmt::formatter` can only be specialized out of
this class. and we don't have access to `format_as()` yet in {fmt} 9.1.0
which is shipped along with fedora38, which is in turn used for
our base build image.
so, in this change, `append_reg_model::entry` is extracted and renamed
to `append_entry`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Tablets metadata is quite expensive to generate (each data_value is
an allocation), so an old driver (without support for tablets) will
generate huge amounts of such notifications. This commit adds a way
to negotiate generation of the notification: a new driver will ask
for them, and an old driver won't get them. It uses the
OPTIONS/SUPPORTED/STARTUP protocol described in native_protocol_v4.spec.
Closesscylladb/scylladb#16611
seastar dropped the dependency to Crypto++, and it also removed
Findcryptopp.cmake from its `cmake` directory. but scylladb still
depends on this library. and it has been using the `Findcryptopp.cmake`
in seastar submodule for finding it.
after the removal of this file, scylladb would not be able to
use it anymore. so, we have to provide our own `Findcryptopp.cmake`.
Findcryptopp.cmake is copied from the Seastar project. So its
date of copyright is preserved. and it was licensed under Apache 2.0,
since we are creating a derivative work from it. let's relicense
it under Apache 2.0 and AGPL 3.0 or later.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16601
seastar::logger is using the compile-time format checking by default if
compiled using {fmt} 8.0 and up. and it requires the format string to be
consteval string, otherwise we have to use `fmt::runtime()` explicitly.
so adapt the change, let's use the consteval string when formatting
logging messages.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16612
in `alternator/auth.cc`, none of the symbols in "query" namespace
provided by the removed headers is used is used, so there is no
need to include this header file.
the same applies to other removed header files.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16603
In the Raft-based topology, we should never update token metadata
through gossip notifications. `storage_service::on_alive` and
`storage_service::on_remove` do it, so we ignore their parts that
touch token metadata.
Additionally, we improve some logs in other places where we ignore
the function because of using the Raft-based topology.
Fixesscylladb/scylladb#15732Closesscylladb/scylladb#16528
* github.com:scylladb/scylladb:
storage_service: handle_state_left, handle_state_normal: improve logs
raft topology: do not update token metadata in on_alive and on_remove
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define a formatter for `atomic_cell_view::printer`
and `atomic_cell::printer` respectively, and remove their operator<<().
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16602
In #16102, we added a test for concurrent bootstrap in the raft-based
topology. This test was running in CI for some time and
never failed. Now, we can believe that concurrent bootstrap is not
bugged or at least the probability of a failure is very low. Therefore,
we can safely make use of it in all tests using the raft-based topology.
This PR:
- makes all initial servers start concurrently in topology tests,
- replaces all multiple `server_add` calls with a single `servers_add`
call in tests using the raft-based topology,
- removes no longer needed `test_concurrent_bootstrap`.
The changes listed above:
- make running tests a bit faster due to concurrent bootstraps,
- make multiple tests test concurrent bootstrap previously tested by
a single test.
Fixesscylladb/scylladb#15423Closesscylladb/scylladb#16384
* github.com:scylladb/scylladb:
test: test_different_group0_ids: fix comments
test: remove test_concurrent_bootstrap
test: replace multiple server_add calls with servers_add
test: ScyllaCluster: start all initial servers concurrently
test: ManagerClient: servers_add: specify consistent-topology-changes assumption
Previously, the tablet information was sent to the drivers
in two pieces within the custom_payload. We had information
about the replicas under the `tablet_replicas` key and token range
information under `token_range`. These names were quite generic
and might have caused problems for other custom_payload users.
Additionally, dividing the information into two pieces raised
the question of what to do if one key is present while the other
is missing.
This commit changes the serialization mechanism to pack all information
under one specific name, `tablets-routing-v1`.
From: Sylwia Szunejko <sylwia.szunejko@scylladb.com>
Closesscylladb/scylladb#16148