Commit Graph

40525 Commits

Author SHA1 Message Date
Pavel Emelyanov
cdf5124003 Merge 'tools/scylla-sstable: pass error handler to utils::config_file::read_from_file()' from Botond Dénes
The default error handler throws an exception, which means scylla-sstable will exit with exception if there is any problem in the configuration. Not even ScyllaDB itself is this harsh -- it will just log a warning for most errors. A tool should be much more lenient. So this patch passes an error handler which just logs all errors with debug level.
If reading an sstable fails, the user is expected to investigate turning debug-level logging on. When they do so, they will see any problems while reading the configuration (if it is relevant, e.g. when using EAR).

Fixes: #16538

Closes scylladb/scylladb#16657

* github.com:scylladb/scylladb:
  tools/scylla-sstable: pass error handler to utils::config_file::read_from_file()
  tools/scylla-sstable: allow always passing --scylla-yaml-file option
2024-01-09 14:28:49 +03:00
Kefu Chai
b91eb89ffa gms: heart_beat_state: add formatter for gms::heart_beat_state
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define a formatter for gms::heart_beat_state, and
remove its operator<<(). the only caller site of its operator<< is
updated to use `fmt::print()`

Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16652
2024-01-09 11:52:40 +02:00
Kefu Chai
cca786e847 gms: endpoint_state: fix a typo in comment
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16653
2024-01-09 11:51:49 +02:00
Kefu Chai
c1beba1f7d utils: config_file: throw bpo::invalid_option_value() when seeing invalid option
before this change, `std::invalid_argument` is thrown by
`bpo::notify(configuration)` in `app_template::run_deprecated()` when
invalid option is passed in via command line. `utils::named_value`
throws `std::invalid_argument` if the given value is not listed in
`_allowed_values`. but we don't handle `std::invalid_argument` in
`app_template::run_deprecated()`. so the application aborts with
unhandled exception if the specified argument is not allowed.

in this change, we convert the `std::invalid_argument` to a
derived class of `bpo::error` in the customized notify handler,
so that it can be handled in `app_template::run_deprecated()`.

because `name_value::operator()` is also used otherwhere, we
should not throw a bpo::error there. so its exception type
is preserved.

Fixes #16687
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16688
2024-01-09 11:49:06 +02:00
Kefu Chai
a6152cb87b sstables: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16666
2024-01-09 11:45:44 +02:00
Kefu Chai
be364d30fd db: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16664
2024-01-09 11:44:19 +02:00
Aleksandra Martyniuk
6f13e55187 tasks: call release_resources when task is finished
Call task_manager::task::impl::release_resources when task is finished
instead of putting the responsibility on user.

Closes scylladb/scylladb#16660
2024-01-09 11:41:54 +02:00
Pavel Emelyanov
cfeff893c6 network_topology_strategy: Print map of dc:rf pairs in one go
The strategy constructor prints the dc:rf at the end making the sstring
for it by hand. Modern fmt-based logger can format unordered_map-s on
its own. The message would look slightly different though:

  Configured datacenter replicas are: foo:1 bar:2

into

  Configured datacenter replicas are: {"foo": 1, "bar": 2}

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#16443
2024-01-09 11:30:49 +02:00
Kamil Braun
d93074e87e cql3: don't parallelize select aggregates to local tables
We've observed errors during shutdown like the following:
```
ERROR 2023-12-26 17:36:17,413 [shard 0:main] raft - [088f01a3-a18b-4821-b027-9f49e55c1926] applier fiber stopped because of the error: std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down)
INFO  2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft_state_monitor_fiber aborted with raft::stopped_error (Raft instance is stopped)
ERROR 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft topology: failed to fence previous coordinator raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down)")
```

some CQL statement execution was trying to use `forward_service` during
shutdown.

It turns out that the statement is in
`system_keyspace::load_topology_state`:
```
auto gen_rows = co_await execute_cql(
    format("SELECT count(range_end) as cnt FROM {}.{} WHERE key = '{}' AND id = ?",
           NAME, CDC_GENERATIONS_V3, cdc::CDC_GENERATIONS_V3_KEY),
    gen_uuid);
```
It's querying a table in the `system` keyspace.

Pushing local table queries through `forward_service` doesn't make sense
as the data is not distributed. Excluding local tables from this logic
also fixes the shutdown error.

Fixes scylladb/scylladb#16570

Closes scylladb/scylladb#16662
2024-01-08 14:44:22 -05:00
Kamil Braun
d4f4b58f3a Merge 'topology_coordinator: reject removenode if the removed node is alive' from Patryk Jędrzejczak
The removenode operation is defined to succeed only if the node
being removed is dead. Currently, we reject this operation on the
initiator side (in `storage_service::raft_removenode`) when the
failure detector considers the node being removed alive. However,
it is possible that even if the initiator considers the node dead,
the topology coordinator will consider it alive when handling the
topology request. For example, the topology coordinator can use
a bigger failure detector timeout, or the node being removed can
suddenly resurrect.

This PR makes the topology coordinator reject removenode if the
node being removed is considered alive. It also adds
`test_remove_alive_node` that verifies this change.

Fixes scylladb/scylladb#16109

Closes scylladb/scylladb#16584

* github.com:scylladb/scylladb:
  test: add test_remove_alive_node
  topology_coordinator: reject removenode if the removed node is alive
  test: ManagerClient: remove unused wait_for_host_down
  test: remove_node: wait until the node being removed is dead
2024-01-08 12:39:23 +01:00
Kamil Braun
d11e824802 Merge 'storage_service: make all Raft-based operations abortable' from Patryk Jędrzejczak
During a shutdown, we call `storage_service::stop_transport` first.
We may try to apply a Raft command after that, or still be in the
the process of applying a command. In such a case, the shutdown
process will hang because Raft retries replicating a command until
it succeeds even in the case of a network error. It will stop when
a corresponding abort source is set. However, if we pass `nullptr`
to a function like `add_entry`, it won't stop. The shutdown
process will hang forever.

We fix all places that incorrectly pass `nullptr`. These shutdown
hangs are not only theoretical. The incorrect `add_entry` call in
`update_topology_state` caused scylladb/scylladb#16435.

Additionally, we remove the default `nullptr` values in all member
functions of `server` and `raft_group0_client` to avoid similar bugs
in the future.

Fixes scylladb/scylladb#16435

Closes scylladb/scylladb#16663

* github.com:scylladb/scylladb:
  server, raft_group0_client: remove the default nullptr values
  storage_service: make all Raft-based operations abortable
2024-01-08 11:30:56 +01:00
Botond Dénes
9119bcbd67 tools/scylla-sstable: pass error handler to utils::config_file::read_from_file()
The default error handler throws an exception, which means
scylla-sstable will exit with exception if there is any problem in the
configuration. Not even ScyllaDB itself is this harsh -- it will just
log a warning for most errors. A tool should be much more lenient. So
this patch passes an error handler which just logs all errors with debug
level.
If reading an sstable fails, the user is expected to investigate turning
debug-level logging on. When they do so, they will see any problems
while reading the configuration (if it is relevant, e.g. when using EAR).

Fixes: #16538
2024-01-08 02:18:15 -05:00
Botond Dénes
16791a63c9 tools/scylla-sstable: allow always passing --scylla-yaml-file option
Currently, if multiple schema sources are provided, the tool complains
about ambiguity, over which to consider. One of these option is
--scylla-yaml-file. However, we want to allow passing this option any
time, otherwise encrypted sstables cannot be read. So relax the multiple
schema source check to also allow this option to be used even when e.g.
--schema-file was used as the schema source.
2024-01-08 02:18:12 -05:00
Nadav Har'El
61395a3658 Update tools/java submodule
* tools/java b7ebfd38...e106b500 (3):
  > build.xml: update scylla-driver-core to 3.11.5.1
  > Use ReplicaOrdering.NEUTRAL in TokenAwarePolicy to respect RackAwareness
  > treewide: update "guava" package

Refs https://github.com/scylladb/scylladb/pull/16491
Refs https://github.com/scylladb/scylla-tools-java/pull/372
2024-01-07 15:12:15 +02:00
Patryk Jędrzejczak
df2034ebd7 server, raft_group0_client: remove the default nullptr values
The previous commit has fixed 5 bugs of the same type - incorrectly
passing the default nullptr to one of the changed functions. At
least some of these bugs wouldn't appear if there was no default
value. It's much harder to make this kind of a bug if you have to
write "nullptr". It's also much easier to detect it in review.

Moreover, these default values are rarely used outside tests.
Keeping them is just not worth the time spent on debugging.
2024-01-05 18:45:50 +01:00
Patryk Jędrzejczak
3d4af4ecf1 storage_service: make all Raft-based operations abortable
During a shutdown, we call `storage_service::stop_transport` first.
We may try to apply a Raft command after that, or still be in the
the process of applying a command. In such a case, the shutdown
process will hang because Raft retries replicating a command until
it succeeds even in the case of a network error. It will stop when
a corresponding abort source is set. However, if we pass `nullptr`
to a function like `add_entry`, it won't stop. The shutdown
process will hang forever.

We fix all places that incorrectly pass `nullptr`. These shutdown
hangs are not only theoretical. The incorrect `add_entry` call in
`update_topology_state` caused scylladb/scylladb#16435.
2024-01-05 18:45:20 +01:00
Kefu Chai
7e84e03f52 gms: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

because the removal of `#include "unimplemented.hh"`,
`service/migration_manager.cc` misses the definition of
`unimplemented::cause::VALIDATION`, so include the header where it is
used.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16654
2024-01-05 13:37:08 +02:00
Nadav Har'El
94580df1c5 test/alternator: fix flaky test in test_filter_expression.py
The test test_filter_expression.py::test_filter_expression_precedence
is flaky - and can fail very rarely (so far we've only actually seen it
fail once). The problem is that the test generates items with random
clustering keys, chosen as an integer between 1 and 1 million, and there
is a chance (roughly 2/10,000) that two of the 20 items happen to have the
same key, so one of the items is "lost" and the comparison we do to the
expected truth fails.

The solution is to just use sequential keys, not random keys.
There is nothing to gain in this test by using random keys.

To make this test bug easy to reproduce, I temporarily changed
random_i()'s range from 1,000,000 to 3, and saw the test failing every
single run before this patch. After this patch - no longer using
random_i() for the keys - the test doesn't fail any more.

Fixes #16647

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#16649
2024-01-04 21:36:40 +02:00
Kamil Braun
bf068dd023 Merge handle error in cdc generation propagation during bootstrap from Gleb
Bootstrap cannot proceed if cdc generation propagation to all nodes
fails, so the patch series handles the error by rolling the ongoing
topology operation back.

* 'gleb/raft-cdc-failure' of github.com:scylladb/scylla-dev:
  test: add test to check failure handling in cdc generation commit
  storage_service: topology coordinator: rollback on failure to commit cdc generation
2024-01-04 15:38:51 +01:00
Kamil Braun
f942bf4a1f Merge 'Do not update endpoint state via gossiper::add_saved_endpoint once it was updated via gossip' from Benny Halevy
Currently, `add_saved_endpoint` is called from two paths:  One, is when
loading states from system.peers in the join path (join_cluster,
join_token_ring), when `_raft_topology_change_enabled` is false, and the
other is from `storage_service::topology_state_load` when raft topology
changes are enabled.

In the later path, from `topology_state_load`, `add_saved_endpoint` is
called only if the endpoint_state does not exist yet.  However, this is
checked without acquiring the endpoint_lock and so it races with the
gossiper, and once `add_saved_endpoint` acquires the lock, the endpoint
state may already be populated.

Since `add_saved_endpoint` applies local information about the endpoint
state (e.g. tokens, dc, rack), it uses the local heart_beat_version,
with generation=0 to update the endpoint states, and that is
incompatible with changes applies via gossip that will carry the
endpoint's generation and version, determining the state's update order.

This change makes sure that the endpoint state is never update in
`add_saved_endpoint` if it has non-zero generation.  An internal error
exception is thrown if non-zero generation is found, and in the only
call site that might reach that state, in
`storage_service::topology_state_load`, the caller acquires the
endpoint_lock for checking for the existence of the endpoint_state,
calling `add_saved_endpoint` under the lock only if the endpoint_state
does not exist.

Fixes #16429

Closes scylladb/scylladb#16432

* github.com:scylladb/scylladb:
  gossiper: add_saved_endpoint: keep heart_beat_state if ep_state is found
  storage_service: topology_state_load: lock endpoint for add_saved_endpoint
  raft_group_registry: move on_alive error injection to gossiper
2024-01-04 14:47:10 +01:00
qiulijuan2
7fa2c33ba1 replica: remove duplicated function calling
set_skip_when_empty is duplicated of metric column_family_row_hits in replica/table.cc

fix: #16582

Signed-off-by: qiulijuan2<qiulijuan2_yewu@cmss.chinamobile.com>

Closes scylladb/scylladb#16581
2024-01-04 15:04:31 +02:00
Kefu Chai
ee28a1cf4b build: enable -Wimplicit-int-float-conversion
a209ae15 addresses that last -Wimplicit-int-float-conversion warning
in the tree, so we now have the luxury of enabling this warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16640
2024-01-04 12:45:23 +02:00
Kefu Chai
cf932888de Update seastar submodule
* seastar e0d515b6...70349b74 (33):
  > util/log: drop unused function
  > util/log, rpc, core: use compile-time formatting with fmtlib >= 8.0
  > Fix edge case in memory sampler at OOM
  > exp/geo distribution benchmark
  > Additional allocation tests
  > Remove null pointer check on free hot path
  > Optimize final part of allocation hot path
  > Optimize zero size checking in allocator
  > memory: Optimize free fast path
  > memory: Optimize small alloc alloation path
  > memory: Limit alloc_sites size
  > memory: Add general comment about sampling strategy
  > memory: Use probabilistic sampler
  > util: Adapt memory sampler to seastar
  > util: Import Android Memory Sampler
  > memory: Use separate small pool for tracking sampled allocations
  > memory: Support enabling memory profiling at runtime
  > util/source_location-compat: mark `source_location::current()` consteval
  > build: use new behavior defined by CMP0155 when building C++ modules
  > circleci: build with C++20 modules enabled
  > seastar.cc: replace cryptopp with gnutls when building seastar modules
  > alien: include used header
  > seastar.cc: include used headers in the global purview
  > docker: install clang-tools-17
  > net/tcp: generate a random src_port hashed to current shard if smp::count > 1
  > net, websocket: replace Crypto++ calls with GnuTLS
  > README-DPDK.md: point user to DPDK's quick start guide
  > reactor: print fatal error using logger as well
  > Avoid ping-pong in spinlock::lock
  > memory: Add allocator perf tests
  > memory: Add a basic sized deletion test
  > Prometheus: Disable Prometheus protobuf with a configuration
  > treewide: bring back prometheus protobuf support
* test/manual/sstable_scan_footprint_test: update to adapt to the
  breaking change of "memory: Use probabilistic sampler" in seastar

Closes scylladb/scylladb#16610
2024-01-04 09:36:53 +02:00
Kefu Chai
47d8edc0fc test.py: s/asyncio.get_event_loop()/asyncio.get_running_loop()/
the latter raises a RuntimeError if there is no no running event loop,
while the former gets one from the the default policy in this case.
in the use cases in test.py, there is always a running event loop,
when `asyncio.get_event_loop()` gets called. so let's use
the preferred `asyncio.get_running_loop()`.

see https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.get_event_loop

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16398
2024-01-04 08:39:49 +02:00
Kefu Chai
50cf62e186 build: cmake: do not link against Boost::dynamic_linking
Boost::dynamic_linking was introduced as a compatibility target
which adds "BOOST_ALL_DYN_LINK" macro on Win32 platform. but since
Scylla only runs on Linux, there is no need to link against this
library.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16544
2024-01-04 08:06:19 +02:00
Lakshmi Narayanan Sreethar
1d6eaf2985 compaction manager: remove: cleanup _compaction_state on exceptions
If for some reason an exception is thrown in compaction_manager::remove,
it might leave behind stale table pointers in _compaction_state. Fix
that by setting up a deffered action to perform the cleanup.

Fixes #16635

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16632
2024-01-03 22:03:24 +02:00
Benny Halevy
9e8998109f gossiper: get_*_members_synchronized: acquire endpoint update semaphore
To ensure that the value they return is synchronized on all shards.

This got broken recently by 147f30caff.

Refs https://github.com/scylladb/scylladb/pull/16597#discussion_r1440445432

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#16629
2024-01-03 17:41:46 +01:00
Michał Chojnowski
a209ae1573 cql3: type_json: fix an edge case in float-to-int conversion
Refer to the added comment for details.

This problem was found by a compiler warning, and I'm fixing
it mainly to silence the warning. I didn't give any thought
to its effects in practice.

Fixes #13077

Closes scylladb/scylladb#16625

[avi: changed Refs to Fixes]
2024-01-03 17:59:01 +02:00
Kefu Chai
2ad532df43 test: randomized_nemesis_test: move std::variant formatter up
we format `std::variant<std::monostate, seastar::timed_out_error,
raft::not_a_leader, raft::dropped_entry, raft::commit_status_unknown,
raft::conf_change_in_progress, raft::stopped_error, raft::not_a_member>`
in this source file. and currently, we format `std::variant<..>` using
the default-generated `fmt::formatter` from `operator<<`, so in order to
format it using {fmt}'s compile-time check enabled, we have to make the
`operator<<` overload for `std::variant<...>` visible from the caller
sites which format `std::variant<...>` using {fmt}.

in this change, the `operator<<` for `std::variant<...>` is moved to
from the middle of the source file to the top of it, so that it can
be found when the compiler looks up for a matched `fmt::formatter`
for `std::variant<...>`.

please note, we cannot use the `fmt::formatter` provided by `fmt/std.h`,
as its specialization for `std::variant` requires that all the types
of the variant is `is_formattable`. but the default generated formatter
for type `T` is not considered as the proof that `T` is formattable.

this should address the FTBFS with the latest seastar like:

```
 /usr/include/fmt/core.h:2743:12: error: call to deleted constructor of 'conditional_t<has_formatter<mapped_type, context>::value, formatter<mapped_type, char_type>, fallback_formatter<stripped_type, char_type>>' (aka 'fmt::detail::fallback_formatter<std::variant<std::monostate, seastar::timed_out_error, raft::not_a_leader, raft::dropped_entry, raft::commit_status_unknown, raft::conf_change_in_progress, raft::stopped_error, raft::not_a_member>>')
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16616
2024-01-03 16:38:25 +01:00
Kefu Chai
2c394e3f6f tablets: remove unused #includes
the removed #include headers are not used, so let's drop their
`#include`s.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16619
2024-01-03 15:30:40 +01:00
Avi Kivity
20531872a7 Merge 'test: randomized_nemesis_test: add formatter for append_entry' from Kefu Chai
we are using `seastar::format()` to format `append_entry` in
`append_reg_model`, so we have to provide a `fmt::formatter` for
these callers which format `append_entry`.

despite that, with FMT_DEPRECATED_OSTREAM, the formatter is defined
by fmt v9, we don't have it since fmt v10. so this change prepares us
for fmt v10.

Refs https://github.com/scylladb/scylladb/issues/13245

Closes scylladb/scylladb#16614

* github.com:scylladb/scylladb:
  test: randomized_nemesis_test: add formatter for append_entry
  test: randomized_nemesis_test: move append_reg_model::entry out
2024-01-03 15:06:33 +02:00
Kefu Chai
dde8f694f6 build: cmake: use # for line comment
it was a copy-pasta error introduced by 2508d339. the copyright
blob was copied from a C++ source code, but the CMake language
define the block comment is different from the C++ language.

let's use the line comment of CMake.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16615
2024-01-03 15:05:00 +02:00
Tomasz Grabiec
715e062d4a Merge 'table, memtable: share log structured allocator statistics across all tablets in a table' from Avi Kivity
In 7d5e22b43b ("replica: memtable: don't forget memtable
memory allocation statistics") we taught memtable_list to remember
learned memory allocation reserves so a new memtable inherits these
statistics from an older memtable. Share it now further across tablets
that belong to the same table as well. This helps the statistics be more
accurate for tablets that are migrated in, as they can share existing
tablet's memory allocation history.

Closes scylladb/scylladb#16571

* github.com:scylladb/scylladb:
  table, memtable: share log-structured allocator statistics across all memtables in a table
  memtable: consolidate _read_section, _allocating_section in a struct
2024-01-03 14:03:40 +01:00
Avi Kivity
b8a0e3543e docs: ddl: document the initial_tablets replication strategy option
While the feature is experimental, this makes it easier to experiment
with it.

An example is provided.

Closes scylladb/scylladb#16193
2024-01-03 13:49:30 +01:00
Benny Halevy
147f30caff gossiper: mutate_live_and_unreachable_endpoints: make exception safe
Change the mutate_live_and_unreachable_endpoints procedure
so that the called `func` would mutate a cloned
`live_and_unreachable_endpoints` object in place.

Those are replicated to temporary copies on all shards
using `foreign<unique_ptr<>>` so that the would be
automatically freed on exception.

Only after all copies are made, they are applied
on all gossiper shards in a noexcept loop
and finally, a `on_success` function is called
to apply further side effects if everything else
was replicated successfully.

The latter is still susceptible to exceptions,
but we can live with those as long as `_live_endpoints`
and `_unreachable_endpoints` are synchronized on all shards.

With that, the read-only methods:
`get_live_members_synchronized` and
`get_unreachable_members_synchronized`
become trivial and they just return the required data
from shard 0.

Fixes #15089

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#16597
2024-01-03 14:46:10 +02:00
Benny Halevy
fadcef01f5 database: setup_scylla_memory_diagnostics_producer: replace infinity sign with unlimited string
The infinity unicode sign used for dumping read concurrency semaphore
state, `∞` may be misrendered.
For example: https://jenkins.scylladb.com/job/scylla-master/job/dtest-release/451/artifact/logs-full.release.011/1703288463175_materialized_views_test.py%3A%3ATestMaterializedViews%3A%3Atest_add_dc_during_mv_insert/node1.log
```
  Read Concurrency Semaphores:
    user: 0/100, 1K/9M, queued: 0
    streaming: 0/10, 0B/9M, queued: 0
    system: 0/10, 0B/9M, queued: 0
    compaction: 0/∞, 0B/∞
```

Instead, just print the word `unlimited`.

This was introduced in 34c213f9bb

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#16534
2024-01-03 14:46:10 +02:00
Kefu Chai
3e4159fece repair: remove unused #include
remove the unused #include headers from repair.hh, as they are not
directly used. after this change, task_manager_module.hh fails to
have access to stream_reason, so include it where it is used.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16618
2024-01-03 14:46:10 +02:00
Kefu Chai
1f4b5126f6 build: cmake: add comment explaining CMAKE_CXX_FLAGS_RELWITHDEBINFO
to clarify why we need to set this flagset instead of appending to it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16546
2024-01-03 14:46:10 +02:00
Kefu Chai
3ef0345b7f test/nodetool: log response from mock server when handling JSONDecodeError
it's observed that the mock server could return something not decodable
as JSON. so let's print out the response in the logging message in this case.
this should help us to understand the test failure better if it surfaces again.

Refs #16542
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16543
2024-01-03 14:46:10 +02:00
Kefu Chai
0484ac46af test: randomized_nemesis_test: add formatter for append_entry
we are using `seastar::format()` to format `append_entry` in
`append_reg_model`, so we have to provide a `fmt::formatter` for
these callers which format `append_entry`.

despite that, with FMT_DEPRECATED_OSTREAM, the formatter is defined
by fmt v9, we don't have it since fmt v10. so this change prepares us
for fmt v10.

Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-03 08:38:43 +08:00
Kefu Chai
32e55731ab test: randomized_nemesis_test: move append_reg_model::entry out
this change prepares for adding fmt::formatter for append_entry.
as we are using its formatter in the inline member functions of
`append_reg_model`. but its `fmt::formatter` can only be specialized out of
this class. and we don't have access to `format_as()` yet in {fmt} 9.1.0
which is shipped along with fedora38, which is in turn used for
our base build image.

so, in this change, `append_reg_model::entry` is extracted and renamed
to `append_entry`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-03 08:38:43 +08:00
Sylwia Szunejko
91a5a41313 add a way to negotiate generation of the tablet info for drivers
Tablets metadata is quite expensive to generate (each data_value is
an allocation), so an old driver (without support for tablets) will
generate huge amounts of such notifications. This commit adds a way
to negotiate generation of the notification: a new driver will ask
for them, and an old driver won't get them. It uses the
OPTIONS/SUPPORTED/STARTUP protocol described in native_protocol_v4.spec.

Closes scylladb/scylladb#16611
2024-01-02 20:00:50 +02:00
Kefu Chai
2508d33946 build: cmake: add Findcryptopp.cmake
seastar dropped the dependency to Crypto++, and it also removed
Findcryptopp.cmake from its `cmake` directory. but scylladb still
depends on this library. and it has been using the `Findcryptopp.cmake`
in seastar submodule for finding it.

after the removal of this file, scylladb would not be able to
use it anymore. so, we have to provide our own `Findcryptopp.cmake`.

Findcryptopp.cmake is copied from the Seastar project. So its
date of copyright is preserved. and it was licensed under Apache 2.0,
since we are creating a derivative work from it. let's relicense
it under Apache 2.0 and AGPL 3.0 or later.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16601
2024-01-02 19:09:50 +02:00
Kefu Chai
34259a03d0 treewide: use consteval string as format string when formatting log message
seastar::logger is using the compile-time format checking by default if
compiled using {fmt} 8.0 and up. and it requires the format string to be
consteval string, otherwise we have to use `fmt::runtime()` explicitly.

so adapt the change, let's use the consteval string when formatting
logging messages.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16612
2024-01-02 19:08:47 +02:00
Kefu Chai
64a227fba0 alternator/auth: remove unused #include
in `alternator/auth.cc`, none of the symbols in "query" namespace
provided by the removed headers is used is used, so there is no
need to include this header file.

the same applies to other removed header files.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16603
2024-01-02 17:50:59 +02:00
Kamil Braun
949658590f Merge 'raft topology: do not update token metadata in on_alive and on_remove' from Patryk Jędrzejczak
In the Raft-based topology, we should never update token metadata
through gossip notifications. `storage_service::on_alive` and
`storage_service::on_remove` do it, so we ignore their parts that
touch token metadata.

Additionally, we improve some logs in other places where we ignore
the function because of using the Raft-based topology.

Fixes scylladb/scylladb#15732

Closes scylladb/scylladb#16528

* github.com:scylladb/scylladb:
  storage_service: handle_state_left, handle_state_normal: improve logs
  raft topology: do not update token metadata in on_alive and on_remove
2024-01-02 16:08:50 +01:00
Kefu Chai
dd496afff3 mutation: add formatter for {atomic_cell_view,atomic_cell}::printer
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define a formatter for `atomic_cell_view::printer`
and `atomic_cell::printer` respectively, and remove their operator<<().

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16602
2024-01-02 16:14:42 +02:00
Kamil Braun
7f6955b883 Merge 'test: make use of concurrent bootstrap' from Patryk Jędrzejczak
In #16102, we added a test for concurrent bootstrap in the raft-based
topology. This test was running in CI for some time and
never failed. Now, we can believe that concurrent bootstrap is not
bugged or at least the probability of a failure is very low. Therefore,
we can safely make use of it in all tests using the raft-based topology.

This PR:
- makes all initial servers start concurrently in topology tests,
- replaces all multiple `server_add` calls with a single `servers_add`
  call in tests using the raft-based topology,
- removes no longer needed `test_concurrent_bootstrap`.

The changes listed above:
- make running tests a bit faster due to concurrent bootstraps,
- make multiple tests test concurrent bootstrap previously tested by
  a single test.

Fixes scylladb/scylladb#15423

Closes scylladb/scylladb#16384

* github.com:scylladb/scylladb:
  test: test_different_group0_ids: fix comments
  test: remove test_concurrent_bootstrap
  test: replace multiple server_add calls with servers_add
  test: ScyllaCluster: start all initial servers concurrently
  test: ManagerClient: servers_add: specify consistent-topology-changes assumption
2024-01-02 15:11:18 +01:00
Sylwia Szunejko
467d466f7e put all tablet info into one field of custom_payload and update docs
Previously, the tablet information was sent to the drivers
in two pieces within the custom_payload. We had information
about the replicas under the `tablet_replicas` key and token range
information under `token_range`. These names were quite generic
and might have caused problems for other custom_payload users.
Additionally, dividing the information into two pieces raised
the question of what to do if one key is present while the other
is missing.

This commit changes the serialization mechanism to pack all information
under one specific name, `tablets-routing-v1`.

From: Sylwia Szunejko <sylwia.szunejko@scylladb.com>

Closes scylladb/scylladb#16148
2024-01-02 14:35:37 +02:00
Patryk Jędrzejczak
215534d527 test: test_different_group0_ids: fix comments
The test disables consistent topology changes, not cluster
management.
2024-01-02 12:19:33 +01:00