in fmt v10, it does not cast unaligned<T> to T when formatting it,
instead it insists on finding a matched fmt::formatter<> specialization for it.
that's why we have FTBFS with fmt v10 when printing
these packed<T> variables with fmtlib v10.
in this change, we just cast them to the underlying types before
formatting them. because seastar::unaligned<T> does not provide
a method for accessing the raw value, neither does it provide
a type alias of the type of the underlying raw value, we have
to cast to the type without deducing it from the printed value.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16167
This bug manifested as delays in DDL statement execution, which had to
wait until streaming is finished so that the topology change
coordinator releases the guard.
The reason is that topology change coordinator didn't release the
group0 guard if there is no work to do with active migrations, and
awaits the condition variable without leaving the scope.
Fixes#16182Closesscylladb/scylladb#16183
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.
This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.
We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.
The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.
Fixes#7988
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#16121
This commit replaces the link to the OSS-only page
(the 5.2-to-5.4 upgrade guide not present in
the Enterprise docs) on the Raft page.
While providing the link to the specific upgrade
guide is more user-friendly, it causes build failures
of the Enterprise documentation. I've replaced
it with the link to the general Upgrade section.
The ".. only:: opensource" directive used to wrap
the OSS-only content correctly excludes the content
form the Enterprise docs - but it doesn't prevent
build warnings.
This commit must be backported to branch-5.4 to
prevent errors in all versions.
Closesscylladb/scylladb#16176
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define a formatter for
reconcilable_result::printer, and remove its operator<<().
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16186
* ./tools/jmx 05bb7b68...80ce5996 (4):
> StorageService: Normalize endpoint inetaddress strings to java form
Fixes#16039
> ColumnFamilyStore: only quote table names if necessary
> APIBuilder: allow quoted scope names
> ColumnFamilyStore: don't fail if there is a table with ":" in its name
Fixes#16153
* ./tools/java 10480342...26f5f71c (1):
> NodeProbe: allow addressing table name with colon in it
Also needed for #16153Closesscylladb/scylladb#16146
in Python, a raw string is created using 'r' or 'R' prefix. when
creating the regex using Python string, sometimes, we have to use
"\" to escape the parenthesis so the tools like "sed" can consider
the parenthesis as a capture group. but "\" is also used to escape
strings in Python, in order to put "\" as it is, we use "\" instead
of escaping "\" with "\\" which is obscure. when generating rules,
we use multiple-lines string and do not want to have an empty line
at the beginning of the string so added "\" continuation mark.
but we fail to escape some of the "\" in the string, and just put
"\(", despite that Python accepts it after failing to find a matched
escaped char for it, and interprets it as "\\(". this should still
be considered a misuse of oversight. with python's warning enabled,
one is able see its complaints.
in this change, we escape the "\" properly.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16179
this change addresses a regression introduced by
f4626f6b8e, which stopped notifying
systemd with the status that scylla is READY. without the
notification, systemd would wait in vain for the readiness of
scylla.
Refs f4626f6b8eFixes#16159
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16166
- remove some code that is obsolete in newer Scylla versions,
- fix some minor bugs. These bugs appear to be benign, there are no known issues caused by them, but fixing them is a good idea nevertheless,
- refactor some code for better maintainability.
Parts of this PR were extracted from https://github.com/scylladb/scylladb/pull/15331 (which was merged but later reverted), parts of it are new.
Closesscylladb/scylladb#16162
* github.com:scylladb/scylladb:
test/pylib: log_browsing: fix type hint
migration_manager: take `abort_source&` in get_schema_for_read/write
migration_manager: inline merge_schema_in_background
migration_manager: remove unused merge_schema_from overload
migration_manager: assume `canonical_mutation` support
migration_manager: add `std::move` to avoid a copy
schema_tables: refactor `scylla_tables(schema_features)`
schema_tables: pass `reload` flag when calling `merge_schema` cross-shard
system_keyspace: fix outdated comment
`run_first` lists in `suite.yaml` files provide a simple way to
shorten the tests' average running time by running the slowest
tests at first.
We update these lists, since they got outdated over time:
- `test_topology_ip` was renamed to `test_replace`
and changed suite,
- `test_tablets` changed suite,
- new slow tests were added:
- `test_cluster_features`,
- `test_raft_cluster_features`,
- `test_raft_ignore_nodes`,
- `test_read_repair`.
Closesscylladb/scylladb#16104
before we support incremental repair, these is no point have the
code path setting / getting it. and even worse, it incurs confusion.
so, in this change, we
* just set the field to 0,
* drop the corresponding field in metadata_collector, as we never
update it.
* add a comment to explain why this variable is initialized to 0
Fixes#16098
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16169
repair: Introduce small table optimization
*) Problem:
We have seen in the field it takes longer than expected to repair system tables
like system_auth which has a tiny amount of data but is replicated to all nodes
in the cluster. The cluster has multiple DCs. Each DC has multiple nodes. The
main reason for the slowness is that even if the amount of data is small,
repair has to walk though all the token ranges, that is num_tokens *
number_of_nodes_in_the_cluster. The overhead of the repair protocol for each
token range dominates due to the small amount of data per token range. Another
reason is the high network latency between DCs makes the RPC calls used to
repair consume more time.
*) Solution:
To solve this problem, a small table optimization for repair is introduced in
this patch. A new repair option is added to turn on this optimization.
- No token range to repair is needed by the user. It will repair all token
ranges automatically.
- Users only need to send the repair rest api to one of the nodes in the
cluster. It can be any of the nodes in the cluster.
- It does not require the RF to be configured to replicate to all nodes in the
cluster. This means it can work with any tables as long as the amount of data
is low, e.g., less than 100MiB per node.
*) Performance:
1)
3 DCs, each DC has 2 nodes, 6 nodes in the cluster. RF = {dc1: 2, dc2: 2, dc3: 2}
Before:
```
repair - repair[744cd573-2621-45e4-9b27-00634963d0bd]: stats:
repair_reason=repair, keyspace=system_auth, tables={roles, role_attributes,
role_members}, ranges_nr=1537, round_nr=4612,
round_nr_fast_path_already_synced=4611,
round_nr_fast_path_same_combined_hashes=0, round_nr_slow_path=1,
rpc_call_nr=115289, tx_hashes_nr=0, rx_hashes_nr=5, duration=1.5648403 seconds,
tx_row_nr=2, rx_row_nr=0, tx_row_bytes=356, rx_row_bytes=0,
row_from_disk_bytes={{127.0.14.1, 178}, {127.0.14.2, 178}, {127.0.14.3, 0},
{127.0.14.4, 0}, {127.0.14.5, 178}, {127.0.14.6, 178}},
row_from_disk_nr={{127.0.14.1, 1}, {127.0.14.2, 1}, {127.0.14.3, 0},
{127.0.14.4, 0}, {127.0.14.5, 1}, {127.0.14.6, 1}},
row_from_disk_bytes_per_sec={{127.0.14.1, 0.00010848}, {127.0.14.2,
0.00010848}, {127.0.14.3, 0}, {127.0.14.4, 0}, {127.0.14.5, 0.00010848},
{127.0.14.6, 0.00010848}} MiB/s, row_from_disk_rows_per_sec={{127.0.14.1,
0.639043}, {127.0.14.2, 0.639043}, {127.0.14.3, 0}, {127.0.14.4, 0},
{127.0.14.5, 0.639043}, {127.0.14.6, 0.639043}} Rows/s,
tx_row_nr_peer={{127.0.14.3, 1}, {127.0.14.4, 1}}, rx_row_nr_peer={}
```
After:
```
repair - repair[d6e544ba-cb68-4465-ab91-6980bcbb46a9]: stats:
repair_reason=repair, keyspace=system_auth, tables={roles, role_attributes,
role_members}, ranges_nr=1, round_nr=4, round_nr_fast_path_already_synced=4,
round_nr_fast_path_same_combined_hashes=0, round_nr_slow_path=0,
rpc_call_nr=80, tx_hashes_nr=0, rx_hashes_nr=0, duration=0.001459798 seconds,
tx_row_nr=0, rx_row_nr=0, tx_row_bytes=0, rx_row_bytes=0,
row_from_disk_bytes={{127.0.14.1, 178}, {127.0.14.2, 178}, {127.0.14.3, 178},
{127.0.14.4, 178}, {127.0.14.5, 178}, {127.0.14.6, 178}},
row_from_disk_nr={{127.0.14.1, 1}, {127.0.14.2, 1}, {127.0.14.3, 1},
{127.0.14.4, 1}, {127.0.14.5, 1}, {127.0.14.6, 1}},
row_from_disk_bytes_per_sec={{127.0.14.1, 0.116286}, {127.0.14.2, 0.116286},
{127.0.14.3, 0.116286}, {127.0.14.4, 0.116286}, {127.0.14.5, 0.116286},
{127.0.14.6, 0.116286}} MiB/s, row_from_disk_rows_per_sec={{127.0.14.1,
685.026}, {127.0.14.2, 685.026}, {127.0.14.3, 685.026}, {127.0.14.4, 685.026},
{127.0.14.5, 685.026}, {127.0.14.6, 685.026}} Rows/s, tx_row_nr_peer={},
rx_row_nr_peer={}
```
The time to finish repair difference = 1.5648403 seconds / 0.001459798 seconds = 1072X
2)
3 DCs, each DC has 2 nodes, 6 nodes in the cluster. RF = {dc1: 2, dc2: 2, dc3: 2}
Same test as above except 5ms delay is added to simulate multiple dc
network latency:
The time to repair is reduced from 333s to 0.2s.
333.26758 s / 0.22625381s = 1472.98
3)
3 DCs, each DC has 3 nodes, 9 nodes in the cluster. RF = {dc1: 3, dc2: 3, dc3: 3}
, 10 ms network latency
Before:
```
repair - repair[86124a4a-fd26-42ea-a078-437ca9e372df]: stats:
repair_reason=repair, keyspace=system_auth, tables={role_attributes,
role_members, roles}, ranges_nr=2305, round_nr=6916,
round_nr_fast_path_already_synced=6915,
round_nr_fast_path_same_combined_hashes=0, round_nr_slow_path=1,
rpc_call_nr=276630, tx_hashes_nr=0, rx_hashes_nr=8, duration=986.34015
seconds, tx_row_nr=7, rx_row_nr=0, tx_row_bytes=1246, rx_row_bytes=0,
row_from_disk_bytes={{127.0.57.1, 178}, {127.0.57.2, 178}, {127.0.57.3,
0}, {127.0.57.4, 0}, {127.0.57.5, 0}, {127.0.57.6, 0}, {127.0.57.7, 0},
{127.0.57.8, 0}, {127.0.57.9, 0}}, row_from_disk_nr={{127.0.57.1, 1},
{127.0.57.2, 1}, {127.0.57.3, 0}, {127.0.57.4, 0}, {127.0.57.5, 0},
{127.0.57.6, 0}, {127.0.57.7, 0}, {127.0.57.8, 0}, {127.0.57.9, 0}},
row_from_disk_bytes_per_sec={{127.0.57.1, 1.72105e-07}, {127.0.57.2,
1.72105e-07}, {127.0.57.3, 0}, {127.0.57.4, 0}, {127.0.57.5, 0},
{127.0.57.6, 0}, {127.0.57.7, 0}, {127.0.57.8, 0}, {127.0.57.9, 0}}
MiB/s, row_from_disk_rows_per_sec={{127.0.57.1, 0.00101385},
{127.0.57.2, 0.00101385}, {127.0.57.3, 0}, {127.0.57.4, 0},
{127.0.57.5, 0}, {127.0.57.6, 0}, {127.0.57.7, 0}, {127.0.57.8, 0},
{127.0.57.9, 0}} Rows/s, tx_row_nr_peer={{127.0.57.3, 1},
{127.0.57.4, 1}, {127.0.57.5, 1}, {127.0.57.6, 1}, {127.0.57.7, 1},
{127.0.57.8, 1}, {127.0.57.9, 1}}, rx_row_nr_peer={}
```
After:
```
repair - repair[07ebd571-63cb-4ef6-9465-6e5f1e98f04f]: stats:
repair_reason=repair, keyspace=system_auth, tables={role_attributes,
role_members, roles}, ranges_nr=1, round_nr=4,
round_nr_fast_path_already_synced=4,
round_nr_fast_path_same_combined_hashes=0, round_nr_slow_path=0,
rpc_call_nr=128, tx_hashes_nr=0, rx_hashes_nr=0, duration=1.6052915
seconds, tx_row_nr=0, rx_row_nr=0, tx_row_bytes=0, rx_row_bytes=0,
row_from_disk_bytes={{127.0.57.1, 178}, {127.0.57.2, 178}, {127.0.57.3,
178}, {127.0.57.4, 178}, {127.0.57.5, 178}, {127.0.57.6, 178},
{127.0.57.7, 178}, {127.0.57.8, 178}, {127.0.57.9, 178}},
row_from_disk_nr={{127.0.57.1, 1}, {127.0.57.2, 1}, {127.0.57.3, 1},
{127.0.57.4, 1}, {127.0.57.5, 1}, {127.0.57.6, 1}, {127.0.57.7, 1},
{127.0.57.8, 1}, {127.0.57.9, 1}},
row_from_disk_bytes_per_sec={{127.0.57.1, 0.00037793}, {127.0.57.2,
0.00037793}, {127.0.57.3, 0.00037793}, {127.0.57.4, 0.00037793},
{127.0.57.5, 0.00037793}, {127.0.57.6, 0.00037793}, {127.0.57.7,
0.00037793}, {127.0.57.8, 0.00037793}, {127.0.57.9, 0.00037793}}
MiB/s, row_from_disk_rows_per_sec={{127.0.57.1, 2.22634},
{127.0.57.2, 2.22634}, {127.0.57.3, 2.22634}, {127.0.57.4,
2.22634}, {127.0.57.5, 2.22634}, {127.0.57.6, 2.22634},
{127.0.57.7, 2.22634}, {127.0.57.8, 2.22634}, {127.0.57.9,
2.22634}} Rows/s, tx_row_nr_peer={}, rx_row_nr_peer={}
```
The time to repair is reduced from 986s (16 minutes) to 1.6s
*) Summary
So, a more than 1000X difference is observed for this common usage of
system table repair procedure.
Fixes#16011
Refs #15159Closesscylladb/scylladb#15974
* github.com:scylladb/scylladb:
repair: Introduce small table optimization
repair: Convert put_row_diff_with_rpc_stream to use coroutine
We add a test for concurrent bootstrap in the raft-based topology.
Additionally, we extend the testing framework with a new function -
`ManagerClient.servers_add`. It allows adding multiple servers
concurrently to a cluster.
This PR is the first step to fix#15423. After merging it, if the new test
doesn't fail for some time in CI, we can:
- use `ManagerClient.servers_add` in other tests wherever possible,
- start initial servers concurrently in all suites with
`initial_size > 0`.
Closesscylladb/scylladb#16102
* github.com:scylladb/scylladb:
test: topology: add test_concurrent_bootstrap
test: ManagerClient: introduce servers_add
test: ManagerClient: introduce _create_server_add_data
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define a formatter for
map<timestamp_type, vector<shared_sstable>>. since the operator<<
for this type is only used in the .cc file, and the only use case
of it is to provide the formatter for fmt, so the operator<< based
formatter is remove in this change.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16163
In topology on raft mode, the events "new node starts its group0 server"
and "new node is added to group0 configuration" are not synchronized
with each other. Therefore it might happen that the cluster starts
sending commands to the new node before the node starts its server. This
might lead to harmless, but ugly messages like:
INFO 2023-09-27 15:42:42,611 [shard 0:stat] rpc - client
127.0.0.1:56352 msg_id 2: exception "Raft group
b8542540-5d3b-11ee-99b8-1052801f2975 not found" in no_wait handler
ignored
In order to solve this, the failure detector verb is extended to report
information about whether group0 is alive. The raft rpc layer will drop
messages to nodes whose group0 is not seen as alive.
Tested by adding a delay before group0 is started on the joining node,
running all topology tests and grepping for the aforementioned log
messages.
Fixes: scylladb/scylladb#15853Fixes: scylladb/scylladb#15167Closesscylladb/scylladb#16071
* github.com:scylladb/scylladb:
raft: rpc: introduce destination_not_alive_error
raft: rpc: drop RPCs if the destination is not alive
raft: pass raft::failure_detector to raft_rpc
raft: transfer information about group0 liveness in direct_fd_ping
raft: add server::is_alive
We add a test for concurrent bootstrap support in the raft-based
topology.
The plan is to make this test temporary. In the future, we will:
- use ManagerClient.servers_add in other tests wherever possible,
- start initial servers concurrently in all suites with
initial_size > 0.
So, this test will not test anything unique.
We could make the changes proposed above now instead of adding
this small test. However, if we did that and it turned out that
concurrent bootstrap is flaky in CI, we would make almost every CI
run fail with many failures. We want to avoid such a situation.
Running only this test for some time in CI will reduce the risk
and make investigating any potential failures easier.
We add a new function - servers_add - that allows adding multiple
servers concurrently to a cluster. It makes use of a concurrent
bootstrap now supported in the raft-based topology.
servers_add doesn't have the replace_cfg parameter. The reason is
that we don't support concurrent replace operations, at least for
now.
There is an implementation detail in ScyllaCluster.add_servers. We
cannot simply do multiple calls to add_server concurrently. If we
did that in an empty cluster, every node would take itself as the
only seed and start a new cluster. To solve this, we introduce a
new field - initial_seed. It is used to choose one of the servers
as a seed for all servers added concurrently to an empty cluster.
Note that the add_server calls in asyncio.gather in add_servers
cannot race with each other when setting initial_seed because
there is only one thread.
In the future, we will also start all initial servers concurrently
in ScyllaCluster.install_and_start. The changes in this commit were
designed in a way that will make changing install_and_start easy.
before this change, in sstable_run_based_compaction_test, we check
every 4 sstables, to verify that we close the sstable to be replaced
in a batch of 4.
since the integer-based generation identifier is monotonically
incremental, we can assume that the identifiers of sstables are like
0, 1, 2, 3, .... so if the compaction consumes sstable in a
batch of 4, the identifier of the first one in the batch should
always be the multiple of 4. unfortunately, this test does not work
if we use uuid-based identifier.
but if we take a closer look at how we create the dataset, we can
have following facts:
1. the `compaction_descriptor` returned by
`sstable_run_based_compaction_strategy_for_tests` never
set `owned_ranges` in the returned descriptor
2. in `compaction::setup_sstable_reader`, `mutation_reader::forward::no`
is used, if `_owned_ranges_checker` is empty
3. `mutation_reader_merger` respects the `fwd_mr` passed to its
ctor, so it closes current sstable immediately when the underlying
mutation reader reaches the end of stream.
in other words, we close every sstable once it is fully consumed in
sstable_ompaction_test. and the reason why the existing test passes
is that we just sample the sstables whose generation id is a multiple
of 4. what happens when we perform compaction in this test is:
1. replace 5 with 33, closing 5
2. replace 6 with 34, closing 6
3. replace 7 with 35, closing 7
4. replace 8 with 36, closing 8 << let's check here.. good, go on!
5. replace 13 with 37, closing 13
...
8. replace 16 with 40, closing 16 << let's check here.. also, good, go on!
so, in this change, we just check all old sstables, to verify that
we close each of them once it is fully consumed.
Fixes https://github.com/scylladb/scylladb/issues/16073Closesscylladb/scylladb#16074
* github.com:scylladb/scylladb:
test/sstable_compaction_test: check every sstable replaced sstable
test/sstable_compaction_test: s/old_sstables.front()/old_sstable/
Support for `canonical_mutation`s was added way back in Scylla 3.2. A
lot of code in `migration_manager` is still checking whether the old
`frozen_mutations` are received or need to be sent.
We no longer need this code, since we don't support version skips during
upgrade (and certainly not upgrades like 3.2->5.4).
Leave a sanity checks in place, but otherwise delete the
`frozen_mutation` branches.
The `scylla_tables` function gives a different schema definition
for the `system_schema.scylla_tables` table, depending on whether
certain schema features are enabled or not.
The way it was implemented, we had to write `θ(2^n)` amount
of code and comments to handle `n` features.
Refactor it so that the amount of code we have to write to handle `n`
features is `θ(n)`.
In 0c86abab4d `merge_schema` obtained a new flag, `reload`.
Unfortunately, the flag was assigned a default value, which I think is
almost always a bad idea, and indeed it was in this case. When
`merge_schema` is called on shard different than 0, it recursively calls
itself on shard 0. That recursive call forgot to pass the `reload` flag.
Fix this.
Add a new destination_not_alive_error, thrown from two-way RPCs in case
when the RPC is not issued because the destination is not reported as
alive by the failure detector.
In snapshot transfer code, lower the verbosity of the message printed in
case it fails on the new error. This is done to prevent flakiness in the
CI - in case of slow runs, nodes might get spuriously marked as dead if
they are busy, and a message with the "error" verbosity can cause some
tests to fail.
The replace operation is defined to succeed only if the node being
replaced is dead. We should reject this operation when the failure
detector considers the node being replaced alive.
Apart from adding this change, this PR adds a test case -
`test_replacing_alive_node_fails` - that verifies it. A few testing
framework adjustments were necessary to implement this test and
to avoid flakiness in other tests that use the replace operation after
the change. From now, we need to ensure that all nodes see the
node being replaced as dead before starting the replace. Otherwise,
the check added in this PR could reject the replace.
Additionally, this PR changes the replace procedure in a way that
if the replacing node reuses the IP of the node being replaced, other
nodes can see it as alive only after the topology coordinator accepts
its join request. The replacing node may become alive before the
topology coordinator checks if the node being replaced is dead. If
that happens and the replacing node reuses the IP of the node being
replaced, the topology coordinator cannot know which of these two
nodes is alive and whether it should reject the join request.
Fixes#15863Closesscylladb/scylladb#15926
* github.com:scylladb/scylladb:
test: add test_replacing_alive_node_fails
raft topology: reject replace if the node being replaced is not dead
raft topology: add the gossiper ref to topology_coordinator
test: test_cluster_features: stop gracefully before replace
test: decrease failure_detector_timeout_in_ms in replace tests
test: move test_replace to topology_custom
test: server_add: wait until the node being replaced is dead
test: server_add: add support for expected errors
raft topology: join: delay advertising replacing node if it reuses IP
raft topology: join: fix a condition in validate_joining_node
the operator<<() based formatter is only used in its test, so
let's move it to where it is used.
we can always bring it back later if it is required in other places.
but better off implementing it as a fmt::formatter<> then.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16142
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define a formatter for
`cql_transport::messages::result_message::rows`
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16143
before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define a formatter for db::seed_provider_type.
please note, we are still formatting vector<db::seed_provider_type>
with the helper provided by seastar/core/sstring.hh, which uses
operator<<() to print the elements in the vector being printed.
so we have to keep the operator<< formatter before disabling
the generic formatter for vector<T>.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16138
generation_number's type is `generation_type`, which in turn is a
`utils::tagged_integer<struct generation_type_tag, int32_t>`,
which formats using either fmtlib which uses ostream_formatter backed by
operator<< . but `ostream_formatter` does not provide the specifier
support. so {:d} does apply to this type, when compiling with fmtlib
v10, it rejects the format specifier (the error is attached at the end
of the commit message).
so in this change, we just drop the format specifier. as fmtlib prints
`int32_t` as a decimal integer, so even if {:d} applied, it does not
change the behavior.
```
/home/kefu/dev/scylladb/gms/gossiper.cc:1798:35: error: call to consteval function 'fmt::basic_format_string<char, utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int> &, utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int> &>::basic_format_string<char[48], 0>' is not a constant expression
1798 | auto err = format("Remote generation {:d} != local generation {:d}", remote_gen, local_gen);
| ^
/usr/include/fmt/core.h:2322:31: note: non-constexpr function 'throw_format_error' cannot be used in a constant expression
2322 | if (!in(arg_type, set)) throw_format_error("invalid format specifier");
| ^
/usr/include/fmt/core.h:2395:14: note: in call to 'parse_presentation_type.operator()(1, 510)'
2395 | return parse_presentation_type(pres::dec, integral_set);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/fmt/core.h:2706:9: note: in call to 'parse_format_specs<char>(&"Remote generation {:d} != local generation {:d}"[20], &"Remote generation {:d} != local generation {:d}"[47], formatter<mapped_type, char_type>().formatter::specs_, checker(s).context_, 13)'
2706 | detail::parse_format_specs(ctx.begin(), ctx.end(), specs_, ctx, type);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/fmt/core.h:2561:10: note: in call to 'formatter<mapped_type, char_type>().parse<fmt::detail::compile_parse_context<char>>(checker(s).context_)'
2561 | return formatter<mapped_type, char_type>().parse(ctx);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/fmt/core.h:2647:39: note: in call to 'parse_format_specs<utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int>, fmt::detail::compile_parse_context<char>>(checker(s).context_)'
2647 | return id >= 0 && id < num_args ? parse_funcs_[id](context_) : begin;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/fmt/core.h:2485:15: note: in call to 'handler.on_format_specs(0, &"Remote generation {:d} != local generation {:d}"[20], &"Remote generation {:d} != local generation {:d}"[47])'
2485 | begin = handler.on_format_specs(adapter.arg_id, begin + 1, end);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/fmt/core.h:2541:13: note: in call to 'parse_replacement_field<char, fmt::detail::format_string_checker<char, utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int>, utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int>> &>(&"Remote generation {:d} != local generation {:d}"[19], &"Remote generation {:d} != local generation {:d}"[47], checker(s))'
2541 | begin = parse_replacement_field(p, end, handler);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/fmt/core.h:2769:7: note: in call to 'parse_format_string<true, char, fmt::detail::format_string_checker<char, utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int>, utils::tagged_tagged_integer<utils::final, gms::generation_type_tag, int>>>({&"Remote generation {:d} != local generation {:d}"[0], 47}, checker(s))'
2769 | detail::parse_format_string<true>(str_, checker(s));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/kefu/dev/scylladb/gms/gossiper.cc:1798:35: note: in call to 'basic_format_string<char[48], 0>("Remote generation {:d} != local generation {:d}")'
1798 | auto err = format("Remote generation {:d} != local generation {:d}", remote_gen, local_gen);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16126
Clang-18 starts to complain when a constexp value is casted to a
enum and the value is out of the range of the enum values. in this
case, boost intentially cast the out-of-range values to the
type to be casted. so silence this warning at this moment.
since `lexical_cast.hpp` is included in multiple places in the
source tree, this warning is disabled globally.
the warning look like:
```
In file included from /home/kefu/dev/scylladb/types/types.cc:9:
In file included from /usr/include/boost/lexical_cast.hpp:32:
In file included from /usr/include/boost/lexical_cast/try_lexical_convert.hpp:43:
In file included from /usr/include/boost/lexical_cast/detail/converter_numeric.hpp:36:
In file included from /usr/include/boost/numeric/conversion/cast.hpp:33:
In file included from /usr/include/boost/numeric/conversion/converter.hpp:13:
In file included from /usr/include/boost/numeric/conversion/conversion_traits.hpp:13:
In file included from /usr/include/boost/numeric/conversion/detail/conversion_traits.hpp:18:
In file included from /usr/include/boost/numeric/conversion/detail/int_float_mixture.hpp:19:
In file included from /usr/include/boost/mpl/integral_c.hpp:32:
/usr/include/boost/mpl/aux_/integral_wrapper.hpp:73:31: error: integer value -1 is outside the valid range of values [0, 3] for the enumeration type 'udt_buil
tin_mixture_enum' [-Wenum-constexpr-conversion]
73 | typedef AUX_WRAPPER_INST( BOOST_MPL_AUX_STATIC_CAST(AUX_WRAPPER_VALUE_TYPE, (value - 1)) ) prior;
| ^
/usr/include/boost/mpl/aux_/static_cast.hpp:24:47: note: expanded from macro 'BOOST_MPL_AUX_STATIC_CAST'
24 | # define BOOST_MPL_AUX_STATIC_CAST(T, expr) static_cast<T>(expr)
| ^
In file included from /home/kefu/dev/scylladb/types/types.cc:9:
In file included from /usr/include/boost/lexical_cast.hpp:32:
In file included from /usr/include/boost/lexical_cast/try_lexical_convert.hpp:43:
In file included from /usr/include/boost/lexical_cast/detail/converter_numeric.hpp:36:
In file included from /usr/include/boost/numeric/conversion/cast.hpp:33:
In file included from /usr/include/boost/numeric/conversion/converter.hpp:13:
In file included from /usr/include/boost/numeric/conversion/conversion_traits.hpp:13:
In file included from /usr/include/boost/numeric/conversion/detail/conversion_traits.hpp:18:
In file included from /usr/include/boost/numeric/conversion/detail/int_float_mixture.hpp:19:
In file included from /usr/include/boost/mpl/integral_c.hpp:32:
/usr/include/boost/mpl/aux_/integral_wrapper.hpp:73:31: error: integer value -1 is outside the valid range of values [0, 3] for the enumeration type 'int_float_mixture_enum' [-Wenum-constexpr-conversion]
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16082
this was a typo introduced by 781b7de5. which intended to add
-Wignored-qualifiers to the compiling options, but it ended up
adding -Wignore-qualifiers.
in this change, the typo is corrected.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16124
The write_mutations_to_database() decides if it needs to flush the
database by checking if the mutations came to system.topology table and
performing some more checks if they did. Overall this looks like
auto topo_schema = db.find_schema(system.topology)
if (target_schema != topo_schema)
return false;
// extra checks go here
However, the system.topology table exists only if the feature named
CONSISTENT_TOPOLOGY_CHANGES is enabled via commandline. If it's not, the
call to db.find_schema(system.topology) throws and the whole attempt to
write mutations throws too stopping raft state machine.
Since the intention is to check if the target schema is the topology
table, the presense of this table should come first.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#16089
Currently, the NIC prompt on scylla_setupshows up virtual devices such as
VLAN devices and bridge devices, but perftune.py does not support them.
To prevent causing error while running scylla_setup, we should stop listing
these devices from the NIC prompt.
closes#6757Closesscylladb/scylladb#15958
If the failure detector sees the destination as dead, there is no use to
send the RPC so drop it silently.
This only affects two-way RPCs and "request" one-way RPCs. The one-way
RPCs used as responses to other one-way RPCs are not affected.
Add a new variant of the reply to the direct_fd_ping which specifies
whether the local group0 is alive or not, and start actively using it.
There is no need to introduce a cluster feature. Due to how our
serialization framework works, nodes which do not recognize the new
variant will treat it as the existing std::monostate. The std::monostate
means "the node and group0 is alive"; nodes before the changes in this
commit would send a std::monostate anyway, so this is completely
transparent for the old nodes.
Add a method which reports whether given raft server is running.
In following commits, the information about whether the local raft
group 0 is running or not will be included in the response to the
failure detector ping, and the is_alive method will be used there.
add fmt formatter for `assignment_testable`.
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `assignment_testabe` without the help of `operator<<`.
since we are still printing the shared_ptr<assignment_testable> using
operator<<(.., const assignment_testable&), we cannot drop this operator
yet.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16127