While measuring #17149 with this test some changes were applied, here they are
- keep initial_tablets number in output json's parameters section
- disable auto compaction
- add control over the amount of sstables generated for --bypass-cache case
Closesscylladb/scylladb#17473
* github.com:scylladb/scylladb:
perf_simple_query: Add --memtable-partitions option
perf_simple_query: Disable auto compaction
perf_simple_query: Keep number of initial tablets in output json
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* raft::election_tracker
* raft::votes
* raft::vote_result
and drop their operator<<:s.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17670
before this change, "ring" subcommand has two issues:
1. `--resolve-ip` option accepts a boolean argument, but this option
should be a switch, which does not accept any argument at all
2. it always prints the endpoint no matter if `--resolve-ip` is
specified or not. but it should print the resolved name, instead
of an IP address if `--resolve-ip` is specified.
in this change, both issues are addressed. and the test is updated
accordingly to exercise the case where `--resolve-ip` is used.
Closesscylladb/scylladb#17553
* github.com:scylladb/scylladb:
tools/scylla-nodetool: print hostname if --resolve-ip is passed to "ring"
test/nodetool: calc max_width from all_hosts
test/nodetool: keep tokens as Host's member
test/nodetool: remove unused import
in `ScyllaServer::add_server()`, `self.create_server()` is called to
create a server, but if it raises, we would reference a local variable
of `server` which is not bound to any value, as `server` is not assigned
at that moment. if `ScyllaServer` is used by `ScyllaClusterManager`, we
would not be able to see the real exception apart from the error like
```
cannot access local variable 'server' where it is not associated with a
value
```
which is but the error from Python runtime.
in this change, `server` is always initialized, and we check for None,
before dereference it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17693
The test cases in this suite need to start scylla with custom config options, restart it and call API on it. By the time the suite was created all this wasn't possible with any library facility, so the suite carries its version of managed_cluster class that piggy-backs cql-pytest scylla starting. Now test.py has pretty flexible manager that provides all the scylla cluster management object_store suite needs. This PR makes the suite use the manager client instead of the home-brew managed_cluster thing
refs: #16006fixes: #16268Closesscylladb/scylladb#17292
* github.com:scylladb/scylladb:
test/object_store: Remove unused managed_cluster (and other stuff)
test/object_store: Use tmpdir fixture in flush-retry case
test/object_store: Turn flush-retry case to use ManagerClient
test/object_store: Turn "misconfigured" case to use ManagerClient
test/object_store: Turn garbage-collect case to use ManagerClient
test/object_store: Turn basic case to use ManagerClient
test/object_store: Prepare to work with ManagerClient
These tests are inserting data into RF=3 tables, but used the default
consistency level which is taken from the default execution profile
which is set to LOCAL_QUORUM. The tests would then read with CL=ONE, so
we cannot give a guarantee that some of the data won't be missed. Fix
this by inserting the data with CL=ALL. (Do it for all RF cases for
simplicity.)
Fixesscylladb/scylladb#17695Closesscylladb/scylladb#17700
The test is booting nodes, and then immediately starts shutting down
nodes and removing them from the cluster. The shutting down and
removing may happen before driver manages to connect to all nodes in the
cluster. In particular, the driver didn't yet connect to the last
bootstrapped node. Or it can even happen that the driver has connected,
but the control connection is established to the first node, and the
driver fetched topology from the first node when the first node didn't
yet consider the last node to be normal. So the driver decides to close
connection to the last node like this:
```
22:34:03.159 DEBUG> [control connection] Removing host not found in
peers metadata: <Host: 127.42.90.14:9042 datacenter1>
```
Eventually, at the end of the test, only the last node remains, all
other nodes have been removed or stopped. But the driver does not have a
connection to that last node.
Fix this problem by ensuring that:
- all nodes see each other as NORMAL,
- the driver has connected to all nodes
at the beginning of the test, before we start shutting down and removing
nodes.
Fixesscylladb/scylladb#16373Closesscylladb/scylladb#17676
The following incompatibilities were identified by `listsnapshots_test.py` in dtests:
* Command doesn't bail out when there are no snapshots, instead it prints meaningless empty report
* Formatting is incompatible
Both are fixed in this mini-series.
Closesscylladb/scylladb#17541
* github.com:scylladb/scylladb:
tools/scylla-nodetool: listsnapshots: make the formatting compatible with origin's
tools/scylla-nodetool: listsnapshots: bail out if there are no snapshots
before this change, "ring" subcommand has two issues:
1. `--resolve-ip` option accepts a boolean argument, but this option
should be a switch, which does not accept any argument at all
2. it always prints the endpoint no matter if `--resolve-ip` is
specified or not. but it should print the resolved name, instead
of an IP address if `--resolve-ip` is specified.
in this change, both issues are addressed. and the test is updated
accordingly to exercise the case where `--resolve-ip` is used.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
It might happen that multiple tablets co-habit the same shard, so we want load-and-stream to jump into a new streaming session for every tablet, such that the receiver will have the data properly segregated. That's a similar treatment we gave to repair. Today, load-and-stream fails due to sstables spanning more than 1 tablet in the receiver.
Synchronization with migration is done by taking replication map, so migrations cannot advance while streaming new data. A bug was fixed too, where data must be streamed to pending replicas too, to handle case where migration is ongoing and new data must reach both old and new replica set. A test was added stressing this synchronization path.
Another bug was fixed in sstable loading, which expected sharder to not be invalidated throughout the operation, but that breaks during migrations.
Fixes#17315.
Closesscylladb/scylladb#17449
* github.com:scylladb/scylladb:
test: test_tablets: Add load-and-stream test
sstables_loader: Stream to pending tablet replica if needed
sstables_loader: Implement tablet based load-and-stream
sstables_loader: Virtualize sstable_streamer for tablet
sstables_loader: Avoid reallocations in vector
sstable_loader: Decouple sstable streaming from selection
sstables_loader: Introduce sstable_streamer
Fix online SSTable loading with concurrent tablet migration
When no keyspace is provided, request all keyspaces from the server,
then scrub all of them. This is what the legacy nodetool does, for some
reason this was missed when re-implementing scrub.
Closesscylladb/scylladb#17495
There are 4 barrier-only stages when migrating a tablet and the test needs to fail pending/leaving replica that handles it in order to validate how coordinator handles dead node. Failing the barrier is done by suspending it with injection code and stopping the node without waking it up. The main difficulty here is how to tell one barrier RPC call from another, because they don't have anything onboard that could tell which stage the barrier is run for. This PR suggests that barrier injection code looks directly into the system.tablets table for the transition stage, the stage is already there by the time barrier is about to ack itself over RPC.
refs: #16527Closesscylladb/scylladb#17450
* github.com:scylladb/scylladb:
topology.tablets_migration: Handle failed use_new
topology.tablets_migration: Handle failed write_both_read_new
topology.tablets_migration: Handle failed write_both_read_old
topology.tablets_migration: Handle failed allow_write_both_read_old
test/tablets_migration: Add conditional break-point into barrier handler
replica: Add helper to read tablet transition stage
topology_coordinator: Add action_failed() helper
The author (me) tried to be clever and fix the formatting, but then he
realized this just means a lot of unnecessary fighting with tests. So
this patch makes the formatting compatible with that of the legacy
nodetool:
* Use compatible rounding and precision formatting
* Use incorrect unit (KB instead of KiB)
* Align numbers to the left
* Add trailing white-space to "Snapshot Details: "
These two parameters are not used by the native nodetool, because
ScyllaDB itself doesn't support them. These should be just ignored and
indeed there was a unit test checking that this is the case. However,
due to a mistake in the unit test, this was not actually tested and
nodetool complained when seeing these params.
This patch fixes both the test and the native nodetool.
Closesscylladb/scylladb#17477
key_view::explode() contains a blatant use-after-free:
unless the input is already linearized, it returns a view to a local temporary buffer.
This is rare, because partition keys are usually not large enough to be fragmented.
But for a sufficiently large key, this bug causes a corrupted partition_key down
the line.
Fixes#17625Closesscylladb/scylladb#17626
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* tablet_id
* tablet_replica
* tablet_metadata
* tablet_map
their operator<<:s are dropped
Refs scylladb/scylladb#13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17504
before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for
* position_range
* mutation_fragment
* range_tombstone_stream
* mutation_fragment_v2::printer
Refs #13245Closesscylladb/scylladb#17521
* github.com:scylladb/scylladb:
mutation: add fmt::formatter for position_range
mutation: add fmt::formatter for mutation_fragment and range_tombstone_stream
mutation: add fmt::formatter for mutation_fragment_v2::printer
This stage doesn't need any special treatment, because we cannot revert
to old replicas and should proceed normally. The barrier itself won't
get stuck, because it already handles excluded/ignored nodes.
Just make the test validate it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Two options here -- go revert to old replicas by jumping into
cleanup_target stage or proceed noramlly. The choice depends on which
replica set has less number of dead nodes.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
At this stage it can happen that target replica got some writes, so its
tablet needs to be cleaned up, so jump to cleanup_target stage.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are several transition stages that are executed by the topology
coordinator with the help of barrier-and-drain raft commands. For the
test to stop and remove a node while handling this stage it must inject
a break-point into barrier handler, wait for it to happen and then stop
the node without resuming the break-point. Then removenode from the
cluster.
The break-point suspends barrier handling when a specific tablet is in
specific transition stage. Tablet ID and desired stage are configured
via injector parameters.
With today's error-injection facilities the way to suspend code
execution is with injecting a lambda that waits for a message from the
injection engine.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This patch series makes all auth writes serialized via raft. Reads stay
eventually consistent for performance reasons. To make transition to new
code easier data is stored in a newly created keyspace: system_auth_v2.
Internally the difference is that instead of executing CQL directly for
writes we generate mutations and then announce them via raft group0. Per
commit descriptions provide more implementation details.
Refs https://github.com/scylladb/scylladb/issues/16970
Fixes https://github.com/scylladb/scylladb/issues/11157Closesscylladb/scylladb#16578
* github.com:scylladb/scylladb:
test: extend auth-v2 migration test to catch stale static
test: add auth-v2 migration test
test: add auth-v2 snapshot transfer test
test: auth: add tests for lost quorum and command splitting
test: pylib: disconnect driver before re-connection
test: adjust tests for auth-v2
auth: implement auth-v2 migration
auth: remove static from queries on auth-v2 path
auth: coroutinize functions in password_authenticator
auth: coroutinize functions in standard_role_manager
auth: coroutinize functions in default_authorizer
storage_service: add support for auth-v2 raft snapshots
storage_service: extract getting mutations in raft snapshot to a common function
auth: service: capture string_view by value
alternator: add support for auth-v2
auth: add auth-v2 write paths
auth: add raft_group0_client as dependency
cql3: auth: add a way to create mutations without executing
cql3: run auth DML writes on shard 0 and with raft guard
service: don't loose service_level_controller when bouncing client_state
auth: put system_auth and users consts in legacy namespace
cql3: parametrize keyspace name in auth related statements
auth: parametrize keyspace name in roles metadata helpers
auth: parametrize keyspace name in password_authenticator
auth: parametrize keyspace name in standard_role_manager
auth: remove redundant consts auth::meta::*::qualified_name
auth: parametrize keyspace name in default_authorizer
db: make all system_auth_v2 tables use schema commitlog
db: add system_auth_v2 tables
db: add system_auth_v2 keyspace
When a tool application is invoked with an unknown operation, an error
message is printed, which includes all the known operations, with all
their aliases. This is collected in `std::vector<std::string_view>`. The
problem is that the vector containing alias names, is returned as a
value, so the code ends up creating views to temporaries.
Fix this by returning alias vector with const&.
Fixes: #17584Closesscylladb/scylladb#17586
before this change, we failed to apply the filtering of tablestats
command in the right way:
1. `table_filter` failed to check if delimiter is npos before
extract the cf component from the specified table name.
2. the stats should not included the keyspace which are not
included by the filter.
3. the total number of tables in the stats report should contain
all tables no matter they are filtered or not.
in this change, all the problems above are addressed. and the tests
are updated to cover these use cases.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17468
For tables using tablet based replication strategies, the sstables should be reshaped only within the compaction groups they belong to. The shard_reshaping_compaction_task_impl now groups the sstables based on their compaction groups before reshaping them.
Fixes https://github.com/scylladb/scylladb/issues/16966Closesscylladb/scylladb#17395
* github.com:scylladb/scylladb:
test/topology_custom: add testcase to verify reshape with tablets
test/pylib/rest_client: add get_sstable_info, enable/disable_autocompaction
replica/distributed_loader: enable reshape for sstables
compaction: reshape sstables within compaction groups
replica/table : add method to get compaction group id for an sstable
compaction: reshape: update total reshaped size only on success
compaction: simplify exception handling in shard_reshaping_compaction_task_impl::run
- use API endpoint of /storage_service/toppartition/
- only print out the specified samplings.
- print "\n" separator between samplings
Closesscylladb/scylladb#17574
* github.com:scylladb/scylladb:
tools/scylla-nodetool: print separator between samplings
tools/scylla-nodetool: only print the specified sampling
tools/scylla-nodetool: use /storage_service/toppartition/
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* std::vector<data_type>
* column_identifier
* column_identifier_raw
* untyped_constant::type_class
and drop their operator<<:s
Refs #13245Closesscylladb/scylladb#17538
* github.com:scylladb/scylladb:
cql3: add fmt::formatter for expression::printer
cql3: add fmt::formatter for raw_value{,_view}
cql3: add fmt::formatter for std::vector<data_type>
cql3: add fmt::formatter for untyped_constant::type_class
cql3: add fmt::formatter for column_identifier{,_row}
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* raw_value
* raw_value_view
`raw_value_view` 's operator<< is still being used by the generic
homebrew printer for vector<>, so it is preserved.
`raw_value` 's operator<< is still being used by the generic
homebrew printer for optional<>, so it's preserved as well.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
We decrease the server's request timeouts in topology tests so that
they are lower than the driver's timeout. Before, the driver could
time out its request before the server handled it successfully.
This problem caused scylladb/scylladb#15924.
Since scylladb/scylladb#15924 is the last issue mentioned in
scylladb/scylladb#15962, this PR also reenables background
writes in `test_topology_ops` with tablets disabled. The test
doesn't pass with tablets and background writes because of
scylladb/scylladb#17025. We will reenable background writes
with tablets after fixing that issue.
Fixesscylladb/scylladb#15924Fixesscylladb/scylladb#15962Closesscylladb/scylladb#17585
* github.com:scylladb/scylladb:
test: test_topology_ops: reenable background writes without tablets
test: test_topology_ops: run with and without tablets
test: topology: decrease the server's request timeouts
Tests that verify upgrading to the raft-based topology
(`test_topology_upgrade`, `test_topology_recovery_basic`,
`test_topology_recovery_majority_loss`) have flaky
`check_system_topology_and_cdc_generations_v3_consistency` calls.
`assert topo_results[0] == topo_res` can fail because of different
`unpublished_cdc_generations` on different nodes.
The upgrade procedure creates a new CDC generation, which is later
published by the CDC generation publisher. However, this can happen
after the upgrade procedure finishes. In tests, if publishing
happens just before querying `system.topology` in
`check_system_topology_and_cdc_generations_v3_consistency`, we can
observe different `unpublished_cdc_generations` on different nodes.
It is an expected and temporary inconsistency.
For the same reasons,
`check_system_topology_and_cdc_generations_v3_consistency` can
fail after adding a new node.
To make the tests not flaky, we wait until the CDC generation
publisher finishes its job. Then, all nodes should always have
equal (and empty) `unpublished_cdc_generations`.
Fixesscylladb/scylladb#17587Fixesscylladb/scylladb#17600Fixesscylladb/scylladb#17621Closesscylladb/scylladb#17622
With auth-v2 we can login even if quorum is lost. So test
which checks if error occurs in such situation is deleted
and the opposite test which checks if logging in works was
added.
Alternator doesn't do any writes to auth
tables so it's simply change of keyspace
name.
Docs will be updated later, when auth-v2
is enabled as default.
After fixing scylladb/scylladb#15924 in one of the previous
patches, we reenable background writes in `test_topology_ops`.
We also start background writes a bit later after adding all nodes.
Without this change and with tablets, the test fails with:
```
> await cql.run_async(f"CREATE TABLE tbl (pk int PRIMARY KEY, v int)")
E cassandra.protocol.ConfigurationException: <Error from server: code=2300
[Query invalid because of configuration issue] message="Datacenter
datacenter1 doesn't have enough nodes for replication_factor=3">
```
The change above makes the test a bit weaker, but we don't have to
worry about it. If adding nodes is bugged, other tests should
detect it.
Unfortunately, the test still doesn't pass with tablets and
background writes because of scylladb/scylladb#17025, so we keep
background writes disabled with tablets and leave FIXME.
Fixesscylladb/scylladb#15962
We decrease the server's request timeouts in topology tests so that
they are lower than the driver's timeout. Before, the driver could
time out its request before the server handled it successfully.
This problem caused scylladb/scylladb#15924.
A high server's request timeout can slow down the topology tests
(see the new comment in `make_scylla_conf`). We make the timeout
dependent on the testing mode to not slow down tests for no reason.
We don't touch the driver's request timeout. Decreasing it in some
modes would require too much effort for almost no improvement.
Fixesscylladb/scylladb#15924
Calling notify_left for old ip on topology change in raft mode
was a regression. In gossiper mode it didn't occur. In gossiper
mode the function handle_state_normal was responsible for spotting
IP addresses that weren't managing any parts of the data, and
it would then initiate their removal by calling remove_endpoint.
This removal process did not include calling notify_left.
Actually, notify_left was only supposed to be called (via excise) by
a 'real' removal procedures - removenode and decommission.
The redundant notify_left caused troubles in scylla python driver.
The driver could receive REMOVED_NODE and NEW_NODE notifications
in the same time and their handling routines could race with each other.
In this commit we fix the problem by not calling notify_left if
the remove_ip lambda was called from the ip change code path.
Also, we add a test which verifies that the driver log doesn't
mention the REMOVED_NODE notification.
Fixesscylladb/scylladb#17444Closesscylladb/scylladb#17561