The issues mentioned in the comment before are already fixed.
Unfortunately, there is another, opposite issue which this function can
be used for. The previous issue was about the existing driver session
not reconnecting. The current issue is about the existing driver session
reconnecting too much... (and in the middle of queries.)
Waiting for CQL connections is not enough. For the queries to succeed,
nodes must see each other. We have to wait for this, otherwise the test
will be flaky.
Fixes#17029Closesscylladb/scylladb#17040
We do not support tablet resharding yet. All tablet-related code assumes that the (host_id, shard) tablet replica is always valid. Violating this leads to undefined behaviour: errors in the tablet load balancer and potential crashes.
Avoid this by refusing to start if the need to resharding is detected. Be as lenient as possible: check all tablets with a replica on this node, and only refuse startup if at least one tablet has an invalid replica shard.
Startup will fail as:
ERROR 2024-01-26 07:03:06,931 [shard 0:main] init - Startup failed: std::runtime_error (Detected a tablet with invalid replica shard, reducing shard count with tablet-enabled tables is not yet supported. Replace the node instead.)
Refs: #16739Fixes: #16843Closesscylladb/scylladb#17008
* github.com:scylladb/scylladb:
test/topolgy_experimental_raft: test_tablets.py: add test for resharding
test/pylib: manager[_client]: add update_cmdline()
main: refuse startup when tablet resharding is required
locator: tablets: add check_tablet_replica_shards()
`db::config` is a class, that is used in many places across the code base. When it is changed, its clients' code need to be recompiled. It represents the configuration of the database. Some fields of the configuration that describe the location of directories may be empty. In such cases `db::config::setup_directories()` function is called - it modifies the provided configuration. Such modification is not good - it is better to keep `db::config` intact.
This PR:
- extends the public interface of utils::directories class to provide required directory paths to the users
- removes 'db::config::setup_directories()' to avoid altering the fields of configuration object
- replaces usages of db::config object with utils::directories object in places that require obtaining paths to dirs
Fixes: scylladb#5626
Closesscylladb/scylladb#16787
* github.com:scylladb/scylladb:
utils/directories: make utils::directories::set an internal type
db::config: keep dir paths unchanged
cql_transport/controler: use utils::directories to get paths of dirs
service/storage_proxy: use utils::directories to get paths of dirs
api/storage_service.cc: use utils::directories to get paths of dirs
tools/scylla-sstable.cc: use utils::directories to get paths
db/commitlog: do not use db::config to get dirs
Use utils::directories to get dirs paths in replica::database
Allow utils::directories to provide paths to dirs
Clean-up of utils::directories
When a node is in the `left_token_ring` state, we don't know how
it has ended up in this state. We cannot distinguish a node that
has finished decommissioning from a node that has failed bootstrap.
The main problem it causes is that we incorrectly send the
`barrier_and_drain` command to a node that has failed
bootstrapping or replacing. We must do it for a node that has
finished decommissioning because it could still coordinate
requests. However, since we cannot distinguish nodes in the
`left_token_ring` state, we must send the command to all of them.
This issue appeared in scylladb/scylladb#16797 and this PR is
a follow-up that fixes it.
The solution is changing `left_token_ring` from a node state
to a transition state.
Fixesscylladb/scylladb#16944Closesscylladb/scylladb#17009
* github.com:scylladb/scylladb:
docs: dev: topology-over-raft: document the left_token_ring state
topology_coordinator: adjust reason string in left_token_ring handler
raft topology: make left_token_ring a transition state
topology_coordinator: rollback_current_topology_op: remove unused exclude_nodes
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `db::read_repair_decision`,
and drop its operator<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17033
This allows the user of `raft::server` to cause it to create a snapshot
and truncate the Raft log (leaving no trailing entries; in the future we
may extend the API to specify number of trailing entries left if
needed). In a later commit we'll add a REST endpoint to Scylla to
trigger group 0 snapshots.
One use case for this API is to create group 0 snapshots in Scylla
deployments which upgraded to Raft in version 5.2 and started with an
empty Raft log with no snapshot at the beginning. This causes problems,
e.g. when a new node bootstraps to the cluster, it will not receive a
snapshot that would contain both schema and group 0 history, which would
then lead to inconsistent schema state and trigger assertion failures as
observed in scylladb/scylladb#16683.
In 5.4 the logic of initial group 0 setup was changed to start the Raft
log with a snapshot at index 1 (ff386e7a44)
but a problem remains with these existing deployments coming from 5.2,
we need a way to trigger a snapshot in them (other than performing 1000
arbitrary schema changes).
Another potential use case in the future would be to trigger snapshots
based on external memory pressure in tablet Raft groups (for strongly
consistent tables).
The PR adds the API to `raft::server` and a HTTP endpoint that uses it.
In a follow-up PR, we plan to modify group 0 server startup logic to automatically
call this API if it sees that no snapshot is present yet (to automatically
fix the aforementioned 5.2 deployments once they upgrade.)
Closesscylladb/scylladb#16816
* github.com:scylladb/scylladb:
raft: remove `empty()` from `fsm_output`
test: add test for manual triggering of Raft snapshots
api: add HTTP endpoint to trigger Raft snapshots
raft: server: add `trigger_snapshot` API
raft: server: track last persisted snapshot descriptor index
raft: server: framework for handling server requests
raft: server: inline `poll_fsm_output`
raft: server: fix indentation
raft: server: move `io_fiber`'s processing of `batch` to a separate function
raft: move `poll_output()` from `fsm` to `server`
raft: move `_sm_events` from `fsm` to `server`
raft: fsm: remove constructor used only in tests
raft: fsm: move trace message from `poll_output` to `has_output`
raft: fsm: extract `has_output()`
raft: pass `max_trailing_entries` through `fsm_output` to `store_snapshot_descriptor`
raft: server: pass `*_aborted` to `set_exception` call
these words are either
* shortened words: strategy => strat, read_from_primary => fro
* or acronyms: node_or_data => nd
before we rename them with better names, let's just add them to the
ignore word list.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17002
Previously, utils::directories::set could have been used by
clients of utils::directories class to provide dirs for creation.
Due to moving the responsibility for providing paths of dirs from
db::config to utils::directories, such usage is no longer the case.
This change:
- defines utils::directories::set in utils/directories.cc to disallow
its usage by the clients of utils::directories
- makes utils::directories::create_and_verify() member function
private; now it is used only by the internals of the class
- introduces a new member function to utils::directories called
create_and_verify_sharded_directory() to limit the functionality
provided to clients
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change is intended to ensure, that
db::config fields related to directories
are not changed. To achieve that a member
function called setup_directories() is
removed.
The responsibility for directories paths
has been moved to utils::directories,
which may generate default paths if the
configuration does not provide a specific
value.
Fixes: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change replaces usage of db::config with
usage of utils::directories to get paths of
directories in cql_transport/controler.
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change replaces usage of db::config with
usage of utils::directories to get paths of
directories in service/storage_proxy.
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change replaces usage of db::config with usage
of utils::directories in api/storage_service.cc in
order to get the paths of directories.
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change replaces usage of db::config with usage
of utils::directories to get paths of directories
in tools/scylla-sstable.cc.
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change removes usage of db::config to
get path of commitlog_directory. Instead, it
introduces a new parameter to directly pass
the path to db::commitlog::config::from_db_config().
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change replaces the usage of db::config with
usage of utils::directories to get dirs paths in
replica::database class.
Moreover, it adjusts tests that require construction
of replica::database - its constructor has been
changed to accept utils::directories object.
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change extends utils::directories class in
the following way:
- adds new member variables that correspond to
fields from db::config that describe paths
of directories
- introduces a public interface to retrieve the
values of the new members
- allows construction of utils::directories
object based on db::config to setup internal
member variables related to paths to dirs
The new members of utils::directories are overriden
when the provided values are empty. The way of setting
paths is taken from db::config.
To ensure that the new logic works correctly
`utils_directories_test` has been created.
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change is intended to clean-up files in which
utils::directories class is defined to ease further
extensions.
The preparation consists of:
- removal of `using namespace` from directories.hh to
avoid namespace pollution in files, that include this
header
- explicit inclusion of headers, that were missing or
were implicitly included to ensure that directories.hh
is self-sufficient
- defining directories::set class outside of its parent
to improve readability
Refs: scylladb#5626
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Similar to the existing update_config(). Updates the command-line
arguments of the specified nodes, merging the new options into the
existing ones. Needs a restart to take effect.
We do not support tablet resharding yet. All tablet-related code assumes
that the (host_id, shard) tablet replica is always valid. Violating this
leads to undefined behaviour: errors in the tablet load balancer and
potential crashes.
Avoid this by refusing to start if the need to resharding is detected.
Be as lenient as possible: check all tablets with a replica on this node,
and only refuse startup if at least one tablet has an invalid replica
shard.
Startup will fail as:
ERROR 2024-01-26 07:03:06,931 [shard 0:main] init - Startup failed: std::runtime_error (Detected a tablet with invalid replica shard, reducing shard count with tablet-enabled tables is not yet supported. Replace the node instead.)
Checks that all tablets with a replica on the this node, have a valid
replica shard (< smp::count).
Will be used to check whether the node can start-up with the current
shard-count.
In one of the previous patches, we changed the `left_token_ring`
state from a node state to a transition state. We document it
in this patch. The node state wasn't documented, so there is
nothing to remove.
A node can be in the `left_token_ring` state after:
- a finished decommission,
- a failed bootstrap,
- a failed replace.
When a node is in the `left_token_ring` state, we don't know how
it has ended up in this state. We cannot distinguish a node that
has finished decommissioning from a node that has failed bootstrap.
The main problem it causes is that we incorrectly send the
`barrier_and_drain` command to a node that has failed
bootstrapping or replacing. We must do it for a node that has
finished decommissioning because it could still coordinate
requests. However, since we cannot distinguish nodes in the
`left_token_ring` state, we must send the command to all of them.
This issue appeared in scylladb/scylladb#16797 and this patch is
a follow-up that fixes it.
The solution is changing `left_token_ring` from a node state
to a transition state.
Regarding implementation, most of the changes are simple
refactoring. The less obvious are:
- Before this patch, in `system_keyspace::left_topology_state`, we
had to keep the ignored nodes' IDs for replace to ensure that the
replacing node will have access to it after moving to the
`left_token_ring` state, which happens when replace fails. We
don't need this workaround anymore. When we enter the new
`left_token_ring` transition state, the new node will still be in
the `decommissioning` state, so it won't lose its request param.
- Before this patch, a decommissioning node lost its tokens
while moving to the `left_token_ring` state. After the patch, it
loses tokens while still being in the `decommissioning` state. We
ensure that all `decommissioning` handlers correctly handle a node
that lost its tokens.
Moving the `left_token_ring` handler from `handle_node_transition`
to `handle_topology_transition` created a large diff. There are
only three changes:
- adding `auto node = get_node_to_work_on(std::move(guard));`,
- adding `builder.del_transition_state()`,
- changing error logged when `global_token_metadata_barrier` fails.
The `exclude_nodes` variable was unused, but it wasn't a bug.
The `left_token_ring` and `rollback_to_normal` handlers correctly
compute excluded nodes on their own.
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for dht::decorated_key and
repair_sync_boundary.
please note, before this change, repair_sync_boundary was using
the operator<< based formatter of `dht::decorated_key`, so we are
updating both of them in a single commit.
because we still use the homebrew generic formatter of vector<>
in to format vector<repair_sync_boundary> and vector<dht::decorated_key>,
so their operator<< are preserved.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16994
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
cassandra::ConsistencyLevel::type.
please note, the operator<< for `cassandra::ConsistencyLevel::type`
is generated using `thrift` command line tool, which does not emit
specialization for fmt::formatter yet, so we need to use
`fmt::ostream_formatter` to implement the formatter for this type.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17013
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `db::replay_position`,
and drop its operator<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17014
This series does a similar change to Alternator as was done recently to CQL:
1. If the "tablets" experimental feature in enabled, new Alternator tables will use tablets automatically, without requiring an option on each new table. A default choice of initial_tablets is used. These choices can still be overridden per-table if the user wants to.
3. In particular, all test/alternator tests will also automatically run with tablets enabled
4. However, some tests will fail on tablets because they use features that haven't yet been implemented with tablets - namely Alternator Streams (Refs #16317) and Alternator TTL (Refs #16567). These tests will - until those features are implemented with tablets - continue to be run without tablets.
5. An option is added to the test/alternator/run to allow developers to manually run tests without tablets enabled, if they wish to (this option will be useful in the short term, and can be removed later).
Fixes#16355Closesscylladb/scylladb#16900
* github.com:scylladb/scylladb:
test/alternator: add "--vnodes" option to run script
alternator: use tablets by default, if available
test/alternator: run some tests without tablets
in general, user should save output of `DESC foo.bar` to a file,
and pass the path to the file as the argument of `--schema-file`
option of `scylla sstable` commands. the CQL statement generated
from `DESC` command always include the keyspace name of the table.
but in case user create the CQL statement manually and misses
the keyspace name. he/she would have following assertion failure
```
scylla: cql3/statements/cf_statement.cc:49: virtual const sstring &cql3::statements::raw::cf_statement::keyspace() const: Assertion `_cf_name->has_keyspace()' failed.
```
this is not a great user experience.
so, in this change, we check for the existence of keyspace before
looking it up. and throw a runtime error with a better error mesage.
so when the CQL statement does not have the keyspace name, the new
error message would look like:
```
error processing arguments: could not load schema via schema-file: std::runtime_error (tools::do_load_schemas(): CQL statement does not have keyspace specified)
```
since this check is only performed by `do_load_schemas()` which
care about the existence of keyspace, and it only expects the
CQL statement to create table/keyspace/type, we just override the
new `has_keyspace()` method of the corresponding types derived
from `cf_statement`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16981
In this commit, we postpone the start-up
of the hint manager until we obtain information
about other nodes in the cluster.
When we start the hint managers, one of the
things that happen is creating endpoint
managers -- structures managed by
db::hints::manager. Whether we create
an instance of endpoint manager depends on
the value returned by host_filter::can_hint_for,
which, in turn, may depend on the current state
of locator::topology.
If locator::topology is incomplete, some endpoint
managers may not be started even though they
should (because the target node IS part of the
cluster and we SHOULD send hints to it if there
are some).
The situation like that can happen because we
start the hint managers too early. This commit
aims to solve that problem. We only start
the hint managers when we've gathered information
about the other nodes in the cluster and created
the locator::topology using it.
Hinted Handoff is not negatively affected by these
changes since in between the previous point of
starting the hint managers and the current one,
all of the mutations performed by
service::storage_proxy target the local node, so
no hints would need to be generated anyway.
Fixesscylladb/scylladb#11870Closesscylladb/scylladb#16511
Anonymous namespace implies internal linkage for its members.
When it is defined in a header, then each translation unit,
which includes such header defines its own unique instance
of members of the unnamed namespace that are ODR-used within
that translation unit.
This can lead to unexpected results including code bloat
or undefined behavior due to ODR violations.
This PR removes unnamed namespaces from header files.
References:
- [CppCoreGuidelines: "SF.21: Don’t use an unnamed (anonymous) namespace in a header"](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#sf21-dont-use-an-unnamed-anonymous-namespace-in-a-header)
- [SEI CERT C++: "DCL59-CPP. Do not define an unnamed namespace in a header file"](https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL59-CPP.+Do+not+define+an+unnamed+namespace+in+a+header+file)
Closesscylladb/scylladb#16998
* github.com:scylladb/scylladb:
utils/config_file_impl.hh: remove anonymous namespace from header
mutation/mutation.hh: remove anonymous namespace from header
this RESTful API is a scylla specific extension and is only used
by scylla-nodetool. currently, the java-based nodetool does not use
it at all, so mark it with "scylla_only".
one can verify this change with:
```
pytest --mode=debug --nodetool=cassandra test_cleanup.py::test_cleanup
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17001
we should allow user to run nodetool tests without `test.py`. but there
are good chance that the host could be reused by multiple tests or
multiple users who could be using port 12345. by randomizing the IP and
port, they would have better chance to complete the test without running
into used port problem.
Closesscylladb/scylladb#16996
* github.com:scylladb/scylladb:
test/nodetool: return a randomized address if not running with unshare
test/nodetool: return an address from loopback_network fixture
In this mode, the node is not reachable from the outside, i.e.
* it refuses all incoming RPC connections,
* it does not join the cluster, thus
* all group0 operations are disabled (e.g. schema changes),
* all cluster-wide operations are disabled for this node (e.g. repair),
* other nodes see this node as dead,
* cannot read or write data from/to other nodes,
* it does not open Alternator and Redis transport ports and the TCP CQL port.
The only way to make CQL queries is to use the maintenance socket. The node serves only local data.
To start the node in maintenance mode, use the `--maintenance-mode true` flag or set `maintenance_mode: true` in the configuration file.
REST API works as usual, but some routes are disabled:
* authorization_cache
* failure_detector
* hinted_hand_off_manager
This PR also updates the maintenance socket documentation:
* add cqlsh usage to the documentation
* update the documentation to use `WhiteListRoundRobinPolicy`
Fixes#5489.
Closesscylladb/scylladb#15346
* github.com:scylladb/scylladb:
test.py: add test for maintenance mode
test.py: generalize usage of cluster_con
test.py: when connecting to node in maintenance mode use maintenance socket
docs: add maintenance mode documentation
main: add maintenance mode
main: move some REST routes initialization before joining group0
message_service: add sanity check that rpc connections are not created in the maintenance mode
raft_group0_client: disable group0 operations in the maintenance mode
service/storage_service: add start_maintenance_mode() method
storage_service: add MAINTENANCE option to mode enum
service/maintenance_mode: add maintenance_mode_enabled bool class
service/maintenance_mode: move maintenance_socket_enabled definition to seperate file
db/config: add maintenance mode flag
docs: add cqlsh usage to maintenance socket documentation
docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy