This commit enables publishing documentation
from branch-5.4. The docs will be published
as UNSTABLE (the warning about version 5.4
being unstable will be displayed).
Closesscylladb/scylladb#15762
This PR improves the way of how handling failures is documented and accessible to the user.
- The Handling Failures section is moved from Raft to Troubleshooting.
- Two new topics about failure are added to Troubleshooting with a link to the Handling Failures page (Failure to Add, Remove, or Replace a Node, Failure to Update the Schema).
- A note is added to the add/remove/replace node procedures to indicate that a quorum is required.
See individual commits for more details.
Fixes https://github.com/scylladb/scylladb/issues/13149Closesscylladb/scylladb#15628
* github.com:scylladb/scylladb:
doc: add a note about Raft
doc: add the quorum requirement to procedures
doc: add more failure info to Troubleshooting
doc: move Handling Failures to Troubleshooting
This commit:
- Removes upgrade guides for versions older than 5.0.
The oldest one is from version 4.6 to 5.0.
- Adds the redirections for the removed pages.
Closesscylladb/scylladb#15709
This commit removes the information about the recommended way of upgrading ScyllaDB images - by updating ScyllaDB and OS packages in one step. This upgrade procedure is not supported (it was implemented, but then reverted).
The scope of this commit:
- Remove the information from the 5.0-to.-5.1 upgrade guide and replace with general info.
- Remove the information from the 4.6-to.-5.1 upgrade guide and replace with general info.
- Remove the information from the 5.x.y-to.-5.x.z upgrade guide and replace with general info.
- Remove the following files as no longer necessary (they were only created to incorporate the (invalid) information about image upgrade into the upgrade guides.
/upgrade/_common/upgrade-image-opensource.rst
/upgrade/_common/upgrade-guide-v5-patch-ubuntu-and-debian-p1.rst
/upgrade/_common/upgrade-guide-v5-patch-ubuntu-and-debian-p2.rst
/upgrade/_common/upgrade-guide-v5-patch-ubuntu-and-debian.rst
This PR is a continuation of https://github.com/scylladb/scylladb/pull/15739.
**This PR must be backported to branch-5.2 and branch-5.1.**
Closesscylladb/scylladb#15740
* github.com:scylladb/scylladb:
doc: remove wrong image upgrade info (5.x.y-to-5.x.y)
doc: remove wrong image upgrade info (4.6-to-5.0)
doc: remove wrong image upgrade info (5.0-to-5.1)
This commit removes the information about
the recommended way of upgrading ScyllaDB
images - by updating ScyllaDB and OS packages
in one step.
This upgrade procedure is not supported
(it was implemented, but then reverted).
The scope of this commit:
- Remove the information from the 5.1-to.-5.2
upgrade guide and replace with general info.
- Remove the information from the Image Upgrade
page.
- Remove outdated info (about previous releases)
from the Image Upgrade page.
- Rename "AMI Upgrade" as "Image Upgrade"
in the page tree.
Refs: https://github.com/scylladb/scylladb/issues/15733Closesscylladb/scylladb#15739
This commit removes the invalid information about
the recommended way of upgrading ScyllaDB
images (by updating ScyllaDB and OS packages
in one step) from the 5.x.y-to-5.x.y upgrade guide.
This upgrade procedure is not supported (it was
implemented, but then reverted).
Refs https://github.com/scylladb/scylladb/issues/15733
In addition, the following files are removed as no longer
necessary (they were only created to incorporate the (invalid)
information about image upgrade into the upgrade guides.
/upgrade/_common/upgrade-image-opensource.rst
/upgrade/_common/upgrade-guide-v5-patch-ubuntu-and-debian-p1.rst
/upgrade/_common/upgrade-guide-v5-patch-ubuntu-and-debian-p2.rst
/upgrade/_common/upgrade-guide-v5-patch-ubuntu-and-debian.rst
This commit removes the invalid information about
the recommended way of upgrading ScyllaDB
images (by updating ScyllaDB and OS packages
in one step) from the 4.6-to-5.0 upgrade guide.
This upgrade procedure is not supported (it was
implemented, but then reverted).
Refs https://github.com/scylladb/scylladb/issues/15733
This commit removes the invalid information about
the recommended way of upgrading ScyllaDB
images (by updating ScyllaDB and OS packages
in one step) from the 5.0-to-5.1 upgrade guide.
This upgrade procedure is not supported (it was
implemented, but then reverted).
Refs https://github.com/scylladb/scylladb/issues/15733
This commit adds a note to specify
that the information on the Handling
Failures page only refers to clusters
with Raft enabled.
Also, the comment is included to remove
the note in future versions.
This commit adds new pages with reference to
Handling Node Failures to Troubleshooting.
The pages are:
- Failure to Add, Remove, or Replace a Node
(in the Cluster section)
- Failure to Update the Schema
(in the Data Modeling section)
This commit moves the content of the Handling
Failures section on the Raft page to the new
Handling Node Failures page in the Troubleshooting
section.
Background:
When Raft was experimental, the Handling Failures
section was only applicable to clusters
where Raft was explicitly enabled.
Now that Raft is the default, the information
about handling failures is relevant to
all users.
This PR updates the information on the ScyllaDB vs. Cassandra compatibility. It covers the information from https://github.com/scylladb/scylladb/issues/15563, but there could more more to fix.
@tzach @scylladb/scylla-maint Please review this PR and the page covering our compatibility with Cassandra and let me know if you see anything else that needs to be fixed.
I've added the updates with separate commits in case you want to backport some info (e.g. about AzureSnitch).
Fixes https://github.com/scylladb/scylladb/issues/15563Closesscylladb/scylladb#15582
* github.com:scylladb/scylladb:
doc: deprecate Thrift in Cassandra compatibility
doc: remove row/key cache from Cassandra compatibility
doc: add AzureSnitch to Cassandra compatibility
The sentence says that if table args are provided compaction will run on
all tables. This is ambigous, so the sentence is rephrased to specify
that compaction will only run on the provided tables.
Closesscylladb/scylladb#15394
This PR implements a new procedure for joining nodes to group 0, based on the description in the "Cluster features on Raft (v2)" document. This is a continuation of the previous PRs related to cluster features on raft (https://github.com/scylladb/scylladb/pull/14722, https://github.com/scylladb/scylladb/pull/14232), and the last piece necessary to replace cluster feature checks in gossip.
Current implementation relies on gossip shadow round to fetch the set of enabled features, determine whether the node supports all of the enabled features, and joins only if it is safe. As we are moving management of cluster features to group 0, we encounter a problem: the contents of group 0 itself may depend on features, hence it is not safe to join it unless we perform the feature check which depends on information in group 0. Hence, we have a dependency cycle.
In order to solve this problem, the algorithm for joining group 0 is modified, and verification of features and other parameters is offloaded to an existing node in group 0. Instead of directly asking the discovery leader to unconditionally add the node to the configuration with `GROUP0_MODIFY_CONFIG`, two different RPCs are added: `JOIN_NODE_REQUEST` and `JOIN_NODE_RESPONSE`. The main idea is as follows:
- The new node sends `JOIN_NODE_REQUEST` to the discovery leader. It sends a bunch of information describing the node, including supported cluster features. The discovery leader verifies some of the parameters and adds the node in the `none` state to `system.topology`.
- The topology coordinator picks up the request for the node to be joined (i.e. the node in `none` state), verifies its properties - including cluster features - and then:
- If the node is accepted, the coordinator transitions it to `boostrap`/`replace` state and transitions the topology to `join_group0` state. The node is added to group 0 and then `JOIN_NODE_RESPONSE` is sent to it with information that the node was accepted.
- Otherwise, the node is moved to `left` state, told by the coordinator via `JOIN_NODE_RESPONSE` that it was rejected and it shuts down.
The procedure is not retryable - if a node fails to do it from start to end and crashes in between, it will not be allowed to retry it with the same host_id - `JOIN_NODE_REQUEST` will fail. The data directory must be cleared before attempting to add it again (so that a new host_id is generated).
More details about the procedure and the RPC are described in `topology-over-raft.md`.
Fixes: #15152Closesscylladb/scylladb#15196
* github.com:scylladb/scylladb:
tests: mark test_blocked_bootstrap as skipped
storage_service: do not check features in shadow round
storage_service: remove raft_{boostrap,replace}
topology_coordinator: relax the check in enable_features
raft_group0: insert replaced node info before server setup
storage_service: use join node rpc to join the cluster
topology_coordinator: handle joining nodes
topology_state_machine: add join_group0 state
storage_service: add join node RPC handlers
raft: expose current_leader in raft::server
storage_service: extract wait_for_live_nodes_timeout constant
raft_group0: abstract out node joining handshake
storage_service: pass raft_topology_change_enabled on rpc init
rpc: add new join handshake verbs
docs: document the new join procedure
topology_state_machine: add supported_features to replica_state
storage_service: check destination host ID in raft verbs
group_state_machine: take reference to raft address map
raft_group0: expose joined_group0
This commit adds AzureSnitch (together with a link
to the AzureSnitch description) to the Cassandra
compatibility page.
In addition, the Sniches table is fixed.
This PR replaces a link to a section of the ScyllaDB website with little information about ScyllaDB vs. Cassandra with a link to
a documentation section where Cassandra compatibility is covered in detail.
In addition, it removes outdated or irrelevant information about versions from the Cassandra compatibility page.
Now that the documentation is versioned, we shouldn't add such information to the content.
Fixes https://github.com/scylladb/scylla-enterprise/issues/3454Closesscylladb/scylladb#15562
* github.com:scylladb/scylladb:
doc: remove outdated/irrelevant version info
doc: replace the link to Cassandra compatibility
This commit removes outdated or irrelevant
information about versions from the Cassandra
compatibility page.
Now that the documentation is versioned, we
shouldn't add such information to the content.
Some "Additional Information" section headings
appear on the page tree in the left sidebar
because of their incorrect underline.
This commit fixes the problem by replacing title
underline with section underline.
Closesscylladb/scylladb#15550
This commit replaces a link to a section of
the ScyllaDB website with little information
about ScyllaDB vs. Cassandra with a link to
a documentation section where Cassandra
compatibility is covered in detail.
in this series, we rename s3 credential related variable and option names so they are more consistent with AWS's official document. this should help with the maintainability.
Closesscylladb/scylladb#15529
* github.com:scylladb/scylladb:
main.cc: rename aws option
utils/s3/creds: rename aws_config member variables
So far generic describe (`DESC <name>`) followed Cassandra implementation and it only described keyspace/table/view/index.
This commit adds UDT/UDF/UDA to generic describe.
Fixes: #14170Closesscylladb/scylladb#14334
* github.com:scylladb/scylladb:
docs:cql: add information about generic describe
cql-pytest:test_describe: add test for generic UDT/UDF/UDA desc
cql3:statements:describe_statement: include UDT/UDF/UDA in generic describe
- s/aws_key/aws_access_key_id/
- s/aws_secret/aws_secret_access_key/
- s/aws_token/aws_session_token/
rename them to more popular names, these names are also used by
boto's API. this should improve the readability and consistency.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
```console
$ journalctl --user start scylla-server -xe
Failed to add match 'start': Invalid argument
```
`journalctl` expects a match filter as its positional arguments.
but apparently, start is not a filter. we could use `--unit`
to specify a unit though, like:
```console
$ journalctl --user --unit scylla-server.service -xe
```
but it would flood the stdout with the logging messages printed
by scylla. this is not what a typical user expects. probably a better
use experience can be achieved using
```console
$ systemctl --user status scylla-server
```
which also print the current status reported by the service, and
the command line arguments. they would be more informative in typical
use cases.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15390
in hope to lower the bar to testing object store.
* add language specifier for better readability of the document.
to highlight the config with YAML syntax
* add more specific comment on the AWS related settings
* explain that endpoint should match in the CREATE KEYSPACE
statement and the one defined by the YAML configuration.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15433
This commit adds the information that ScyllaDB Enterprise
supports FIPS-compliant systems in versions
2023.1.1 and later.
The information is excluded from OSS docs with
the "only" directive, because the support was not
added in OSS.
This commit must be backported to branch-5.2 so that
it appears on version 2023.1 in the Enterprise docs.
Closes#15415
cassandra-stress connects to "localhost" by default. that's exactly the
use case when we install scylla using the unified installer. so do not
suggest "-node xxx" option. the "xxx" part is but confusing.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15411
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.
Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.
If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be aborted. There is no infrastructure for
aborts yet, so this is not implemented.
Closes#15197
* github.com:scylladb/scylladb:
tablets, raft topology: Add support for decommission with tablets
tablet_allocator: Compute load sketch lazily
tablet_allocator: Set node id correctly
tablet_allocator: Make migration_plan a class
tablets: Implement cleanup step
storage_service, tablets: Prevent stale RPCs from running beyond their stage
locator: Introduce tablet_metadata_guard
locator, replica: Add a way to wait for table's effective_replication_map change
storage_service, tablets: Extract do_tablet_operation() from stream_tablet()
raft topology: Add break in the final case clause
raft topology: Fix SIGSEGV when trace-level logging is enabled
raft topology: Set node state in topology
raft topology: Always set host id in topology
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.
Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.
If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be paused so that admin becomes aware of the problem and
issues an abort on the topology change. There is no infrastructure for
aborts yet, so this is not implemented.
we get the path object storage config like:
```c++
db::config::get_conf_sub("object_storage.yaml").native()
```
so, the default path should be $SCYLLA_CONF/object_storage.yaml.
in this change, it is corrected.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15406
We make the `CDC_GENERATIONS_V3` table single-partition and change the
clustering key from `range_end` to `(id, range_end)`. We also change the
type of `id` to `timeuuid` and ensure that a new generation always has
the highest `id`. These changes allow efficient clearing of obsolete CDC
generation data, which we need to prevent Raft-topology snapshots from
endlessly growing as we introduce new generations over time.
All this code is protected by an experimental feature flag. It includes
the definition of `CDC_GENERATIONS_V3`. The table is not created unless
the feature flag is enabled.
Fixes#15163Closes#15319
* github.com:scylladb/scylladb:
system_keyspace: rename cdc_generation_id_v2
system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
cdc: generation: remove topology_description_generator
cdc: do not create uuid in make_new_generation_data
system_kayspace: make CDC_GENERATIONS_V3 single-partition
cdc: generation: introduce get_common_cdc_generation_mutations
cdc: generation: rename get_cdc_generation_mutations
should not "tar" to tar, otherwise we'd have following error:
```
tar (child): tar: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
```
as "tar" is not the compressed tarball we want to untar.
Fixes#15328
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15383