Commit Graph

1153 Commits

Author SHA1 Message Date
Piotr Dulikowski
1104f8b00f docs: document raft topology upgrade and recovery 2024-02-07 09:54:54 +01:00
David Garcia
ad1c9ae452 docs: fix logging in images extensions
Adds a missing logging import in the file scylladb_common_images extension, which prevents the enterprise build from building.

Additionally, it standardizes logging handling across the extensions and removes "ami" references in Azure and GCP extensions.

Closes scylladb/scylladb#17137
2024-02-06 13:00:37 +02:00
Botond Dénes
115ee4e1f5 Merge 'doc: remove the OSS and Enterprise Features pages' from Anna Stuchlik
This PR removes the following pages:
- ScyllaDB Open Source Features
- ScyllaDB Enterprise Features

They were outdated, incomplete, and misleading. They were also redundant, as the per-release updates are added as Release Notes.

With this update, the features listed on the removed pages are added under the common page: ScyllaDB Features.

In addition, a reference to the Enterprise-only Features section is added.

Note: No redirections are added because no file paths or URLs are changed with this PR.

Fixes https://github.com/scylladb/scylladb/issues/13485

Refs https://github.com/scylladb/scylladb/issues/16496

(nobackport)

Closes scylladb/scylladb#17150

* github.com:scylladb/scylladb:
  Update docs/using-scylla/features.rst
  doc: remove the OSS and Enterprise Features pages
2024-02-06 08:17:18 +02:00
Botond Dénes
edb983d165 Merge 'doc: add the 5.4-to-2024.1 upgrade guide' from Anna Stuchlik
This PR:
- Adds the upgrade guide from ScyllaDB Open Source 5.4 to ScyllaDB Enterprise 2024.1. Note: The need to include the "Restore system tables" step in rollback has been confirmed; see https://github.com/scylladb/scylladb/issues/11907#issuecomment-1842657959.
- Removes the 5.1-to-2022.2 upgrade guide (unsupported versions).

Fixes https://github.com/scylladb/scylladb/issues/16445

Closes scylladb/scylladb#16887

* github.com:scylladb/scylladb:
  doc: fix the OSS version number
  doc: metric updates between 2024.1. and 5.4
  doc: remove the 5.1-to-2022.2 upgrade guide
  doc: add the 5.4-to-2024.1 upgrade guide
2024-02-06 08:16:05 +02:00
Anna Stuchlik
d6723134ab doc: fix the OSS version number
Replace "5.2" with "5.4", as this is
the 5.4-to-2024.1 upgrade guide.
2024-02-05 21:10:50 +01:00
Kamil Braun
968d1e3e78 Merge 'raft topology: make rollback_to_normal a transition state' from Patryk Jędrzejczak
After changing `left_token_ring` from a node state to a transition
state in scylladb/scylladb#17009, we do the same for
`rollback_to_normal`. `rollback_to_normal` was created as a node
state because `left_token_ring` was a node state.

This change will allow us to distinguish a failed removenode from
a failed decommission in the `rollback_to_normal` handler.
Currently, we use the same logic for both of them, so it's not
required. However, this might change, as it has happened with the
decommission and the failed bootstrap/replace in the
`left_token_ring` state (scylladb/scylladb#16797). We are making
this change now because it would be much harder after branching.

Fixes scylladb/scylladb#17032

Closes scylladb/scylladb#17136

* github.com:scylladb/scylladb:
  docs: dev: topology-over-raft: align indentation
  docs: dev: topology-over-raft: document the rollback_to_normal state
  topology_coordinator: improve logs in rollback_to_normal handler
  raft topology: make rollback_to_normal a transition state
2024-02-05 16:30:20 +01:00
Anna Stuchlik
6d6c400b77 doc: metric updates between 2024.1. and 5.4
This commit adds the information about
metrics updates between these two versions.

Fixes https://github.com/scylladb/scylladb/issues/16446
2024-02-05 16:24:16 +01:00
Anna Stuchlik
1e9c7ab6d1 Update docs/using-scylla/features.rst
Co-authored-by: Tzach Livyatan <tzach.livyatan@gmail.com>
2024-02-05 14:44:31 +01:00
Anna Stuchlik
f7afa6773f doc: remove the OSS and Enterprise Features pages
This commit removes the following pages:
- ScyllaDB Open Source Features
- ScyllaDB Enterprise Features

They were outdated, incomplete, and misleading.
They were also redundant, as the per-release
updates are added as Release Notes.

With this update, the features listed on the removed
pages are added under the common page: ScyllaDB Features.

Note: No redirections are added, because no file paths
or URLs are changed with this commit.

Fixes https://github.com/scylladb/scylladb/issues/13485

Refs https://github.com/scylladb/scylladb/issues/16496
2024-02-04 20:55:40 +01:00
Patryk Jędrzejczak
2687204c7f docs: dev: topology-over-raft: align indentation 2024-02-02 16:55:28 +01:00
Patryk Jędrzejczak
fdd3c3a280 docs: dev: topology-over-raft: document the rollback_to_normal state
In one of the previous patches, we changed the `rollback_to_normal`
state from a node state to a transition state. We document it
in this patch. The node state wasn't documented, so there is
nothing to remove.
2024-02-02 16:55:28 +01:00
Kefu Chai
792fa4441e docs: s/ontop/on top/
this misspelling is identified by codespell. ontop cannot be found
on merriam-webster, but "on top" can, see
https://www.merriam-webster.com/dictionary/on%20top, so let's
replace ontop with "on top".

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17127
2024-02-02 15:20:40 +01:00
Avi Kivity
c8397f0287 Merge 'Implement tablet splitting' from Raphael "Raph" Carvalho
The motivation for tablet resizing is that we want to keep the average tablet size reasonable, such that load rebalancing can remain efficient. Too large tablet makes migration inefficient, therefore slowing down the balancer.

If the avg size grows beyond the upper bound (split threshold), then balancer decides to split. Split spans all tablets of a table, due to power-of-two constraint.

Likewise, if the avg size decreases below the lower bound (merge threshold), then merge takes place in order to grow the avg size. Merge is not implemented yet, although this series lays foundation for it to be impĺemented later on.

A resize decision can be revoked if the avg size changes and the decision is no longer needed. For example, let's say table is being split and avg size drops below the target size (which is 50% of split threshold and 100% of merge one). That means after split, the avg size would drop below the merge threshold, causing a merge after split, which is wasteful, so it's better to just cancel the split.

Tablet metadata gains 2 new fields for managing this:
resize_type: resize decision type, can be either of "merge", "split", or "none".
resize_seq_number: a sequence number that works as the global identifier of the decision (monotonically increasing, increased by 1 on every new decision emitted by the coordinator).

A new RPC was implemented to pull stats from each table replica, such that load balancer can calculate the avg tablet size and know the "split status", for a given table. Avg size is aggregated carefully while taking RF of each DC into account (which might differ).
When a table is done splitting its storage, it loads (mirror) the resize_seq_number from tablet metadata into its local state (in another words, my split status is ready). If a table is split ready, coordinator will see that table's seq number is the same as the one in tablet metadata. Helps to distinguish stale decisions from the latest one (in case decisions are revoked and re-emited later on). Also, it's aggregated carefully, by taking the minimum among all replicas, so coordinator will only update topology when all replicas are ready.

When load balancer emits split decision, replicas will listen to need to split with a "split monitor" that is awakened once a table has replication metadata updated and detects the need for split (i.e. resize_type field is "split").
The split monitor will start splitting of compaction groups (using mechanism introduced here: 081f30d149) for the table. And once splitting work is completed, the table updates its local state as having completed split.

When coordinator pulls the split status of all replicas for a table via RPC, the balancer can see whether that table is ready for "finalizing" the decision, which is about updating tablet metadata to split each tablet into two. Once table replicas have their replication metadata updated with the new tablet count, they can update appropriately their set of compaction groups (that were previously split in the preparation step).

Fixes #16536.

Closes scylladb/scylladb#16580

* github.com:scylladb/scylladb:
  test/topology_experimental_raft: Add tablet split test
  replica: Bypass reshape on boot with tablets temporarily
  replica: Fix table::compaction_group_for_sstable() for tablet streaming
  test/topology_experimental_raft: Disable load balancer in test fencing
  replica: Remap compaction groups when tablet split is finalized
  service: Split tablet map when split request is finalized
  replica: Update table split status if completed split compaction work
  storage_service: Implement split monitor
  topology_cordinator: Generate updates for resize decisions made by balancer
  load_balancer: Introduce metrics for resize decisions
  db: Make target tablet size a live-updateable config option
  load_balancer: Implement resize decisions
  service: Wire table_resize_plan into migration_plan
  service: Introduce table_resize_plan
  tablet_mutation_builder: Add set_resize_decision()
  topology_coordinator: Wire load stats into load balancer
  storage_service: Allow tablet split and migration to happen concurrently
  topology_coordinator: Periodically retrieve table_load_stats
  locator: Introduce topology::get_datacenter_nodes()
  storage_service: Implement table_load_stats RPC
  replica: Expose table_load_stats in table
  replica: Introduce storage_group::live_disk_space_used()
  locator: Introduce table_load_stats
  tablets: Add resize decision metadata to tablet metadata
  locator: Introduce resize_decision
2024-01-31 13:59:56 +02:00
Kamil Braun
0912d2a2c6 Merge 'raft topology: make left_token_ring a transition state' from Patryk Jędrzejczak
When a node is in the `left_token_ring` state, we don't know how
it has ended up in this state. We cannot distinguish a node that
has finished decommissioning from a node that has failed bootstrap.

The main problem it causes is that we incorrectly send the
`barrier_and_drain` command to a node that has failed
bootstrapping or replacing. We must do it for a node that has
finished decommissioning because it could still coordinate
requests. However, since we cannot distinguish nodes in the
`left_token_ring` state, we must send the command to all of them.
This issue appeared in scylladb/scylladb#16797 and this PR is
a follow-up that fixes it.

The solution is changing `left_token_ring` from a node state
to a transition state.

Fixes scylladb/scylladb#16944

Closes scylladb/scylladb#17009

* github.com:scylladb/scylladb:
  docs: dev: topology-over-raft: document the left_token_ring state
  topology_coordinator: adjust reason string in left_token_ring handler
  raft topology: make left_token_ring a transition state
  topology_coordinator: rollback_current_topology_op: remove unused exclude_nodes
2024-01-29 15:29:01 +01:00
Beni Peled
8009170d3a docs: update the installation instructions with the new gpg 2024 key
Closes scylladb/scylladb#17019
2024-01-29 14:37:25 +02:00
Patryk Jędrzejczak
7c10cae6c4 docs: dev: topology-over-raft: document the left_token_ring state
In one of the previous patches, we changed the `left_token_ring`
state from a node state to a transition state. We document it
in this patch. The node state wasn't documented, so there is
nothing to remove.
2024-01-29 10:39:07 +01:00
Tzach Livyatan
06a9a925a5 Update link to sizing / pricing calc
Closes scylladb/scylladb#17015
2024-01-29 11:07:20 +02:00
Anna Stuchlik
dfa88ccc28 doc: document nodetool resetlocalschema
This adds the documentation for the nodetool resetlocalschema
command.
The syntax description is based on the description for Cassandra
and the ScyllaDB help for nodetool.

Fixes https://github.com/scylladb/scylladb/issues/16286

Closes scylladb/scylladb#16790
2024-01-28 21:09:02 +01:00
Kefu Chai
9ee6c00c84 docs: fix misspellings
these misspellings are identified by codespell.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17005
2024-01-26 13:14:21 +02:00
Kamil Braun
4f736894e1 Merge 'Add maintenance mode' from Mikołaj Grzebieluch
In this mode, the node is not reachable from the outside, i.e.
* it refuses all incoming RPC connections,
* it does not join the cluster, thus
  * all group0 operations are disabled (e.g. schema changes),
  * all cluster-wide operations are disabled for this node (e.g. repair),
  * other nodes see this node as dead,
  * cannot read or write data from/to other nodes,
* it does not open Alternator and Redis transport ports and the TCP CQL port.

The only way to make CQL queries is to use the maintenance socket. The node serves only local data.

To start the node in maintenance mode, use the `--maintenance-mode true` flag or set `maintenance_mode: true` in the configuration file.

REST API works as usual, but some routes are disabled:
* authorization_cache
* failure_detector
* hinted_hand_off_manager

This PR also updates the maintenance socket documentation:
* add cqlsh usage to the documentation
* update the documentation to use `WhiteListRoundRobinPolicy`

Fixes #5489.

Closes scylladb/scylladb#15346

* github.com:scylladb/scylladb:
  test.py: add test for maintenance mode
  test.py: generalize usage of cluster_con
  test.py: when connecting to node in maintenance mode use maintenance socket
  docs: add maintenance mode documentation
  main: add maintenance mode
  main: move some REST routes initialization before joining group0
  message_service: add sanity check that rpc connections are not created in the maintenance mode
  raft_group0_client: disable group0 operations in the maintenance mode
  service/storage_service: add start_maintenance_mode() method
  storage_service: add MAINTENANCE option to mode enum
  service/maintenance_mode: add maintenance_mode_enabled bool class
  service/maintenance_mode: move maintenance_socket_enabled definition to seperate file
  db/config: add maintenance mode flag
  docs: add cqlsh usage to maintenance socket documentation
  docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy
2024-01-26 11:02:34 +01:00
Raphael S. Carvalho
7ed5b44d52 load_balancer: Implement resize decisions
This implements the ability in load balancer to emit split or merge
requests, cancel ongoing ones if they're no longer needed, and
also finalize those that are ready for the topology changes.

That's all based on average tablet size, collected by coordinator
from all nodes, and split and merge thresholds.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2024-01-25 18:36:08 -03:00
Raphael S. Carvalho
0d5ba1ee4b tablets: Add resize decision metadata to tablet metadata
The new metadata describes the ongoing resize operation (can be either
of merge, split or none) that spans tablets of a given table.
That's managed by group0, so down nodes will be able to see the
decision when they come back up and see the changes to the
metadata.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2024-01-25 18:36:06 -03:00
Mikołaj Grzebieluch
9c07a189e8 docs: add maintenance mode documentation 2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch
81ef9fc91e docs: add cqlsh usage to maintenance socket documentation
After https://github.com/scylladb/scylla-cqlsh/pull/67, the user can use
cqlsh to connect to the node by maintenance socket.
2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch
2c34d9fcd8 docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy
After https://github.com/scylladb/python-driver/pull/287, the user can use
WhiteListRoundRobinPolicy to connect to the node by maintenance socket.
2024-01-25 14:52:24 +01:00
Avi Kivity
69d597075a Merge 'tablets: Add support for removenode and replace handling' from Tomasz Grabiec
New tablet replicas are allocated and rebuilt synchronously with node
operations. They are safely rebuilt from all existing replicas.
The list of ignored nodes passed to node operations is respected.

Tablet scheduler is responsible for scheduling tablet rebuilding transition which
changes the replicas set. The infrastructure for handling decommission
in tablet scheduler is reused for this.

Scheduling is done incrementally, respecting per-shard load
limits. Rebuilding transitions are recognized by load calculation to
affect all tablet replicas.

New kind of tablet transition is introduced called "rebuild" which
adds new tablet replica and rebuilds it from existing replicas. Other
than that, the transition goes through the same stages as regular
migration to ensure safe synchronization with request coordinators.

In this PR we simply stream from all tablet replicas. Later we should
switch to calling repair to avoid sending excessive amounts of data.

Fixes https://github.com/scylladb/scylladb/issues/16690.

Closes scylladb/scylladb#16894

* github.com:scylladb/scylladb:
  tests: tablets: Add tests for removenode and replace
  tablets: Add support for removenode and replace handling
  topology_coordinator: tablets: Do not fail in a tight loop
  topology_coordinator: tablets: Avoid warnings about ignored failured future
  storage_service, topology: Track excluded state in locator::topology
  raft topology: Introduce param-less topology::get_excluded_nodes()
  raft topology: Move get_excluded_nodes() to topology
  tablets: load_balancer: Generalize load tracking
  tablets: Introduce get_migration_streaming_info() which works on migration request
  tablets: Move migration_to_transition_info() to tablets.hh
  tablets: Extract get_new_replicas() which works on migraiton request
  tablets: Move tablet_migration_info to tablets.hh
  tablets: Store transition kind per tablet
2024-01-25 14:49:43 +02:00
Botond Dénes
7bb3ed7f23 docs/operating-scylla: scylla-sstable.rst: fix checksum list
Add empty line before list of different checksums in
validate-checksums's description. Otherwise the list is not rendered.

Closes scylladb/scylladb#16401
2024-01-24 16:34:13 +01:00
Avi Kivity
4a57b67634 docs: add a rough diagram of module interaction
It is incomplete and maybe inaccurate, but it is a start.

Closes scylladb/scylladb#16903
2024-01-23 18:08:48 +02:00
David Garcia
77822fc51d chore: add azure and gcp images extensions
Closes scylladb/scylladb#16942
2024-01-23 16:06:40 +02:00
Anna Stuchlik
9076a944c5 doc: improve the ScyllaDB for Developers page
This commit improves the developer-oriented section
of the core documentation:

- Added links to the developer sections in the new
  Get Started guide (Develop with ScyllaDB and
  Tutorials and Example Projects) for ease of access.

- Replaced the outdated Learn to Use ScyllaDB page with
  a link to the up-to-date page in the Get Started guide.
  This involves removing the learn.rst file and adding
  an appropriate redirection.

- Removed the Apache Copyrights, as this page does not
  need it.

- Removed the Features panel box as there was only one
  feature listed, which looked weird. Also, we are in
  the process of removing the Features section.

Closes scylladb/scylladb#16800
2024-01-23 10:06:31 +02:00
Tomasz Grabiec
4a06ffb43c tablets: Store transition kind per tablet
Will be used to distinguish regular migration from rebuild, repair and
RF change.
2024-01-23 01:12:57 +01:00
David Garcia
f3eeba8cc6 docs: parse config.cc properties as rst text
This enhancement formats descriptions in config.cc using the standard markup language reStructuredText (RST).

By doing so, it improves the rendering of these descriptions in the documentation, allowing you to use various directives like admonitions, code blocks, ordered lists, and more.

Closes scylladb/scylladb#16311
2024-01-22 16:40:18 +02:00
Anna Stuchlik
a462b914cb doc: add 2024.1 to the OSS vs. Enterprise matrix
This commit adds the information that
ScyllaDB Enterprise 2024.1 is based
on ScyllaDB Open Source 5.4
to the OSS vs. Enterprise matrix.

Closes scylladb/scylladb#16880
2024-01-22 09:25:08 +02:00
Avi Kivity
3092e3a5dc Merge 'doc: improvements to the Create Cluster page' from Anna Stuchlik
This PR:
- Removes the redundant information about previous versions from the Create Cluster page.
- Fixes language mistakes on that page, and replaces "Scylla" with "ScyllaDB".

(nobackport)

Closes scylladb/scylladb#16885

* github.com:scylladb/scylladb:
  doc: fix the language on the Create Cluster page
  doc: remove reduntant info about old versions
2024-01-21 18:18:32 +02:00
Anna Stuchlik
652cf1fa70 doc: remove the 5.1-to-2022.2 upgrade guide
This commit removes the 5.1-to-2022.2 upgrade
guide - the upgrade guide for versions we
no longer support.
We should remove it while adding the 5.4-to-2024.1
upgrade guide (the previous commit).
2024-01-19 18:33:08 +01:00
Anna Stuchlik
3c17fca363 doc: add the 5.4-to-2024.1 upgrade guide
This commit adds the upgrade guide from
ScyllaDB Open Source 5.4 to ScyllaDB
Enterprise 2024.1.

The need to include the "Restore system tables"
step in rollback has been confirmed; see
https://github.com/scylladb/scylladb/issues/11907#issuecomment-1842657959

Fixes https://github.com/scylladb/scylladb/issues/16445
2024-01-19 18:23:37 +01:00
Anna Stuchlik
d345a893d6 doc: fix the language on the Create Cluster page
This commit fixes language mistakes on
the Create Cluster page, and replaces
"Scylla" with "ScyllaDB".
2024-01-19 17:21:12 +01:00
Anna Stuchlik
af669dd7ae doc: remove reduntant info about old versions
This commit removes the information about
old versions, which is reduntant in the next
upcoming version.
2024-01-19 17:06:34 +01:00
Anna Stuchlik
b1ba904c49 doc: remove upgrade for unsupported versions
This commit removes the upgrade guides
from ScyllaDB Open Source to Enterprise
for versions we no longer support.

In addition, it removes a link to
one of the removed pages from
the Troubleshooting section (the link is
redundant).

Closes scylladb/scylladb#16249
2024-01-19 15:59:35 +02:00
Eliran Sinvani
32d8dadf1a Add code coverage documentation
Add `docs/dev/code-coverage.md` with explanations about how to work with
the different tools added for coverage reporting and cli options added
to `configure.py` and `test.py`

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2024-01-18 11:11:34 +02:00
David Garcia
f555a2cb05 docs: dynamic include based on flag
docs: extend include options

Closes scylladb/scylladb#16753
2024-01-17 09:33:40 +02:00
Tomasz Grabiec
49026dc319 Merge 'Turn on tablets on keyspace by default when the feature is enabled' from Pavel Emelyanov
To enable tablets replication one needs to turn on the (experimental) feature and specify the `initial_tablets: N` option when creating a keyspace. We want tablets to become default in the future and allow users to explicitly opt it out if they want to.

This PR solves this by changing the CREATE KEYSPACE syntax wrt tablets options. Now there's a new TABLETS options map and the usage is

* `CREATE KEYSPACE ...` will turn tablets on or off based on cluster feature being enabled/disabled
* `CREATE KEYSPACE ... WITH TABLETS = { 'enabled': false }` will turn tablets off regardless of what
* `CREATE KEYSPACE ... WITH TABLETS = { 'enabled': true }` will try to enable tablets with default configuration
* `CREATE KEYSPACE ... WITH TABLETS = { 'initial': <int> }` is now the replacement for `REPLICATION = { ... 'initial_tablets': <int> }` thing

fixes: #16319

Closes scylladb/scylladb#16364

* github.com:scylladb/scylladb:
  code: Enable tablets if cluster feature is enabled
  test: Turn off tablets feature by default
  test: Move test_tablet_drain_failure_during_decommission to another suite
  test/tablets: Enable tables for real on test keyspace
  test/tablets: Make timestamp local
  cql3: Add feature service to as_ks_metadata_update()
  cql3: Add feature service to ks_prop_defs::as_ks_metadata()
  cql3: Add feature service to get_keyspace_metadata()
  cql: Add tablets on/off switch to CREATE KEYSPACE
  cql: Move initial_tablets from REPLICATION to TABLETS in DDL
  network_topology_strategy: Estimate initial_tablets if 0 is set
2024-01-16 00:15:10 +01:00
Anna Stuchlik
af1405e517 doc: remove support for CentOS 7
This commit removes support for CentOS 7
from the docs.

The change applies to version 5.4,so it
must be backported to branch-5.4.

Refs https://github.com/scylladb/scylla-enterprise/issues/3502

In addition, this commit removes the information
about Amazon Linux and Oracle Linux, unnecessarily added
without request, and there's no clarity over which versions
should be documented.

Closes scylladb/scylladb#16279
2024-01-15 15:37:29 +02:00
Anna Stuchlik
bca39b2a93 doc: remove Serverless from the Drivers page
This commit removes the information about ScyllaDB Cloud Serverless,
which is no longer valid.

Closes scylladb/scylladb#16700
2024-01-15 15:36:51 +02:00
Pavel Emelyanov
6cb3055059 cql: Add tablets on/off switch to CREATE KEYSPACE
Now the user can do

  CREATE KEYSPACE ... WITH TABLETS = { 'enabled': false }

to turn tablets off. It will be useful in the future to opt-out keyspace
from tablets when they will be turned on by default based on cluster
features only.

Also one can do just

  CREATE KEYSPACE ... WITH TABLETS = { 'enabled': true }

and let Scylla select the initial tablets value by its own

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-01-15 13:12:11 +03:00
Pavel Emelyanov
941f6d8fca cql: Move initial_tablets from REPLICATION to TABLETS in DDL
This patch changes the syntax of enabling tablets from

  CREATE KEYSPACE ... WITH REPLICATION = { ..., 'initial_tablets': <int> }

to be

  CREATE KEYSPACE ... WITH TABLETS = { 'initial': <int> }

and updates all tests accordingly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-01-15 13:04:48 +03:00
Kefu Chai
7abd263ee6 db/config.cc: do not respect sstable_format option
"me" sstable format includes an important feature of storing the
`host_id` of the local node when writing sstables. The is crucial
for validating the sstable's `replay_position` in stats metadata as
it is valid only on the originating node and shard (#10080), therefor
we would like to make the `me` format mandatory.

before making `me` mandatory, we need to stop handling `sstable_format`
option if it is "md".

in this change

- gms/feature_service: do not disable `ME_SSTABLE_FORMAT` even if
  `sstable_format` is configured with "md". and in that case, instead,
  a warning is printed in the logging message to note that
  this setting is not valid anymore.
- docs/architecture/sstable: note that "me" is used by default now.

after this change, "sstable_format" will only accept "me" if it's
explicitly configured. and when a server with this change joins a
cluster, it uses "md" if the any of the node in the cluster still has
`sstable_format`. practically, this change makes "me" mandatory
in a 6.x cluster, assuming this change will be included in 6.x
releases.

Fixes #16551
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-11 22:43:05 +08:00
Avi Kivity
b8a0e3543e docs: ddl: document the initial_tablets replication strategy option
While the feature is experimental, this makes it easier to experiment
with it.

An example is provided.

Closes scylladb/scylladb#16193
2024-01-03 13:49:30 +01:00
Sylwia Szunejko
91a5a41313 add a way to negotiate generation of the tablet info for drivers
Tablets metadata is quite expensive to generate (each data_value is
an allocation), so an old driver (without support for tablets) will
generate huge amounts of such notifications. This commit adds a way
to negotiate generation of the notification: a new driver will ask
for them, and an old driver won't get them. It uses the
OPTIONS/SUPPORTED/STARTUP protocol described in native_protocol_v4.spec.

Closes scylladb/scylladb#16611
2024-01-02 20:00:50 +02:00
Sylwia Szunejko
467d466f7e put all tablet info into one field of custom_payload and update docs
Previously, the tablet information was sent to the drivers
in two pieces within the custom_payload. We had information
about the replicas under the `tablet_replicas` key and token range
information under `token_range`. These names were quite generic
and might have caused problems for other custom_payload users.
Additionally, dividing the information into two pieces raised
the question of what to do if one key is present while the other
is missing.

This commit changes the serialization mechanism to pack all information
under one specific name, `tablets-routing-v1`.

From: Sylwia Szunejko <sylwia.szunejko@scylladb.com>

Closes scylladb/scylladb#16148
2024-01-02 14:35:37 +02:00