The mechanics of the restore is like this
- A /storage_service/tablets/restore API is called with (keyspace, table, endpoint, bucket, manifests) parameters
- First, it populates the system_distributed.snapshot_sstables table with the data read from the manifests
- Then it emplaces a bunch of tablet transitions (of a new "restore" kind), one for each tablet
- The topology coordinator handles the "restore" transition by calling a new RESTORE_TABLET RPC against all the current tablet replicas
- Each replica handles the RPC verb by
- Reading the snapshot_sstables table
- Filtering the read sstable infos against current node and tablet being handled
- Downloading and attaching the filtered sstables
This PR includes system_distributed.snapshot_sstables table from @robertbindar and preparation work from @kreuzerkrieg that extracts raw sstables downloading and attaching from existing generic sstables loading code.
This is first step towards SCYLLADB-197 and lacks many things. In particular
- the API only works for single-DC cluster
- the caller needs to "lock" tablet boundaries with min/max tablet count
- not abortable
- no progress tracking
- sub-optimal (re-kicking API on restore will re-download everything again)
- not re-attacheable (if API node dies, restoration proceeds, but the caller cannot "wait" for it to complete via other node)
- nodes download sstables in maintenance/streaming sched gorup (should be moved to maintenance/backup)
Other follow-up items:
- have an actual swagger object specification for `backup_location`
Closes#28436Closes#28657Closes#28773Closesscylladb/scylladb#28763
* github.com:scylladb/scylladb:
test: Add test for backup vs migration race
test: Restore resilience test
sstables_loader: Fail tablet-restore task if not all sstables were downloaded
sstables_loader: mark sstables as downloaded after attaching
sstables_loader: return shared_sstable from attach_sstable
db: add update_sstable_download_status method
db: add downloaded column to snapshot_sstables
db: extract snapshot_sstables TTL into class constant
test: Add a test for tablet-aware restore
tablets: Implement tablet-aware cluster-wide restore
messaging: Add RESTORE_TABLET RPC verb
sstables_loader: Add method to download and attach sstables for a tablet
tablets: Add restore_config to tablet_transition_info
sstables_loader: Add restore_tablets task skeleton
test: Add rest_client helper to kick newly introduced API endpoint
api: Add /storage_service/tablets/restore endpoint skeleton
sstables_loader: Add keyspace and table arguments to manfiest loading helper
sstables_loader_helpers: just reformat the code
sstables_loader_helpers: generalize argument and variable names
sstables_loader_helpers: generalize get_sstables_for_tablet
sstables_loader_helpers: add token getters for tablet filtering
sstables_loader_helpers: remove underscores from struct members
sstables_loader: move download_sstable and get_sstables_for_tablet
sstables_loader: extract single-tablet SST filtering
sstables_loader: make download_sstable static
sstables_loader: fix formating of the new `download_sstable` function
sstables_loader: extract single SST download into a function
sstables_loader: add shard_id to minimal_sst_info
sstables_loader: add function for parsing backup manifests
split utility functions for creating test data from database_test
export make_storage_options_config from lib/test_services
rjson: Add helpers for conversions to dht::token and sstable_id
Add system_distributed_keyspace.snapshot_sstables
add get_system_distributed_keyspace to cql_test_env
code: Add system_distributed_keyspace dependency to sstables_loader
storage_service: Export export handle_raft_rpc() helper
storage_service: Export do_tablet_operation()
storage_service: Split transit_tablet() into two
tablets: Add braces around tablet_transition_kind::repair switch
Currently, the manifest advertises "powof2", which is wrong for
arbitrary count and boundaries.
Introduce a new kind of layout called "arbitrary", and produce it if
the tablet map doesn't conform to "powof2" layout.
We should also produce tablet boundaries in this case, but that's
worked on in a different PR: https://github.com/scylladb/scylladb/pull/28525
This patch adds the snapshot_sstables table with the following
schema:
```cql
CREATE TABLE system_distributed.snapshot_sstables (
snapshot_name text,
keyspace text, table text,
datacenter text, rack text,
id uuid,
first_token bigint, last_token bigint,
toc_name text, prefix text)
PRIMARY KEY ((snapshot_name, keyspace, table, datacenter, rack), first_token, id);
```
The table will be populated by the coordinator node during the restore
phase (and later on during the backup phase to accomodate live-restore).
The content of this table is meant to be consumed by the restore worker nodes
which will use this data to filter and file-based download sstables.
Fixes SCYLLADB-263
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Adds a "sstables" array member to manifest.json.
For each sstables, keep the following metadata:
id - a uuid for the sstable (the sstable identifier
if the use-sstable-identifier option was used, otherwise
the sstable uuid generation)
toc_name - the name of the TOC.txt file
data_size and index_size - in bytes
first_token and last_token - of the sstable first and last keys.
Fixes: SCYLLADB-196
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add a table member to manifest.json with the keyspace_name,
table_name, table_id, tablets_type, and, for tablets-enabled tables, get
tablet_count on each shard and write the minimum to manifest.json.
For vnodes-based tables, tablet_count=0.
For now, `tablets_type` may be either `none` for vnodes tables, or
`powof2` for tablets tables. In the future, when we support arbitrary
tablt boundaries, this will be reflected here, and it is likely we
would backup the whole tablets map sperately to get all tablet boundaries.
Fixes SCYLLADB-195
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add metadata about the node: host_id, datacenter, and rack.
This enables dc- or rack- aware restore.
Today this information is "encoded" into the snapshot hierarchy
prefixes, but if all manifest files would be stored in a flat
directory, we'd need to encode that metadata in the object name,
but it'd be better for the manifest contents to be self descriptive.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add metadata about the manifest itself:
A version and the manifest scope (currently "node",
but in the future, may also be "shard", or "tablet")
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Validate the manifest.json format by loading it using rjson::parse
and then validate its contents to ensure it lists exactly the
SSTables present in the snapshot directory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This reverts commit 1bb897c7ca, reversing
changes made to 954f2cbd2f. It makes
incompatible changes to the object storage configuration format, breaking
tests [1]. It's likely that it doesn't break any production configuration,
but we can't be sure.
Fixes#27966Closesscylladb/scylladb#27969
Augments the object storage document with config options etc for
using GS instead of S3.
TODO: add proper gsutil command line examples for manual managing of
GCP storage.
This patch intends to give an overview of where, when and how we store
data in S3 and provide a quick set of commands
which help gain local access to the data in case there is a need for
manual intervention.
The patch also collects in the same place links/descriptions for all
formats we use in S3.
Fixes#22438
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Closesscylladb/scylladb#24323
This change also removes the `object_storage.yaml` file
altogether and adds tests for fetching the endpoints
via the `v2/config/object_storage_endpoints` REST api.
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
this commit moves the object storage configuration guide from the developer
documentation to the user-facing admin documentation. the change reflects
the increasing importance of object storage integration in user-facing
features.
in this change:
- move relevant content from `docs/dev/object_storage.md` to
`docs/operating-scylla/admin.rst`
- reformat the content from Markdown to reStructuredText (RST)
- reword and restructure the content to be more user-friendly
- add explanations and context suitable for a broader audience
this change makes the object storage configuration information more
accessible to Scylla administrators and end-users, supporting the adoption
of new features built on top of object storage integration.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.
in this change:
* api/storage_service: add "table" parameter to "backup" API.
* snapshot_ctl: compose the full path of the snapshot directory in
`snapshot_ctl::start_backup`. since we have all the information
for composing the snapshot directory, and what the `backup_task_impl`
class is interested is but the snapshot directory, we just pass
the path to it instead the individual components of the directory.
* backup_task_impl: instead of scan the whole keyspace recursively,
only scan the specified snapshot directory.
Fixesscylladb/scylladb#20636
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Currently the doc assumes that object storage can only be used to keep
sstables on it. It's going to change, restructure the doc to allow for
more usage scenarios.
Someone thought that they actually represent real keys (the 'EXAMPLE' in their name was not enough).
Converted them to be as clear as can be, example data.
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Closesscylladb/scylladb#18565
- s/aws_key/aws_access_key_id/
- s/aws_secret/aws_secret_access_key/
- s/aws_token/aws_session_token/
rename them to more popular names, these names are also used by
boto's API. this should improve the readability and consistency.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in hope to lower the bar to testing object store.
* add language specifier for better readability of the document.
to highlight the config with YAML syntax
* add more specific comment on the AWS related settings
* explain that endpoint should match in the CREATE KEYSPACE
statement and the one defined by the YAML configuration.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15433
we get the path object storage config like:
```c++
db::config::get_conf_sub("object_storage.yaml").native()
```
so, the default path should be $SCYLLA_CONF/object_storage.yaml.
in this change, it is corrected.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15406