25 Commits

Author SHA1 Message Date
Benny Halevy
d6557764b9 snapshot: keep per-sstable metadata in manifest.json
Adds a "sstables" array member to manifest.json.
For each sstables, keep the following metadata:
id - a uuid for the sstable (the sstable identifier
if the use-sstable-identifier option was used, otherwise
the sstable uuid generation)
toc_name - the name of the TOC.txt file
data_size and index_size - in bytes
first_token and last_token - of the sstable first and last keys.

Fixes: SCYLLADB-196

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:42:52 +02:00
Benny Halevy
dc9093303d snapshot: add table info and tablet_count to manifest.json
Add a table member to manifest.json with the keyspace_name,
table_name, table_id, tablets_type, and, for tablets-enabled tables, get
tablet_count on each shard and write the minimum to manifest.json.
For vnodes-based tables, tablet_count=0.

For now, `tablets_type` may be either `none` for vnodes tables, or
`powof2` for tablets tables.  In the future, when we support arbitrary
tablt boundaries, this will be reflected here, and it is likely we
would backup the whole tablets map sperately to get all tablet boundaries.

Fixes SCYLLADB-195

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:36:52 +02:00
Benny Halevy
91df129e21 snapshot: add basic support for snapshot ttl in manifest.json
Store the snapshot `created_at` time and an optional
`expires_at` time.

Fixes SCYLLADB-189

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:12:56 +02:00
Benny Halevy
d9fc3b1c11 snapshot: add snapshot name in manifest.json
Store the snapshot tag in the manifest file.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:12:56 +02:00
Benny Halevy
0d82e56078 snapshot: add node info to manifest.json
Add metadata about the node: host_id, datacenter, and rack.
This enables dc- or rack- aware restore.
Today this information is "encoded" into the snapshot hierarchy
prefixes, but if all manifest files would be stored in a flat
directory, we'd need to encode that metadata in the object name,
but it'd be better for the manifest contents to be self descriptive.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:12:56 +02:00
Benny Halevy
24040efc54 snapshot: add manifest info to manifest.json
Add metadata about the manifest itself:
A version and the manifest scope (currently "node",
but in the future, may also be "shard", or "tablet")

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:12:56 +02:00
Benny Halevy
9e0f5410ae test: database_test: snapshot_works: add validate_manifest
Validate the manifest.json format by loading it using rjson::parse
and then validate its contents to ensure it lists exactly the
SSTables present in the snapshot directory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-01-22 09:12:56 +02:00
Pavel Emelyanov
bd225784bd docs: Update docs according to new endpoints config option format
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2026-01-13 13:24:06 +03:00
Avi Kivity
0df85c8ae8 Revert "Merge 'Unify configuration of object storage endpoints' from Pavel Emelyanov"
This reverts commit 1bb897c7ca, reversing
changes made to 954f2cbd2f. It makes
incompatible changes to the object storage configuration format, breaking
tests [1]. It's likely that it doesn't break any production configuration,
but we can't be sure.

Fixes #27966

Closes scylladb/scylladb#27969
2026-01-05 08:53:41 +02:00
Pavel Emelyanov
f93cafac51 docs: Update docs according to new endpoints config option format
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-12-10 15:33:47 +03:00
Calle Wilund
68c109c2df docs::dev::object_storage: Add some initial info on GS storage
Augments the object storage document with config options etc for
using GS instead of S3.

TODO: add proper gsutil command line examples for manual managing of
GCP storage.
2025-10-13 08:53:28 +00:00
Robert Bindar
1dd37ba47a Add dev documentation for manipulating s3 data manually
This patch intends to give an overview of where, when and how we store
data in S3 and provide a quick set of commands
which help gain local access to the data in case there is a need for
manual intervention.

The patch also collects in the same place links/descriptions for all
formats we use in S3.

Fixes #22438

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#24323
2025-06-17 13:21:30 +03:00
Robert Bindar
e3a3508960 Move object_storage.yaml endpoints to scylla.yaml
This change also removes the `object_storage.yaml` file
altogether and adds tests for fetching the endpoints
via the `v2/config/object_storage_endpoints` REST api.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
2025-03-31 13:39:39 +03:00
Ernest Zaslavsky
29e60288de docs: update the object_storage.md and admin.rst
Added additional options and best practices for AWS authentication.
2025-02-05 14:57:19 +02:00
Kefu Chai
9bd9ee9f36 docs: promote object storage configuration to user-facing documentation
this commit moves the object storage configuration guide from the developer
documentation to the user-facing admin documentation. the change reflects
the increasing importance of object storage integration in user-facing
features.

in this change:

- move relevant content from `docs/dev/object_storage.md` to
  `docs/operating-scylla/admin.rst`
- reformat the content from Markdown to reStructuredText (RST)
- reword and restructure the content to be more user-friendly
- add explanations and context suitable for a broader audience

this change makes the object storage configuration information more
accessible to Scylla administrators and end-users, supporting the adoption
of new features built on top of object storage integration.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-22 18:26:19 +08:00
Kefu Chai
d663b6c13b treewide: add "table" parameter to "backup" API
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.

in this change:

* api/storage_service: add "table" parameter to "backup" API.
* snapshot_ctl: compose the full path of the snapshot directory in
  `snapshot_ctl::start_backup`. since we have all the information
  for composing the snapshot directory, and what the `backup_task_impl`
  class is interested is but the snapshot directory, we just pass
  the path to it instead the individual components of the directory.
* backup_task_impl: instead of scan the whole keyspace recursively,
  only scan the specified snapshot directory.

Fixes scylladb/scylladb#20636
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-25 09:11:26 +08:00
Ernest Zaslavsky
924325fd25 treewide: add "prefix" parameter to backup API
Allow the caller to pass the prefix when performing backup and restore

Fixes scylladb/scylladb#20335

Closes scylladb/scylladb#20413
2024-09-18 08:25:00 +03:00
Pavel Emelyanov
245cc852dd docs: Document the new backup method
Add the new /storage_service/backup endpoint to object_storage.md as yet
another way to use S3 from Scylla.
2024-08-22 19:47:06 +03:00
Pavel Emelyanov
8949d73cd9 docs: Update object_storage.md with AWS_ environment
Commit 51c53d8db6 made it possible to configure object storage endpoint
creds via environment. Mention this in the docs.
2024-08-22 14:08:21 +03:00
Pavel Emelyanov
d3f9865d2f docs: Restructure object_storage.md
Currently the doc assumes that object storage can only be used to keep
sstables on it. It's going to change, restructure the doc to allow for
more usage scenarios.
2024-08-22 14:08:21 +03:00
Yaniv Michael Kaul
124064844f docs/dev/object_stroage.md: convert example AWS keys to be more innocent
Someone thought that they actually represent real keys (the 'EXAMPLE' in their name was not enough).
Converted them to be as clear as can be, example data.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#18565
2024-05-09 08:26:43 +03:00
Kefu Chai
f3f31f0c65 main.cc: rename aws option
- s/aws_key/aws_access_key_id/
- s/aws_secret/aws_secret_access_key/
- s/aws_token/aws_session_token/

rename them to more popular names, these names are also used by
boto's API. this should improve the readability and consistency.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-23 14:31:32 +08:00
Kefu Chai
30ef69fcb2 docs/dev/object_store: add more samples
in hope to lower the bar to testing object store.

* add language specifier for better readability of the document.
  to highlight the config with YAML syntax
* add more specific comment on the AWS related settings
* explain that endpoint should match in the CREATE KEYSPACE
  statement and the one defined by the YAML configuration.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15433
2023-09-15 17:35:17 +03:00
Kefu Chai
60c293ed7d doc/dev: correct the path to object_storage.yaml
we get the path object storage config like:

```c++
db::config::get_conf_sub("object_storage.yaml").native()
```
so, the default path should be $SCYLLA_CONF/object_storage.yaml.

in this change, it is corrected.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15406
2023-09-14 10:40:55 +03:00
Pavel Emelyanov
0b18e3bff9 doc: Add a document describing how to configure S3 backend
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:23:38 +03:00