The test shows how to restore previously backed up table:
- backup
- truncate to get rid of existing sstables
- start restore with the new API method
- wait for the task to finish
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This adds minimal implementation of the start-backup API call.
The method starts a task that uploads all files from the given keyspace's snapshot to the requested endpoint/bucket. Arguments are:
- endpoint -- the ID in object_store.yaml config file
- bucket -- the target bucket to put objects into
- keyspace -- the keyspace to work on
- snapshot -- the method assumes that the snapshot had been already taken and only copies sstables from it
The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion (hint: it's good to have non-zero TTL value to make sure fast backups don't finish before the caller manages to call wait_task API).
Sstables components are scanned for all tables in the keyspace and are uploaded into the /bucket/${cf_name}/${snapshot_name}/ path.
refs: #18391Closesscylladb/scylladb#19890
* github.com:scylladb/scylladb:
tools/scylla-nodetool: add backup integration
docs: Document the new backup method
test/object_store: Test that backup task is abortable
test/object_store: Add simple backup test
test/object_store: Move format_tuples()
test/pylib: Add more methods to rest client
backup-task: Make it abortable (almost)
code: Introduce backup API method
database: Export parse_table_directory_name() helper
database: Introduce format_table_directory_name() helper
snapshot-ctl: Add config to snapshot_ctl
snapshot-ctl: Add sstables::storage_manager dependency
snapshot-ctl: Maintain task manager module
snapshot-ctl: Add "snapshots" logger
snapshot-ctl: Outline stop() method and constructor
snapshot-ctl: Inline run_snapshot_list<>
test/cql_test_env: Export task manager from cql test env
task_manager: Print task ttl on start (for debugging)
docs: Update object_storage.md with AWS_ environment
docs: Restructure object_storage.md
Increase pool size changes were recently reverted because of the flakiness for the test_gossip_boot test. Test started
to fail on adding the node to the cluster without any issues in the Scylla log file. In test logs it looked like the
installation process for the new node just hanged. After investigating the problem, I've found out that the issue is that
test.py was draining the io_executor pool for cleaning the directory during install that was set to eight workers. So
to fix the issue, io_executor pool should be increased to more or less the same ratio as it was: doubled cluster pool size.
Closesscylladb/scylladb#20276
It starts similarly to simpl backup test, but injects a pause into the
task once a single file is scheduled for upload, then aborts the task,
waits for it to fail, and check that _not_ all files are uploaded.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test shows how to backup a keyspace:
- flush
- take snapshot
- start backup with the new API method
- wait for the task to finish
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Instead of plain http request, use the power of boto3 package. The
recently added get_s3_resource() facilitates creating one
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It creates boto3.resource object that points to endpoint maintained
by s3_server argument (that tests obtain via fixture). This allows
using boto3 to access S3 bucket from local minio server.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Incorrect passing of the artifacts_dir_url parameter from test.py to pytest leads to the situation when it will pass None as a string and pytest will generate incorrect URL.
In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same.
To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id.
Fixes: https://github.com/scylladb/scylladb/issues/17851
Fixes: https://github.com/scylladb/scylladb/issues/15973
This PR resolves issue with double count of the test result for topology tests. It will not appear in the consolidated report anymore.
Another fix is to provide a better view which test failed by modifying the test case name in the report enriching it with mode and run id, so making them unique across the run.
The scope of this change is:
1. Modify the test name to have run id in name
2. Add handlers to get logs of test.py and pytest in one file that are related to test, rather than to the full suite
3. Remove topology tests from aggregating them on a suite level in Junit results
4. Add a link to the logs related to the failed tests in Junit results, so it will be easier to navigate to all logs related to test
5. Gather logs related to the failed test to one directory for better logs investigation
Ref: scylladb/scylladb#17851Closesscylladb/scylladb#18277
Now all test cases use pylib manager client to manipulate cluster
While at it -- drop more unused bits from suite .py files
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now when the test case in question is not using ManagerCluster, there's
no point in using test_tempdir either and the temporary object-store
config can be generated in generic temporary directory
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In the middle this test case needs to force scylla server reload its
configs. Currently manager API requires that some existing config option
is provided as an argument, but in this test case scylla.yaml remains
intact. So it satisfies the API with non-chaning option.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This case is a bit tricky, as it needs to know where scylla's workdir
is, so it replaces the use of test_tempdir with the call to manager to
get server's workdir.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This includes
- marking the suite as Topology
- import needed fixtures and options from topology conftest
- configuring the zero initial cluster size and anonymous auth
- marking all test cases as skipped, as they no longer work after above
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
The test case is
- start scylla with broken object storage endpoint config
- create and populate s3-backed keyspace
- try flushing it (API call would hang, so do it in the background)
- wait for a few seconds, then fix the config
- wait for the flush to finish and stop scylla
- start scylla again and check that the keyspace is properly populated
Nice side effect of this test is that once flush fails (due to broken
config) it tries to remove the not-yet-sealed sstables and (!) fails
again, for the same reason. So during the restart there happen to be
several sstables in "creating" state with no stored objects, so this
additionally tests one more g.c. corner case
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The Cluster wrapper used by object_store test already has the ability to
access cluster via CQL and via API. Add the sugar to make the cluster
re-read its scylla.yaml and other configs
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The pylib minio server does that already. A test case added by the next
patch would need to have both cases as path, not as string
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In an attempt to create a non-local keyspace with unknown endpoint,
there should pop up the configuration exception.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We're going to ban creation of a keyspace with S3 type in case the
requested endpoint is not configured. The problem is that this test case
of cql-pytest needs such keyspace to be created and in order to provide
the object storage configuration we'd need to touch the generic scylla
cluster management which is an overill for generic cql-pytest case.
Simpler solution is to make object_store test suite perform all the
S3-related checks, including the way DESCRIBE for S3-backed ks works.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
All two and the upcoming third test cases in the test create the very
same ks.cf pair with the very same sequence of steps. Generalize them.
For the basic test case also tune up the way "expected" rows are
calculated -- now they are SELECT-ed right after insertion and the size
is checked to be non zero. Not _exactly_ the same check, but it's good
enough for basic testing purposes.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#15986
When creating a S3-backed keyspace its storage dir shouldn't be made.
Also it shouldn't be "resurrected" by boot-time loader of existing
keyspaces.
For extra confidence check that the system keyspace's directory does
exists where the test expects keyspaces' directories to appear.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
before this change, the tempdir is always nuked no matter if the
test succceds. but sometimes, it would be important to check
scylla's sstables after the test finishes.
so, in this change, an option named `--keep-tmp` is added so
we can optionally preserve the temp directory. this option is off
by default.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15949
instead of referencing the elements in tuple with their indexes, use
pattern matching to capture them. for better readability.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
it is printed when pytest passes it down as a fixture as part of
the logging message. it would help with debugging a object_store test.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15817
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.
to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.
so, in this change:
1. instead of creating a new UUID, just reuse the generation of the
sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
base36-based one (implemented by
fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
enabled in the `test_env_config`, so that it the sstable manager
can enable the uuid-based uuid when creating a new uuid for
sstable.
5. throw if the generation of sstable is not UUID-based when
accessing / manipulating an sstable with S3 storage backend. as
the S3 storage backend now relies on this option. as, otherwise
we'd have sstables with key like s3://bucket/number/basename, which
is just unable to serve as a unique id for sstable if the bucket is
shared across multiple tables.
Fixes#14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
pytest changes the test's sys.stdout and sys.stderr to the
captured fds when it captures the outputs of the test. so we
are not able to get the STDOUT_FILENO and STDERR_FILENO in C
by querying `sys.stdout.fileno()` and `sys.stderr.fileno()`.
their return values are not 1 and 2 anymore, unless pytest
is started with "-s".
so, to ensure that we always redirect the child process's
outputs to the log file. we need to use 1 and 2 for accessing
the well-known fds, which are the ones used by the child
process, when it writes to stdout and stderr.
this change should address the problem that the log file is
always empty, unless "-s" is specified.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15560
Test cases kick scylla to force keyspaces flush (to have the objects on
object store) by hand. Equip the wrapped cluster object with the REST
API class instance for convenience
The assertion for 200 return status code is dropped, REST client does it
behind the scenes
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Test cases use temporary cluster object which is, in fact, cql cluster.
In the future there will be the need to perform more actions on it
rather than just querying it with cql client, so wrap the cluster with
an extendable object
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test-case creates a S3-backed ks, populates it with table and data,
then forces flush to make sstables appear on the backend. Then it
updates the registry by marking all the objects as 'removing' so that on
next boot they will be garbage-collected.
After reboot check that the table is "empty" and also validate that the
backend doesn't have the corresponding objects on board for real
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently minio applies anonymous public policy for the test bucket and
all tests just use unsigned S3 requests. This patch generates a policy
for the temporary minio user and removes the anon public one. All tests
are updated respectively to use the provided key:secret pair.
The use-https bit is off by default as minio still starts with plain
http. That's OK for now, all tests are local and have no secret data
anyway
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
instead of hardwiring the names in multiple places, let's just
keep them in a single place as variables, and reference them by
these variables instead of their values.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.
to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
should have been use `ignore_errors=True` to ignore
the error. this issue has not poped up, because
we haven't run into the case where the log file
does not exist.
this was a regression introduced by
d4ee84ee1e
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15063
since MinioServer find a free port by itself, there is no need to
provide it an IP address for it anymore -- we can always use
127.0.0.1.
so, in this change, we just drop the HostRegistry parameter passed
to the constructor of MinioServer, and pass the host address in place
of it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, if the object_store test fails, the tempdir
will be preserved. and if our CI test pipeline is used to perform
the test, the test job would scan for the artifacts, and if the
test in question fails, it would take over 1 hour to scan the tempdir.
to alleviate the pain, let's just keep the scylla logging file
no matter the test fails or succeeds. so that jenkins can scan the
artifacts faster if the test fails.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14880
in 46616712, we tried to keep the tmpdir only if the test failed,
and keep up to 1 of them using the recently introduced
option of `tmp_path_retention_count`. but it turns out this option
is not supported by the pytest used by our jenkins nodes, where we
have pytest 6.2.5. this is the one shipped along with fedora 36.
so, in this change, the tempdir is removed if the test completes
without failures. as the tempdir contains huge number of files,
and jenkins is quite slow scanning them. after nuking the tempdir,
jenkins will be much faster when scanning for the artifacts.
Fixes#14690
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14772
by default, up to 3 temporary directories are kept by pytest.
but we run only a single time for each of the $TMPDIR. per
our recent observation, it takes a lot more time for jenkins
to scan the tempdir if we use it for scylla's rundir.
so, to alleviate this symptom, we just keep up to one failed
session in the tempdir. if the test passes, the tempdir
created by pytest will be nuked. normally it is located at
scylladb/testlog/${mode}/pytest-of-$(whoami).
see also
https://docs.pytest.org/en/7.3.x/reference/reference.html#confval-tmp_path_retention_policy
Refs #14690
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14735
[xemul: Withdrawing from PR's comments
object_store is the only test which
is using tmpdir fixture
starts / stops scylla by itself
and put the rundir of scylla in its own tmpdir
we don't register the step of cleaning up [the temp dir] using the utilities provided by
cql-pytest. we rely on pytest to perform the cleanup. while cql-pytest performs the
cleanup using a global registry.
]
these comments or docstrings are not in-sync with the code they
are supposed to explain. so let's update them accordingly.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14545