When creating a S3-backed keyspace its storage dir shouldn't be made.
Also it shouldn't be "resurrected" by boot-time loader of existing
keyspaces.
For extra confidence check that the system keyspace's directory does
exists where the test expects keyspaces' directories to appear.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
before this change, the tempdir is always nuked no matter if the
test succceds. but sometimes, it would be important to check
scylla's sstables after the test finishes.
so, in this change, an option named `--keep-tmp` is added so
we can optionally preserve the temp directory. this option is off
by default.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15949
instead of referencing the elements in tuple with their indexes, use
pattern matching to capture them. for better readability.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
it is printed when pytest passes it down as a fixture as part of
the logging message. it would help with debugging a object_store test.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15817
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.
to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.
so, in this change:
1. instead of creating a new UUID, just reuse the generation of the
sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
base36-based one (implemented by
fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
enabled in the `test_env_config`, so that it the sstable manager
can enable the uuid-based uuid when creating a new uuid for
sstable.
5. throw if the generation of sstable is not UUID-based when
accessing / manipulating an sstable with S3 storage backend. as
the S3 storage backend now relies on this option. as, otherwise
we'd have sstables with key like s3://bucket/number/basename, which
is just unable to serve as a unique id for sstable if the bucket is
shared across multiple tables.
Fixes#14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
pytest changes the test's sys.stdout and sys.stderr to the
captured fds when it captures the outputs of the test. so we
are not able to get the STDOUT_FILENO and STDERR_FILENO in C
by querying `sys.stdout.fileno()` and `sys.stderr.fileno()`.
their return values are not 1 and 2 anymore, unless pytest
is started with "-s".
so, to ensure that we always redirect the child process's
outputs to the log file. we need to use 1 and 2 for accessing
the well-known fds, which are the ones used by the child
process, when it writes to stdout and stderr.
this change should address the problem that the log file is
always empty, unless "-s" is specified.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15560
Test cases kick scylla to force keyspaces flush (to have the objects on
object store) by hand. Equip the wrapped cluster object with the REST
API class instance for convenience
The assertion for 200 return status code is dropped, REST client does it
behind the scenes
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Test cases use temporary cluster object which is, in fact, cql cluster.
In the future there will be the need to perform more actions on it
rather than just querying it with cql client, so wrap the cluster with
an extendable object
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test-case creates a S3-backed ks, populates it with table and data,
then forces flush to make sstables appear on the backend. Then it
updates the registry by marking all the objects as 'removing' so that on
next boot they will be garbage-collected.
After reboot check that the table is "empty" and also validate that the
backend doesn't have the corresponding objects on board for real
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently minio applies anonymous public policy for the test bucket and
all tests just use unsigned S3 requests. This patch generates a policy
for the temporary minio user and removes the anon public one. All tests
are updated respectively to use the provided key:secret pair.
The use-https bit is off by default as minio still starts with plain
http. That's OK for now, all tests are local and have no secret data
anyway
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
instead of hardwiring the names in multiple places, let's just
keep them in a single place as variables, and reference them by
these variables instead of their values.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.
to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
should have been use `ignore_errors=True` to ignore
the error. this issue has not poped up, because
we haven't run into the case where the log file
does not exist.
this was a regression introduced by
d4ee84ee1e
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15063
since MinioServer find a free port by itself, there is no need to
provide it an IP address for it anymore -- we can always use
127.0.0.1.
so, in this change, we just drop the HostRegistry parameter passed
to the constructor of MinioServer, and pass the host address in place
of it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, if the object_store test fails, the tempdir
will be preserved. and if our CI test pipeline is used to perform
the test, the test job would scan for the artifacts, and if the
test in question fails, it would take over 1 hour to scan the tempdir.
to alleviate the pain, let's just keep the scylla logging file
no matter the test fails or succeeds. so that jenkins can scan the
artifacts faster if the test fails.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14880
in 46616712, we tried to keep the tmpdir only if the test failed,
and keep up to 1 of them using the recently introduced
option of `tmp_path_retention_count`. but it turns out this option
is not supported by the pytest used by our jenkins nodes, where we
have pytest 6.2.5. this is the one shipped along with fedora 36.
so, in this change, the tempdir is removed if the test completes
without failures. as the tempdir contains huge number of files,
and jenkins is quite slow scanning them. after nuking the tempdir,
jenkins will be much faster when scanning for the artifacts.
Fixes#14690
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14772
by default, up to 3 temporary directories are kept by pytest.
but we run only a single time for each of the $TMPDIR. per
our recent observation, it takes a lot more time for jenkins
to scan the tempdir if we use it for scylla's rundir.
so, to alleviate this symptom, we just keep up to one failed
session in the tempdir. if the test passes, the tempdir
created by pytest will be nuked. normally it is located at
scylladb/testlog/${mode}/pytest-of-$(whoami).
see also
https://docs.pytest.org/en/7.3.x/reference/reference.html#confval-tmp_path_retention_policy
Refs #14690
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14735
[xemul: Withdrawing from PR's comments
object_store is the only test which
is using tmpdir fixture
starts / stops scylla by itself
and put the rundir of scylla in its own tmpdir
we don't register the step of cleaning up [the temp dir] using the utilities provided by
cql-pytest. we rely on pytest to perform the cleanup. while cql-pytest performs the
cleanup using a global registry.
]
these comments or docstrings are not in-sync with the code they
are supposed to explain. so let's update them accordingly.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14545
let's just use cluster.contact_points for retrieving the IP address
of the scylla node in this single-node cluster. so the name of
managed_cluster() is less weird.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
instead of using a single run to perform the test, restructure
it into a pytest based test suite with a single test case.
this should allow us to add more tests exercising the object-storage
and cached/tierd storage in future.
* add fixtures so they can be reused by tests
* use tmpdir fixture for managing the tmpdir, see
https://docs.pytest.org/en/6.2.x/tmpdir.html#the-tmpdir-fixture
* perform part of the teardown in the "test_tempdir()" fixture
* change the type of test from "Run" to "Python"
* rename "run" to "test_basic.py"
* optionally start the minio server if the settings are not
found in command line or env variables, so that the tests are
self-contained without the fixture setup by test.py.
* instead of sys.exit(), use assert statement, as this is
what pytest uses.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
* define a dedicated S3_server class which duck types MinioServer.
it will be used to represent S3 server in place of MinioServer if
S3 is used for testing
* prepare object_storage.yaml in get_scylla_with_s3(), so it is more
clear that we are using the same set of settings for launching
scylla
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
replace the restart_with_dir() with kill_with_dir(), so
that we can simplify the usage of managed_cluster() by enabling it
to start and stop the single-node cluster. with this change, the caller
does not need to run the scylla and pass its pid to this function
any more.
since the restart_with_dir() call is superseded by managed_cluster(),
which tears down the cluster, teardown() is now only responsible to
print out the log file.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
to match with another call of managed_cluster(), so it's clear that
we are just reusing test_tempdir.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
for setting up the cluster and tearing down it.
this helps to indent the code so that it is visually explicit
the lifecycle of the cluster.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
instead of hardwiring the dataset in test, let's define them with
variables and use the variables instead.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in order to make data set for testing more visible, format_tuples() is
introduced for formatting a dict into a set of structured values
consumable by CQL.
this function is added to test/cql-pytest/util.py in hope that it
can be reused by other tests using CQL.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
As described in https://github.com/scylladb/scylladb/issues/8638,
we're moving away from `SimpleStrategy`, in the future
it will become deprecated.
We should remove all uses of it and replace them
with `NetworkTopologyStrategy`.
This change replaces `SimpleStrategy` with
`NetworkTopologyStrategy` in all unit tests,
or at least in the ones where it was reasonable to do so.
Some of the tests were written explicitly to test the
`SimpleStrategy` strategy, or changing the keyspace from
`SimpleStrategy` to `NetworkTopologyStrategy`.
These tests were left intact.
It's still a feature that is supported,
even if it's slowly getting deprecated.
The typical way to use `NetworkTopologyStrategy` is
to specify a replication factor for each datacenter.
This could be a bit cumbersome, we would have to fetch
the list of datacenters, set the repfactors, etc.
Luckily there is another way - we can just specify
a replication factor to use for or each existing
datacenter, like this:
```cql
CREATE KEYSPACE {} WITH REPLICATION =
{'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1};
```
This makes the change rather straightforward - just replace all
instances of `'SimpleStrategy'', with `'NetworkTopologyStrategy'`.
Refs: https://github.com/scylladb/scylladb/issues/8638
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Closes#13990
Currently the code temporarily assumes that the endpoint port is 9000.
This is what tests' local minio is started with. This patch keeps the
port number on endpoint config and makes test get the port number from
minio starting code via environment.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In order to access real S3 bucket, the client should use signed requests
over https. Partially this is due to security considerations, partially
this is unavoidable, because multipart-uploading is banned for unsigned
requests on the S3. Also, signed requests over plain http require
signing the payload as well, which is a bit troublesome, so it's better
to stick to secure https and keep payload unsigned.
To prepare signed requests the code needs to know three things:
- aws key
- aws secret
- aws region name
The latter could be derived from the endpoint URL, but it's simpler to
configure it explicitly, all the more so there's an option to use S3
URLs without region name in them we could want to use some time.
To keep the described configuration the proposed place is the
object_storage.yaml file with the format
endpoints:
- name: a.b.c
port: 443
aws_key: 12345
aws_secret: abcdefghijklmnop
...
When loaded, the map gets into db::config and later will be propagated
down to sstables code (see next patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
the temporary directory holding the log file collecting the scylla
subprocess's output is specified by the test itself, and it is
`test_tempdir`. but unfortunately, cql-pytest/run.py is not aware
of this. so `cleanup_all()` is not able to print out the logging
messages at exit. as, please note, cql-pytest/run.py always
collect "log" file under the directory created using `pid_to_dir()`
where pid is the spawned subprocesses. but `object_store/run` uses
the main process's pid for its reusable tempdir.
so, with this change, we also register a cleanup func to printout
the logging message when the test exits.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
we only need to declare a variable with `global` when we need to
write to it, but if we just want to read it, there is no need to
declare it. because the way how python looks up for a variable
when reading from it enables python to find the global variables
(and apparently the functions!). but when we assign a variable in
python, the interpreter would have to tell in which scope the
variable lives. by default the local scope is used, and a new
variable is added to `locals()`.
but in this case, we just read from it. so no need to add the
`global` statement.
see also https://docs.python.org/3/reference/simple_stmts.html#global
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Using it the pylib minio code export minio address for tests. This
creates unneeded WTFs when running the test over AWS S3, so it's better
to rename to variable not to mention MINIO at all.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Local test.py runs minio with the public 'testbucket' bucket and all
test cases know that. This series adds an ability to run tests over real
S3 so the bucket name should be configurable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test does
- starts scylla (over stable directory
- creates S3-backed keyspace (minio is up and running by test.py
already)
- creates table in that keyspace and populates it with several rows
- flushes the keyspace to make sstables hit the storage
- checks that the ownership table is populated properly
- restarts scylla
- makes sure old entries exist
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>