- Remove `ScyllaCluster.__getitem__()` (pending request by @kbr- in a previous pull request), for this remove all direct access to servers from caller code
- Increase Python driver timeouts (req by @nyh)
- Improve `ManagerClient` API requests: use `http+unix://<sockname>/<resource>` instead of `http://localhost/<resource>` and callers of the helper method only pass the resource
- Improve lint and type hints
Closes#11305
* github.com:scylladb/scylladb:
test.py: remove ScyllaCluster.__getitem__()
test.py: ScyllaCluster check kesypace with any server
test.py: ScyllaCluster server error log method
test.py: ScyllaCluster read_server_log()
test.py: save log point for all running servers
test.py: ScyllaCluster provide endpoint
test.py: build host param after before_test
test.py: manager client disable lint warnings
test.py: scylla cluster lint and type hint fixes
test.py: increase more timeouts
test.py: ManagerClient improve API HTTP requests
Provide server error logs to caller (test.py).
Avoids direct access to list of servers.
To be done later: pick the failed server. For now it just provides the
log of one server.
While there, fix type hints.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Instead of accessing the first server, now test.py asks ScyllaCluster
for the server log.
In a later commit, ScyllaCluster will pick the appropriate server.
Also removes another direct access to the list of servers we want to get
rid of.
For error reporting, before a test a mark of the log point in time is
saved. Previously, only the log of the first server was saved. Now it's
done for all running servers.
While there, remove direct access to servers on test.py.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
If no server started, there is no server in the cluster list. So only
build the pytest --host param after before_test check is done.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Preparing for topology tests with changing clusters, run before and
after checks per test case.
Change scope of pytest fixtures to function as we need them per test
casse.
Add server and client API logic.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Add an API via Unix socket to Manager so pytests can query information
about the cluster. Requests are managed by ManagerClient helper class.
The socket is placed inside a unique temporary directory for the
Manager (as safe temporary socket filename is not possible in Python).
Initial API services are manager up, cluster up, if cluster is dirty,
cql port, configured replicas (RF), and list of host ids.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Instead of only using last started server as seed, use all started
servers as seed for new servers.
This also avoids tracking last server's state.
Pass empty list instead of None.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Preparing to cycle clusters modified (dirty) and use multiple clusters
per topology pytest, introduce Topology tests and Manager class to
handle clusters.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
For scylla servers, keep default PasswordAuthenticator and
CassandraAuthorizer but allow this to be configurable per test suite.
Use AllowAll* for topology test suite.
Disabling authentication avoids complications later for topology tests
as system_auth kespace starts with RF=1 and tests take down nodes. The
keyspace would need to change RF and run repair. Using AllowAll avoids
this problem altogether.
A different cql fixture is created without auth for topology tests.
Topology tests require servers without auth from scylla.yaml conf.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
The code in test.py using a ScyllaCluster is getting a server id and
taking logs from only the first server.
If there is a failure in another server it's not reported properly.
And CQL connection will go only to the first server.
Also, it might be better to have ScyllaCluster to handle these matters
and be more opaque.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Previously, if pytest itself failed (e.g. bad import or unexpected
parameter), there was no output file but test.py tried to copy it and
failed.
Change the logic of handling the output file to first check if the
file is there. Then if it's worth keeping it, *move* it to the test
directory for easier comparison and maintenance. Else, if it's not worth
keeping, discard it.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Closes#11193
Fix mixing of log filename and log summary in error reporting for
CQLApprovalTest and PythonTest.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Closes#11125
cql-pytest contains subdirectories with tests
ported from Cassandra. It's desirable to preserve
the same layout and file names for these tests as in
the original source tree. To do that, add support for recursive
search of tests to PythonTestSuite. The log files for
the tests which are found recursively are created in subdirs
of the test tmpdir.
While implementing the feature, switch to using pathlib,
since a) it supports rglob (recursive glob) and b) it
was requested in one of the earlier reviews.
Closes#11018
Add test and server logs, as well as the unidiff, to
XML output. This makes jenkins reports nicer.
While on it, debug & fix bugs in handling of flaky tests:
- the reset would reset a flaky test even after the last attempt
fails, so it would be impossible to see what happened to it
- the args needed to be reset as well, since execution modifies
them
- we would say that we're going to retry the flaky test when in
fact it was the last attempt to run it and no more retries were
planned
Today, if you want to reproduce a rare condition using the same RNG seed
reported, you cannot use test.py which provides useful infrastructure
and will have to run the tests manually instead.
So let's extend test.py to allow optional forwarding of RNG seed to
boost tests only, as other suites don't support the seed option.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220615223657.142110-1-raphaelsc@scylladb.com>
The idea is that a flaky test can be marked as flaky
rather than disabled to make sure it passes in CI.
This reduces chances of a regression being added
while the flakiness is being resolved and the number
of disabled tests doesn't grow.
Introduce reset() hierarchy, which is similar to __init__(),
i.e. allows to reset test execution state before retrying it.
Useful for retrying flaky tests.
If output is a not a tty, verbose is set automatically.
If the output is a tty, one has to request --verbose.
However, a part of test.py verbosity was ignoring --verbose
and looking only at the terminal type.
Before this patch, approval tests (test/cql/*) were using a C++
application called cql_repl, which is a seastar app running
Scylla, reading commands from the standard input and producing
results in Json format in the standard output. The rationale for
this was to avoid running a standalone Scylla which could leak
more resources such as open sockets.
Now that other suites already start and stop Scylla servers, it
makes more sense to run CQL commands in approval tests against an
existing running server. It saves us from building a one more
binary and allows to better format the output. Specifically, we
would like to see Scylla output in tabular format in approval
tests, which is difficult to do when C++ formatting libraries
are used.
Track running tests in the suite.
Cleanup after each suite (after all tests
in the suite end).
Cleanup all artifacts before exit. Don't drop server logs if
there is at least one failed test.
Search for test cases in parallel.
This speeds up the search for test cases from 30 to 4-5
seconds in absence of test case cache and from 4 to 3
seconds if case cache is present.
test.py runs each unit test's test case in a separate process.
The list of test cases is built at start, by running --list-cases
for each unit test. The output is cached, so that if one uses --repeat
option, we don't list the cases again and again.
The cache, however, was only useful for --repeat, because it was only
caching the last tests' output, not all tests output, so if I, for example,
run tests like:
./test.py foo bar foo
.. the cache was unused. Make the cache global which simplifies its
logic and makes it work in more cases.
To run tests in a given mode we will need to start off scylla
clusters, which we would want to pool and reuse between many tests.
TestSuite class was designed to share resources of common tests.
One can't pool together scylla servers compiled with different
tests, so create an own TestSuite instance for each mode.
It's good practice to use linters and style formatters for
all scripted languages. Python community is more strict
about formatting guidelines than others, and using
formatters (like flake8 or black) is almost universally
accepted.
test.py was adhering to flake8 standards at some point,
but later this was spoiled by random commits.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The test suite names seen by Jenkins are suboptimal: there is
no distinction between modes, and the ".cc" suffix of file names
is interpreted as a class name, which is converted to a tree node
that must be clicked to expand. Massage the names to remove
unnecessary information and add the mode.
Closes#9696
The recent parallelization of boost unit tests caused an increase
in xml result files. This is challenging to Jenkins, since it
appears to use rpc-over-ssh to read the result files, and as a result
it takes more than an hour to read all result files when the Jenkins
main node is not on the same continent as the agent.
To fix this, merge the result files in test.py and leave one result
file per mode. Later we can leave one result file overall (integrating
the mode into the testsuite name), but that can wait.
Tested on a local Jenkins instance (just reading the result files,
not the entire build).
Closes#9668
The option accepts taskset-style cpulist and limits the launched tests
respectively. When specified, the default number of jobs is adjusted
accordingly, if --jobs is given it overrides this "default" as expected.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>