Search for test cases in parallel.
This speeds up the search for test cases from 30 to 4-5
seconds in absence of test case cache and from 4 to 3
seconds if case cache is present.
test.py runs each unit test's test case in a separate process.
The list of test cases is built at start, by running --list-cases
for each unit test. The output is cached, so that if one uses --repeat
option, we don't list the cases again and again.
The cache, however, was only useful for --repeat, because it was only
caching the last tests' output, not all tests output, so if I, for example,
run tests like:
./test.py foo bar foo
.. the cache was unused. Make the cache global which simplifies its
logic and makes it work in more cases.
To run tests in a given mode we will need to start off scylla
clusters, which we would want to pool and reuse between many tests.
TestSuite class was designed to share resources of common tests.
One can't pool together scylla servers compiled with different
tests, so create an own TestSuite instance for each mode.
It's good practice to use linters and style formatters for
all scripted languages. Python community is more strict
about formatting guidelines than others, and using
formatters (like flake8 or black) is almost universally
accepted.
test.py was adhering to flake8 standards at some point,
but later this was spoiled by random commits.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The test suite names seen by Jenkins are suboptimal: there is
no distinction between modes, and the ".cc" suffix of file names
is interpreted as a class name, which is converted to a tree node
that must be clicked to expand. Massage the names to remove
unnecessary information and add the mode.
Closes#9696
The recent parallelization of boost unit tests caused an increase
in xml result files. This is challenging to Jenkins, since it
appears to use rpc-over-ssh to read the result files, and as a result
it takes more than an hour to read all result files when the Jenkins
main node is not on the same continent as the agent.
To fix this, merge the result files in test.py and leave one result
file per mode. Later we can leave one result file overall (integrating
the mode into the testsuite name), but that can wait.
Tested on a local Jenkins instance (just reading the result files,
not the entire build).
Closes#9668
The option accepts taskset-style cpulist and limits the launched tests
respectively. When specified, the default number of jobs is adjusted
accordingly, if --jobs is given it overrides this "default" as expected.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Next patch will need to know if the --jobs option was specified or the
caller is OK with the default. One way to achieve it is to keep 0 as the
default and set the default value afterwards.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There were few missing bits before making this the default.
- default max number of AIOs, now tests are run with the greatly
reduced value
- 1.5 hours single case from database_test, now it's split and
scales with --parallel-cases
- suite add_test methods called in a loop for --repeat options,
patch #1 from this set fixes it
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The add_test method of a siute can be called several times in a
row e.g. in case of --repeat option or because there are more
than one custom_args entries in the suite.yaml file. In any case
it's pointless to re-collect the test cases by launching the
test binary again, it's much faster (and 100% safe) to keep the
list of cases from the previous call and re-use it if the test
name matches.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Seastar's default limit of 10,000 iocbs per shard is too low for
some workload (it places an upper bound on the number of idle
connections, above which a crash occurs). Use the new Seastar
feature to raise the default to 50000.
Also multiply the global reservation by 5, and round it upwards
so the number is less weird. This prevents io_setup() from failing.
For tests, the reservation is reduced since they don't create large
numbers of connections. This reduces surprise test failures when they
are run on machines that haven't been adjusted.
Fixes#9051Closes#9052
The parallelizm is acheived by listing the content of each (boost)
test and by adding a test for each case found appending the
'--run_test={case_name}' option.
Also few tests (logallog and memtable) have cases that depend on
each other (the former explicitly stated this in the head comment),
so these are marked as "no_parallel_cases" in the suite.yaml file.
In dev mode tests need 2m:5s to run by default. With parallelizm
(and updated long-running tests list) -- 1m 35s.
In debug mode there are 6 slow _cases_ that overrun 30 minutes.
They finish last and deserve some special (incremental) care. All
the other tests run ~1h by default vs ~25m in parallel.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This means adding the casename argument to its describing class
and handling it:
1. appending to the shortname
2. adding the --run_test= argument to boost args
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method in question is in charge of creating a single
entry in the list of tests to be run. The BoostTestSuite's
method is about to create several entries and this patch
prepares it for this:
- makes it distinguish individual arguments
- lets it select the test.id value itself
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When running tests in parallel-cases mode the test.uname must
include the case name to make different log and xml files for
different runs and to show which exact case is run when shown
by the tabular-output. At the same time the test shortname
identifies the binary with the whole test.
This patch makes class Test treat the shortname argument as
a dot-separated string where the 0th component is the binary
with the test and the rest is how test identifies itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This change solves several issues that would arise with the
case-by-case run.
First, the currently printed name is "$binary_name.$id". For
case-by-case run the binary name would coinside for many cases
and it will be inconvenient to identify the test case. So
the tests uname is printed instead.
Second, the tests uname doesn't contain suite name (unlike the
test binary name which does), so this patch also adds the
explicit suite name back as a separate column (like MODE)
Third, the testname + casename string length will be far above
the allocated 50 characters, so the test name is moved at the
tail of the line.
Fourth, the total number of cases is 2100+, the field of 7
characters is not enough to print it, so it's extended.
Finally the test.py output would look like this for parallel run:
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
[1/2108] raft dev [ PASS ] etcd_test.test_progress_leader.40 0.06s
[2/2108] raft dev [ PASS ] etcd_test.test_vote_from_any_state.45 0.03s
[3/2108] raft dev [ PASS ] etcd_test.test_progress_flow_control.43 0.04s
[4/2108] raft dev [ PASS ] etcd_test.test_progress_resume_by_append_resp.41 0.05s
[5/2108] raft dev [ PASS ] etcd_test.test_leader_election_overwrite_newer_logs.44 0.04s
[6/2108] raft dev [ PASS ] etcd_test.test_progress_paused.42 0.05s
[7/2108] raft dev [ PASS ] etcd_test.test_log_replication_2.47 0.06s
...
or like this for regular:
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
[1/184] raft dev [ PASS ] fsm_test.41 0.06s
[2/184] raft dev [ PASS ] etcd_test.40 0.06s
[3/184] cql dev [ PASS ] cassandra_cql_test.2 1.87s
[4/184] unit dev [ PASS ] btree_stress_test.30 1.82s
...
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Since fce124bd90 ("Merge "Introduce flat_mutation_reader_v2" from
Tomasz") tests involving mutation_reader are a lot slower due to
the new API testing. On slower machines it's enough to time out.
Work underway to improve the situation, and it will also revert back
to the original timing once the flat_mutation_reader_v2 work is done,
but meanwhile, increase the timeout.
Closes#9046
Instead of attempting to universally set the proper environment
necessary for tests to generate profiling data such that coverage.py can
process it, allow each Test subclass to set up the environment as needed
by the specific Test variant.
With this we now have support for all current test types, including cql,
cql-pytest and alternator tests.
* Add ability to skip tests in individual modes using "skip_in_<mode>".
* Add ability to allow tests in specific modes using "run_in_<mode>".
* Rename "skip_in_debug_mode" to "skip_in_debug_modes", because there
is an actual mode named "debug" and this is confusing.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Add support for the newly added coverage mode. When --mode=coverage,
also invoke the coverage generation report script to produce a coverage
report after having run the tests.
There are still some rough edges, alternator and cql tests don't work.
8a8589038c ("test: increase quota for tests to 6GB") increased
the quota for tests from 2GB to 6GB. I later found that the increased
requirement is related to the page size: Address Sanitizer allocates
at least a page per object, and so if the page size is larger the
memory requirement is also larger.
Make use of this by only increasing the quota if the page size
is greater than 4096 (I've only seen 4096 and 65536 in the wild).
This allows greater parallelism when the page size is small.
Closes#8371
Tests are short-lived and use a small amount of data. They
are also often run repeatly, and the data is deleted immediately
after the test. This is a good scenario for using the kernel page
cache, as it can cache read-only data from test to test, and avoid
spilling write data to disk if it is deleted quickly.
Acknowledge this by using the new --kernel-page-cache option for
tests.
This is expected to help on large machines, where the disk can be
overloaded. Smaller machines with NVMe disks probably will not see
a difference.
Closes#8347
The patch which introduces build-dependent testing
has a regression: it quietly filters out all tests
which are not part of ninja output. Since ninja
doesn't build any CQL tests (including CQL-pytest),
all such tests were quietly disabled.
Fix the regression by only doing the filtering
in unit and boost test suites.
test: dev (unit), dev + --build-raft
Message-Id: <20201119224008.185250-1-kostja@scylladb.com>
When unit tests fail the test.py dump their output on the screen. This is impossible
to read this output from the terminal, all the more so the logs are anyway saved in
the testlog/ directory. At the same time the names of the failed tests are all left
_before_ these logs, and if the terminal history is not large enough, it becomes
quite annoying to find the names out.
The proposal is not to spoil the terminal with raw logs -- just names and summaries.
Logs themselves are at testlog/$mode/$name_of_the_test.log
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20201031154518.22257-1-xemul@scylladb.com>
Some ARM cores are slow, and trip our current timeout of 3000
seconds in debug mode. Quadrupling the timeout is enough to make
debug-mode tests pass on those machines.
Since the timeout's role is to catch rare infinite loops in unsupervised
testing, increasing the timeout has no ill effect (other than to
delay the report of the failure).
Closes#7518
test.py estimates the amount of memory needed per test
in order not to overload the machine, but it underestimates
badly and so machines with many cores but not a lot of memory
fail the tests (in debug mode principally) due to running out
of memory.
Increase the estimate from 2GB per test to 6GB.
Closes#7499
rapidjson has a harmless (but true) ubsan violation. It was fixed
in 16872af889.
Since rapidjson has't released since 2016, we're unlikely to see
the fix, so suppress it to prevent the tests failing. In any case
the violation is harmless.
gcc's ubsan doesn't object to the addition.
Closes#7357
For suite.yaml add an extra configuration option disable.
Tests in this list will disabled for all modes.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
This patch adds a "--list" option to test.py that shows a list of tests
instead of executing them. This is useful for people and scripts, which
need to discover the tests that will be run. For example, when Jenkins
needs to store failing tests, it can use "test.py --list" to figure out
what to archive.
Message-Id: <20200916135714.89350-1-penberg@scylladb.com>
Export TMPDIR environment variable pointing at a subdir of testlog.
This variable is used by seastar/scylla tests to create a
a subdirectory with temporary test data. Normally a test cleans
up the temporary directory, but if it crashes or is killed the
directory remains.
By resetting the default location from /tmp to testlog/{mode}
we allow test.py we consolidate all test artefacts in a single
place.
Fixes#6062, "test.py uses tmpfs"
When tests are run in parallel, it is hard to tell how much time each test
ran. The time difference between consecutive printouts (indicating a test's
end) says nothing about the test's duration.
This patch adds in "--verbose" mode, at the end of each test result, the
duration in seconds (in wall-clock time) of the test. For example,
$ ./test.py --mode dev --verbose alternator
================================================================================
[N/TOTAL] TEST MODE RESULT
------------------------------------------------------------------------------
[1/2] boost/alternator_base64_test dev [ PASS ] 0.02s
[2/2] alternator/run dev [ PASS ] 26.57s
These durations are useful for recognizing tests which are especially slow,
or runs where all the tests are unusually slow (which might indicate some
sort of misconfiguration of the test machine).
Fixes#6759
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200706142109.438905-1-nyh@scylladb.com>
The problem is that this option is defined in seastar testing wrapper,
while no unit tests use it, all just start themselves with app.run() and
would complain on unknown option.
"Would", because nowadays every single test in it declares its own options
in suite.yaml, that override test.py's defaults. Once an option-less unit
test is added (B+ tree ones) it will complain.
The proposal is to remove this option from defaults, if any unit test will
use the seastar testing wrappers and will need this option, it can add one
to the suite.yaml.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200709084602.8386-1-xemul@scylladb.com>
* seastar a6c8105443...7664f991b9 (13):
> gate: add try_enter and try_with_gate
> Merge "Manage reference counts in the file API" from Rafael
> cmake: Refactor a bit of duplicated code
> stream: Delete _sub
> future: Add a rethrow_exception to future_state_base
> future: Use a new seastar::nested_exception in finally
> cmake: only apply C++ compile options to C++ language
> testing: Enable fail-on-abandoned-failed-futures by default
> future: Correct a few hypercorrect uses of std::forward
> futures_test: Test using future::then with functions
> Merge "io-queue: A set of cleanups collected so far" from Pavel E
> tmp_file: Replace futurize_apply with futurize_invoke
> future: Replace promise::set_coroutine with forward_state_and_schedule
Contains update to tests from Rafael:
tests: Update for fail-on-abandoned-failed-futures's new default
This depends on the corresponding change in seastar.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Currently test.py has three different places it checks whether stdout is
a tty. This patch centralizes these into a single global variable. This
ensures consistency and makes it easier to override it later with a
command-line switch, should we want to.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200529101124.123925-1-bdenes@scylladb.com>
Boost test uses colored output by default, even when the output of the
test is redirected to a file. This makes the output quite hard to read
for example in Jenkins. This patch fixes this by disabling the colored
output when stdout is not a tty. This is in line with the colored output
of configure.py itself, which is also enabled only if stdout is a tty.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200526112857.76131-1-bdenes@scylladb.com>
Print the test command line and the UBSAN and ASAN env settings to the log
so the run can be easily reproduced (optionally with providing --random-seed=XXX
that is printed by scylla unit tests when they start).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200513110959.32015-1-bhalevy@scylladb.com>
The Alternator test's run script, test/alternator/run, runs Scylla.
By default, it chooses the last built Scylla executable build/*/scylla.
However, test.py has a "mode" option, that should be able to choose which
build mode to run. Before this patch, this mode option wasn't honored by
the Alternator test, so a "test.py alternator/run" would run the same
Scylla binary (the one last built) three times, instead of running each
of the three build modes.
We fix this in this patch: test.py now passes the "SCYLLA" environment
variable to the test/alternator/run script, indicating the location of the
Scylla binary with the appropriate build mode. The script already supported
this environment variable to override its default choice of Scylla binary.
In test.py, we add to the run_test() function an optional "env" parameter
which can be used to pass additional environment variables to the test.
Fixes#6286
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200427131958.28248-1-nyh@scylladb.com>
Assumes that "Run" tests can take the --junit-xml=<path> option, and
pass it to ask the test to generate an XML summary of the run to a file
like testlog/dev/xml/run.1.xunit.xml.
This option is honored by the Alternator tests.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>