`linearizing_input_stream` allows transparently reading linearized
values from a fragmented buffer. This is done by linearizing on-the-fly
only those read values that happen to be split across multiple
fragments. This reduces the size of the largest allocation from the size
of the entire buffer (when the entire buffer is linearized) to the size
of the largest read value. This is a huge gain when the buffer contains
loads of small objects, and modest gains when the buffer contains few
large objects. But the even in the worst case the size of the largest
allocation will be less or equal compared to the case where the entire
buffer is linearized.
This stream is planned to be used as glue code between the fragmented
cell value and the collection deserialization code which expects to be
reading linearized values.
Use asyncio as a more modern way to work with concurrency,
Process signals in an event loop, terminate all outstanding
tests before exiting.
Breaking change: this commit requires Python 3.7 or
newer to run this script. The patch adds a version
check and a message to enforce it.
UnitTest class uses juggles with the name 'args' quite a bit to
construct the command line for a unit test, so let's spread
the harness command line arguments from the unit test command line
arguments a bit apart by consistently calling the harness command line
arguments 'options', and unit test command line arguments 'args'.
Rename usage() to parse_cmd_line().
Create unique UnitTest objects in find_tests() for each found match,
including repeat, to ensure each test has its own unique id.
This will also be used to store execution state in the test.
It somewhat stands in the way of using asyncio
This patch also implements a more comprehensive
fix for #5303, since we not only have --repeat, but
run some tests in different configurations, in which
case xml output is also overwritten.
Currently, we overwrite the same XML output file for each test repeat
cycle. This can cause invalid XML to be generated if the XML contents
don't match exactly for every iteration.
Fix the problem by appending the test repeat cycle in the XML filename
as follows:
$ ./test.py --repeat 3 --name vint_serialization_test --mode dev --jenkins jenkins_test
$ ls -1 *.xml
jenkins_test.release.vint_serialization_test.0.boost.xml
jenkins_test.release.vint_serialization_test.1.boost.xml
jenkins_test.release.vint_serialization_test.2.boost.xml
Fixes#5303.
Message-Id: <20191119092048.16419-1-penberg@scylladb.com>
"
This patch series adds only UDF support, UDA will be in the next patch series.
With this all CQL types are mapped to Lua. Right now we setup a new
lua state and copy the values for each argument and return. This will
be optimized once profiled.
We require --experimental to enable UDF in case there is some change
to the table format.
"
* 'espindola/udf-only-v4' of https://github.com/espindola/scylla: (65 commits)
Lua: Document the conversions between Lua and CQL
Lua: Implement decimal subtraction
Lua: Implement decimal addition
Lua: Implement support for returning decimal
Lua: Implement decimal to string conversion
Lua: Implement decimal to floating point conversion
Lua: Implement support for decimal arguments
Lua: Implement support for returning varint
Lua: Implement support for returning duration
Lua: Implement support for duration arguments
Lua: Implement support for returning inet
Lua: Implement support for inet arguments
Lua: Implement support for returning time
Lua: Implement support for time arguments
Lua: Implement support for returning timeuuid
Lua: Implement support for returning uuid
Lua: Implement support for uuid and timeuuid arguments
Lua: Implement support for returning date
Lua: Implement support for date arguments
Lua: Implement support for returning timestamp
...
Add mode_list rule to ninja build and use it by default when searching
for tests in test.py.
Now it is no longer necessary to explicitly specify the test mode when
invoking test.py.
(cherry picked from commit a211ff30c7f2de12166d8f6f10d259207b462d4b)
With this it is possible to create user defined functions and
aggregates and they are saved to disk and the schema change is
propagated.
It is just not possible to call them yet.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
At the moment, this test only checks that table
creation and alteration sets cdc_options property
on a table correctly.
Future patches will extend this test to cover more
CDC aspects.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
If the user specifies an output file name using "--xunit=<filename>",
test.py will write the test results of non-boost tests to the file in the XUnit XML format.
Every boost test creates its own results file already.
Resolves#4680.
Signed-off-by: Kamil Braun <kbraun@scylladb.com>
This adds a '--repeat N' command line option to test.py, which can be
used to execute the tests N times. This is useful for finding flakey
tests, for example.
Message-Id: <20190710092115.15960-1-penberg@scylladb.com>
Currently there is a single mutation_writer: `multishard_writer`,
however in the next path we are going to add another one. This is the
right moment to move these into a common namespace (and folder), we
have way too much stuff scattered already in the top-level namespace
(and folder).
Also rename `tests/multishard_writer_test.cc` to
`tests/mutation_writer_test.cc`, this test-suite will be the home of all
the different mutation writer's unit test cases.
Tests without custom flags were already being run with -m2G. Tests
with custom flags have to manually specify it, but some were missing
it. This could cause tests to fail with std::bad_alloc when two
concurrent tests tried to allocate all the memory.
This patch adds -m2G to all tests that were missing it.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190620002921.101481-1-espindola@scylladb.com>
Running tests in debug mode takes 25:22.08 in my machine. Using
sanitize instead takes that down to 10:46.39.
The mode is opt in, in that it must be explicitly selected with
"configure.py --mode=sanitize" or "ninja sanitize". It must also be
explicitly passed to test.py.
Unfortunately building with asan, optimizations and debug info is
very slow and there is nothing like -gline-tables-only in gcc.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190617170007.44117-1-espindola@scylladb.com>
Currently, we only allocate memory for concurrent unit test runs. This can cause
CPU overcommit when running test.py on machines with a log of memory but few cores.
This overcommit can cause timeouts in tests that are time-sensitive (bad practice,
but can happen) and makes the desktop sluggish.
Improve by allocating at least one logical core per running test.
Reviewed-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190227132516.22147-1-avi@scylladb.com>
Allow the --mode argument to ./configure.py and ./test.py to be repeated. This
is to allow contiuous integration to configure only debug and release, leaving dev
to developers.
Message-Id: <20190214162736.16443-1-avi@scylladb.com>
Each `*_test.cc` file must be compiled separately so that there is only
one definition of `main`.
This change correctly defines an independent `sstable_datafile_test`
from `sstable_datafile_test.cc` and adds that test to the existing
suite.
In c++17 there are standard ways of requesting aligned memory, so
seastar doesn't need to provide one.
This patch is in preparation for removing with_alignment from seastar.
Tests: unit (debug)
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190107191019.22295-1-espindola@scylladb.com>
The current timeout is way too small for debug builds. Currently
jenkins runs avoid the problem by increasing the timeout by 100x. This
patch increases it by 10x, with seems to be sufficient to run the
tests in most desktop machines.
Message-Id: <20190107191413.22531-1-espindola@scylladb.com>
Implementation of nodetool toppartiotion query, which samples most frequest PKs in read/write
operation over a period of time.
Content:
- data_listener classes: mechanism that interfaces with mutation readers in database and table classes,
- toppartition_query and toppartition_data_listener classes to implement toppartition-specific query (this
interfaces with data_listeners and the REST api),
- REST api for toppartitions query.
Uses Top-k structure for handling stream summary statistics (based on implementation in C*, see #2811).
What's still missing:
- JMX interface to nodetool (interface customization may be required),
- Querying #rows and #bytes (currently, only #partitions is supported).
Fixes#2811
* https://github.com/avikivity/scylla rafie_toppartitions_v7.1:
top_k: whitespace and minor fixes
top_k: map template arguments
top_k: std::list -> chunked_vector
top_k: support for appending top_k results
nodetool toppartitions: refactor table::config constructor
nodetool toppartitions: data listeners
nodetool toppartitions: add data_listeners to database/table
nodetool toppartitions: fully_qualified_cf_name
nodetool toppartitions: Toppartitions query implementation
nodetool toppartitions: Toppartitions query REST API
nodetool toppartitions: nodetool-toppartitions script
Add data_listeners member to database.
Adds data_listeners* to table::config, to be used by table methods to invoke listeners.
Install on_read() listener in table::make_reader().
Install on_write() listener in database::apply_in_memory().
Tests: Unit (release)
Signed-off-by: Rafi Einstein <rafie@scylladb.com>
"
This series optimises the read path by replacing some usages of
std::vector by utils::small_vector. The motivation for this change was
an observation that memory allocation functions are pointed out by the
profiler as the ones where we spent most time and while they have a
large number of callers storage allocation for some vectors was close to
the top. The gains are not huge, since the problem is a lot of things
adding up and not a single slow thing, but we need to start with
something.
Unfortunately, the performance of boost::container::small_vector is
quite disappointing so a new implementation of a small_vector was
introduced.
perf_simple_query -c4 --duration 60, medians:
./perf_before ./perf_after diff
read 343086.80 360720.53 5.1%
Tests: unit(release, small_vector in debug)
"
* tag 'small_vector/v2.1' of https://github.com/pdziepak/scylla:
partition_slice: use small_vector for column_ids
mutation_fragment_merger: use small_vector
auth: use small_vector in resource
auth: avoid list-initialisation of vectors
idl: serialiser: add serialiser for utils::small_vector
idl: serialiser: deduplicate vector serialisers
utils: introduce small_vector
intrusive_set_external_comparator: make iterator nothrow move constructible
mutation_fragment_merger: value-initialise iterator