Commit Graph

48 Commits

Author SHA1 Message Date
Botond Dénes
272da51f80 test/lib/mutation_source_test: remove upgrade_to_v2 tests
We don't have any upgrade_to_v2() left in production code, so no need to
keep testing it. Removing it from this test paves the way for removing
it for good (not in this series).
2022-04-28 14:12:24 +03:00
Benny Halevy
b3e2bbe5bd test: random_mutation_generator: make more interesting range tombstones
Include also singular prefix and semi-bounded range tombstones.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:34:49 +03:00
Botond Dénes
4243cd395d test: export squash_mutations() into lib/mutation_source_test.hh
This method used to be a static one in
boost/flat_mutation_reader_test.cc. Turns out it is useful for other
tests based on the mutation source test suite, so move it into the
header of the latter to make it accessible.
2022-03-17 08:08:01 +02:00
Mikołaj Sielużycki
7ce0d380d4 readers: Update tests to use make_queue_reader_v2.
Closes #10220
2022-03-15 13:56:50 +02:00
Mikołaj Sielużycki
1d84a254c0 flat_mutation_reader: Split readers by file and remove unnecessary includes.
The flat_mutation_reader files were conflated and contained multiple
readers, which were not strictly necessary. Splitting optimizes both
iterative compilation times, as touching rarely used readers doesn't
recompile large chunks of codebase. Total compilation times are also
improved, as the size of flat_mutation_reader.hh and
flat_mutation_reader_v2.hh have been reduced and those files are
included by many file in the codebase.

With changes

real	29m14.051s
user	168m39.071s
sys	5m13.443s

Without changes

real	30m36.203s
user	175m43.354s
sys	5m26.376s

Closes #10194
2022-03-14 13:20:25 +02:00
Benny Halevy
26b1be0b8f test: lib: random_mutation_generator: accept optional random seed
Provide an easy way to instrument a particular test case to use
a given random number seed (that's curretly already printed to
the test log).

Refs #5349

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210907114537.3464004-1-bhalevy@scylladb.com>
2022-03-14 13:09:36 +02:00
Botond Dénes
6544da342a test/lib/mutation_source_test: log name of each run_mutation_source()
Although we have a log in run_mutation_reader_tests(), it is useful to
know where it was called from, when trying to find the test scenario
that failed.
2022-03-10 06:46:46 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Asias He
a8ad385ecd repair: Get rid of the gc_grace_seconds
The gc_grace_seconds is a very fragile and broken design inherited from
Cassandra. Deleted data can be resurrected if cluster wide repair is not
performed within gc_grace_seconds. This design pushes the job of making
the database consistency to the user. In practice, it is very hard to
guarantee repair is performed within gc_grace_seconds all the time. For
example, repair workload has the lowest priority in the system which can
be slowed down by the higher priority workload, so that there is no
guarantee when a repair can finish. A gc_grace_seconds value that is
used to work might not work after data volume grows in a cluster. Users
might want to avoid running repair during a specific period where
latency is the top priority for their business.

To solve this problem, an automatic mechanism to protect data
resurrection is proposed and implemented. The main idea is to remove the
tombstone only after the range that covers the tombstone is repaired.

In this patch, a new table option tombstone_gc is added. The option is
used to configure tombstone gc mode. For example:

1) GC a tombstone after gc_grace_seconds

cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ;

This is the default mode. If no tombstone_gc option is specified by the
user. The old gc_grace_seconds based gc will be used.

2) Never GC a tombstone

cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'};

3) GC a tombstone immediately

cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'};

4) GC a tombstone after repair

cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'};

In addition to the 'mode' option, another option 'propagation_delay_in_seconds'
is added. It defines the max time a write could possibly delay before it
eventually arrives at a node.

A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc
option can only be used after the whole cluster supports the new
feature. A mixed cluster works with no problem.

Tests: compaction_test.py, ninja test

Fixes #3560

[avi: resolve conflicts vs data_dictionary]
2022-01-04 19:48:14 +02:00
Avi Kivity
9e74556413 Merge 'Support reverse reads in the row cache natively' from Tomasz Grabiec
This change makes row cache support reverse reads natively so that reversing wrappers are not needed when reading from cache and thus the read can be executed efficiently, with similar cost as the forward-order read.

The database is serving reverse reads from cache by default after this. Before, it was bypassing cache by default after 703aed3277.

Refs: #1413

Tests:

  - unit [dev]
  - manual query with build/dev/scylla and cache tracing on

Closes #9454

* github.com:scylladb/scylla:
  tests: row_cache: Extend test_concurrent_reads_and_eviction to run reverse queries
  row_cache: partition_snapshot_row_cursor: Print more details about the current version vector
  row_cache: Improve trace-level logging
  config: Use cache for reversed reads by default
  config: Adjust reversed_reads_auto_bypass_cache description
  row_cache: Support reverse reads natively
  mvcc: partition_snapshot: Support slicing range tombstones in reverse
  test: flat_mutation_reader_assertions: Consume expected range tombstones before end_of_partition
  row_cache: Log produced range tombstones
  test: Make produces_range_tombstone() report ck_ranges
  tests: lib: random_mutation_generator: Extract make_random_range_tombstone()
  partition_snapshot_row_cursor: Support reverse iteration
  utils: immutable-collection: Make movable
  intrusive_btree: Make default-initialized iterator cast to false
2021-12-29 16:53:25 +02:00
Tomasz Grabiec
26ed0081a4 tests: lib: random_mutation_generator: Extract make_random_range_tombstone() 2021-12-19 22:41:35 +01:00
Botond Dénes
20e45987b5 test/lib/mutation_source_test: don't force v1 reader in reverse run
Currently in the reverse run we wrap the test-provided mutation-source
and create a v1 reader with it, forcing a conversion if the
mutation-source has a v2 factory. Worse still, if the test is v2 native,
there will be a double conversion. This patch fixes this by creating a
wrapper mutation-source appropriate to the version of the underlying
factory of the wrapped mutation-source.
2021-12-10 15:48:49 +02:00
Tomasz Grabiec
3226c5bf9d Merge 'sstables: mx: enable position fast-forwarding in reverse mode' from Kamil Braun
Most of the machinery was already implemented since it was used when
jumping between clustering ranges of a query slice. We need only perform
one additional thing when performing an index skip during
fast-forwarding: reset the stored range tombstone in the consumer (which
may only be stored in fast-forwarding mode, so it didn't matter that it
wasn't reset earlier). Comments were added to explain the details.

As a preparation for the change, we extend the sstable reversing reader
random schema test with a fast-forwarding test and include some minor
fixes.

Fixes #9427.

Closes #9484

* github.com:scylladb/scylla:
  query-request: add comment about clustering ranges with non-full prefix key bounds
  sstables: mx: enable position fast-forwarding in reverse mode
  test: sstable_conforms_to_mutation_source_test: extend `test_sstable_reversing_reader_random_schema` with fast-forwarding
  test: sstable_conforms_to_mutation_source_test: fix `vector::erase` call
  test: mutation_source_test: extract `forwardable_reader_to_mutation` function
  test: random_schema: fix clustering column printing in `random_schema::cql`
2021-11-29 16:01:53 +01:00
Mikołaj Sielużycki
44f4ea38c5 test: Future-proof reader conversions tests.
Query time must be fetched after populate. If compaction is executed
during populate it may be executed with timestamp later than query_time.
This would cause the test expected compaction and compaction during
populate to be executed at different time points producing different
results. The result would be sporadic test failures depending on relative
timing of those operations. If no other mutations happen after populate,
and query_time is later than the compaction time during population, we're
guaranteed to have the same results.
Message-Id: <20211123134808.105068-1-mikolaj.sieluzycki@scylladb.com>
2021-11-24 21:01:57 +01:00
Kamil Braun
3abcbf6875 test: mutation_source_test: extract forwardable_reader_to_mutation function
The function shall be used in other places as well.
2021-11-15 17:32:17 +01:00
Michael Livshin
4941e2ec41 tests: fix range tombstone checking and deal with the fallout
flat_reader_assertions::produces_range_tombstone() does not actually
check range tombstones beyond the fact that they are in fact range
tombstones (unless non-empty ck_ranges is passed).

Fixing the immediate problem reveals that:

* The assertion logic is not flexible enough to deal with
  creatively-split or creatively-overlapping range tombstones.

* Some existing tests involving range tombstones are in fact wrong:
  some assertions may (at least with some readers) refer to wrong
  tombstones entirely, while others assert wrong things about right
  tombstones.

* Range tombstones in pre-made sstables (such as those read by
  sstable_3_x_test) have deletion time drift, and that now has to be
  somehow dealt with.

This patch (which is not split into smaller ones because that would
either generate unreasonable amount of work towards ensuring
bisectability or entail "temporarily" disabling problematic tests,
which is cheating) contains the following changes:

* flat_reader_assertions check range tombstones more carefully, by
  accumulating both expected and actually-read range tombstones into
  lists and comparing those lists when a partition ends (or when the
  assertion object is destroyed).

* flat_reader_assertions::may_produce_tombstones() can take
  constraining ck_ranges.

* Both flat_reader_assertions and flat_reader_assertions_v2 can be
  instructed to ignore tombstone deletion times, to help with tests that
  read pre-made sstables.

* Affected tests are changed to reflect reality.  Most changes to
  tests make sense; the only one I am not completely sure about is in
  test_uncompressed_filtering_and_forwarding_range_tombstones_read.

Fixes #9470

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-11-08 00:56:39 +02:00
Botond Dénes
6a76e12768 mutation_partition: row: make row marker shadowing symmetric
Currently row marker shadowing the shadowable tombstone is only checked
in `apply(row_marker)`. This means that shadowing will only be checked
if the shadowable tombstone and row marker are set in the correct order.
This at the very least can cause flakyness in tests when a mutation
produced just the right way has a shadowable tombstone that can be
eliminated when the mutation is reconstructed in a different way,
leading to artificial differences when comparing those mutations.

This patch fixes this by checking shadowing in
`apply(shadowable_tombstone)` too, making the shadowing check symmetric.

There is still one vulnerability left: `row_marker& row_marker()`, which
allow overwriting the marker without triggering the corresponding
checks. We cannot remove this overload as it is used by compaction so we
just add a comment to it warning that `maybe_shadow()` has to be manually
invoked if it is used to mutate the marker (compaction takes care of
that). A caller which didn't do the manual check is
mutation_source_test: this patch updates it to use `apply(row_marker)`
instead.

Fixes: #9483

Tests: unit(dev)

Closes #9519
2021-10-26 20:40:31 +02:00
Pavel Emelyanov
1e09a2c925 test: Split run_mutation_source_tests
There are 4 flavours of mutation source tests that are all ran
sequentially -- plain, reversed and upgrade/downgrade ones that
check v1<->v2 conversions.

This patch splits them all into individual calls so that some
tests may want to have dedicated cases for each. "By default" they
are all run as they were.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-10-05 11:51:43 +03:00
Botond Dénes
bc49c27a06 test/lib: mutation_source_test: test reading in reverse
To ensure all mutation sources uniformly support the current API of
reverse reading: reversed schema and half-reversed slice. This test will
also ensure that once we switch to native-reverse slice, all
mutation-sources will keep on working.
2021-09-29 12:15:48 +03:00
Botond Dénes
c71a281e6b test/lib/mutation_source_test: add consistent log to all methods
Most test methods log their own name either via testlog.info() or
BOOST_TEST_MESSAGE() so failures can be more easily located. Not all do
however. This commit fixes this and also converts all those using
BOOST_TEST_MESSAGE() for this to testlog.info(), for consistency.
2021-09-09 15:42:15 +03:00
Botond Dénes
74a22a706b mutation_rebuilder: make it standalone
Not requiring a wrapper object to become usable.
2021-09-09 15:42:15 +03:00
Benny Halevy
4476800493 flat_mutation_reader: get rid of timeout parameter
Now that the timeout is taken from the reader_permit.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:51 +03:00
Juliusz Stasiewicz
38b8a6ce2c test/boost: run_mutation_source_tests on streaming virtual table
Tests that require inter-partition forwarding are excluded.
2021-07-20 14:19:17 +02:00
Michał Radwański
67d99e02a7 flat_mutation_reader: downgrade_to_v1 - reset state of rt_assembler
The downgrade_to_v1 didn't reset the state of range tombstone assembler
in case of the calls to next_partition or fast_forward_to, which caused
a situation where the closing range tombstone change is cleared from the
buffer before being emitted, without notifying the assembler. This patch
fixes the behaviour in fast_forward_to as well.

Fixes #9022
2021-07-19 15:54:26 +02:00
Botond Dénes
0e78399051 test/lib: migrate off the global test reader semaphore 2021-07-08 15:28:39 +03:00
Tomasz Grabiec
3fcd1f43ba tests: mutation_source_test: Run tests with conversions inserted in the middle 2021-06-16 00:23:49 +02:00
Tomasz Grabiec
cddcba27de tests: mutation_source_tests: Unroll run_flat_mutation_reader_tests()
All readers are now flat so there is no need for this grouping.

Will be needed for the next patch, which needs a single function with
all test cases.
2021-06-16 00:23:49 +02:00
Tomasz Grabiec
ffb616fef6 tests: Add tests for flat_mutation_reader_v2 2021-06-16 00:23:49 +02:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Benny Halevy
5dce9997ff test/lib: mutation_source_test: close readers
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Kamil Braun
7ffb0d826b clustering_order_reader_merger: handle empty readers
The merger could return end-of-stream if some (but not all) of the
underlying readers were empty (i.e. not even returning a
`partition_start`). This could happen in places where it was used
(`time_series_sstable_set::create_single_key_sstable_reader`) if we
opened an sstable which did not have the queried partition but passed
all the filters (specifically, the bloom filter returned a false
positive for this sstable).

The commit also extends the random tests for the merger to include empty
readers and adds an explicit test case that catches this bug (in a
limited scope: when we merge a single empty reader).

It also modifies `test_twcs_single_key_reader_filtering` (regression
test for #8432) because the time where the clustering key filter is
invoked changes (some invocations move from the constructor of the
merger to operator()). I checked manually that it still catches the bug
when I reintroduce it.

Fixes #8445.

Closes #8446
2021-04-12 10:34:52 +03:00
Konstantin Osipov
c83cf1f965 uuid: switch the API to use std::chrono
A follow up for the patch for #7611. This change was requested
during review and moved out of #7611 to reduce its scope.

The patch switches UUID_gen API from using plain integers to
hold time units to units from std::chrono.

For one, we plan to switch the entire code base to std::chrono units,
to ensure type safety. Secondly, using std::chrono units allows to
increase code reuse with template metaprogramming and remove a few
of UUID_gen functions that beceme redundant as a result.

* switch  get_time_UUID(), unix_timestamp(), get_time_UUID_raw(), switch
  min_time_UUID(), max_time_UUID(), create_time_safe() to
  std::chrono
* remove unused variant of from_unix_timestamp()
* remove unused get_time_UUID_bytes(), create_time_unsafe(),
  redundant get_adjusted_timestamp()
* inline get_raw_UUID_bytes()
* collapse to similar implementations of get_time_UUID()
* switch internal constants to std::chrono
* remove unnecessary unique_ptr from UUID_gen::_instance
Message-Id: <20210406130152.3237914-2-kostja@scylladb.com>
2021-04-06 17:12:54 +03:00
Pavel Emelyanov
1bdfa355ea row: Remove old storages
Now when the 3rd storage type (radix tree) is all in, old
storage can be safely removed.  The result is:

1. memory footprint

sizeof(class row):  112 => 16 bytes
sizeof(rows_entry): 126 => 120 bytes

the "in cache" value depends on the number of cells:

num of cells     master       patch
         1       752         656
         2       808         712
         3       864         768
         4       920         824
         5       968         936
         6      1136         992
         ...
         16     1840        1672
         17     1904        1992  (+88)
         18     1976        2048  (+72)
         19     2048        2104  (+56)
         20     2120        2160  (+40)
         21     2184        2208  (+24)
         22     2256        2264  ( +8)
         23     2328        2320
         ...
         32     2960        2808

After 32 cells the storage switches into rbtree with
24-bytes per-cell overhead and the radix tree improvement
rocketlaunches

           64     7872        6056
           128   15040        9512
           256   29376       18568

2. perf_mutation test is enhanced by this series and the
   results differ depending on the number of columns used

                    tps value
--column-count    master   patch
          1       59.9k    57.6k  (-3.8%)
          2       59.9k    57.5k
          4       59.8k    57.6k
          8       57.6k    57.7k  <- eq
         16       56.3k    57.6k
         32       53.2k    57.4k  (+7.9%)

A note on this. Last time 1-column test was ~5% worse which
was explained by inline storage of 5 cells that's present on
current implementation and was absent in radix tree.

An attempt to make inline storage for small radix trees
resulted in complete loss of memory footprint gain, but gave
fraction of percent to perf_mutation performance. So this
version doesn't have inline nodes.

The 1.2% improvement from v2 surprisingly came from the
tree::clone_from() which in v2 was work-around-ed by slow
walk+emplace sequence while this version has the optimized
API call for cloning.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-02-15 20:35:06 +03:00
Benny Halevy
5e41228fe8 test: everywhere: use seastar::testing::local_random_engine
Use the thread_local seastar::testing::local_random_engine
in all seastar tests so they can be reproduced using
the --random-seed option.

Test: unit(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210112103713.578301-2-bhalevy@scylladb.com>
2021-01-13 11:07:29 +02:00
Avi Kivity
8fc0bbd487 test: lib: don't ignore future in compare_readers()
A fast_forward_to() call is not waited on in compare_readers(). Since
this is called in a thread, add a future::get() call to wait for it.
2020-12-07 16:50:20 +02:00
Kamil Braun
4f7e2bf920 mutation_reader_test: test clustering_order_reader_merger in memory 2020-11-30 11:55:44 +01:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Botond Dénes
1d48442ae7 test/lib/mutation_source_test: test-monotonic-positions: test the reader-under-test
Instead of always testing `flat_mutation_reader_from_mutations()`.

Tests: unit(dev, debug)

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200812073406.1681250-1-bdenes@scylladb.com>
2020-08-12 10:52:26 +03:00
Botond Dénes
d68ac8bf18 treewide: remove all uses of no_reader_permit() 2020-05-28 11:34:35 +03:00
Botond Dénes
1ab45e15a0 test: lib/mutation_source_test: make data compaction friendly
Currently the mutation source test suite may generate data that is
compactable. This poses a problem for the next patch, where we want to
use it to test `compacting_reader` a reader which compacts data as it
reads it. When the input is compactable, this will introduce artificial
differences, failing the tests.
To allow also testing such readers, make sure data is not compactable,
i.e. compacting it will not change it.
The goal of the mutation source test suite is not to exercise compaction
logic, so this will not take anything away from its value.
2020-03-16 13:58:13 +02:00
Botond Dénes
c4fab16723 test: random_mutation_generator: add generate_uncompactable mode
The random mutation generator currently generates data and tombstones
with random timestamps selected from a pre-determined range. This
results in mutations where tombstones often cover each other and data.
There is nothing wrong with this, as this is how real data is too.
However for certain tests this is problematic, as compacting the
mutations will result in a different mutations. To cater for these users
too, introduce a `generate_uncompactable` option. When set to `yes`, the
generated mutations will be uncompactable, i.e. no tombstone will cover
lower-level tombstones and no tombstone will cover data. The mutations
will not change after compacted.
2020-03-16 13:58:13 +02:00
Tomasz Grabiec
d5557023f6 Merge "Stop using BOOST_TEST_MESSAGE() in unit tests" from Kostja
Stop using BOOST_TEST_MESSAGE() in unit tests, it bloats test XML
output. Use Scylla logger instead.

Test: unit (debug, dev, release)
2020-03-05 13:27:30 +01:00
Konstantin Osipov
ff3f9cb7cf test: stop using BOOST_TEST_MESSAGE() for logging
We use boost test logging primarily to generate nice XML xunit
files used in Jenkins. These XML files can be bloated
with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes
of build archives, on every build.

Let's use seastar logger for test logging instead, reserving
the use of boost log facilities for boost test markup information.
2020-03-05 11:38:11 +03:00
Botond Dénes
8b908a9aba test: lib/mutation_source_test: log the name of the test-method
Most test-methods log a message with their names upon entering them.
This helps in identifying the test-method a failure happened in in the
logs. Two methods were missing this log line, so add it.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200304155235.46170-1-bdenes@scylladb.com>
2020-03-04 18:16:21 +02:00
Piotr Jastrzebski
ca4a89d239 dht: add dht::decorate_key
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key

It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:06 +01:00
Botond Dénes
dfc8b2fc45 treewide: replace reader_resource_tracer with reader_permit
The former was never really more than a reader_permit with one
additional method. Currently using it doesn't even save one from any
includes. Now that readers will be using reader_permit we would have to
pass down both to mutation_source. Instead get rid of
reader_resource_tracker and just use reader_permit. Instead of making it
a last and optional parameter that is easy to ignore, make it a
first class parameter, right after schema, to signify that permits are
now a prominent part of the reader API.

This -- mostly mechanical -- patch essentially refactors mutation_source
to ask for the reader_permit instead of reader_resource_tracking and
updates all usage sites.
2020-01-28 08:13:16 +02:00
Konstantin Osipov
1c8736f998 tests: move all test source files to their new locations
1. Move tests to test (using singular seems to be a convention
   in the rest of the code base)
2. Move boost tests to test/boost, other
   (non-boost) unit tests to test/unit, tests which are
   expected to be run manually to test/manual.

Update configure.py and test.py with new paths to tests.
2019-12-16 17:47:42 +03:00