Files
scylladb/tests/mutation_assertions.hh
Tomasz Grabiec 41ede08a1d mutation_reader: Allow range tombstones with same position in the fragment stream
When we get two range tombstones with the same lower bound from
different data sources (e.g. two sstable), which need to be combined
into a single stream, they need to be de-overlapped, because each
mutation fragment in the stream must have a different position. If we
have range tombstones [1, 10) and [1, 20), the result of that
de-overlapping will be [1, 10) and [10, 20]. The problem is that if
the stream corresponds to a clustering slice with upper bound greater
than 1, but lower than 10, the second range tombstone would appear as
being out of the query range. This is currently violating assumptions
made by some consumers, like cache populator.

One effect of this may be that a reader will miss rows which are in
the range (1, 10) (after the start of the first range tombstone, and
before the start of the second range tombstone), if the second range
tombstone happens to be the last fragment which was read for a
discontinuous range in cache and we stopped reading at that point
because of a full buffer and cache was evicted before we resumed
reading, so we went to reading from the sstable reader again. There
could be more cases in which this violation may resurface.

There is also a related bug in mutation_fragment_merger. If the reader
is in forwarding mode, and the current range is [1, 5], the reader
would still emit range_tombstone([10, 20]). If that reader is later
fast forwarded to another range, say [6, 8], it may produce fragments
with smaller positions which were emitted before, violating
monotonicity of fragment positions in the stream.

A similar bug was also present in partition_snapshot_flat_reader.

Possible solutions:

 1) relax the assumption (in cache) that streams contain only relevant
 range tombstones, and only require that they contain at least all
 relevant tombstones

 2) allow subsequent range tombstones in a stream to share the same
 starting position (position is weakly monotonic), then we don't need
 to de-overlap the tombstones in readers.

 3) teach combining readers about query restrictions so that they can drop
fragments which fall outside the range

 4) force leaf readers to trim all range tombstones to query restrictions

This patch implements solution no 2. It simplifies combining readers,
which don't need to accumulate and trim range tombstones.

I don't like solution 3, because it makes combining readers more
complicated, slower, and harder to properly construct (currently
combining readers don't need to know restrictions of the leaf
streams).

Solution 4 is confined to implementations of leaf readers, but also
has disadvantage of making those more complicated and slower.

Fixes #3093.
2017-12-22 11:06:20 +01:00

344 lines
12 KiB
C++

/*
* Copyright (C) 2015 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "mutation.hh"
class mutation_partition_assertion {
schema_ptr _schema;
const mutation_partition& _m;
public:
mutation_partition_assertion(schema_ptr s, const mutation_partition& m)
: _schema(s)
, _m(m)
{ }
// If ck_ranges is passed, verifies only that information relevant for ck_ranges matches.
mutation_partition_assertion& is_equal_to(const mutation_partition& other,
const stdx::optional<query::clustering_row_ranges>& ck_ranges = {}) {
return is_equal_to(*_schema, other, ck_ranges);
}
// If ck_ranges is passed, verifies only that information relevant for ck_ranges matches.
mutation_partition_assertion& is_equal_to(const schema& s, const mutation_partition& other,
const stdx::optional<query::clustering_row_ranges>& ck_ranges = {}) {
if (ck_ranges) {
mutation_partition_assertion(_schema, _m.sliced(*_schema, *ck_ranges))
.is_equal_to(s, other.sliced(s, *ck_ranges));
return *this;
}
if (!_m.equal(*_schema, other, s)) {
BOOST_FAIL(sprint("Mutations differ, expected %s\n ...but got: %s", other, _m));
}
if (!other.equal(s, _m, *_schema)) {
BOOST_FAIL(sprint("Mutation inequality is not symmetric for %s\n ...and: %s", other, _m));
}
return *this;
}
mutation_partition_assertion& is_not_equal_to(const mutation_partition& other) {
return is_not_equal_to(*_schema, other);
}
mutation_partition_assertion& is_not_equal_to(const schema& s, const mutation_partition& other) {
if (_m.equal(*_schema, other, s)) {
BOOST_FAIL(sprint("Mutations equal but expected to differ: %s\n ...and: %s", other, _m));
}
return *this;
}
mutation_partition_assertion& has_same_continuity(const mutation_partition& other) {
if (!_m.equal_continuity(*_schema, other)) {
BOOST_FAIL(sprint("Continuity doesn't match: %s\n ...and: %s", other, _m));
}
return *this;
}
mutation_partition_assertion& is_continuous(const position_range& r, is_continuous cont = is_continuous::yes) {
if (!_m.check_continuity(*_schema, r, cont)) {
BOOST_FAIL(sprint("Expected range %s to be %s in %s", r, cont ? "continuous" : "discontinuous", _m));
}
return *this;
}
};
static inline
mutation_partition_assertion assert_that(schema_ptr s, const mutation_partition& mp) {
return {std::move(s), mp};
}
class mutation_assertion {
mutation _m;
public:
mutation_assertion(mutation m)
: _m(std::move(m))
{ }
// If ck_ranges is passed, verifies only that information relevant for ck_ranges matches.
mutation_assertion& is_equal_to(const mutation& other, const stdx::optional<query::clustering_row_ranges>& ck_ranges = {}) {
if (ck_ranges) {
mutation_assertion(_m.sliced(*ck_ranges)).is_equal_to(other.sliced(*ck_ranges));
return *this;
}
if (_m != other) {
BOOST_FAIL(sprint("Mutations differ, expected %s\n ...but got: %s", other, _m));
}
if (other != _m) {
BOOST_FAIL(sprint("Mutation inequality is not symmetric for %s\n ...and: %s", other, _m));
}
return *this;
}
mutation_assertion& is_not_equal_to(const mutation& other) {
if (_m == other) {
BOOST_FAIL(sprint("Mutations equal but expected to differ: %s\n ...and: %s", other, _m));
}
return *this;
}
mutation_assertion& has_schema(schema_ptr s) {
if (_m.schema() != s) {
BOOST_FAIL(sprint("Expected mutation of schema %s, but got %s", *s, *_m.schema()));
}
return *this;
}
mutation_assertion& has_same_continuity(const mutation& other) {
assert_that(_m.schema(), _m.partition()).has_same_continuity(other.partition());
return *this;
}
mutation_assertion& is_continuous(const position_range& r, is_continuous cont = is_continuous::yes) {
assert_that(_m.schema(), _m.partition()).is_continuous(r, cont);
return *this;
}
// Verifies that mutation data remains unchanged when upgraded to the new schema
void is_upgrade_equivalent(schema_ptr new_schema) {
mutation m2 = _m;
m2.upgrade(new_schema);
BOOST_REQUIRE(m2.schema() == new_schema);
mutation_assertion(m2).is_equal_to(_m);
mutation m3 = m2;
m3.upgrade(_m.schema());
BOOST_REQUIRE(m3.schema() == _m.schema());
mutation_assertion(m3).is_equal_to(_m);
mutation_assertion(m3).is_equal_to(m2);
}
};
static inline
mutation_assertion assert_that(mutation m) {
return { std::move(m) };
}
static inline
mutation_assertion assert_that(streamed_mutation sm) {
auto mo = mutation_from_streamed_mutation(std::move(sm)).get0();
return { std::move(*mo) };
}
class mutation_opt_assertions {
mutation_opt _mo;
public:
mutation_opt_assertions(mutation_opt mo) : _mo(std::move(mo)) {}
mutation_assertion has_mutation() {
if (!_mo) {
BOOST_FAIL("Expected engaged mutation_opt, but found not");
}
return { *_mo };
}
void has_no_mutation() {
if (_mo) {
BOOST_FAIL("Expected disengaged mutation_opt");
}
}
};
static inline
mutation_opt_assertions assert_that(mutation_opt mo) {
return { std::move(mo) };
}
static inline
mutation_opt_assertions assert_that(streamed_mutation_opt smo) {
auto mo = mutation_from_streamed_mutation(std::move(smo)).get0();
return { std::move(mo) };
}
class streamed_mutation_assertions {
streamed_mutation _sm;
clustering_key::equality _ck_eq;
public:
streamed_mutation_assertions(streamed_mutation sm)
: _sm(std::move(sm)), _ck_eq(*_sm.schema()) { }
streamed_mutation_assertions& produces_static_row() {
auto mfopt = _sm().get0();
if (!mfopt) {
BOOST_FAIL("Expected static row, got end of stream");
}
if (mfopt->mutation_fragment_kind() != mutation_fragment::kind::static_row) {
BOOST_FAIL(sprint("Expected static row, got: %s", mfopt->mutation_fragment_kind()));
}
return *this;
}
streamed_mutation_assertions& produces(mutation_fragment::kind k, std::vector<int> ck_elements) {
std::vector<bytes> ck_bytes;
for (auto&& e : ck_elements) {
ck_bytes.emplace_back(int32_type->decompose(e));
}
auto ck = clustering_key_prefix::from_exploded(*_sm.schema(), std::move(ck_bytes));
auto mfopt = _sm().get0();
if (!mfopt) {
BOOST_FAIL(sprint("Expected mutation fragment %s, got end of stream", ck));
}
if (mfopt->mutation_fragment_kind() != k) {
BOOST_FAIL(sprint("Expected mutation fragment kind %s, got: %s", k, mfopt->mutation_fragment_kind()));
}
if (!_ck_eq(mfopt->key(), ck)) {
BOOST_FAIL(sprint("Expected key %s, got: %s", ck, mfopt->key()));
}
return *this;
}
streamed_mutation_assertions& produces(mutation_fragment mf) {
auto mfopt = _sm().get0();
if (!mfopt) {
BOOST_FAIL(sprint("Expected mutation fragment %s, got end of stream", mf));
}
if (!mfopt->equal(*_sm.schema(), mf)) {
BOOST_FAIL(sprint("Expected %s, but got %s", mf, *mfopt));
}
return *this;
}
streamed_mutation_assertions& produces(const mutation& m) {
assert_that(mutation_from_streamed_mutation(_sm).get0()).is_equal_to(m);
return *this;
}
streamed_mutation_assertions& produces_only(const std::deque<mutation_fragment>& fragments) {
for (auto&& f : fragments) {
produces(f);
}
produces_end_of_stream();
return *this;
}
streamed_mutation_assertions& produces_row_with_key(const clustering_key& ck) {
BOOST_TEST_MESSAGE(sprint("Expect %s", ck));
auto mfo = _sm().get0();
if (!mfo) {
BOOST_FAIL(sprint("Expected row with key %s, but got end of stream", ck));
}
if (!mfo->is_clustering_row()) {
BOOST_FAIL(sprint("Expected row with key %s, but got %s", ck, *mfo));
}
auto& actual = mfo->as_clustering_row().key();
if (!actual.equal(*_sm.schema(), ck)) {
BOOST_FAIL(sprint("Expected row with key %s, but key is %s", ck, actual));
}
return *this;
}
// If ck_ranges is passed, verifies only that information relevant for ck_ranges matches.
streamed_mutation_assertions& produces_range_tombstone(const range_tombstone& rt, const query::clustering_row_ranges& ck_ranges = {}) {
BOOST_TEST_MESSAGE(sprint("Expect %s", rt));
auto mfo = _sm().get0();
if (!mfo) {
BOOST_FAIL(sprint("Expected range tombstone %s, but got end of stream", rt));
}
if (!mfo->is_range_tombstone()) {
BOOST_FAIL(sprint("Expected range tombstone %s, but got %s", rt, *mfo));
}
const schema& s = *_sm.schema();
range_tombstone_list actual_list(s);
position_in_partition::equal_compare eq(s);
while (mutation_fragment* next = _sm.peek().get0()) {
if (!next->is_range_tombstone() || !eq(next->position(), mfo->position())) {
break;
}
actual_list.apply(s, _sm().get0()->as_range_tombstone());
}
actual_list.apply(s, mfo->as_range_tombstone());
{
range_tombstone_list expected_list(s);
expected_list.apply(s, rt);
actual_list.trim(s, ck_ranges);
expected_list.trim(s, ck_ranges);
if (!actual_list.equal(s, expected_list)) {
BOOST_FAIL(sprint("Expected %s, but got %s", expected_list, actual_list));
}
}
return *this;
}
streamed_mutation_assertions& fwd_to(const clustering_key& ck1, const clustering_key& ck2) {
return fwd_to(position_range{
position_in_partition(position_in_partition::clustering_row_tag_t(), ck1),
position_in_partition(position_in_partition::clustering_row_tag_t(), ck2)
});
}
streamed_mutation_assertions& fwd_to(position_range range) {
BOOST_TEST_MESSAGE(sprint("Forwarding to %s", range));
_sm.fast_forward_to(std::move(range)).get();
return *this;
}
streamed_mutation_assertions& produces_end_of_stream() {
auto mfopt = _sm().get0();
if (mfopt) {
BOOST_FAIL(sprint("Expected end of stream, got: %s", *mfopt));
}
return *this;
}
void has_monotonic_positions() {
position_in_partition::less_compare less(*_sm.schema());
mutation_fragment_opt prev;
for (;;) {
mutation_fragment_opt mfo = _sm().get0();
if (!mfo) {
break;
}
if (prev) {
if (less(mfo->position(), prev->position())) {
BOOST_FAIL(sprint("previous fragment has greater position: prev=%s, current=%s", *prev, *mfo));
}
}
prev = std::move(mfo);
}
}
};
static inline streamed_mutation_assertions assert_that_stream(streamed_mutation sm)
{
return streamed_mutation_assertions(std::move(sm));
}