As flat mutation reader {up,down}grades get added to the write path,
comparing range-tombstone-containing (at least) sstables byte-by-byte
to a reference is starting to seem like a fool's errand.
* When a flat mutation reader is {up,down}graded, information may get
lost while splitting range tombstones. Making those splits revertable
should in theory be possible but would surely make {up,down}graders
slower and more complex, and may also possibly entail adding
information to in-memory representation of range tombstones and
range rombstone changes. Such investment for the sake of 7 unit tests
does not seem wise, given that the plan is to get rid of reader
{up,down}grade logic once the move to flat mutation reader v2 is
completed.
* All affected tests also validate their written sstables
semantically.
* At least some of the offending reference sstables are not
"canonical" wrt range tombstones to begin with -- they contain range
tombstones that overlap with clustering rows. The fact that Scylla
does not "canonicalize" those in some way seems purely incidental.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Some legacy `mc` SSTables (created in Scylla 3.0) may contain incorrect
serialization headers, which don't wrap frozen UDTs nested inside collections
with the FrozenType<...> tag. When reading such SSTable,
Scylla would detect a mismatch between the schema saved in schema
tables (which correctly wraps UDTs in the FrozenType<...> tag) and the schema
from the serialization header (which doesn't have these tags).
SSTables created in Scylla versions 3.1 and above, in particular in
Scylla versions that contain this commit, create correct serialization
headers (which wrap UDTs in the FrozenType<...> tag).
This commit does two things:
1. for all SSTables created after this commit, include a new feature
flag, CorrectUDTsInCollections, presence of which implies that frozen
UDTs inside collections have the FrozenType<...> tag.
2. when reading a Scylla SSTable without the feature flag, we assume that UDTs
nested inside collections are always frozen, even if they don't have
the tag. This assumption is safe to be made, because at the time of
this commit, Scylla does not allow non-frozen (multi-cell) types inside
collections or UDTs, and because of point 1 above.
There is one edge case not covered: if we don't know whether the SSTable
comes from Scylla or from C*. In that case we won't make the assumption
described in 2. Therefore, if we get a mismatch between schema and
serialization headers of a table which we couldn't confirm to come from
Scylla, we will still reject the table. If any user encounters such an
issue (unlikely), we will have to use another solution, e.g. using a
separate tool to rewrite the SSTable.
Fixes#6130.
The plan is to move the unstructured content of tests/ directory
into the following directories of test/:
test/lib - shared header and source files for unit tests
test/boost - boost unit tests
test/unit - non-boost unit tests
test/manual - tests intended to be run manually
test/resource - binary test resources and configuration files
In order to not break git bisect and preserve the file history,
first move most of the header files and resources.
Update paths to these files in .cc files, which are not moved.