The 'bool' type doesn't hold any meaning on its own, which makes the
template instantiation sites not very readable:
tuple_type<true>
To improve that, we can introduce an enum class which is meaningful in
every context:
tuple_type<allow_prefixes::yes>
Origin is using CompositeType to serialize composite keys and that
type is using 16-bit integer to encode the length. If it's enough for
Origin, it's enough for us.
Because the length preceeds component contents, serialized tuples are
not byte order comparable. It breaks lexi ordering. It should be that
"abc" < "b", but with length prefix "<3>abc" is greater than
"<1>b". Only single element tuples can be byte order capared because
they have no length component.
Spotted during code review.
There are two sides of this optimization:
1) We don't store the length of the last component, so the
representation is now shorter.
2) A single-element tuple is serialized exactly as the component it
holds, which allows us to optimize conversions for such keys.
tuple_type is for managing our internal representation of keys. It
shares some interface with abstract_type, but the latter is a basis
for types of data stored in cells. tuple_type does not need to hide
behind a virtual interface.
Note: there is a TupleType in Origin, but it serves a different purpose.
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.
Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.
Instead of using inefficient std::ostream, use our own 'bytes' iterator class.
Compute ahead of time the length of the byte buffer.
Afterwards serialize the objects into it.
Gives ~X5 boost over previus results (that sometimes don't even
finish in reasonable time)
[avi: add missing include]
deserialize_value() is slow because it involves multiple allocations
and copies. Internal operations such as compare() or hash() don't need
all that heavy transformations, now that those functions work on
bytes_view we can iterate over component values in-place.