Schemas using compact storage can have clustering keys with the trailing
components not set and effectively being a clustering key prefixes
instead of full clustering keys.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
We use boost::any to convert to and from database values (stored in
serlialized form) and native C++ values. boost::any captures information
about the data type (how to copy/move/delete etc.) and stores it inside
the boost::any instance. We later retrieve the real value using
boost::any_cast.
However, data_value (which has a boost::any member) already has type
information as a data_type instance. By teaching data_type intances about
the corresponding native type, we can elimiante the use of boost::any.
While boost::any is evil and eliminating it improves efficiency somewhat,
the real goal is growing native type support in data_type. We will use that
later to store native types in the cache, enabling O(log n) access to
collections, O(1) access to tuples, and more efficient large blob support.
We need a container which can be used with compacting
allocators. "bytes" can't be used with compacting allocator because it
can't handle its external storage being moved.
make_empty() is used from thrift to create a clustering_key for a
table's row without clustering key columns. The implementation was
misleading because it seemed to be handling any number of components in
the key while only no-component case is supposed to work.
We now have partition_key_view, clustering_key_view, etc.
Database APIs will be extended to also accept views.
This will alows us to avoid allocations in certain scenarios.
From Pekka:
"This patch series converts LegacySchemaTables keyspace merging code to
C++. After this series, keyspaces are actually created as demonstrated
by the newly added test in cql_query_test.cc."
This automatically exposes them in partition_key and clustering_key too.
The iterators return bytes_view to components.
For example:
schema s;
partition_key k;
for (bytes_view component : boost::make_iterator_range(key.begin(s), key.end(s))) {
// ...
}
The 'bool' type doesn't hold any meaning on its own, which makes the
template instantiation sites not very readable:
tuple_type<true>
To improve that, we can introduce an enum class which is meaningful in
every context:
tuple_type<allow_prefixes::yes>
There are two sides of this optimization:
1) We don't store the length of the last component, so the
representation is now shorter.
2) A single-element tuple is serialized exactly as the component it
holds, which allows us to optimize conversions for such keys.
tuple_type is for managing our internal representation of keys. It
shares some interface with abstract_type, but the latter is a basis
for types of data stored in cells. tuple_type does not need to hide
behind a virtual interface.
Note: there is a TupleType in Origin, but it serves a different purpose.
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.
Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.