bytes and sstring are distinct types, since their internal buffers are of
different length, but bytes_view is an alias of sstring_view, which makes
it possible of objects of different types to leak across the abstraction
boundary.
Fix this by making bytes a basic_sstring<int8_t, ...> instead of using char.
int8_t is a 'signed char', which is a distinct type from char, so now
bytes_view is a distinct type from sstring_view.
uint8_t would have been an even better choice, but that diverges from Origin
and would have required an audit.
This patch converts (for very small value of 'converts') some
replication related classes. Only static topology is supported (it is
created in keyspace::create_replication_strategy()). During mutation
no replication is done, since messaging service is not ready yet,
only endpoints are calculated.
* database now holds all keyspace + column family object
* column families are mapped by uuid, either generated or explicit
* lookup by name tuples or uuid
* finder functions now return refs + throws on missing obj
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.
Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.
We use bytes for many different things, and it is easy to get confused as
to what format the data is actually in.
Fix that for atomic_cell by proving wrappers. atomic_cell::one corresponds
to a bytes object holding exactly one atomic cell, and atomic_cell::view is
a bytes_view to an atomic_cell. The static functions of atomic_cell itself
are privatized to prevent the unwashed masses from using them on the wrong
objects.
Since a row entry can hold either a an atomic cell, or a collection,
depending on the schema, also introduce a variant type
atomic_cell_or_collection and allow the user to pick the type explicitly.
Internally both are stored as bytes object.
Storing cells as boost::any objects makes us use expensive
boost::any_cast to access the data. This change replaces boost::any
with bytes object which holds the value in serialized form (the same
as will be used for on-wire format).
If the cell type is atomic, you use fields accessors defined in
atomic_cell class, eg like this:
if (column.type.is_atomic()) {
if (atomic_cell::is_live(c) {
auto timestamp = atomic_cell::timestamp(c);
...
}
}
Eventually we could switch to a more officient semi-serialized form
with native byte order but I don't want to introduce it just yet for
simplicity.
Add database::shard_of() to compute the shard hosting the partition
(with a simplistic algorithm, but perhaps not too bad).
Convert non-metadata invoke_on_all() and local calls on the database
to use shard_of().
s/database/distributed<database>/ everywhere.
Use simple distribution rules: writes are broadcast, reads are local.
This causes tremendous data duplication, but will change soon.
Futures hold either a value or an exception; thrift uses two separate
function objects to signal completion, one for success, the other for
an exception.
Add a helper to pass the result of a future to either of these.
For simplicity partition data is stored using the same object which is
used for mutations: mutation_partition. Later we can introduce a more
efficient version.
Regular columns may have names of arbitrary type. See
https://issues.apache.org/jira/browse/CASSANDRA-8178
Primary key columns are UTF8.
This change also does some refactoring of the schema object to make
the change easier to digest (more encapsulation).
Cassandra allows even regular columns to be treated as a sorted map
(column name -> value), accessing it with get_slice(), so sort the column
names to support this.