Don't return foreign_ptr<> which is not copyable so that we can use
map_difference for maps with result_set in them.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
The immediate motivation for introducing frozen_mutation is inability
to deserialize current "mutation" object, which needs schema reference
at the time it's constructed. It needs schema to initialize its
internal maps with proper key comparators, which depend on schema.
frozen_mutation is an immutable, compact form of a mutation. It
doesn't use complex in-memory strucutres, data is stored in a linear
buffer. In case of frozen_mutation schema needs to be supplied only at
the time mutation partition is visited. Therefore it can be trivially
deserialized without schema.
Use commit log in database, from Calle:
"Initial" usage of the commitlog in database mutation path.
A commitlog is created in "work" dirs when initing the db
from a datadir. However, since we have neither disk data storage,
nor replay capability yet (and no real db config), the settings
are basically to just write in-memory serialization, write them to
disk and then discard them. So in fact, pointless. But at least using
the log...
* A commitlog is created in "work" dirs when initing the db
from a datadir. However, since we have neither disk data storage,
nor replay capability yet (and no real db config), the settings
are basically to just write in-memory serialization, write them to
disk and then discard them. So in fact, pointless. But at least using
the log...
* Moved the actual "apply" of mutation into database. If a commitlog
is active, add an entry to it before applying mutation.
Origin supports chaining multiple mutations but we don't. Therefore,
return a vector of mutations for 'create keyspace'.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Partitions should be ordered using Origin's ordering, which is first
by token, then by Origin's representation of the key. That is the
natural ordering of decorated_key.
This also changes mutation class to hold decorated_key, to avoid
decoration overhead at different layers.
This patch converts (for very small value of 'converts') some
replication related classes. Only static topology is supported (it is
created in keyspace::create_replication_strategy()). During mutation
no replication is done, since messaging service is not ready yet,
only endpoints are calculated.
* database now holds all keyspace + column family object
* column families are mapped by uuid, either generated or explicit
* lookup by name tuples or uuid
* finder functions now return refs + throws on missing obj
Cassandra added support for specifying user-specified query handlers
instead of the default QueryProcessor in CASSANDRA-6659. We don't really
need that now and as we're C++, we cannot even support existing custom
query handlers. Therefore, remove the QueryHandler class and references
to it.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Pass a reference to storage_proxy and apply mutations in
legacy_schema_tables::merge_schema().
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
This patch adds initial support for PREPARE and EXECUTE requests which
are used by the CQL binary protocol for prepared statements. The use of
prepared statement gives a nice 2.5x single core performance boost for
Urchin:
$ ./build/release/seastar --data data --smp 1
$ ./tools/bin/cassandra-stress write -mode cql3 simplenative -rate threads=32
Results:
op rate : 31728
partition rate : 31728
row rate : 31728
latency mean : 1.0
latency median : 0.9
latency 95th percentile : 1.8
latency 99th percentile : 1.8
latency 99.9th percentile : 5.6
latency max : 181.7
Total operation time : 00:00:30
END
$ ./tools/bin/cassandra-stress write -mode cql3 simplenative prepared -rate threads=32
Results:
op rate : 75033
partition rate : 75033
row rate : 75033
latency mean : 0.4
latency median : 0.4
latency 95th percentile : 0.7
latency 99th percentile : 0.8
latency 99.9th percentile : 3.4
latency max : 205.0
Total operation time : 00:00:30
END
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.
Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.
Add database::shard_of() to compute the shard hosting the partition
(with a simplistic algorithm, but perhaps not too bad).
Convert non-metadata invoke_on_all() and local calls on the database
to use shard_of().
s/database/distributed<database>/ everywhere.
Use simple distribution rules: writes are broadcast, reads are local.
This causes tremendous data duplication, but will change soon.
With replication, we want the contents of the mutation to be available
to multiple replicas.
(In this context, we will replicate the mutation to all shards in the same
node, as a temporary step in sharding a node; but the issue also occurs
when replicating to other nodes).