To prepare a user-defined type, we need to look up its name in the keyspace.
While we get the keyspace name as an argument to prepare(), it is useless
without the database instance.
Fix the problem by passing a database reference along with the keyspace.
This precolates through the class structure, so most cql3 raw types end up
receiving this treatment.
Origin gets along without it by using a singleton. We can't do this due
to sharding (we could use a thread-local instance, but that's ugly too).
Hopefully the transition to a visitor will clean this up.
This gives about 30% increase in tps in:
build/release/tests/perf/perf_simple_query -c1 --query-single-key
This patch switches query result format from a structured one to a
serialized one. The problems with structured format are:
- high level of indirection (vector of vectors of vectors of blobs), which
is not CPU cache friendly
- high allocation rate due to fine-grained object structure
On replica side, the query results are probably going to be serialized
in the transport layer anyway, so this change only subtracts
work. There is no processing of the query results on replica other
than concatenation in case of range queries. If query results are
collected in serialized form from different cores, we can concatenate
them without copying by simply appending the fragments into the
packet. This optimization is not implemented yet.
On coordinator side, the query results would have to be parsed from
the transport layer buffers anyway, so this also doesn't add work, but
again saves allocations and copying. The CQL server doesn't need
complex data structures to process the results, it just goes over it
linearly consuming it. This patch provides views, iterators and
visitors for consuming query results in serialized form. Currently the
iterators assume that the buffer is contiguous but we could easily
relax this in future so that we can avoid linearization of data
received from seastar sockets.
The coordinator side could be optimized even further for CQL queries
which do not need processing (eg. select * from cf where ...) we
could make the replica send the query results in the format which is
expected by the CQL binary protocol client. So in the typical case the
coordinator would just pass the data using zero-copy to the client,
prepending a header.
We do need structure for prefetched rows (needed by list
manipulations), and this change adds query result post-processing
which converts serialized query result into a structured one, tailored
particularly for prefetched rows needs.
This change also introduces partition_slice options. In some queries
(maybe even in typical ones), we don't need to send partition or
clustering keys back to the client, because they are already specified
in the query request, and not queried for. The query results hold now
keys as optional elements. Also, meta-data like cell timestamp and
ttl is now also optional. It is only needed if the query has
writetime() or ttl() functions in it, which it typically won't have.
'keys' and 'prefix' are used twice in the same expression, and as the language
does not guarantee any ordering in this case, any moves are illegal.
Get rid of them.
Preparation for range queries, from Tomasz:
"This series adds static typic for different key variants.
It also changes clustered row map to boost implementation which allows to use
heterogenous keys, so that we can lookup a row by a full prefix without
reserializing it.
Similar change is made to row prefix tombstones."
Query processor needs to store prepared statements as part of a client
session for PREPARE and EXECUTE requests. Switch from unique_ptr to
shared_ptr in preparation for that.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.
Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.
Regular columns may have names of arbitrary type. See
https://issues.apache.org/jira/browse/CASSANDRA-8178
Primary key columns are UTF8.
This change also does some refactoring of the schema object to make
the change easier to digest (more encapsulation).
The method may defer so the result is wrapped in future<>.
I think we don't need to wrap arguments in shared_ptr<> because they
may come from the request state object.