Commit Graph

25 Commits

Author SHA1 Message Date
Paweł Dziepak
f5fff734ed cq3: update_parameters: add getters for ttl and expiry
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-07-30 14:10:06 +02:00
Pekka Enberg
d50139351f cql3: Use pragma once everywhere
There's no benefit to using C include guards so switch to pragma once
everywhere for consistency.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-05-12 16:32:56 +03:00
Tomasz Grabiec
b1e45e4401 db: Store ttl in atomic_cell
Origin does that, so should we. Both ttl and expiry time are stored in
sstables. The value of ttl seems to be used to calculate the read
digest (expiry is not used for that).

The API for creating atomic_cells changed a bit.

To create a non-expiring cell:

  atomic_cell::make_live(timestamp, value);

To create an expiring cell:

  atomic_cell::make_live(timestamp, value, expiry, ttl);

or:

  // Expiry is calculated based on current clock reading
  atomic_cell::make_live(timestamp, value, ttl_optional);
2015-05-06 19:42:38 +02:00
Tomasz Grabiec
5ba1486ae7 db: Rename "ttl" to "expiry" when it's used as time point
To avoid confusion with "ttl" the duration.
2015-05-06 17:27:22 +02:00
Tomasz Grabiec
731a63e371 schema: Embed raw_schema inside schema
Public fields got encapsulated.
2015-04-24 18:01:01 +02:00
Tomasz Grabiec
00f99cefd4 db: split query.hh to reduce header dependencies 2015-04-15 20:44:59 +02:00
Tomasz Grabiec
878a740b9d db: Write query results in serialized form
This gives about 30% increase in tps in:

  build/release/tests/perf/perf_simple_query -c1 --query-single-key

This patch switches query result format from a structured one to a
serialized one. The problems with structured format are:

  - high level of indirection (vector of vectors of vectors of blobs), which
    is not CPU cache friendly

  - high allocation rate due to fine-grained object structure

On replica side, the query results are probably going to be serialized
in the transport layer anyway, so this change only subtracts
work. There is no processing of the query results on replica other
than concatenation in case of range queries. If query results are
collected in serialized form from different cores, we can concatenate
them without copying by simply appending the fragments into the
packet. This optimization is not implemented yet.

On coordinator side, the query results would have to be parsed from
the transport layer buffers anyway, so this also doesn't add work, but
again saves allocations and copying. The CQL server doesn't need
complex data structures to process the results, it just goes over it
linearly consuming it. This patch provides views, iterators and
visitors for consuming query results in serialized form. Currently the
iterators assume that the buffer is contiguous but we could easily
relax this in future so that we can avoid linearization of data
received from seastar sockets.

The coordinator side could be optimized even further for CQL queries
which do not need processing (eg. select * from cf where ...)  we
could make the replica send the query results in the format which is
expected by the CQL binary protocol client. So in the typical case the
coordinator would just pass the data using zero-copy to the client,
prepending a header.

We do need structure for prefetched rows (needed by list
manipulations), and this change adds query result post-processing
which converts serialized query result into a structured one, tailored
particularly for prefetched rows needs.

This change also introduces partition_slice options. In some queries
(maybe even in typical ones), we don't need to send partition or
clustering keys back to the client, because they are already specified
in the query request, and not queried for. The query results hold now
keys as optional elements. Also, meta-data like cell timestamp and
ttl is now also optional. It is only needed if the query has
writetime() or ttl() functions in it, which it typically won't have.
2015-04-15 20:44:50 +02:00
Tomasz Grabiec
f469dd1234 cql3: Fix setter_by_index::execute() to work with empty cells
The list cell may be not set, in which case we should reurn an error
to the user. Current implementation of get_prefetched_list() was
returning collection_mutaion::view in this case, which had an empty
view. Deserialization is not prepared to get an empty view though. I
think we can stick with having non-empty views in the general case and
return an optional in get_prefetched_list().
2015-03-30 09:07:01 +02:00
Tomasz Grabiec
6487e82561 cql3: Fix make_tombstone_just_before()
The timestamp should be decremented, not local deletion time.
2015-03-30 09:07:01 +02:00
Tomasz Grabiec
f30fe80d07 cql3: update_parameters: Add timestamp attribute getter 2015-03-30 09:07:00 +02:00
Tomasz Grabiec
2902395129 Relax includes 2015-03-30 09:01:59 +02:00
Avi Kivity
fec79ac147 cql: introduce update_parameters::make_tombstone_just_before()
Addresses a quirk in how collection tombstones are compared vs. cells:
a timestamp tie goes to the tombstone.  Matches origin.
2015-03-23 21:54:22 +02:00
Avi Kivity
4fb7aba0f5 cql3: convert update_parameters::get_prefetched_list() to C++ 2015-03-23 21:54:14 +02:00
Avi Kivity
65a2c68df2 cql3: fix update_parameters prefetched rows type
We may be required to work on multiple rows (IN (row_key_1, row_key_2)) so
use a query::result for prefetched rows.
2015-03-23 19:34:29 +02:00
Tomasz Grabiec
bdbd5547e3 db: Cleanup key names
clustering_key::one -> clustering_key
clustering_key::prefix::one -> clustering_key_prefix
partition_key::one -> partition_key
clustering_prefix -> exploded_clustering_prefix
2015-03-20 18:59:29 +01:00
Tomasz Grabiec
90298af614 db: Cleanup atomic_cell naming
atomic_cell -> atomic_cell_type
atomic_cell::one -> atomic_cell
atomic_cell::view -> atomic_cell_view
2015-03-20 18:59:29 +01:00
Tomasz Grabiec
1b1af8cdfd db: Introduce types to hold keys
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.

Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.
2015-03-17 15:56:29 +01:00
Avi Kivity
a49330095a db: wrap bytes in atomic_cell format
We use bytes for many different things, and it is easy to get confused as
to what format the data is actually in.

Fix that for atomic_cell by proving wrappers.  atomic_cell::one corresponds
to a bytes object holding exactly one atomic cell, and atomic_cell::view is
a bytes_view to an atomic_cell.  The static functions of atomic_cell itself
are privatized to prevent the unwashed masses from using them on the wrong
objects.

Since a row entry can hold either a an atomic cell, or a collection,
depending on the schema, also introduce a variant type
atomic_cell_or_collection and allow the user to pick the type explicitly.
Internally both are stored as bytes object.
2015-03-04 15:49:35 +02:00
Tomasz Grabiec
74295a9759 db: Use opaque bytes for cell values instead of boost::any
Storing cells as boost::any objects makes us use expensive
boost::any_cast to access the data. This change replaces boost::any
with bytes object which holds the value in serialized form (the same
as will be used for on-wire format).

If the cell type is atomic, you use fields accessors defined in
atomic_cell class, eg like this:

if (column.type.is_atomic()) {
   if (atomic_cell::is_live(c) {
      auto timestamp = atomic_cell::timestamp(c);
      ...
   }
}

Eventually we could switch to a more officient semi-serialized form
with native byte order but I don't want to introduce it just yet for
simplicity.
2015-02-27 10:59:43 +01:00
Tomasz Grabiec
d5a7f37c45 db: Merge api.hh into database.hh 2015-02-09 10:28:44 +01:00
Tomasz Grabiec
800ba79efa db: Drop api:: namespace from mutation model classes
In preparation for merging into database.hh
2015-02-09 10:28:44 +01:00
Tomasz Grabiec
3645e983c0 db: Refactor atomic_cell structure
The new structure has common timestamp field extracted, it has the
same meaning for both live and dead cells. It will make it easier to
merge cells this way.
2015-02-09 10:28:44 +01:00
Tomasz Grabiec
5710a99f44 cql3: Fix mis-overrides of cql_statement::execute*()
The method may defer so the result is wrapped in future<>.

I think we don't need to wrap arguments in shared_ptr<> because they
may come from the request state object.
2015-02-04 10:28:51 +01:00
Tomasz Grabiec
64128dc117 cql3: Convert UpdateParameters 2015-01-29 18:55:24 +01:00
Pekka Enberg
a9930ac12c cql3: Convert UpdateParameters to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-01-20 11:47:38 +02:00