Commit Graph

55 Commits

Author SHA1 Message Date
Piotr Jastrzebski
ea449c9cce Replace sstables::mutation_reader with ::mutation_reader
This will make migration to flat_mutation_reader much
easier and sstables::mutation_reader is going away with
this migration anyway.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-15 10:40:01 +01:00
Avi Kivity
f7023501d6 treewide: use shared_sstable, make_sstable in place of lw_shared_ptr<sstable>
Since shared_sstable is going to be its own type soon, we can't use the old alias.
2017-09-12 10:43:05 +03:00
Avi Kivity
5ebb15b9d4 sstable_mutation_test: add missing include 2017-09-12 10:43:05 +03:00
Duarte Nunes
4c9206ba2f tests/sstable_mutation_test: Don't use moved-from object
Fix a bug introduced in dbbb9e93d and exposed by gcc6 by not using a
moved-from object. Twice.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170802161033.4213-1-duarte@scylladb.com>
2017-08-03 09:45:49 +03:00
Duarte Nunes
dbbb9e93da tests/sstable_mutation_test: Test promoted index blocks are monotonic
Reproduces #2333

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 18:23:58 +02:00
Tomasz Grabiec
a9237c1666 schema: Revert back to the 1.7 layout of static compact tables in memory
We are using C* 3.x compatible layout in schema tables but want to
keep using the 1.7 layout in memory for compatibility during rolling
upgrade. This patch switches the schema and schema_builder classes
back to the old layout. Translation of layout happens when converting
to/from schema mutations.

Notable changes:

 1) Includes a revert of commit 6260f31e08
    "thrift: Update CQL mapping of static CFs".

 2) Brings back the "default_validation_class" schema attribute. In v3
    it can be dervied from column definitions, but in v2 it can't, so
    we have to store it.

 3) legacy_schema_migrator and schema_builder don't have to do
    conversions to v3, this is now handled by the v3_columns
    class. schema_builder works with the same layout as schema, that
    is v2.

 4) Includes a revert of commit 66991a7ccb
    "v3 schema test fixes"

Fixes #2555.
2017-07-19 09:52:15 +02:00
Avi Kivity
555621b537 Disentable memtables from sstables
Remove sstable::write_components(memtable), replacing it with a helper.

Fixes #2354
Message-Id: <20170624142639.16662-1-avi@scylladb.com>
2017-06-26 09:37:11 +02:00
Tomasz Grabiec
f3a6d94398 sstables: Introduce sstable::as_mutation_source()
Adaptors extracted from existing testing code.
Message-Id: <1495729508-30081-1-git-send-email-tgrabiec@scylladb.com>
2017-05-25 19:30:20 +03:00
Tomasz Grabiec
3c509308ab range_tombstone_list: Merge adjacent range tombstones in apply()
Needed for equivalence to work correctly with difference and addition:

  m1 + (m2 - m1) = m1 + m2

Fixes #2158.
2017-05-23 13:16:03 +02:00
Calle Wilund
66991a7ccb v3 schema test fixes 2017-05-10 16:44:48 +00:00
Tomasz Grabiec
fd5dbe04b5 tests: sstables: Test more configutaions of sstable writer in test_sstable_conforms_to_mutation_source()
Test different versions of the format, and different promoted index
block sizes.  The size of 1 is especially important, it will put each
fragment in a separate block, exposing various issues with promoted
index handling.
2017-04-27 18:43:49 +02:00
Duarte Nunes
4e693383f7 mutation_partion: Use row_tombstone
This patch replaces the current row tombstone representation by a
row_tombstone.

The intent of the patch is thus to reify the idea of shadowable
tombstones, that up until now we considered all materialized view row
tombstones to be.

We need to distinguish shadowable from non-shadowable row tombstones
to support scenarios such as, when inserting to a table with a
materialzied view:

1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1
2. delete from base using timestamp 2 where p = 3
3. insert into base (p, v1) values (3, 1) using timestamp 3

These should yield a view row where v2 is definitely null, but with
the current implementation, v2 will pop back with its value v2=3@TS=1,
even though its dead in the base row. This is because the row
tombstone inserted at 2) is a shadowable one.

This patch only addresses the memory representation of such
row_tombstones.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-04-25 11:46:33 +02:00
Tomasz Grabiec
ad1e69c4c5 tests: Move as_mutation_source() helper to header 2017-03-10 14:42:22 +01:00
Tomasz Grabiec
892d4a2165 db: Enable creating forwardable readers via mutation_source
Right now all mutation source implementations will use
make_forwardable() wrapper.
2017-02-23 18:50:44 +01:00
Tomasz Grabiec
2b8bd10dca tests: Pass all mutation source parameters 2017-02-13 20:52:49 +01:00
Piotr Jastrzebski
b159e08764 intrusive_set: rename size() to calculate_size()
This hopefully will make it more apparent that
the time complexity of this method is O(N) not O(1).

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 12:21:43 +01:00
Piotr Jastrzebski
4bbe05dd47 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Avi Kivity
1d9ee358f1 Revert "Merge "Reduce the size of mutation_partition" from Piotr"
This reverts commit aa392810ff, reversing
changes made to a24ff47c637e6a5fd158099b8a65f1191fc2d023; it uses
boost::intrusive::detail directly, which it must not, and doesn't compile on
all boost versions as a consequence.
2016-12-25 16:07:48 +02:00
Piotr Jastrzebski
345ed5b6ff intrusive_set: rename size() to calculate_size()
This hopefully will make it more apparent that
the time complexity of this method is O(N) not O(1).

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-12-23 11:32:13 +01:00
Piotr Jastrzebski
2af6ff68d9 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-12-23 11:29:07 +01:00
Asias He
e5485f3ea6 Get rid of query::partition_range
Use dht::partition_range instead
2016-12-19 08:09:25 +08:00
Asias He
85034c1b57 Convert to use dht::partition_range 2016-12-19 08:04:30 +08:00
Avi Kivity
a35136533d Convert ring_position and token ranges to be nonwrapping
Wrapping ranges are a pain, so we are moving wrap handling to the edges.

Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.

Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
2016-11-02 21:04:11 +02:00
Avi Kivity
7faf2eed2f build: support for linking statically with boost
Remove assumptions in the build system about dynamically linked boost unit
tests.  Includes seastar update which would have otherwise broken the
build.
2016-10-26 08:51:21 +03:00
Paweł Dziepak
f49a9e0d64 sstables: drop unused read_range_rows() overload
That overload was used only by unit test and violated guarantee that
partition range lives until mutation reader is done.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-10-19 15:29:08 +01:00
Raphael S. Carvalho
1f31223f32 sstables: store schema in sstable object
That will be needed for optimization that will store decorated keys
in the sstable object, and also for a subsequent work that will
detect wrong metadata (min/max column names) by looking at columns
in the schema. As schema is stored in sstable, there's no longer
a need to store ks and cf names in it.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-09-02 10:49:17 -03:00
Piotr Jastrzebski
bb0c4c3c40 Fix compilation errors
query::range parameter in mutation_partiton::range
has to be changed to nonwrapping_range.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <36e444bfe90586f8d3b08ca36d8dc13d5898ef97.1471347402.git.piotr@scylladb.com>
2016-08-16 12:49:54 +01:00
Paweł Dziepak
5ba4cd1a0b sstables: enable_lw_shared_from_this for sstable
sstable has member functions that create objects which need to extend
lifetime of the sstable (for example mutation_readers), the easiest way
to achieve that is to enable_lw_shared_from_this for sstable.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-29 15:51:12 +01:00
Duarte Nunes
6fc6adbdeb sstable_mutation_test: Test non-compound cell name
This patch adds a test case for reading non-compound cell names,
validating that such a cell is not incorrectly marked as static.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1469616205-4550-5-git-send-email-duarte@scylladb.com>
2016-07-28 11:12:20 +02:00
Paweł Dziepak
b6f78a8e2f sstable: make sstable reads return streamed_mutation
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:50 +01:00
Paweł Dziepak
737eb73499 mutation_reader: make readers return streamed_mutations
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:50 +01:00
Paweł Dziepak
11f43a8e91 tests/sstable: drop sstable_range_wrapping_reader
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:49 +01:00
Duarte Nunes
70083efee2 sstables: Read and write range tombstone bounds
This patch uses the composite_marker to add inclusiveness information
to the prefixes of a range tombstone.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00
Duarte Nunes
7628e403a3 sstables: Drop code for tombstone merging
Since Scylla now supports proper range tombstones, the code for
reading ranges from sstables and converting them to overlapping
tombstones is no longer necessary, and is, in fact, wasteful as
the internal representation converts overlapping tombstones back to
ranges.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00
Duarte Nunes
91aac30f12 mutations: Row tombstones are now a set of ranges
This patch changes the type of the mutation partition's row_tombstones
to be a range_tombstone_list, so that they are now represented as a
set of disjoint ranges. All of its usages are updated accordingly.

Fixes #1155

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00
Duarte Nunes
dc8319ed91 keys: Remove schema argument from make_empty
An empty key is independent of the schema.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:36 +02:00
Nadav Har'El
92ef11ffaa stables_mutation_test: more compare keys not representations
Commit 0fc4c36952 missed one place where
keys were compared using their byte representation. Fix that.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459778074-10759-4-git-send-email-nyh@scylladb.com>
2016-04-11 11:36:17 +03:00
Nadav Har'El
9f9353ae5b sstable_mutation_test: another test for range tombstone merging
This is even a more elaborate tombstone merging unit test, with
3 levels of nesting, which did not pass with older range-tombstone
merging algorithms, and works with the current one.

I started with deletion of three nested levels of row -
aaa, aaa:bbb, and aaa:bbb::ccc. I then complicated the sstable
even further by adding additional middle-points with the same
timestamps (which we saw happening in some real-life sstables),
resulting in:

[
{"key": "pk",
 "cells": [["aaa:_","aaa:bba:_",1459438519943668,"t",1459438519],
           ["aaa:bba:_","aaa:bbb:_",1459438519943668,"t",1459438519],
           ["aaa:bbb:_","aaa:bbb:ccb:_",1459438519950348,"t",1459438519],
           ["aaa:bbb:ccb:_","aaa:bbb:ccc:_",1459438519950348,"t",1459438519],
           ["aaa:bbb:ccc:_","aaa:bbb:ccc:!",1459438519958850,"t",1459438519],
           ["aaa:bbb:ccc:!","aaa:bbb:ddd:!",1459438519950348,"t",1459438519],
           ["aaa:bbb:ddd:!","aaa:bbb:!",1459438519950348,"t",1459438519],
           ["aaa:bbb:!","aaa:!",1459438519943668,"t",1459438519]]}
]

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459778074-10759-3-git-send-email-nyh@scylladb.com>
2016-04-11 11:35:59 +03:00
Nadav Har'El
77a793048e sstable_mutation_test: strengthen tombstone_merging test
In the tombstone_merging test, we expected one row tombstone. But we did
not verify that in addition to that row tombstone, there is no other rows
(deleted or otherwise). It turns out that in the onld merging algorithm,
we did produce additional deleted rows which shouldn't have been there.

So this patch adds a test that there are no such additional deleted rows
beyond the one row tombstone we expect. The test passes with the new
range tombstone merging algorithm.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459778074-10759-2-git-send-email-nyh@scylladb.com>
2016-04-11 11:35:46 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Tomasz Grabiec
0fc4c36952 tests: sstable_mutation_test: Compare keys not representations
Representation is opaque at this level of abstraction.

Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459508193-7086-1-git-send-email-tgrabiec@scylladb.com>
2016-04-03 11:39:03 +03:00
Nadav Har'El
6c4ee49bd3 sstables: another test for range tombstone merging
This is another unit test for range tombstone merging, introduced in commit
0fc9a5ee4d and rewritten in commit
99ecda3c96.

In this test, a single large deletion was broken up into several smaller
ranges, all with the same time stamps, so we should recombine them into
one row tombstone, instead of failing the read.

The sstable in this test case was artificially created using json2sstable.
We don't know how yet to produce such a case using Cassandra 2, but we
have seen a similar occurance in the wild, in a real SSTable.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459429243-15821-1-git-send-email-nyh@scylladb.com>
2016-04-01 11:55:14 +02:00
Nadav Har'El
99ecda3c96 sstables: overhaul range tombstone reading
Until recently, we believed that range tombstones we read from sstables will
always be for entire rows (or more generalized clustering-key prefixes),
not for arbitrary ranges. But as we found out, because Cassandra insists
that range tombstones do not overlap, it may take two overlapping row
tombstones and convert them into three range tombstones which look like
general ranges (see the patch for a more detailed example).

Not only do we need to accept such "split" range tombstones, we also need
to convert them back to our internal representation which, in the above
example, involves two overlapping tombstones. This is what this patch does.

This patch also contains a test for this case: We created in Cassandra
an sstable with two overlapping deletions, and verify that when we read
it to Scylla, we get these two overlapping deletions - despite the
sstable file actually having contained three non-overlapping tombstones.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <b7c07466074bf0db6457323af8622bb5210bb86a.1459399004.git.glauber@scylladb.com>
2016-03-31 12:49:50 +03:00
Benoît Canet
1fb9a48ac5 exception: Optionally shutdown communication on I/O errors.
I/O errors cannot be fixed by Scylla the only solution
is to shutdown the database communications.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>
2016-03-17 15:02:52 +02:00
Tomasz Grabiec
095efd01d6 keys: Make from_exploded() and components() work without schema
For simplicity, we want to have keys serializable and deserializable
without schema for now. We will serialize keys in a generic form of a
vector of components where the format of components is specified by
CQL binary protocol. So conversion between keys and vector of
components needs to be possible to do without schema.

We may want to make keys schema-dependent back in the future to apply
space optimizations specific to column types. Existing code should
still pass schema& to construct and access the key when possible.

One optimization had to be reverted in this change - avoidance of
storing key length (2 bytes) for single-component partition keys. One
consequence of this, in addition to a bit larger keys, is that we can
no longer avoid copy when constructing single-component partition keys
from a ready "bytes" object.

I haven't noticed any significant performance difference in:

  tests/perf/perf_simple_query -c1 --write

It does ~130K tps on my machine.
2016-02-10 14:35:13 +01:00
Tomasz Grabiec
b777cc9565 tests: Fix tests to not rely on key representation 2016-02-10 14:35:13 +01:00
Glauber Costa
58fdae33bd mutation_source: turn it into a class
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-01-25 15:20:38 -05:00
Tomasz Grabiec
4e5a52d6fa db: Make read interface schema version aware
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.

Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.

Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.

Schema requesting across nodes is currently stubbed (throws runtime
exception).
2016-01-11 10:34:52 +01:00
Avi Kivity
47499dcf18 data_value: make conversion from bytes explicit
Since bytes is a very generic value that is returned from many calls,
it is easy to pass it by mistake to a function expecting a data_value,
and to get a wrong result.  It is impossible for the data_value constructor
to know if the argument is a genuine bytes variable, a data_value of another
type, but serialized, or some other serialized data type.

To prevent misuse, make the data_value(bytes) constructor
(and complementary data_value(optional<bytes>) explicit.
2015-11-13 17:12:29 +02:00
Avi Kivity
2c3591cbd9 data_value de-any-fication
We use boost::any to convert to and from database values (stored in
serlialized form) and native C++ values.  boost::any captures information
about the data type (how to copy/move/delete etc.) and stores it inside
the boost::any instance.  We later retrieve the real value using
boost::any_cast.

However, data_value (which has a boost::any member) already has type
information as a data_type instance.  By teaching data_type intances about
the corresponding native type, we can elimiante the use of boost::any.

While boost::any is evil and eliminating it improves efficiency somewhat,
the real goal is growing native type support in data_type.  We will use that
later to store native types in the cache, enabling O(log n) access to
collections, O(1) access to tuples, and more efficient large blob support.
2015-10-30 17:38:51 +01:00