Commit Graph

64 Commits

Author SHA1 Message Date
Tomasz Grabiec
6bf1c6014f mvcc: partition_snapshot_row_cursor: Mark allocation points
This marks places which may allocate but not always do as allocation
points to increase effectiveness of testing.
2017-11-13 20:55:13 +01:00
Tomasz Grabiec
d76b141b34 tests: Extract mvcc tests to separate file 2017-09-13 17:47:04 +02:00
Tomasz Grabiec
2df6f356b1 mvcc: Store LSA region reference in partition_snapshot
Will be useful for improving encapsulation.
2017-09-13 17:38:08 +02:00
Avi Kivity
9b540eccb0 database: remove dependency on compaction.hh and compaction_manager.hh 2017-09-11 20:09:45 +03:00
Piotr Jastrzebski
c602ffd610 Make Scylla ttl expiration behave like in Cassandra
Fixes #2497

[tgrabiec: reworked the title]

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <2f5a99dce6ef11fe0ef135c9fa0592078fc9a056.1502886874.git.piotr@scylladb.com>
2017-08-21 14:25:45 +02:00
Tomasz Grabiec
fb62dfab02 tests: mvcc: Introduce test_schema_upgrade_preserves_continuity 2017-06-24 18:06:11 +02:00
Tomasz Grabiec
164989a574 tests: mvcc: Add test for partition_entry::apply_to_incomplete() 2017-06-24 18:06:11 +02:00
Tomasz Grabiec
db053ef902 tests: Add test for continuity merging rules 2017-06-24 18:06:11 +02:00
Tomasz Grabiec
804f46f684 mutation: Make compare_*_for_merge() consistent with equals()
equals() considers expiring cells to be different form non-expiring cells,
but compare_row_marker_for_merge() considers them equal. Fix the latter to
pick expiring cells. The choice was arbitrary.
2017-05-23 13:35:03 +02:00
Tomasz Grabiec
c1475a8eb2 tests: mutation: Improve assertion failure message 2017-05-23 13:16:03 +02:00
Tomasz Grabiec
d15880b3b7 tests: Use default equality in test_mutation_diff_with_random_generator 2017-05-23 13:16:03 +02:00
Tomasz Grabiec
ef4c7c458c tests: mutation: Check commutativity of mutation addition 2017-05-23 12:11:12 +02:00
Duarte Nunes
4e693383f7 mutation_partion: Use row_tombstone
This patch replaces the current row tombstone representation by a
row_tombstone.

The intent of the patch is thus to reify the idea of shadowable
tombstones, that up until now we considered all materialized view row
tombstones to be.

We need to distinguish shadowable from non-shadowable row tombstones
to support scenarios such as, when inserting to a table with a
materialzied view:

1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1
2. delete from base using timestamp 2 where p = 3
3. insert into base (p, v1) values (3, 1) using timestamp 3

These should yield a view row where v2 is definitely null, but with
the current implementation, v2 will pop back with its value v2=3@TS=1,
even though its dead in the base row. This is because the row
tombstone inserted at 2) is a shadowable one.

This patch only addresses the memory representation of such
row_tombstones.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-04-25 11:46:33 +02:00
Duarte Nunes
392403b5b3 row_marker: Mark constructors explicit
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-04-25 11:43:04 +02:00
Avi Kivity
6d9e18fd61 logalloc: reduce descriptor overhead
Every lsa-allocated object is prefixed by a header that contains information
needed to free or migrate it.  This includes its size (for freeing) and
an 8-byte migrator (for migrating).  Together with some flags, the overhead
is 14 bytes (16 bytes if the default alignment is used).

This patch reduces the header size to 1 byte (8 bytes if the default alignment
is used).  It uses the following techniques:

 - ULEB128-like encoding (actually more like ULEB64) so a live object's header
   can typically be stored using 1 byte
 - indirection, so that migrators can be encoded in a small index pointing
   to a migrator table, rather than using an 8-byte pointer; this exploits
   the fact that only a small number of types are stored in LSA
 - moving the responsibility for determining an object's size to its
   migrator, rather than storing it in the header; this exploits the fact
   that the migrator stores type information, and object size is in fact
   information about the type

The patch improves the results of memory_footprint_test as following:

Before:

 - in cache:     976
 - in memtable:  947

After:

mutation footprint:
 - in cache:     880
 - in memtable:  858

A reduction of about 10%.  Further reductions are possible by reducing the
alignment of lsa objects.

logalloc_test was adjusted to free more objects, since with the lower
footprint, rounding errors (to full segments) are different and caused
false errors to be detected.

Missing: adjustments to scylla-gdb.py; will be done after we agree on the
new descriptor's format.
2017-04-24 12:23:12 +02:00
Duarte Nunes
143136647a mutation_test: Add more test cases for difference()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 14:34:01 +01:00
Paweł Dziepak
04b80272f2 cell_locker: add metrics for lock acquisition 2017-03-02 09:05:12 +00:00
Paweł Dziepak
4ffe0401ee test/mutation_source: specify whether to generate counter mutations
Tests using random mutation generator should be provided with bot
counter and non-counter mutations to ensure that both cases are
sufficiently covered. However, mixed schemas (with both counter and
non-counter columns) are not allowed so the RMG has to be explicitly
told whether to use counter or non-counter schema.
2017-02-07 15:17:14 +00:00
Piotr Jastrzebski
4bbe05dd47 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Avi Kivity
1d9ee358f1 Revert "Merge "Reduce the size of mutation_partition" from Piotr"
This reverts commit aa392810ff, reversing
changes made to a24ff47c637e6a5fd158099b8a65f1191fc2d023; it uses
boost::intrusive::detail directly, which it must not, and doesn't compile on
all boost versions as a consequence.
2016-12-25 16:07:48 +02:00
Piotr Jastrzebski
2af6ff68d9 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-12-23 11:29:07 +01:00
Asias He
e5485f3ea6 Get rid of query::partition_range
Use dht::partition_range instead
2016-12-19 08:09:25 +08:00
Avi Kivity
7faf2eed2f build: support for linking statically with boost
Remove assumptions in the build system about dynamically linked boost unit
tests.  Includes seastar update which would have otherwise broken the
build.
2016-10-26 08:51:21 +03:00
Glauber Costa
28e3f2f6ee LSA: export information about object memory footprint
We allocate objects of a certain size, but we use a bit more memory to hold
them.  To get a clerer picture about how much memory will an object cost us, we
need help from the allocator. This patch exports an interface that allow users
to query into a specific allocator to get that information.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Paweł Dziepak
6012a7e733 mutation_partition: fix iterator invalidation in trim_rows
Reversed iterators are adaptors for 'normal' iterators. These underlying
iterators point to different objects that the reversed iterators
themselves.

The consequence of this is that removing an element pointed to by a
reversed iterator may invalidate reversed iterator which point to a
completely different object.

This is what happens in trim_rows for reversed queries. Erasing a row
can invalidate end iterator and the loop would fail to stop.

The solution is to introduce
reversal_traits::erase_dispose_and_update_end() funcion which erases and
disposes object pointed to by a given iterator but takes also a
reference to and end iterator and updates it if necessary to make sure
that it stays valid.

Fixes #1609.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1472080609-11642-1-git-send-email-pdziepak@scylladb.com>
2016-08-25 16:52:35 +03:00
Piotr Jastrzebski
bb0c4c3c40 Fix compilation errors
query::range parameter in mutation_partiton::range
has to be changed to nonwrapping_range.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <36e444bfe90586f8d3b08ca36d8dc13d5898ef97.1471347402.git.piotr@scylladb.com>
2016-08-16 12:49:54 +01:00
Tomasz Grabiec
8c4b5e4283 db: Avoiding checking bloom filters during compaction
Checking bloom filters of sstables to compute max purgeable timestamp
for compaction is expensive in terms of CPU time. We can avoid
calculating it if we're not about to GC any tombstone.

This patch changes compacting functions to accept a function instead
of ready value for max_purgeable.

I verified that bloom filter operations no longer appear on flame
graphs during compaction-heavy workload (without tombstones).

Refs #1322.
2016-07-10 09:54:20 +02:00
Paweł Dziepak
983321f194 tests/mutation: do not create memtable on stack
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:51 +01:00
Paweł Dziepak
e4ae7894d4 tests/mutation: test slicing mutations
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:51 +01:00
Paweł Dziepak
737eb73499 mutation_reader: make readers return streamed_mutations
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:50 +01:00
Duarte Nunes
dc8319ed91 keys: Remove schema argument from make_empty
An empty key is independent of the schema.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:36 +02:00
Duarte Nunes
a15ed3c60f mutation_test: Specify tmp data dir
Otherwise we attempt to create sstable files under /.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1464618602-1124-1-git-send-email-duarte@scylladb.com>
2016-05-30 20:34:47 +02:00
Avi Kivity
db03295c8a Merge "Fix query digest mismatch" from Tomasz
"Currently data query digest includes cells and tombstones which may have
expired or be covered by higher-level tombstones. This causes digest
mismatch between replicas if some elements are compacted on one of the
nodes and not on others. This mismatch triggers read-repair which doesn't
resolve because mutations received by mutation queries are not differing,
they are compacted already.

The fix adds compacting step before writing and digesting query results by
reusing the algorithm used by mutation query. This is not the most optimal
way to fix this. The compaction step could be folded with the query writing,
there is redundancy in both steps. However such change carries more risk,
and thus was postponed.

perf_simple_query test (cassandra-stress-like partitions) shows regression
from 83k to 77k (7%) ops/s.

Fixes #1165."
2016-04-08 12:13:29 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Tomasz Grabiec
474a35ba6b tests: Add test for query digest calculation 2016-04-07 19:57:19 +02:00
Tomasz Grabiec
5d768d0681 tests: mutation_test: Move mutation generator to mutation_source_test.hh
So that it can be reused.
2016-04-07 19:57:19 +02:00
Tomasz Grabiec
30d25bc47a tests: mutation_test: Add test case for querying of expired cells 2016-04-07 19:57:19 +02:00
Tomasz Grabiec
2fbb55929d mutation_test: Add allocation failure stress test for apply()
The test injects allocation failures at every allocation site during
apply(). Only allocations throug allocation_strategy are instrumented,
but currently those should include all allocations in the apply() path.

The target and source mutations are randomized.
2016-03-21 21:49:53 +01:00
Tomasz Grabiec
8ede27f9c6 mutation_test: Add more apply() tests 2016-03-21 21:49:53 +01:00
Tomasz Grabiec
36575d9f01 mutation_test: Hoist make_blob() to a function 2016-03-21 21:49:53 +01:00
Tomasz Grabiec
4c85d06df7 mutation_test: Make make_blob() return different blob each time
random_bytes was constructed with the same seed each time.
2016-03-21 21:49:53 +01:00
Tomasz Grabiec
19b3df9f0f mutation_test: Fix use-after-free
The problem was that verify_row() was returning a future which was not
waited on. Fix by running the code in a thread.
2016-03-21 21:49:53 +01:00
Benoît Canet
1fb9a48ac5 exception: Optionally shutdown communication on I/O errors.
I/O errors cannot be fixed by Scylla the only solution
is to shutdown the database communications.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>
2016-03-17 15:02:52 +02:00
Glauber Costa
a339296385 database: turn sstable generation number into an optional
This patch makes sure that every time we need to create a new generation number -
the very first step in the creation of a new SSTable, the respective CF is already
initialized and populated. Failure to do so can lead to data being overwritten.
Extensive details about why this is important can be found
in Scylla's Github Issue #1014

Nothing should be writing to SSTables before we have the chance to populate the
existing SSTables and calculate what should the next generation number be.

However, if that happens, we want to protect against it in a way that does not
involve overwriting existing tables. This is one of the ways to do it: every
column family starts in an unwriteable state, and when it can finally be written
to, we mark it as writeable.

Note that this *cannot* be a part of add_column_family. That adds a column family
to a db in memory only, and if anybody is about to write to a CF, that was most
likely already called. We need to call this explicitly when we are sure we're ready
to issue disk operations safely.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-10 21:06:05 -05:00
Tomasz Grabiec
4e5a52d6fa db: Make read interface schema version aware
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.

Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.

Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.

Schema requesting across nodes is currently stubbed (throws runtime
exception).
2016-01-11 10:34:52 +01:00
Tomasz Grabiec
036974e19b Make mutation interfaces support multiple versions
Schema is tracked in memtable and cache per-entry. Entries are
upgraded lazily on access. Incoming mutations are upgraded to table's
current schema on given shard.

Mutating nodes need to keep schema_ptr alive in case schema version is
requested by target node.
2016-01-11 10:34:51 +01:00
Tomasz Grabiec
5184381a0b memtable: Deconstify memtable in readers
We want to upgrade entries on read and for that we need mutating
permission.
2016-01-11 10:34:51 +01:00
Tomasz Grabiec
3e447e4ad1 tests: mutation_test: Add tests for equality and hashing 2016-01-11 10:34:50 +01:00
Tomasz Grabiec
4b92ef01fc test: Add tests for mutation upgrade 2016-01-08 21:10:26 +01:00
Raphael S. Carvalho
03eee06784 remove empty rows in mutation_partition::do_compact
do_compact() wasn't removing an empty row that is covered by a
tombstone. As a result, an empty partition could be written to a
sstable. To solve this problem, let's make trim_rows remove a
row that is considered to be empty. A row is empty if it has no
tombstone, no marker and no cells.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-01-05 15:19:21 +01:00