Commit Graph

11716 Commits

Author SHA1 Message Date
Duarte Nunes
4eca7632ec sstables: Replace composite fields with raw bytes
This patch fixes a regression introduced in
f81329be60, which made keys compound by
default when using a particular ctor, in turn leading to mismatches
when comparing the same key built with functions that properly
consider compoundness.

As a temporary fix, the sstable::key and sstable::key_view classes
store raw bytes instead of a composite.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468339295-3924-1-git-send-email-duarte@scylladb.com>
2016-07-12 18:08:04 +02:00
Duarte Nunes
f013425bb5 query: Ensure timestamp is last param in read_command
Since the timestamp is not serialized, it must always be the last
parameter of query::read_command. This patch reorders it with the
partition_limit parameters and updates callers that specified a
timestamp argument.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468312334-10623-1-git-send-email-duarte@scylladb.com>
2016-07-12 10:41:54 +01:00
Amnon Heiman
41546747d8 scylla-server.service: Start the scylla-housekeeping
This makes scylla-server to try and start the scylla-housekeeping.

Failing to start the service will not interfere with the scylla-server
start.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-12 12:32:52 +03:00
Amnon Heiman
0eba2b8fd5 scylla.spec.in: Pack the scylla-housekeeping service
This change pack and install the scylla-housekeeping service under
redhat like systems.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-12 12:32:48 +03:00
Tomasz Grabiec
c5e3c9bc35 Merge branch 'duarten/composite-v7' from git@github.com:duarten/scylla.git
From Duarte:

This patchset adds a representation of a legacy composite
value to compound_compat.hh and replaces the one in
sstables/key.hh. This patchset is needed for the thrift series.
2016-07-12 10:49:02 +02:00
Amnon Heiman
6d5049d90b Adding the scylla-housekeeping service
The scylla housekeeping service responsible for recurent tasks.

It is currently set to run daily and report if the version is correct.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-12 11:47:04 +03:00
Amnon Heiman
30efdabf55 Introducting the scylla-housekeeping script
scylla-housekeeping is a script that check and report for hardware and software issues.

The first phase of it check for newer version and report if the version
is old.

To see the available options run
scylla-housekeeping help
2016-07-12 11:12:43 +03:00
Glauber Costa
73a70e6d0a config: Use Scylla in user visible options
We have imported most of our data about config options from Cassandra.  Due to
that, many options that mention the database by name are still using
"Cassandra".

Specially for the user visible options, which is something that a user sees, we
should really be using Scylla here.

This patch was created by automatically replacing every occurrence of "Cassandra"
with "Scylla" and then later on discarding the ones in which the change didn't
make sense (such as Unused options and mentions to the Cassandra documentation)

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <1423e1d7e36874a1f46bd091aec96dcb4d8482d9.1468267193.git.glauber@scylladb.com>
2016-07-12 09:18:17 +03:00
Duarte Nunes
f81329be60 sstables: sstables::key delegates to composite
The sstables::key class now delegates much of its functionality
to the composite class. All existing behavior is preserved.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-11 23:37:33 +02:00
Gleb Natapov
726b79ea91 messaging_service: enable internode_compression option
Use LZ4 for internode compression if enabled.

Message-Id: <20160711141734.GZ18455@scylladb.com>
2016-07-11 18:30:21 +03:00
Avi Kivity
201f585ab6 Merge seastar upstream
* seastar e7a7d41...e660d54 (1):
  > rpc: add factory class for lz4 compressor
2016-07-11 18:29:43 +03:00
Glauber Costa
f7706d51d1 scyllatop: fix typo
Keyborad -> Keyboard

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <349f20fd69be2f2e05ae0b7800e34a336cd2472b.1468248179.git.glauber@scylladb.com>
2016-07-11 18:27:49 +03:00
Duarte Nunes
ad8ff1df7e sstables: Replace composite class
This patch replaces the sstables::composite class with the one in
compound_compat.hh.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-11 16:55:11 +02:00
Duarte Nunes
0b87d16699 composite: Add unit tests
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-11 16:55:11 +02:00
Duarte Nunes
b179d8d378 compound_compat: Parse legacy compound values
This patch adds support for parsing legacy compound values by
introducing the composite class, a wrapper around a sequence of bytes
serialized in the legacy format for compounds. Compound values can be
sent though the thrift API.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-11 16:55:07 +02:00
Avi Kivity
9b08ddb639 Merge seastar upstream
* seastar 9267dfa...e7a7d41 (3):
  > Merge "Compression support for RPC" from Gleb
  > reactor: allow sleeping while disk aio is pending
  > sstring: add resize method
2016-07-11 16:23:29 +03:00
Calle Wilund
4ab03e98cf commitlog: Ensure we don't end up in a loop when we must wait for alloc
Continuation reordering could cause us to repeatedly see the
segment-local flag var even though actual write/sync ops are done.
Can cause wild recursion without actual delayed continuation ->
SOE.

Fix by also checking queue status, since this is the wait object.

Message-Id: <1468234873-13581-1-git-send-email-calle@scylladb.com>
2016-07-11 14:12:38 +03:00
Calle Wilund
14b0fe23c5 commitlog: Ensure we don't end up in a loop when we must wait for alloc
Continuation reordering could cause us to repeatedly see the 
segment-local flag var even though actual write/sync ops are done. 
Can cause wild recursion without actual delayed continuation ->
SOE. 

Fix by also checking queue status, since this is the wait object.
2016-07-11 07:45:36 +00:00
Avi Kivity
f126efd7f2 transport: encode user-defined type metadata
Right now we fall back to tuples, which confuses the client.

Fixes #1443.

Reviewed-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468167120-1945-1-git-send-email-avi@scylladb.com>
2016-07-11 08:51:17 +03:00
Takuya ASADA
d2caa486ba dist/redhat/centos_dep: disable go and ada language on scylla-gcc package, since ScyllaDB never use them
centos-master jenkins job failed at building libgo, but we don't need go language, so let's disable it on scylla-gcc package.
Also we never use ada, disable it too.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1468166660-23323-1-git-send-email-syuu@scylladb.com>
2016-07-10 19:12:52 +03:00
Avi Kivity
24e3026e32 Merge "compaction manager refactoring" from Raphael 2016-07-10 17:16:23 +03:00
Tomasz Grabiec
6a1f9a9b97 db: Improve logging
Message-Id: <1467997671-16570-1-git-send-email-tgrabiec@scylladb.com>
2016-07-10 16:15:03 +03:00
Avi Kivity
b5bef73ad2 Merge "Avoiding checking bloom filters during compaction" from Tomasz
"Checking bloom filters of sstables to compute max purgeable timestamp
for compaction is expensive in terms of CPU time. We can avoid
calculating it if we're not about to GC any tombstone.

This patch changes compacting functions to accept a function instead
of ready value for max_purgeable.

I verified that bloom filter operations no longer appear on flame
graphs during compaction-heavy workload (without tombstones).

Refs #1322."
2016-07-10 11:33:41 +03:00
Tomasz Grabiec
8c4b5e4283 db: Avoiding checking bloom filters during compaction
Checking bloom filters of sstables to compute max purgeable timestamp
for compaction is expensive in terms of CPU time. We can avoid
calculating it if we're not about to GC any tombstone.

This patch changes compacting functions to accept a function instead
of ready value for max_purgeable.

I verified that bloom filter operations no longer appear on flame
graphs during compaction-heavy workload (without tombstones).

Refs #1322.
2016-07-10 09:54:20 +02:00
Tomasz Grabiec
c0233c877d db: Avoid out-of-memory when flushing cannot keep up
memtable_list::seal_on_overlflow() is called on each mutation to check
if current memtable should be flushed. It will call
memtable_list::seal_active_memtable() when that is the case.

The number of concurrent seals is guarded by a semaphore, starting
from commit 0f64eb7e7d, and allows
at most 4 of them.

If there are 4 flushes already pending, every incoming mutation will
enqueue a new flush task on the semaphore's wait list, without waiting
for it. The wait queue can grow without bounds, eventually leading to
out-of-memory.

The fix is to seal the memtable immediately to satisfy should_flush()
condition, but limit concurrency of actual flushes. This way the wait
queue size on the semaphore is limited by memtables pending a flush,
which is fairly limited.

Message-Id: <1467997652-16513-1-git-send-email-tgrabiec@scylladb.com>
2016-07-10 10:53:51 +03:00
Tomasz Grabiec
74ff30a31a mutation_reader: Introduce stable_flattened_mutations_consumer adaptor
Needed to make compact_mutation class non-movable later. It is used in
do_with, so needs to be movable. Will be solved by using this adaptor.
2016-07-09 22:31:28 +02:00
Tomasz Grabiec
fb44f895b2 mutation_reader: Name template parameters after concepts
With so many consumer concepts out there, it is confusing to name
parameters using genering "Consumer" name, let's name them after
(already defined) concepts: CompactedMutationsConsumer, FlattenedConsumer.
2016-07-09 22:31:27 +02:00
Raphael S. Carvalho
ed5e7e6842 compaction: refactor compaction manager
Previously, same function was used to handle both regular compaction
and cleanup requests. That's bad because a lot of conditions were
added for both compaction types to live in the same function.
Now, cleanup and regular compaction will live in different functions.
They share a lot of code, so helper functions were introduced.
This change is also important for user-initiated compaction that
will go through compaction manager in the future.
Code is also a lot easier to read now.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 16:37:53 -03:00
Raphael S. Carvalho
da6a2b429d compaction: add functions to register and deregister compacting sstables
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 16:00:51 -03:00
Raphael S. Carvalho
4d6dce8ec9 compaction: add helper function to get candidates for strategy
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 15:06:14 -03:00
Raphael S. Carvalho
e38f66c6fe database: make certain column family functions const qualified
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 15:05:22 -03:00
Raphael S. Carvalho
bfc5376548 compaction: remove gate from compaction manager task
There is no longer a need to use gate for regular termination of
fiber that runs compaction. Now, we only set task->stopping to
true, ask for compaction termination, and wait for its future to
resolve. Code is simplified a lot with this change.

Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 15:05:10 -03:00
Paweł Dziepak
cba996a3ea Merge "Implement missing functions for byte_ordered_partitioner" from Asias 2016-07-08 10:49:25 +01:00
Asias He
f4389349e4 config: Enable partitioner option
Enable --partitioner option so that user can choose partitioner other
than the default Murmur3Partitioner. Currently, only Murmur3Partitioner
and ByteOrderedPartitioner are supported. When non-supported partitioner
is specifed, error will be propogated to user.
2016-07-08 17:44:55 +08:00
Asias He
9c27b5c46e byte_ordered_partitioner: Implement missing describe_ownership and midpoint
In order to support ByteOrderedPartitioner, we need to implement the
missing describe_ownership and midpoint function in
byte_ordered_partitioner class.

As a starter, this path uses a simple node token distance based method
to calculate ownership. C* uses a complicated key samples based method.
We can switch to what C* does later.

Tests are added to tests/partitioner_test.cc.

Fixes #1378
2016-07-08 17:44:55 +08:00
Asias He
e0949a8f4f storage_service: Exit shadow round state if it fails
If a node fails to talk to any seed node, shadow round will fail. We
should exit shadow round state before we continue.

This issue is spotted by
consistency_test.TestConsistency.data_query_digest_test dtest.
Message-Id: <ba0613532a69bac369ca316ab61d907b320c8e68.1467963674.git.asias@scylladb.com>
2016-07-08 10:05:07 +01:00
Avi Kivity
8dab93a853 sstables: fix low disk utilization with compression and small chunk lengths
As Nadav notes we use the chunk length as the buffer size for the compressed
stream too.

Fix by using it only for the outer (uncompressed) stream; the inner
(compressed) stream uses the sstable buffer size, 128 kiB.

Fixes #1402.
Message-Id: <1467910556-5759-1-git-send-email-avi@scylladb.com>
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
2016-07-07 18:13:30 +01:00
Vlad Zolotarov
f2bf453be2 database: revive mutation retry in case of replay_position_reordered_exception
The logic that would retry applying a mutation in case of
a replay_position_reordered_exception error was broken by
a commit 0c31f3e626
Author: Glauber Costa <glauber@scylladb.com>
Date:   Wed Apr 20 19:09:21 2016 -0400

    database: move memtable throttler to the LSA throttler

This patch makes it work again.

Fixes #1439

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1467893342-30559-1-git-send-email-vladz@cloudius-systems.com>
2016-07-07 15:00:35 +02:00
Tomasz Grabiec
de429d6a53 Merge branch 'dev/pdziepak/streamed-mutations-streaming/v3'
Support for streaming of large partitions from Paweł:

This series converts streaming to streaming_mutations so that there
is need to store full mutation in memory in order to send or receive
it.

The first several patches add a way of estimating mutation fragment
memory usage and introduce fragment_and_freeze() which produces
a stream of reasonably sized frozen mutations from a single streamed
mutation.

The second part of this patchset makes sure that streaming mutations
in fragments doesn't break isolation guarantees. This is achieved by
delaying visibility of sstables produced by streaming until the
streaming is completed. However, our current receiving code merges
mutations from all streaming plans together thus making it impossible
to track which data was received from a particular streaming plan.
The solution to that problem is to introduce an additional flag to
STREAM_MUTATION verb which informs the receiver whether the mutation
is fragmented and care must be taken to preserve isolation. Small
mutations behaved as they were, with writes from different stream
plans coalesced while big mutations are handled separately for each
streaming task.
2016-07-07 13:23:39 +02:00
Paweł Dziepak
d9eb4d8028 streaming: use fragment_and_freeze() to send mutations
Commit 206955e4 "streaming: Reduce memory usage when sending mutations"
moved streaming mutation limiter from do_send_mutations() to
send_mutations(). The reason for that was that send_mutation() did full
mutation copies. That's no longer the case and streaming limiter should
be moved back to do_send_mutation() in order to provide back pressure to
fragment_and_freeze().

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:36 +01:00
Paweł Dziepak
32a5de7a1f db: handle receiving fragmented mutations
If mutations are fragmented during streaming a special care must be
taken so that isolation guarantees are not broken.

Mutations received with flag "fragmented" set are applied to a memtable
that is used only by that particular streaming task and the sstables
created by flushing such memtables are not made visible until the task
is complte. Also, in case the streaming fails all data is dropped.

This means that fragmented mutations cannot benefit from coalescing of
writes from multiple streaming plans, hence separate way of handling
them so that there is no loss of performance for small partitions.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
f2ae31711e streaming: inform CF when streaming fails
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
4031c0ed8f streaming: pass plan_id to column family for apply and flush
plan_id is needed to keep track of the origin of mutations so that if
they are fragmented all fragments are made visible at the same time,
when that particular streaming plan_id completes.

Basically, each streaming plan that sends big (fragmented) mutations is
going to have its own memtables and a list of sstables which will get
flushed and made visible when that plan completes (or dropped if it
fails).

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
51ec7a7285 db: wait for ongoing flushes at end of streaming
When flush_streaming_mutations() is called at the end of streaming it is
supposed to flush all data and then invalidate cache. ranges However, if
there are already some memtable flushes in progress it won't wait for them.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
5bc51821fe sstables: allow writing unsealed sstables
The purpose of this patch is to split the actions of writing sstable and
sealing it. As long as the sstable is unsealed it is considered
incomplete and is going to be removed on reboot.

Such functionality is needed in order to defer visibility of sstables
created during streaming until the streaming is complete.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
a7b6c1110f sstables: do not require seal_sstable() to be run in thread
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
4e34bd4e8a tests/streamed_mutation: test fragment_and_freeze()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:35 +01:00
Paweł Dziepak
19629e95e2 frozen_mutation: add fragment_add_freeze()
fragment_and_freeze() produces a stream of frozen mutations from a
single streamed_mutation.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:18:30 +01:00
Paweł Dziepak
820bd6c9bc streamed_mutation: add mutation_fragment::memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:17:25 +01:00
Paweł Dziepak
23d0bfd065 mutation_partition: add row::memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:17:25 +01:00