Commit Graph

8470 Commits

Author SHA1 Message Date
Nadav Har'El
7dc843fc1c repair: stop ongoing repairs during shutdown
When shutting down a node gracefully, this patch asks all ongoing repairs
started on this node to stop as soon as possible (without completing
their work), and then waits for these repairs to finish (with failure,
usually, because they didn't complete).

We need to do this, because if the repair loop continues to run while we
start destructing the various services it relies on, it can crash (as
reported in #699, although the specific crash reported there no longer
occurs after some changes in the streaming code). Additionally, it is
important that to stop the ongoing repair, and not wait for it to complete
its normal operation, because that can take a very long time, and shutdown
is supposed to not take more than a few seconds.

Fixes #699.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1455218873-6201-1-git-send-email-nyh@scylladb.com>
2016-02-14 16:52:41 +02:00
Raphael S. Carvalho
a487ef1ff3 sstables: improve log message when a sstable is sealed
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <e391243212d83347b1b50c728bee24f6a2ecc950.1455230788.git.raphaelsc@scylladb.com>
2016-02-14 12:05:16 +02:00
Tomasz Grabiec
456275e06a storage_proxy: Simplify condition
Message-Id: <1455288472-30538-1-git-send-email-tgrabiec@scylladb.com>
2016-02-14 11:22:15 +02:00
Tomasz Grabiec
321287dd7c cql3: Fix crash when parsing collection condition
Happened when parsing a statement like this:

 DELETE FROM tmap WHERE k=0 IF m[null] = 'foo'

Message-Id: <1455294896-15184-1-git-send-email-tgrabiec@scylladb.com>
2016-02-14 11:21:10 +02:00
Takuya ASADA
3697cee76d dist: switch AMI base image to 'CentOS7-Base2', uses CentOS official kernel
On previous CentOS base image, it accsidently uses non-standard kernel from elrepo.
This replaces base image to new one, contains CentOS default kernel.

Fixes #890

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1455398903-2865-1-git-send-email-syuu@scylladb.com>
2016-02-14 10:15:27 +02:00
Tomasz Grabiec
efdbc3d6d7 abstract_replication_strategy: Fix generation of token ranges
We can't move-from in the loop because the subject will be empty in
all but the first iteration.

Fixes crash during node stratup:

  "Exiting on unhandled exception of type 'runtime_exception': runtime error: Invalid token. Should have size 8, has size 0"

Fixes update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_add_node_1_test (and probably others)

Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
2016-02-12 19:38:36 +01:00
Shlomi Livne
f938e1d303 dist: start scylla with SCYLLA_IO
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <d93a7b41a285fcde796c5681479a328f1efac0c3.1455188901.git.shlomi@scylladb.com>
2016-02-11 17:01:03 +02:00
Shlomi Livne
5494135ddd dist: update SCYLLA_IO with params for AMI
Add setting of --num-io-queues, --max-io-requests for AMI

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <b94a63154a91c8568e194d7221b9ffc7d7813ebc.1455188901.git.shlomi@scylladb.com>
2016-02-11 17:01:02 +02:00
Shlomi Livne
5cae2560a3 dist: introduce SCYLLA_IO
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <6490d049fd23a335bb0a95cac3e8a4c08c61166e.1455188901.git.shlomi@scylladb.com>
2016-02-11 17:01:02 +02:00
Shlomi Livne
d8cdf76e70 dist: change setting of scylla home from "-d" to "-r"
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <53dcd9d1daa0194de3f889b67788d9c21d1e474d.1455188901.git.shlomi@scylladb.com>
2016-02-11 17:00:37 +02:00
Avi Kivity
3c4f67f3e6 build: require boost > 1.55
See #898.

Add checks both for boost being installed, and for the correct version.
Message-Id: <1455193574-24959-1-git-send-email-avi@scylladb.com>
2016-02-11 15:15:49 +02:00
Avi Kivity
9249d45ae1 Update scylla-ami submodule
* dist/ami/files/scylla-ami b2724be...b3b85be (1):
  > adding --stop-services
2016-02-11 12:24:17 +02:00
Avi Kivity
5834815ed9 Merge seastar upstream
* seastar 14c9991...353b1a1 (2):
  > scripts: posix_net_conf.sh: Change the way we learn NIC's IRQ numbers
  > gate: protect against calling close() more than once
2016-02-11 12:23:51 +02:00
Takuya ASADA
09b1ec6103 dist: attach ephemeral disks on AMI by default
To attach maximum number of ephemeral disks available on the instance, specify 8.
On AMI creation, it will be reduce to available number.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1454439628-2882-1-git-send-email-syuu@scylladb.com>
2016-02-11 12:21:09 +02:00
Takuya ASADA
16e6db42e1 dist: abandon to start scylla-server when it's disabled from AMI userdata
Support AMi's --stop-services, prevent startup scylla-server (and scylla-jmx, since it's dependent on scylla-server)

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1454492729-11876-1-git-send-email-syuu@scylladb.com>
2016-02-11 12:21:08 +02:00
Takuya ASADA
f227b3faac dist: On AMI, mark root disk with delete_on_termination
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1454513308-12384-1-git-send-email-syuu@scylladb.com>
2016-02-11 12:19:28 +02:00
Takuya ASADA
33309f667e dist: enable enhanced networking on AMI
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1454971289-21369-1-git-send-email-syuu@scylladb.com>
2016-02-11 12:18:48 +02:00
Raphael S. Carvalho
ed61fe5831 sstables: make compaction stop report user-friendly
When scylla stopped an ongoing compaction, the event was reported
as an error. This patch introduces a specialized exception for
compaction stop so that the event can be handled appropriately.

Before:
ERROR [shard 0] compaction_manager - compaction failed: read exception:
std::runtime_error (Compaction for keyspace1/standard1 was deliberately
stopped.)

After:
INFO  [shard 0] compaction_manager - compaction info: Compaction for
keyspace1/standard1 was stopped due to shutdown.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <1f85d4e5c24d23a1b4e7e0370a2cffc97cbc6d44.1455034236.git.raphaelsc@scylladb.com>
2016-02-11 12:16:53 +02:00
Takuya ASADA
8d8130f9c9 dist: fix typo on build_ami.sh
We should always run scylla_setup, not just for locally built rpm

Fixes #897

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1455103519-13780-1-git-send-email-syuu@scylladb.com>
2016-02-11 11:56:11 +02:00
Shlomi Livne
64f8d5a50e dist: update packer location
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <3c33ea073f702e00b789930fce9befef03ad9e88.1455178900.git.shlomi@scylladb.com>
2016-02-11 11:52:56 +02:00
Avi Kivity
bfbf89ee31 Merge "Serialize keys in a form independent of in-memory representation" from Tomasz
"This series changes the on-wire definitions of keys to be of the following form:

  class partition_key {
     std::vector<bytes> exploded();
  };

Keys are therefore collections of components. The components are serialized according
to the format specified in the CQL binary protocol. No bit depends now on how we store keys in memory.

Constructing keys from components currently requires a schema reference,
which makes it not possible to deserialize or serialize the keys automatically
by RPC. To avoid those complications, compound_type was changed so that
it can be constructed and components can be iterated over without schema.
Because of this, partition_key size increased by 2 bytes."
2016-02-10 17:54:42 +02:00
Tomasz Grabiec
b74301302c tests: Add test for key serialization 2016-02-10 15:22:56 +01:00
Tomasz Grabiec
3e2c1840d8 idl: Make key definitions independent of in-memory representation 2016-02-10 15:22:56 +01:00
Tomasz Grabiec
428fce3828 compound: Optimize serialize_single() 2016-02-10 15:22:56 +01:00
Tomasz Grabiec
0cc2832a76 keys: Allow constructing from a range 2016-02-10 15:22:56 +01:00
Tomasz Grabiec
3ffcb998fb keys: Enable serialization from a range not just a vector 2016-02-10 14:35:14 +01:00
Tomasz Grabiec
095efd01d6 keys: Make from_exploded() and components() work without schema
For simplicity, we want to have keys serializable and deserializable
without schema for now. We will serialize keys in a generic form of a
vector of components where the format of components is specified by
CQL binary protocol. So conversion between keys and vector of
components needs to be possible to do without schema.

We may want to make keys schema-dependent back in the future to apply
space optimizations specific to column types. Existing code should
still pass schema& to construct and access the key when possible.

One optimization had to be reverted in this change - avoidance of
storing key length (2 bytes) for single-component partition keys. One
consequence of this, in addition to a bit larger keys, is that we can
no longer avoid copy when constructing single-component partition keys
from a ready "bytes" object.

I haven't noticed any significant performance difference in:

  tests/perf/perf_simple_query -c1 --write

It does ~130K tps on my machine.
2016-02-10 14:35:13 +01:00
Tomasz Grabiec
31312722d1 compound: Reduce duplication 2016-02-10 14:35:13 +01:00
Tomasz Grabiec
085d148d6f compound: Remove unused methods 2016-02-10 14:35:13 +01:00
Tomasz Grabiec
b777cc9565 tests: Fix tests to not rely on key representation 2016-02-10 14:35:13 +01:00
Asias He
6d0407503b locator: Do not generate wrap-around ranges
Like we did in commit d54c77d5d0,
make the remaining functions in abstract_replication_strategy return
non-wrap-around ranges.

This fixes:

ERROR [shard 0] stream_session - [Stream #f0b7fda0-cf3e-11e5-b6c4-000000000000]
stream_transfer_task: Fail to send to 127.0.0.4:0: std::runtime_error (Not implemented: WRAP_AROUND)

in streaming.
Message-Id: <514d2a9a1d3b868d213464c8858ac5162c0338d8.1455093643.git.asias@scylladb.com>
2016-02-10 10:03:31 +01:00
Avi Kivity
9f3061ade8 Revert "streaming: Send mutations on all shards"
This reverts commit 31d439213c.

Fixes #894.

Conflicts:
    streaming/stream_manager.cc

(may have undone part of 63a5aa6122)
2016-02-09 18:26:14 +02:00
Calle Wilund
873f87430d database: Check sstable dir name UUID part when populating CF
Fixes #870
Only load sstables from CF directories that match the current
CF uuid.
Message-Id: <1454938450-4338-1-git-send-email-calle@scylladb.com>
2016-02-08 14:48:19 +01:00
Calle Wilund
2ffd7d7b99 stream_manager: Change construction to make gcc 4.9 happy
gcc 4.9 complains about the type{ val, val } construction of
type with implicit default constructor, i.e. member = initial
declarations. gcc 5 does not (and possibly rightly so).
However, we still (implicitly) claim to support gcc 4.9 so
why not just change this particular instance.

Message-Id: <1454921328-1106-1-git-send-email-calle@scylladb.com>
2016-02-08 10:54:48 +02:00
Paweł Dziepak
c90ec731c8 transport: do not close gate at connection shutdown
connection::_pending_requests_gate is responsible for keeping connection
objects alive as long as there are outstanding requests and is closed
in connection::proccess() when needed. Closing it in connection::shutdown()
as well may cause the gate to be closed twice what is a bug.

Fixes #690.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1454596390-23239-1-git-send-email-pdziepak@scylladb.com>
2016-02-07 20:07:23 +02:00
Avi Kivity
8b0a26f06d build: support for alternative versions of libsystemd pkgconfig
While pkgconfig is supposed to be a distribution and version neutral way
of detecting packages, it doesn't always work this way.  The sd_notify()
manual page documents that sd_notify is available via the libsystemd
package, but on centos 7.0 it is only available via the libsystemd-daemon
package (on centos 7.1+ it works as expected).

Fix by allowing for alternate version of package names, testing each one
until a match is found.

Fixes #879.

Message-Id: <1454858862-5239-1-git-send-email-avi@scylladb.com>
2016-02-07 17:36:57 +02:00
Avi Kivity
ad58663c96 row_cache: reindent 2016-02-07 13:25:29 +02:00
Asias He
31d439213c streaming: Send mutations on all shards
Currently, only the shard where the stream_plan is created on will send
streaing mutations. To utilize all the available cores, we can make each
shard send mutations which it is responsbile for. On the receiver side,
we do not forward the mutations to the shard where the stream_session is
created, so that we can avoid unnecessary forwarding.

Note: the downside is that it is now harder to:

1) to track number of bytes sent and received
2) to update the keep alive timer upon receive of the STREAM_MUTATION

To fix, we now store the sent/recieved bytes info on all shards. When
the keep alive timer expires, we check if any progress has been made.

Hopefully, this patch will make the streaming much faster and in turn
make the repair/decommission/adding a node faster.

Refs: https://github.com/scylladb/scylla/issues/849

Tested with decommission/repair dtest.

Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>
2016-02-07 10:57:51 +02:00
Gleb Natapov
63a5aa6122 prevent superfluous frozen_mutation copying
Sometimes frozen_mutation is copied while it can be moved instead. Fix
those cases.

Message-Id: <20160204165708.GI6705@scylladb.com>
2016-02-07 10:54:16 +02:00
Erich Keane
4197ceeedb raw_statement::is_reversed rewrite to avoid VLA
The is_reversed function uses a variable length array, which isn't
spec-abiding C++.  Additionally, the Clang compiler doesn't allow them
with non-POD types, so this function wouldn't compile.

After reading through the function it seems that the array wasn't
necessary as the check could be calculated inline rather than
separately.  This version should be more performant (since it no longer
requires the VLA lookup performance hit) while taking up less memory in
all but the smallest of edge-cases (when the clustering_key_size *
sizeof(optional<bool>) < sizeof(size_type) - sizeof(uint32_t) +
sizeof(bool).

This patch uses  relation_order_unsupported it assure that the exception
order is consistent with the preivous version.  The throw would
otherwise be moved into the initial for-loop.

There are two derrivations in behavior:
The first is the initial assert.  It however should not change the apparent
behavior besides causing orderings() to be looked up 2x in debug
situations.

The second is the conversion of is_reversed_ from an optional to a bool.
The result is that the final return value is now well-defined to be
false in the release-condition where orderings().size() == 0, rather
than be the ill-defined *is_reversed_ that was there previously.

Signed-off-by: Erich Keane <erich.keane@verizon.net>
Message-Id: <1454546285-16076-4-git-send-email-erich.keane@verizon.net>
2016-02-07 10:38:17 +02:00
Erich Keane
49842aacd9 managed_vector: maybe_constructed ctor to non-constexpr
Clang enforces that a union's constexpr CTOR must initialize
one of the members.  The spec is seemingly silent as to what
the rule on this is, however, making this non-constexpr results in clang
accepting the constructor.

Signed-off-by: Erich Keane <erich.keane@verizon.net>
Message-Id: <1454604300-1673-1-git-send-email-erich.keane@verizon.net>
2016-02-07 10:30:45 +02:00
Erich Keane
e87019843f Fix PHI_FACTOR definition to be spec compliant
PHI_FACTOR is a constexpr variable that is defined using std::log.
Though G++ has a constexpr version of std::log, this itself is not spec
complaint (in fact, Clang enforces this).  See C++ Spec 26.8 for the
definition of std::log and 17.6.5.6 for the rule regarding adding
constexpr where it isn't specified.

This patch replaces the std::log statement with a version from math.h
that contains the exact value (M_LOG10El).

Signed-off-by: Erich Keane <erich.keane@verizon.net>
Message-Id: <1454603285-32677-1-git-send-email-erich.keane@verizon.net>
2016-02-04 18:33:44 +02:00
Avi Kivity
c85f6c4df1 Merge seastar upstream
* seastar 661ccd9...14c9991 (1):
  > reactor: use correct open_flags when opening a file without DMA support

Fixes #871.
2016-02-04 18:17:04 +02:00
Gleb Natapov
77d47c0c4b optimize serialization of array/vector of integral types
Array of integral types on little endian machine can be memcpyed into/out
of a buffer instead of serialized/deserialized element by element.

Message-Id: <20160204155425.GC6705@scylladb.com>
2016-02-04 18:01:14 +02:00
Avi Kivity
91fbb81477 Merge seastar upstream
* seastar f8beab9...661ccd9 (1):
  > Merge "Use swapcontext() with AddressSanitizer" from Paweł
2016-02-04 17:30:15 +02:00
Paweł Dziepak
ababdfc9e2 tests/batchlog: use proper batchlog version
Since 42e3999a00 "Check batchlog version
before replaying" there is a version check in batchlog replay.
However, the test wasn't updated and still used some arbitrary version
number which caused it to fail.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1454595368-21670-1-git-send-email-pdziepak@scylladb.com>
2016-02-04 16:50:45 +02:00
Gleb Natapov
049ae37d08 storage_proxy: change collectd to show foreground mutation instead of overall mutation count
It is much easier to see what is going on this way otherwise graphs for
bg mutations and overall mutations are very close with usual scaling for
many workloads.

Message-Id: <20160204083452.GH6705@scylladb.com>
2016-02-04 14:58:56 +02:00
Gleb Natapov
a9e4afd8d2 Drop query-result.hh from database.hh
It is not needed there but causes a lot of recompilation when changed.

Message-Id: <1454496142-14537-3-git-send-email-gleb@scylladb.com>
2016-02-04 13:22:27 +02:00
Gleb Natapov
2ae1ae2d18 Cleanup messaging_service.hh includes a bit.
Forward declare some classes instead.

Message-Id: <1454496142-14537-2-git-send-email-gleb@scylladb.com>
2016-02-04 13:22:24 +02:00
Avi Kivity
f3ca597a01 Merge "Sstable cleanup fixes" from Tomasz
"  - Added waiting for async cleanup on clean shutdown

  - Crash in the middle of sstable removal doesn't leave system in a non-bootable state"
2016-02-04 12:36:13 +02:00