Commit Graph

6565 Commits

Author SHA1 Message Date
Gleb Natapov
d53be0a91e Move operator<< for std::exception_ptr to std namespace and make it get const
If the operator is not in std namespace it cannot be found in non global
contexts.
2015-09-27 14:16:35 +03:00
Takuya ASADA
b2630db514 dist: remove rpm dependency to libvirt
This is for testing virtio mode, since we don't officially recommend to use virtio mode we should drop it.

Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-25 17:14:37 -07:00
Gleb Natapov
140641689b messaging: do not use rpc client in error state
Using rpc client in error state will result in a message loss. Try to
reconnect instead.
2015-09-24 17:50:51 +02:00
Raphael S. Carvalho
ce855577b6 add compaction stats to collectd
With this change, we can see the number and length of compaction
activity per shard from collectd.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-24 16:51:11 +02:00
Asias He
e77cea382e rpm: Improve rpm build scripts
This makes we can build in a centos container.
2015-09-23 21:42:51 -07:00
Tomasz Grabiec
1b1cfd2cbf tests: Introduce tests/memory_footprint_test 2015-09-23 21:27:44 -07:00
Tomasz Grabiec
d033cdcefe db: Move "Populating Keyspace ..." message from WARN to INFO level
WARN level is for messages which should draw log reader's attention,
journalctl highlights them for example. Populating of keyspace is a
fairly normal thing, so it should be logged on lower level.
2015-09-23 15:28:44 +02:00
Avi Kivity
b3b6fc2f39 Merge branch 'branch-0.9' 2015-09-23 06:27:55 -07:00
Avi Kivity
2b8a3c3f81 Merge seastar upstream
* seastar 66569fd...5fe596a (1):
  > memory: tolerate mbind failures

Fixes #390.
2015-09-23 06:27:23 -07:00
Asias He
ea007485d8 ami: Copy the rpm we just built only
If there previous multiple build/rpms, the build will fail like

cp a.rpm b.rpm c.rpm

c.rpm is not a directory.
2015-09-22 21:06:43 -07:00
Takuya ASADA
9c3db5cfa3 dist: change /var/lib/scylla/*/ permission to 755
Make them readable from other users.

Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-22 21:06:42 -07:00
Asias He
91b0019b50 rpm: Setup irq, network queue binding and cpuset
When scylla is deployed on AWS's c4.8xlarge or c4.8xlarge large
instances, we can apply irq and network queue binding to achieve better
performance.

Also make scylla skip using cpu0 which will be busy serving network
interrupts under high workload.
2015-09-22 21:00:59 -07:00
Avi Kivity
36c3439fae Merge branch 'branch-0.9' 2015-09-22 05:33:32 -07:00
Avi Kivity
cbc0aa4916 README: drop 'urchin' codename in favor of Scylla 2015-09-22 05:32:55 -07:00
Pekka Enberg
3d0106aa69 dist/docker: Limit Scylla to a single CPU for now
Limit to single CPU to work around abysmal performance...

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-22 05:10:38 -07:00
Avi Kivity
d69cb91bea dist: add libasan and libubsan as dependencies
Fixes mock Fedora 22 build; it needed them for the -fsanitize=vptr
detection.
2015-09-22 04:46:33 -07:00
Takuya ASADA
df995d1815 dist: change license to AGPL
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-22 10:39:39 +02:00
Paweł Dziepak
34e66e60c1 main: disable thrift by default
Fixes #205.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-22 09:48:44 +02:00
Asias He
9ab0c1e321 README: Remove ubuntu build instructions
Scylla needs thrift and antlr3 which are not provided by ubuntu. We need
to compile them from source in order to build scylla. For now, let's
only support build on Fedora.
2015-09-21 18:25:18 -07:00
Avi Kivity
99e19a9f73 Merge branch 'branch-0.9' 2015-09-21 17:03:47 -07:00
Avi Kivity
b57e170aae Merge seastar upstream
* seastar 5c68145...66569fd (2):
  > scripts: posix_net_conf.sh: posix_net_conf.sh configure the AWS NIC's IRQs affinities and RPS
  > scripts: Scripts allowing to run a command in DPDK environment.
2015-09-21 16:59:39 -07:00
Avi Kivity
37344c19e7 version: update for next cycle 2015-09-22 00:41:57 +03:00
Avi Kivity
eca0228f15 Merge branch 'branch-0.9' 2015-09-22 00:40:52 +03:00
Shlomi Livne
37bcfb0249 release: bump up version to 0.9
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-09-22 00:39:47 +03:00
Pekka Enberg
569efa2c4c dist/docker: ScyllaDB Docker image
Add a Dockerfile for building a ScyllaDB Docker image. The image is
based on Fedora 22 and ScyllaDB is installed from our RPM repository.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-22 00:22:01 +03:00
Takuya ASADA
83d05df9b7 dist: move ComboAMI related code to scylla-ami
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-22 00:17:42 +03:00
Tomasz Grabiec
83dbea5b3a Merge branch 'branch-0.9'
tests: Fix row_cache_alloc_stress
    dist: remove conflicts with cassandra21 to allow side by side rpm installation
    dist: update ami base image id to one that supports enhanced networking
2015-09-21 23:06:35 +02:00
Tomasz Grabiec
8085a04771 tests: Fix row_cache_alloc_stress
Since row_cache::populate() uses allocating_section now, the trick
with populating under relcaim lock no longer works, resulting in
assertion failure inside allocating_section:

row_cache_alloc_stress: utils/logalloc.hh:289: auto logalloc::allocating_section::operator()(logalloc::region&, Func&&) [with Func = row_cache::populate(const mutation&)::<lambda()>::<lambda()>]: Assertion `r.reclaiming_enabled()' failed.

Use the trick with populating until eviction is detected by comapring
region occupancy.
2015-09-21 23:01:52 +02:00
Shlomi Livne
0758117854 dist: remove conflicts with cassandra21 to allow side by side rpm installation
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-09-21 20:58:45 +02:00
Shlomi Livne
a2313bc7b6 dist: update ami base image id to one that supports enhanced networking
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-09-21 20:58:45 +02:00
Tomasz Grabiec
a588c72ef2 Merge branch 'branch-0.9'
Changes:

    transport: fix poller removal
    dist: Add CentOS packaging
    row_cache: Use allocating_section in row_cache::populate()
2015-09-21 20:28:21 +02:00
Gleb Natapov
a15c062b5d transport: fix poller removal
During cql connection removal we wait for all outstanding sends to
complete by waiting for _ready_to_respond future to resolve, but if
at this point connection is in _pending_responders then poller my call
do_flush() and try to reuse same _ready_to_respond future that already has
a continuation attached to it. The fix is to remove connection from
the poller before waiting for _ready_to_respond. The special measures
should be taken to prevent the connection from been added to the poller
again, so we set _flush_requested to avoid exactly that.
2015-09-21 17:48:00 +02:00
Takuya ASADA
710442f9fa dist: Add CentOS packaging
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-21 13:27:14 +03:00
Tomasz Grabiec
4712af2c21 row_cache: Use allocating_section in row_cache::populate()
Cache has a tendency to eat up all available memory. It is evicted
on-demand, but this happens at certain points in time (during large
allocation requests). Small allocations which are served from small
object pools won't usually trigger this. Large allocations happen for
example when LSA region needs a new segment, eg. when row cache is
populated. If large allocations happen for certain period only inside
row_cache::update(), then eviction will not be able to make forward
progress because cache's LSA region is locked inside
row_cache::update(). While it's locked, data can't be evicted from
it.

The solution is to use allocating_section.

Fixes #376.
2015-09-21 13:25:13 +03:00
Pekka Enberg
6cef7d8270 db/schema_tables: Fix calculate_schema_digest()
map_reduce() can run the reducer out-of-order which breaks the MD5 hash.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

Fixes #357. [tgrabiec]
2015-09-21 11:51:17 +02:00
Nadav Har'El
1a4c8db71a scanning_reader: fix bug on still-being-written memtable
scanning_reader has a bug in its range support when it iterates over a
memtable which is still open, and thus might still be modified between
calls to the read function.

This caused, among other things, issue #368 - where repair was reading
a memtable which was still open and being written to (by a stream from a
a remote node).

The problem is that scanning_reader has an optimization so it can avoid
comparing the current partition with the range's end on every iteration:
It finds, once, a pointer to the element past the end of the range (the
so-called "upper bound"), and saves this pointer in _end. Then at every
iteration, we can just compare pointers.

But If partitions are added to the memtable, the _end we saved is no longer
relevant: It still points to a valid partition, but this partition which
was once the first partition *after* the range, may now be precedeed by
many new partitions, which may be now returned despite being after the
range's end.

The fix is to re-calculate "_end" if partitions were added to the memtable.
Moreover, we also need to re-calculate "_i" in this case - the current code
calculates in one iteration a pointer, _i, to the element to be returned in
the *next* iteration. If additional partitions were added in the meantime,
we may need to return them.

Because it's impossible to delete partitions from a memtable (just to
add new ones or modify existing ones), we can trivially figure out if
new partitions were added, using _memtable->partition_count(). Because
boost::intrusive::set defaults to constant_time_size(true), using this
count is efficient.

Fixes #368.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-09-20 15:08:08 +02:00
Avi Kivity
00299460f3 thrift: move connection closed error to logger, debug level 2015-09-20 15:48:45 +03:00
Avi Kivity
a87e591f5e Merge seastar upstream
* seastar 86ffe72...5c68145 (1):
  > reactor: add header to resolve FALLOC_FL_PUNCH_HOLE/FALLOC_FL_KEEP_SIZE on CentOS7
2015-09-20 13:15:33 +03:00
Takuya ASADA
f8ecd338b8 cql3: replace PRId32 with %d on sprint()
fixes #374

Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-20 13:15:24 +03:00
Nadav Har'El
6a655bc5a6 Fix typo in stream_init_message.hh
The debug build uncovered this typo. It was setting a class member with
itself (with an undefined value) instead from the parameter, which I was
surprised the compiler didn't catch at compile time.

Discovered in issue #368.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-09-20 11:42:50 +03:00
Gleb Natapov
6b300e517e rpm: set ulimits in systemd scylla config
According to https://bugzilla.redhat.com/show_bug.cgi?id=754285
limits.conf is ignored by systemd during service launch. Set limits
in systemd unit file instead.
2015-09-20 10:46:57 +03:00
Avi Kivity
afda54a083 Add the AGPL license 2015-09-20 10:45:35 +03:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Avi Kivity
9ff4cc8e7c configure.py: add copyright 2015-09-20 10:36:17 +03:00
Avi Kivity
bee1cf1352 consumer.hh: tidy up copyright 2015-09-20 10:35:41 +03:00
Avi Kivity
987294a412 Add missing copyrights 2015-09-20 10:16:11 +03:00
Asias He
eead846712 messaging_service: Make gossip use standalone tcp connection
For unknown reasons, I saw gossip syn message got rpc timeout erros when
the cluster is under heavy cassandra-strss stress.

Using a standalone tcp connection seems to fix the issue.
2015-09-19 10:17:42 +03:00
Shlomi Livne
4ba3580fa7 dist: aws ami install scylla-tools
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-09-19 10:16:33 +03:00
Avi Kivity
8f4eb7cc51 Merge "Fixes for building aws ami" from Shlomi 2015-09-19 10:15:42 +03:00
Raphael S. Carvalho
4d31e08299 conf: reenable partitioner in scylla.yaml
It's needed for compaction_delete_test dtest.
Otherwise, it will fail with:

Missing directive: partitioner
Fatal configuration error; unable to start. See log for stacktrace.

FAIL

======================================================================
FAIL: compaction_delete_test (compaction_test.TestCompaction_with_SizeTieredCompactionStrategy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/urchin_world/urchin-dtest/compaction_test.py", line 50, in compaction_delete_test
    self.assertEqual(numfound, 10)
AssertionError: 0 != 10

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-19 10:04:05 +03:00