Commit Graph

10483 Commits

Author SHA1 Message Date
Nadav Har'El
ee7ec10b11 CQL parser: "CREATE MATERIALIZED VIEW" statement
This patch adds the parsing for the "CREATE MATERIALIZED VIEW" statement,
following Cassandra 3 syntax. For example:

   CREATE MATERIALIZED VIEW building_by_city
   AS SELECT * FROM buildings
   WHERE city IS NOT NULL
   PRIMARY KEY(city, name);

It also adds the "IS NOT NULL" operator needed for this purpose.
As in Cassandra, "IS NOT NULL" can only be used for materialized
view creation, and not in a normal SELECT. It can only be used with
the NULL operand (i.e., "IS NOT 3" will be a syntax error).

The current implementation of this statement just does some sanity
checking (such as to verify that "city" is a valid column name and that
the "building" base table exists), complains that materialized views are
not yet supported:

SyntaxException: <ErrorMessage code=2000 [Syntax error in CQL query] message="Failed parsing statement: [CREATE MATERIALIZED VIEW building_by_city AS
SELECT * FROM buildings
WHERE city IS NOT NULL
PRIMARY KEY(city, name);] reason: unsupported operation: Materialized views not yet supported">

As mentioned above, the "IS NOT NULL" restriction is not allowed in
ordinary selects not creating a materialized views:

SELECT * FROM buildings WHERE city IS NOT NULL;
InvalidRequest: code=2200 [Invalid query] message="restriction 'city IS NOT null' is only supported in materialized view creation"

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1475742927-30695-1-git-send-email-nyh@scylladb.com>
2016-10-06 15:42:37 +03:00
Glauber Costa
7146776d7c fix sstable tests by not using the flush_reader if no region_group
The latest virtual dirty patches broke the SSTable tests. The reason for
this is that those tests will flush synthetic memtables that do not have
a region_group attached to it.

Normally in cases like this we would just give the flush_reader an empty
region group. However, the memtable class constructor takes a
region_group pointer and that can be null according to the interface.
So we must conditionally test it.

If there isn't a region_group involved, the virtual dirty accounting
should be disabled: after all, we won't even have the baseline memory
to begin with.

One of the approaches to fix this could be to just provide null
accounter classes to be used as a surrogate for the accounting classes
in this case. However, since this is mostly used for tests, a much
simpler way is to just revert back to the scanning reader in that case.

The scanning reader is similar enough to the flush_reader, except that
it can handle partial ranges, slices, and delegate accesses to an
sstable post-flush. We don't need any of that, but as argued above,
there is no need to remove it either.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
Message-Id: <1475667271-60806-1-git-send-email-glommer@scylladb.com>
2016-10-05 12:44:21 +01:00
Avi Kivity
c94fb1bf12 build: reduce inclusions of messaging_service.hh
Remove inclusions from header files (primary offender is fb_utilities.hh)
and introduce new messaging_service_fwd.hh to reduce rebuilds when the
messaging service changes.

Message-Id: <1475584615-22836-1-git-send-email-avi@scylladb.com>
2016-10-05 11:46:49 +03:00
Avi Kivity
f8118d9fc2 Merge "Virtual dirty memory management" from Glauber
"Description:
============

Scylla currently suffers from a brick wall behavior of the request throttler.
Requests pile up until we reach the dirty memory limit, at which point we stop
serving them until we have freed enough memory to allow for more requests.

The problem is that freeing dirty memory means writing an SSTable to completion.
That can take a long time, even if we are blessed with great disks. Those long
waiting times can and will translate into timeouts. That is bad behavior.

What this patch does is introduce one form of virtual dirty memory accounting.
Instead of allowing 100 % of the dirty memory to be filled up until we stop
accepting requests, we will do that when we reach 50 % of memory. However,
instead of releasing requests only when an SSTable is fully written, we start
releasing them when some memory was written.

The practical effect of that, is that once we reach 50 % occupancy in our dirty
memory region, we will bring the system from CPU speed to disk speed, and will
start accepting requests only at the rate we are able to write memory back.

Results
=======

With this patchset running a load big enough to easily saturate the disk,
(commitlog disabled to highlight the effects of the memtable writer), I am able
to run scylla for many minutes, with timeouts occurring only when I run out of
disk space, whereas without this patch a swarm of timeouts would start merely 2
seconds after the load started - and would never get stable.

In V2, I have sent a set of graphs illustrating the performance of this solution.
This version does not have any significant differences in that front.

For details, please refer to
https://groups.google.com/d/msg/scylladb-dev/iCvD-3Z-QqY/EM8KUh_MAQAJ

Accuracy of the accounting:
---------------------------
It is important for us to be as accurate as possible when accounting freed
memory, since every byte we mark as freed may allow one or more requests to be
executed.  I have measured the accuracy of this approach (ignoring padding,
object size for the mutation fragments) to be 99.83 % of used memory in the
test workload I have ran (large, 65k mutations). Memtables under this circumnstance
tend to have a very high occupancy ratio because throttle breeds idle, and idle
breeds compact-on-idle.

Known Issues:
-------------

A lot of time can be elapsed between destroying the flush_reader and actually
releasing memory. The release of memory only happens when the SSTable is fully
sealed, and we have to flush the files, as well as finish writing all SSTable
components at this point. This happened in practice with a buggy kernel that
would result in flushes taking a long time.

After that is fixed, this is just a theoretical problem and in practice it
shouldn't matter given the time we expect those operations to take."

* 'virtual-dirty-v6' of github.com:glommer/scylla:
  database: allow virtual dirty memory management
  streamed_mutation: make _buffer private
  add accounting of memory read to partition_snapshot_reader
  move partition_snapshot_reader code to header file
  LSA: allow a group to query its own region group
  memtables: split scanning reader in two
  sstables: use special reader for writing a memtable
  LSA: export information about object memory footprint
  LSA: export information about size of the throttle queue
  database: export virtual dirty bytes region group
2016-10-04 20:57:52 +03:00
Avi Kivity
cc33c8b4ba Merge seastar upstream
* seastar 18f7bb8...f937fb0 (5):
  > Merge "Fix signal mask corruption" from Tomasz
  > core/memory: Avoid violating strict aliasing when accessing allocation sites
  > core/memory: Avoid indirection when storing allocation sites
  > core/memory: Add a way to disable abort on allocation failure in some scope
  > core/sharded: Allow mapper to take the service by non-const reference
2016-10-04 20:08:57 +03:00
Glauber Costa
f89a67c75c database: allow virtual dirty memory management
Scylla currently suffers from a brick wall behavior of the request throttler.
Requests pile up until we reach the dirty memory limit, at which point we stop
serving them until we have freed enough memory to allow for more requests.

The problem is that freeing dirty memory means writing an SSTable to completion.
That can take a long time, even if we are blessed with great disks. Those long
waiting times can and will translate into timeouts. That is bad behavior.

What this patch does is introduce one form of virtual dirty memory accounting.
Instead of allowing 100 % of the dirty memory to be filled up until we stop
accepting requests, we will do that when we reach 50 % of memory. However,
instead of releasing requests only when an SSTable is fully written, we start
releasing them when some memory was written.

The practical effect of that is that once we reach 50 % occupancy in our dirty
memory region, we will bring the system from CPU speed to disk speed, and will
start accepting requests only at the rate we are able to write memory back.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
7b6e8a2526 streamed_mutation: make _buffer private
It is currently protected, but now all users go through
push_mutation_fragment().  So we can safely move its visibility to guarantee
that it stays that way.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
1db245b52d add accounting of memory read to partition_snapshot_reader
By default, we don't do any accounting. By specializing this class and providing
an accounter class, we can account how much memory are we reading as we read
through the elements.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
452eb95943 move partition_snapshot_reader code to header file
This is so we can template it without worrying about declaring the
specializations in the .cc file.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
86aa0b830d LSA: allow a group to query its own region group
Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
eee15578fb memtables: split scanning reader in two
The code that is common will live in its own reader, the iterator_reader.  All
friendly private access to memtable attributes and methods happen through the
iterator reader.

After this patch, we are now left with the scanning_reader - same as always,
but now implemented on top of the iterator_reader, and a flush_reader, which
will be used by SSTable flushes only.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
16886eeb96 sstables: use special reader for writing a memtable
Right now the special reader doesn't do much, but the idea is that we will
soon replace it will a reader that specializes in flush, and is in turn able
to provide read-side on-flush functionality like virtual dirty.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Glauber Costa
28e3f2f6ee LSA: export information about object memory footprint
We allocate objects of a certain size, but we use a bit more memory to hold
them.  To get a clerer picture about how much memory will an object cost us, we
need help from the allocator. This patch exports an interface that allow users
to query into a specific allocator to get that information.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Pekka Enberg
c3bebea1ef dist/docker: Add '--listen-address' to 'docker run'
Add a '--listen-address' command line parameter to the Docker image,
which can be used to set Scylla's listen address.

Refs #1723

Message-Id: <1475485165-6772-1-git-send-email-penberg@scylladb.com>
2016-10-04 13:57:55 +03:00
Marius
876775a52c dist/docker/ubuntu: refactored $IP/listen_address
In order to allow Scylla’s docker container to handle multiple network
interfaces, the start-scylla script was refactored:

- `$IP` is now called `$SCYLLA_LISTEN_ADDRESS`, so it is less likely to
   be confused or interfere with other environment variables.
- `$SCYLLA_LISTEN_ADDRESS` now checks its value and also tries to
   resolve a hostname, if no IP was set to it.
- `$SCYLLA_LISTEN_DEVICE` can now be set as environment variable and
   contain any available NIC device name (e.g. `eth0`). The script
   automatically retrieves the IP address from the device.

Usage:

1. With `$SCYLLA_LISTEN_ADDRESS` as IP:
`docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_ADDRESS=192.168.1.100 scylladb/scylla`

2. With `$SCYLLA_LISTEN_ADDRESS` as hostname:
`docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_ADDRESS=containername.network.lan scylladb/scylla`

3. With `$SCYLLA_LISTEN_DEVICE`:
`docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_DEVICE=eth0 scylladb/scylla`

Message-Id: <20161003151230.67672-1-marius@twostairs.com>
2016-10-04 13:56:55 +03:00
Raphael S. Carvalho
747b42299c database: remove unused code
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <95e1ed590c9e45d15f19a84824a4dce05aefdab8.1475528611.git.raphaelsc@scylladb.com>
2016-10-04 09:26:43 +03:00
Paweł Dziepak
7599ef6fde query_pager: fix splitting range at the end bound
Currently, the code responsible for calculating ranges for the next
request could produce a wrap-around partition range. For example, if the
original range was (unimportant, A] and the last partition key A then
the output range would be (A, A].

This patch adds checks to make sure that in such cases the range is
removed.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475497244-2790-1-git-send-email-pdziepak@scylladb.com>
2016-10-03 19:33:42 +02:00
Avi Kivity
8747054d10 exceptions: mark function called before construction static
cassandra_exception::prepare_message() is called from derived classes'
constructors before the base cassnadra_exception object is constructed.
This is technically illegal but harmless.  Fix by marking the function
static.

Found by clang.
2016-10-03 16:29:02 +03:00
Calle Wilund
5b815b81b4 auth::password_authenticator: Ensure exceptions are processed in continuation
Fixes #1718 (even more)
Message-Id: <1475497389-27016-1-git-send-email-calle@scylladb.com>
2016-10-03 14:49:59 +02:00
Pekka Enberg
f3cd21c8f1 Merge seastar upstream
* seastar 0e60722...18f7bb8 (1):
  > core/memory: Fix compilation errors
2016-10-03 12:54:38 +03:00
Calle Wilund
d24d0f8f90 auth::password_authenticator: "authenticate" should not throw undeclared excpt
Fixes #1718

Message-Id: <1475487331-25927-1-git-send-email-calle@scylladb.com>
2016-10-03 12:53:30 +03:00
Avi Kivity
a51804eca8 Merge "token_restriction: Deal with minimum tokens" from Duarte
"This patch set ensures we can correctly handle queries
where the minimum token is specified."

* 'min-token/v3' of github.com:duarten/scylla:
  cql_query_test: Add test case for min/max token bounds
  token_restriction: Deal with minimum tokens
  partitioner: Parse token from bytes
2016-10-02 12:32:40 +03:00
Avi Kivity
5071f4c0bf Merge seastar upstream
* seastar 9e1d5db...0e60722 (9):
  > core/memory: Replace assert with bad_alloc in allocate_large()
  > chunked_fifo: avoid direct use of sized operator delete
  > memory: fix build without heap profiler
  > xen: initialize port::_sem
  > Merge "Make input streams skippable" from Paweł
  > semaphore: require explict setting for start value
  > prometheus: remove invalid chars from meric names
  > core/memory: Introduce heap profiler
  > util/backtrace: Mark noexcept if func() doesn't throw
2016-10-02 11:43:22 +03:00
Vlad Zolotarov
7e180c7bd3 tracing: introduce the tracing::global_trace_state_ptr class
This object, similarly to a global_schema_ptr, allows to dynamically
create the trace_state_ptr objects on different shards in a context
of the original tracing session.

This object would create a secondary tracing session object from the
original trace_state_ptr object when a trace_state_ptr object is needed
on a "remote" shard, similarly to what we do when we need it on a remote
Node.

Fixes #1678
Fixes #1647

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1474387767-21910-1-git-send-email-vladz@cloudius-systems.com>
2016-10-02 11:31:37 +03:00
Takuya ASADA
15b156c9d4 dist/common/scripts/scylla_io_setup: describe how to set developer mode when validation tests failed
Describe how to set developer mode, not to confuse users.
Fixes #1701

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475167584-18092-1-git-send-email-syuu@scylladb.com>
2016-10-02 10:58:38 +03:00
Avi Kivity
58ddfea18f Merge "Fixes for leveled compaction strategy" from Raphael
* 'lcs_fixes' of github.com:raphaelsc/scylla:
  lcs: fix starvation at higher levels
  lcs: fix broken token range distribution at higher levels
2016-10-01 21:34:21 +03:00
Takuya ASADA
9639cc840e dist/redhat: add missing build time dependency for libunwind
There was missing dependency for libunwind, so add it.
Fixes #1722

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475260099-25881-1-git-send-email-syuu@scylladb.com>
2016-09-30 21:33:39 +03:00
Takuya ASADA
c89d9599b1 dist/ubuntu: add missing build time dependency for libunwind
There was missing dependency for libunwind, so add it.
Fixes #1721

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475255706-26434-1-git-send-email-syuu@scylladb.com>
2016-09-30 21:33:21 +03:00
Raphael S. Carvalho
a8ab4b8f37 lcs: fix starvation at higher levels
When max sstable size is increased, higher levels are suffering from
starvation because we decide to compact a given level if the following
calculation results in a number greater than 1.001:
level_size(L) / max_size_for_level_l(L)

Fixes #1720.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-09-30 14:09:49 -03:00
Raphael S. Carvalho
a3bf7558f2 lcs: fix broken token range distribution at higher levels
Uniform token range distribution across sstables in a level > 1 was broken,
because we were only choosing sstable with lowest first key, when compacting
a level > 0. This resulted in performance problem because L1->L2 may have a
huge overlap over time, for example.
Last compacted key will now be stored for each level to ensure sort of
"round robin" selection of sstables for compactions at level >= 1.
That's also done by C*, and they were once affected by it as described in
https://issues.apache.org/jira/browse/CASSANDRA-6284.

Fixes #1719.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-09-30 14:09:16 -03:00
Paweł Dziepak
eb1fcf3ecc query_pagers: fix clustering key range calculation
Paging code assumes that clustering row range [a, a] contains only one
row which may not be true. Another problem is that it tries to use
range<> interface for dealing with clustering key ranges which doesn't
work because of the lack of correct comparator.

Refs #1446.
Fixes #1684.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475236805-16223-1-git-send-email-pdziepak@scylladb.com>
2016-09-30 17:32:59 +02:00
Tomasz Grabiec
7e25b958ac transport: Extend request memory footprint accounting to also cover execution
CQL server is supposed to throttle requests so that they don't
overflow memory. The problem is that it currently accounts for
request's memory only around reading of its frame from the connection
and not actual request execution. As a result too many requests may be
allowed to execute and we may run out of memory.

Fixes #1708.
Message-Id: <1475149302-11517-1-git-send-email-tgrabiec@scylladb.com>
2016-09-30 14:23:14 +01:00
Duarte Nunes
72af476397 cql_query_test: Add test case for min/max token bounds
This patch adds a test case for specifying the minimum and maximum
tokens in a cql3 query.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:45:45 +00:00
Duarte Nunes
98b4814894 token_restriction: Deal with minimum tokens
This patch fixes a bug where queries such as the following are not
handled properly:

"SELECT * FROM ks.cf WHERE token(id) >
9207857967443869328 AND token(id) <= -9223372036854775808"

Here -9223372036854775808 represents the minimum token, which we were
just translating into a token with kind::key, thus returning incorrect
results.

Ref #1139
Ref #693
Fixes #1717

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:17:08 +00:00
Duarte Nunes
862f51cddf partitioner: Parse token from bytes
This patch adds the from_bytes() function to the i_partitioner class,
whose purpose is parse a particular token and explicitly handle the
case when the minimum token is specified.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:17:02 +00:00
Duarte Nunes
0c8f280af7 partition_key_view: Implement operator<<
The operator is declared, but it isn't implemented. This patch fixes
that.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1475225647-3800-1-git-send-email-duarte@scylladb.com>
2016-09-30 10:54:54 +02:00
Duarte Nunes
a36888f3cb storage_service: Convert token through partitioner
This patch ensures we use the partitioner to convert a token to
sstring instead of casting.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1475179683-28552-1-git-send-email-duarte@scylladb.com>
2016-09-30 10:54:26 +02:00
Tomasz Grabiec
91b1bada55 Merge seastar upstream
* seastar 5b7252d...9e1d5db (5):
  > prometheus: prevent illegal prometheus names
  > scollectd: raw_to_value should not use network order
  > semaphore: Introduce get_units()
  > core::scollectd: truncate the identifiers fields on a 63 characters boundary
  > Merge "Fix ASAN errors in debug builds" from Tomasz
2016-09-29 13:23:24 +02:00
Asias He
511f8aeb91 gossip: Do not remove failure_detector history on remove_endpoint
Otherwise a node could wrongly think the decommissioned node is still
alive and not evict it from the gossip membership.

Backport: CASSANDRA-10371

7877d6f Don't remove FailureDetector history on removeEndpoint

Fixes #1714
Message-Id: <f7f6f1eec2aab1b97a2e568acfd756cca7fc463a.1475112303.git.asias@scylladb.com>
2016-09-29 13:00:47 +03:00
Asias He
a6d6341627 streaming: Add total_{incoming,outgoing}_bytes collectd metrics
It reflects number of bytes sent or received per second in streaming.

To use it:

$ tools/scyllatop/scyllatop.py "*streaming*"

Refs #1655
Message-Id: <5f7943cb2b459db5ed4bd8d7365532ea201ad2d9.1475116963.git.asias@scylladb.com>
2016-09-29 11:54:32 +02:00
Asias He
a6529ad582 repair: Fix split_and_add
Before: the range is split only once, so it is split into 2 sub ranges

  INFO  2016-09-29 15:52:43,625 [shard 0] repair - target_partitions=100, estimated_partitions=537, ranges.size=2,
  range=(8993553141924659802, 8997061146192366917] ->
  ranges={
  (8993553141924659802, 8995307144058513359], (8995307144058513359, 8997061146192366917]}

After: the range is split mulitple times, resulting 16 sub ranges.

  INFO  2016-09-29 15:55:07,934 [shard 0] repair - target_partitions=100, estimated_partitions=67, ranges.size=16,
  range=(8993553141924659802, 8997061146192366917] ->
  ranges={
  (8993553141924659802, 8993772392191391496], (8993772392191391496, 8993991642458123191],
  (8993991642458123191, 8994210892724854885], (8994210892724854885, 8994430142991586580],
  (8994430142991586580, 8994649393258318274], (8994649393258318274, 8994868643525049969],
  (8994868643525049969, 8995087893791781664], (8995087893791781664, 8995307144058513359],
  (8995307144058513359, 8995526394325245053], (8995526394325245053, 8995745644591976748],
  (8995745644591976748, 8995964894858708443], (8995964894858708443, 8996184145125440138],
  (8996184145125440138, 8996403395392171832], (8996403395392171832, 8996622645658903527],
  (8996622645658903527, 8996841895925635222], (8996841895925635222, 8997061146192366917]}

Without this patch, repair can do checksum with a range with a lot of
partitions, not the expected less than 100 partitions per checksum. This
can lead to unncessary data transfer since the checksum is too coarse.
For instacne, as above, if the checksum of 1 out of 537 partitions is
different, the whole 527 partitions will be synced.

Fixes #1613
Message-Id: <0775c20c485c105df5f10bd685048227f074c365.1475137029.git.asias@scylladb.com>
2016-09-29 10:09:25 +01:00
Pekka Enberg
20dccb4bf7 transport/server: Fix CQL Snappy compression failure
The snappy_compress() function expects the "compressed_length" parameter
to contain the actual output buffer length but now we're passing random
garbage from the stack.

Fixes #1711
Message-Id: <1475132127-316-1-git-send-email-penberg@scylladb.com>
2016-09-29 09:29:51 +01:00
Asias He
774d16306f gossip: Use lowres_clock for scheduled_gossip_task
The timer is fired once per second. Using low resolution clock is enough.
Message-Id: <1f21514e975afea6ac5c9dde18a881a41561da70.1475130948.git.asias@scylladb.com>
2016-09-29 10:03:14 +03:00
Piotr Jastrzebski
1948ec8061 Update README.md
Add --init to git submodules update.
It's needed for fmt.
Add libunwind-devel dependency do dnf install.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <4918237f91d985649c195035c02b2dd9e9a1ff68.1475087373.git.piotr@scylladb.com>
2016-09-29 10:02:34 +03:00
Gleb Natapov
32989d1e66 Merge seastar upstream
* seastar 2b55789...5b7252d (3):
  > Merge "rpc: serialize large messages into fragmented memory" from Gleb
  > Merge "Print backtrace on SIGSEGV and SIGABRT" from Tomasz
  > test_runner: avoid nested optionals

Includes patch from Gleb to adapt to seastar changes.
2016-09-28 17:34:16 +03:00
Pekka Enberg
9ea24c9d2b Merge "repair: less stream_plan and less streaming traffic" from Asias
"This series improves repair by

 1) using less streaming sessions

 2) reducing unnecessary streaming traffic

 3) fixing a hang during shutdown

 See commit log for "repair: Reduce stream_plan usage", "repair: Reduce
 unnecessary streaming traffic" and "streaming: Fail streaming sessions
 during shutdown" for details.

 Tested with repair_additional_test.py."
2016-09-28 09:54:15 +03:00
Glauber Costa
f5fd6bd714 LSA: export information about size of the throttle queue
Also add information about for how long has the oldest been sitting in the
queue. This is part of the backpressure work to allow us to throttle incoming
requests if we won't have memory to process them. Shortages can happen in all
sorts of places, and it is useful when designing and testing the solutions to
know where they are, and how bad they are.

This counter is named for consistency after similar counters from transport/.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-09-27 12:09:08 -04:00
Glauber Costa
aa6a96d09b database: export virtual dirty bytes region group
Currently, we export the region group where memtables are placed as dirty bytes.
Upcoming patches will optimistically mark some bytes in this region as free, a
scheme we know as "virtual dirty".

We are still interested in knowing the real state of the dirty region, so we
will keep track of the bytes virtually freed and split the counters in two.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-09-27 12:09:08 -04:00
Gleb Natapov
c95df8f053 messaging_service: use correct value for listen_to_bc_address is a constructor used by tests
Also make sure to not listen on the same exact address twice in case
listen_address == broadcast_address. Scylla configuration code does not
allow such thing to be configured, but better to be safe.

Message-Id: <20160927102316.GO32178@scylladb.com>
2016-09-27 11:27:23 +01:00
Pekka Enberg
e35166af10 Merge "gossip: Fix expire_time for gossip membership removal" from Asias
"We currently use steady_lock which is not consistent on nodes in the cluster.
 Use system_clock for it.

 Fixes #1704"
2016-09-27 11:46:09 +03:00