Commit Graph

10468 Commits

Author SHA1 Message Date
Pekka Enberg
c3bebea1ef dist/docker: Add '--listen-address' to 'docker run'
Add a '--listen-address' command line parameter to the Docker image,
which can be used to set Scylla's listen address.

Refs #1723

Message-Id: <1475485165-6772-1-git-send-email-penberg@scylladb.com>
2016-10-04 13:57:55 +03:00
Marius
876775a52c dist/docker/ubuntu: refactored $IP/listen_address
In order to allow Scylla’s docker container to handle multiple network
interfaces, the start-scylla script was refactored:

- `$IP` is now called `$SCYLLA_LISTEN_ADDRESS`, so it is less likely to
   be confused or interfere with other environment variables.
- `$SCYLLA_LISTEN_ADDRESS` now checks its value and also tries to
   resolve a hostname, if no IP was set to it.
- `$SCYLLA_LISTEN_DEVICE` can now be set as environment variable and
   contain any available NIC device name (e.g. `eth0`). The script
   automatically retrieves the IP address from the device.

Usage:

1. With `$SCYLLA_LISTEN_ADDRESS` as IP:
`docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_ADDRESS=192.168.1.100 scylladb/scylla`

2. With `$SCYLLA_LISTEN_ADDRESS` as hostname:
`docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_ADDRESS=containername.network.lan scylladb/scylla`

3. With `$SCYLLA_LISTEN_DEVICE`:
`docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_DEVICE=eth0 scylladb/scylla`

Message-Id: <20161003151230.67672-1-marius@twostairs.com>
2016-10-04 13:56:55 +03:00
Raphael S. Carvalho
747b42299c database: remove unused code
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <95e1ed590c9e45d15f19a84824a4dce05aefdab8.1475528611.git.raphaelsc@scylladb.com>
2016-10-04 09:26:43 +03:00
Paweł Dziepak
7599ef6fde query_pager: fix splitting range at the end bound
Currently, the code responsible for calculating ranges for the next
request could produce a wrap-around partition range. For example, if the
original range was (unimportant, A] and the last partition key A then
the output range would be (A, A].

This patch adds checks to make sure that in such cases the range is
removed.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475497244-2790-1-git-send-email-pdziepak@scylladb.com>
2016-10-03 19:33:42 +02:00
Avi Kivity
8747054d10 exceptions: mark function called before construction static
cassandra_exception::prepare_message() is called from derived classes'
constructors before the base cassnadra_exception object is constructed.
This is technically illegal but harmless.  Fix by marking the function
static.

Found by clang.
2016-10-03 16:29:02 +03:00
Calle Wilund
5b815b81b4 auth::password_authenticator: Ensure exceptions are processed in continuation
Fixes #1718 (even more)
Message-Id: <1475497389-27016-1-git-send-email-calle@scylladb.com>
2016-10-03 14:49:59 +02:00
Pekka Enberg
f3cd21c8f1 Merge seastar upstream
* seastar 0e60722...18f7bb8 (1):
  > core/memory: Fix compilation errors
2016-10-03 12:54:38 +03:00
Calle Wilund
d24d0f8f90 auth::password_authenticator: "authenticate" should not throw undeclared excpt
Fixes #1718

Message-Id: <1475487331-25927-1-git-send-email-calle@scylladb.com>
2016-10-03 12:53:30 +03:00
Avi Kivity
a51804eca8 Merge "token_restriction: Deal with minimum tokens" from Duarte
"This patch set ensures we can correctly handle queries
where the minimum token is specified."

* 'min-token/v3' of github.com:duarten/scylla:
  cql_query_test: Add test case for min/max token bounds
  token_restriction: Deal with minimum tokens
  partitioner: Parse token from bytes
2016-10-02 12:32:40 +03:00
Avi Kivity
5071f4c0bf Merge seastar upstream
* seastar 9e1d5db...0e60722 (9):
  > core/memory: Replace assert with bad_alloc in allocate_large()
  > chunked_fifo: avoid direct use of sized operator delete
  > memory: fix build without heap profiler
  > xen: initialize port::_sem
  > Merge "Make input streams skippable" from Paweł
  > semaphore: require explict setting for start value
  > prometheus: remove invalid chars from meric names
  > core/memory: Introduce heap profiler
  > util/backtrace: Mark noexcept if func() doesn't throw
2016-10-02 11:43:22 +03:00
Vlad Zolotarov
7e180c7bd3 tracing: introduce the tracing::global_trace_state_ptr class
This object, similarly to a global_schema_ptr, allows to dynamically
create the trace_state_ptr objects on different shards in a context
of the original tracing session.

This object would create a secondary tracing session object from the
original trace_state_ptr object when a trace_state_ptr object is needed
on a "remote" shard, similarly to what we do when we need it on a remote
Node.

Fixes #1678
Fixes #1647

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1474387767-21910-1-git-send-email-vladz@cloudius-systems.com>
2016-10-02 11:31:37 +03:00
Takuya ASADA
15b156c9d4 dist/common/scripts/scylla_io_setup: describe how to set developer mode when validation tests failed
Describe how to set developer mode, not to confuse users.
Fixes #1701

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475167584-18092-1-git-send-email-syuu@scylladb.com>
2016-10-02 10:58:38 +03:00
Avi Kivity
58ddfea18f Merge "Fixes for leveled compaction strategy" from Raphael
* 'lcs_fixes' of github.com:raphaelsc/scylla:
  lcs: fix starvation at higher levels
  lcs: fix broken token range distribution at higher levels
2016-10-01 21:34:21 +03:00
Takuya ASADA
9639cc840e dist/redhat: add missing build time dependency for libunwind
There was missing dependency for libunwind, so add it.
Fixes #1722

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475260099-25881-1-git-send-email-syuu@scylladb.com>
2016-09-30 21:33:39 +03:00
Takuya ASADA
c89d9599b1 dist/ubuntu: add missing build time dependency for libunwind
There was missing dependency for libunwind, so add it.
Fixes #1721

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475255706-26434-1-git-send-email-syuu@scylladb.com>
2016-09-30 21:33:21 +03:00
Raphael S. Carvalho
a8ab4b8f37 lcs: fix starvation at higher levels
When max sstable size is increased, higher levels are suffering from
starvation because we decide to compact a given level if the following
calculation results in a number greater than 1.001:
level_size(L) / max_size_for_level_l(L)

Fixes #1720.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-09-30 14:09:49 -03:00
Raphael S. Carvalho
a3bf7558f2 lcs: fix broken token range distribution at higher levels
Uniform token range distribution across sstables in a level > 1 was broken,
because we were only choosing sstable with lowest first key, when compacting
a level > 0. This resulted in performance problem because L1->L2 may have a
huge overlap over time, for example.
Last compacted key will now be stored for each level to ensure sort of
"round robin" selection of sstables for compactions at level >= 1.
That's also done by C*, and they were once affected by it as described in
https://issues.apache.org/jira/browse/CASSANDRA-6284.

Fixes #1719.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-09-30 14:09:16 -03:00
Paweł Dziepak
eb1fcf3ecc query_pagers: fix clustering key range calculation
Paging code assumes that clustering row range [a, a] contains only one
row which may not be true. Another problem is that it tries to use
range<> interface for dealing with clustering key ranges which doesn't
work because of the lack of correct comparator.

Refs #1446.
Fixes #1684.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475236805-16223-1-git-send-email-pdziepak@scylladb.com>
2016-09-30 17:32:59 +02:00
Tomasz Grabiec
7e25b958ac transport: Extend request memory footprint accounting to also cover execution
CQL server is supposed to throttle requests so that they don't
overflow memory. The problem is that it currently accounts for
request's memory only around reading of its frame from the connection
and not actual request execution. As a result too many requests may be
allowed to execute and we may run out of memory.

Fixes #1708.
Message-Id: <1475149302-11517-1-git-send-email-tgrabiec@scylladb.com>
2016-09-30 14:23:14 +01:00
Duarte Nunes
72af476397 cql_query_test: Add test case for min/max token bounds
This patch adds a test case for specifying the minimum and maximum
tokens in a cql3 query.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:45:45 +00:00
Duarte Nunes
98b4814894 token_restriction: Deal with minimum tokens
This patch fixes a bug where queries such as the following are not
handled properly:

"SELECT * FROM ks.cf WHERE token(id) >
9207857967443869328 AND token(id) <= -9223372036854775808"

Here -9223372036854775808 represents the minimum token, which we were
just translating into a token with kind::key, thus returning incorrect
results.

Ref #1139
Ref #693
Fixes #1717

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:17:08 +00:00
Duarte Nunes
862f51cddf partitioner: Parse token from bytes
This patch adds the from_bytes() function to the i_partitioner class,
whose purpose is parse a particular token and explicitly handle the
case when the minimum token is specified.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:17:02 +00:00
Duarte Nunes
0c8f280af7 partition_key_view: Implement operator<<
The operator is declared, but it isn't implemented. This patch fixes
that.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1475225647-3800-1-git-send-email-duarte@scylladb.com>
2016-09-30 10:54:54 +02:00
Duarte Nunes
a36888f3cb storage_service: Convert token through partitioner
This patch ensures we use the partitioner to convert a token to
sstring instead of casting.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1475179683-28552-1-git-send-email-duarte@scylladb.com>
2016-09-30 10:54:26 +02:00
Tomasz Grabiec
91b1bada55 Merge seastar upstream
* seastar 5b7252d...9e1d5db (5):
  > prometheus: prevent illegal prometheus names
  > scollectd: raw_to_value should not use network order
  > semaphore: Introduce get_units()
  > core::scollectd: truncate the identifiers fields on a 63 characters boundary
  > Merge "Fix ASAN errors in debug builds" from Tomasz
2016-09-29 13:23:24 +02:00
Asias He
511f8aeb91 gossip: Do not remove failure_detector history on remove_endpoint
Otherwise a node could wrongly think the decommissioned node is still
alive and not evict it from the gossip membership.

Backport: CASSANDRA-10371

7877d6f Don't remove FailureDetector history on removeEndpoint

Fixes #1714
Message-Id: <f7f6f1eec2aab1b97a2e568acfd756cca7fc463a.1475112303.git.asias@scylladb.com>
2016-09-29 13:00:47 +03:00
Asias He
a6d6341627 streaming: Add total_{incoming,outgoing}_bytes collectd metrics
It reflects number of bytes sent or received per second in streaming.

To use it:

$ tools/scyllatop/scyllatop.py "*streaming*"

Refs #1655
Message-Id: <5f7943cb2b459db5ed4bd8d7365532ea201ad2d9.1475116963.git.asias@scylladb.com>
2016-09-29 11:54:32 +02:00
Asias He
a6529ad582 repair: Fix split_and_add
Before: the range is split only once, so it is split into 2 sub ranges

  INFO  2016-09-29 15:52:43,625 [shard 0] repair - target_partitions=100, estimated_partitions=537, ranges.size=2,
  range=(8993553141924659802, 8997061146192366917] ->
  ranges={
  (8993553141924659802, 8995307144058513359], (8995307144058513359, 8997061146192366917]}

After: the range is split mulitple times, resulting 16 sub ranges.

  INFO  2016-09-29 15:55:07,934 [shard 0] repair - target_partitions=100, estimated_partitions=67, ranges.size=16,
  range=(8993553141924659802, 8997061146192366917] ->
  ranges={
  (8993553141924659802, 8993772392191391496], (8993772392191391496, 8993991642458123191],
  (8993991642458123191, 8994210892724854885], (8994210892724854885, 8994430142991586580],
  (8994430142991586580, 8994649393258318274], (8994649393258318274, 8994868643525049969],
  (8994868643525049969, 8995087893791781664], (8995087893791781664, 8995307144058513359],
  (8995307144058513359, 8995526394325245053], (8995526394325245053, 8995745644591976748],
  (8995745644591976748, 8995964894858708443], (8995964894858708443, 8996184145125440138],
  (8996184145125440138, 8996403395392171832], (8996403395392171832, 8996622645658903527],
  (8996622645658903527, 8996841895925635222], (8996841895925635222, 8997061146192366917]}

Without this patch, repair can do checksum with a range with a lot of
partitions, not the expected less than 100 partitions per checksum. This
can lead to unncessary data transfer since the checksum is too coarse.
For instacne, as above, if the checksum of 1 out of 537 partitions is
different, the whole 527 partitions will be synced.

Fixes #1613
Message-Id: <0775c20c485c105df5f10bd685048227f074c365.1475137029.git.asias@scylladb.com>
2016-09-29 10:09:25 +01:00
Pekka Enberg
20dccb4bf7 transport/server: Fix CQL Snappy compression failure
The snappy_compress() function expects the "compressed_length" parameter
to contain the actual output buffer length but now we're passing random
garbage from the stack.

Fixes #1711
Message-Id: <1475132127-316-1-git-send-email-penberg@scylladb.com>
2016-09-29 09:29:51 +01:00
Asias He
774d16306f gossip: Use lowres_clock for scheduled_gossip_task
The timer is fired once per second. Using low resolution clock is enough.
Message-Id: <1f21514e975afea6ac5c9dde18a881a41561da70.1475130948.git.asias@scylladb.com>
2016-09-29 10:03:14 +03:00
Piotr Jastrzebski
1948ec8061 Update README.md
Add --init to git submodules update.
It's needed for fmt.
Add libunwind-devel dependency do dnf install.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <4918237f91d985649c195035c02b2dd9e9a1ff68.1475087373.git.piotr@scylladb.com>
2016-09-29 10:02:34 +03:00
Gleb Natapov
32989d1e66 Merge seastar upstream
* seastar 2b55789...5b7252d (3):
  > Merge "rpc: serialize large messages into fragmented memory" from Gleb
  > Merge "Print backtrace on SIGSEGV and SIGABRT" from Tomasz
  > test_runner: avoid nested optionals

Includes patch from Gleb to adapt to seastar changes.
2016-09-28 17:34:16 +03:00
Pekka Enberg
9ea24c9d2b Merge "repair: less stream_plan and less streaming traffic" from Asias
"This series improves repair by

 1) using less streaming sessions

 2) reducing unnecessary streaming traffic

 3) fixing a hang during shutdown

 See commit log for "repair: Reduce stream_plan usage", "repair: Reduce
 unnecessary streaming traffic" and "streaming: Fail streaming sessions
 during shutdown" for details.

 Tested with repair_additional_test.py."
2016-09-28 09:54:15 +03:00
Gleb Natapov
c95df8f053 messaging_service: use correct value for listen_to_bc_address is a constructor used by tests
Also make sure to not listen on the same exact address twice in case
listen_address == broadcast_address. Scylla configuration code does not
allow such thing to be configured, but better to be safe.

Message-Id: <20160927102316.GO32178@scylladb.com>
2016-09-27 11:27:23 +01:00
Pekka Enberg
e35166af10 Merge "gossip: Fix expire_time for gossip membership removal" from Asias
"We currently use steady_lock which is not consistent on nodes in the cluster.
 Use system_clock for it.

 Fixes #1704"
2016-09-27 11:46:09 +03:00
Asias He
1292341d77 gossip: Improve the expire time logging
Print when the node will be removed from gossip membership, e.g.,

INFO  2016-09-27 08:54:49,262 [shard 0] gossip - Node 127.0.0.3 will be
removed from gossip at [2016-09-30 08:54:48]: (expire = 1475196888294489339,
now = 1474937689262295270, diff = 259199 seconds)
2016-09-27 16:42:35 +08:00
Asias He
f0d3084c8b gossip: Switch to use system_clock
The expire time which is used to decide when to remove a node from
gossip membership is gossiped around the cluster. We switched to steady
clock in the past. In order to have a consistent time_point in all the
nodes in the cluster, we have to use wall clock. Switch to use
system_clock for gossip.

Fixes #1704
2016-09-27 16:42:13 +08:00
Avi Kivity
bfa9aa5d23 Merge "Installing the node_exporter" from Amnon
"The prometheus project and its sub project does not have RPM/DEB packaging yet,
but it does have binaries for download.

This series adds an installation script that download install and run as a
service the node_exporter. For os that uses systemd it has a spec file ready
that will be package with the system. For ubuntu a service file will be created
when running the installer.

After this series running node_exporter_install a node_exporter will be running
as a service on the machine."
2016-09-27 11:00:00 +03:00
Asias He
802c25e67b repair: Switch to use make_streaming_reader in checksum calculation
In patch ac619820 (streaming: Switch to use make_streaming_reade), we
switched to use make_streaming_reader for streaming. In repair, the
checksum phases also uses a mutation reader. For the same reasons (no
pollution to row cache, bounded new data after the reader is created),
switch repair checksum calculation to use the make_streaming_reader too.

Fixes #382
Fixes #1682

Message-Id: <9e0ecda861bb0b6f690da5e2378b208159ffa41c.1474933195.git.asias@scylladb.com>
2016-09-27 10:58:31 +03:00
Tomasz Grabiec
c03568d687 Merge tag 'asias/read_data_from_sstable_in_streaming/v2' from seastar-dev.git
From Asias:

With this series, streaming and repair are improved:

    - streaming, repair will not pollute the row cache on the sender side
      any more. Currently, we are risking evicting all the frequently-queried
      partitions from the cache when an operation like repair reads entire
      sstables and floods the row cache with swathes of cold data from they
      read from disk.

    - less data will be sent becasue the reader will only return existing
      data before the point of the reader is created, plus bounded amount
      of writes which arrive later. This helps reducing the streaming time
      in the case new data is being inserted all the time while streaming is
      in progress. E.g., adding a new node while there is a lot of cql write
      workload.

Fixes #382 and #1682
2016-09-26 11:30:12 +02:00
Asias He
ac6198208b streaming: Switch to use make_streaming_reader
Using make_streaming_reader for streaming on the sender side, it has
the following advantages:

- streaming, repair will not pollute the row cache on the sender side
  any more. Currently, we are risking evicting all the frequently-queried
  partitions from the cache when an operation like repair reads entire
  sstables and floods the row cache with swathes of cold data from they
  read from disk.

- less data will be sent becasue the reader will only return existing
  data before the point of the reader is created, plus bounded amount
  of writes which arrive later. This helps reducing the streaming time
  in the case new data is being inserted all the time while streaming is
  in progress. E.g., adding a new node while there is a lot of cql write
  workload.

Fixes #382
Fixes #1682
2016-09-26 16:12:56 +08:00
Asias He
b505e34062 database: Introduce make_streaming_reader
The make_streaming_reader returns a combined mutation reader reads
mutations from sstables and memtable. The memtable reader handles
memtable flushing automatically so no special handling is needed here.

It will be used by streaming soon.
2016-09-26 16:02:48 +08:00
Asias He
e5a5a9ba15 repair: Rename sync_ranges to request_transfer_ranges
To refelct the fact that the function does not sync the ranges but add
the ranges to request from peer or transfer to peer.
2016-09-26 16:00:07 +08:00
Takuya ASADA
d38aa6570f dist/common/scripts/scylla_setup: do not ask to select disks when there's no free disk
When there's no free disk, it asks to select disks from empty list:

"Please select disks from following list:
type 'done' to finish selection. selected:"

We should avoid to ask it, abort RAID setup instead.

Fixes #1673

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1474429218-28382-1-git-send-email-syuu@scylladb.com>
2016-09-26 08:53:54 +03:00
Gleb Natapov
26ae8e8365 implement listen_on_broadcast_address option
When using multiple physical network interfaces, set this to true to
listen on broadcast_address in addition to the listen_address, allowing
nodes to communicate in both interfaces.  Ignore this property if the
network configuration automatically routes between the public and
private networks such as EC2.

Message-Id: <20160921094810.GA28654@scylladb.com>
2016-09-26 08:49:54 +03:00
Asias He
f377a3b7ac streaming: Fail streaming sessions during shutdown
Fixes repair_additional_test.py:RepairAdditionalTest.repair_kill_3_test

The test does:

- Insert data on node1 only
- Insert data on node2 only
- Run repair on node1 and stop node1
  once "starting user-requested repair" is seen

The repair shutdown code may wait for the stream session to complete for
a very long time if node 1 finishes sending data to node2 and is waiting
for node2 to send data to it, when node1 is stopped. The stream session
will not be closed in this case until stream session _keep_alive_timeout
(10 minutes) expires. Instead of waiting for the stream_session keep
alive timer to expire, we can fail all the stream sessions during
shutdown.

Before 1 - The bad case (repair shutdown will last for 10 minutes):

  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:23:56,618 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:23:58,626 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:23:58,626 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:23:58,626 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:23:58,626 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:23:58,669 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:25:56,624 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] The session 0x600021516c00 made no progress with peer 127.0.0.2

Before 2 - The good case:

  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:18:34,155 [shard 0] messaging_service - Retry verb=19 to 127.0.0.2:0, retry=10: rpc::closed_error (connection is closed)
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] COMPLETE_MESSAGE for 127.0.0.2 has failed: rpc::closed_error (connection is closed)
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Streaming error occurred
  INFO  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Session with 127.0.0.2 is complete, state=FAILED
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] bytes_sent = 0, bytes_received = 245000
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Stream failed, peers={127.0.0.2}
  WARN  2016-09-21 16:18:34,155 [shard 0] repair - repair's stream failed: streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:18:34,155 [shard 0] repair - repair 1 failed - streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:18:34,155 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:18:34,155 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:18:34,156 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:18:34,156 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:18:34,156 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:18:34,199 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:18:34,199 [shard 0] repair - Completed shutdown of repair
  INFO  2016-09-21 16:18:34,199 [shard 0] compaction_manager - Asked to stop
  INFO  2016-09-21 16:18:34,199 [shard 1] compaction_manager - Asked to stop

After:

  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:06:21,685 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Session with 127.0.0.2 is complete, state=FAILED
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - stream_manager stopped
  INFO  2016-09-21 16:06:23,688 [shard 1] storage_service - stream_manager stopped
  INFO  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] bytes_sent = 0, bytes_received = 25725
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: shutdown stream_manager done
  WARN  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Stream failed, peers={127.0.0.2}
  WARN  2016-09-21 16:06:23,688 [shard 0] repair - repair's stream failed: streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:06:23,688 [shard 0] repair - repair 1 failed - streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:06:23,688 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:06:23,688 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:06:23,688 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:06:23,688 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:06:23,774 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:06:23,774 [shard 0] repair - Completed shutdown of repair
  INFO  2016-09-21 16:06:23,774 [shard 0] compaction_manager - Asked to stop
  INFO  2016-09-21 16:06:23,774 [shard 1] compaction_manager - Asked to stop
2016-09-26 06:29:40 +08:00
Asias He
7c873f0d1f repair: Reduce unnecessary streaming traffic
If the remote peers have the same checksum, we can only fetch from
one of the peer node instead of all of them since they all have the same
data anyway. No need to fetch from all of them.

In addition to above optimization, if the local peer has no data, we can
skip sending the data back to the remote peer. Due to the fact that all
the remote peers have the same checksum and local peer has no data, so
each and every remote peer has all the data. There is no need to merge
the remote data with local data and send back the merged data back to
remote peers.

Refs: #1617
2016-09-26 06:28:51 +08:00
Asias He
99e77e8ec2 repair: Do not abort the repair when one range is failed
failed_ranges is added to track the ranges that fail during repair.
2016-09-26 06:28:51 +08:00
Asias He
81c98ff3d9 repair: Reduce stream_plan usage
Right now, we are using one stream_plan for each range of a column
family. This generates tons of stream_plans and stream_sessions. Each
stream_plan can transfer multiple ranges and column families. We can
use a single stream_plan to stream datas for multiple ranges and column
families, so that 1) overhead of stream_plan/session negotiation is
reduced 2) it is much easier to debug/monitor few stream_sessions

Fixes #1685
2016-09-26 06:28:50 +08:00
Asias He
a0020fdad2 stream_session: Allow adding ranges to a cf more than once
Append the ranges to a stream_transfer_task if the cf is already added to
_transfers in add_transfer_ranges.
2016-09-26 06:28:50 +08:00