Commit Graph

270 Commits

Author SHA1 Message Date
Asias He
49a73aa2fc streaming: Move stream_mutation_fragments_cmd to a new file (#4812)
Avoid including the lengthy stream_session.hh in messaging_service.

More importantly, fix the build because currently messaging_service.cc
and messaging_service.hh does not include stream_mutation_fragments_cmd.
I am not sure why it builds on my machine. Spotted this when backporting
the "streaming: Send error code from the sender to receiver" to 3.0
branch.

Refs: #4789
2019-08-07 14:59:46 +02:00
Asias He
bac987e32a streaming: Send error code from the sender to receiver
In case of error on the sender side, the sender does not propagate the
error to the receiver. The sender will close the stream. As a result,
the receiver will get nullopt from the source in
get_next_mutation_fragment and pass mutation_fragment_opt with no value
to the generating_reader. In turn, the generating_reader generates end
of stream. However, the last element that the generating_reader has
generated can be any type of mutation_fragment. This makes the sstable
that consumes the generating_reader violates the mutation_fragment
stream rule.

To fix, we need to propagate the error. However RPC streaming does not
support propagate the error in the framework. User has to send an error
code explicitly.

Fixes: #4789
2019-08-06 16:54:56 +02:00
Asias He
5d3e4d7b73 messaging_service: Check if messaging_service is stopped before get_rpc_client
get_rpc_client assumes the messaging_service is not stopped. We should check
is_stopping() before we call get_rpc_client.

We do such check in existing code, e.g., send_message and friends. Do
the same check in the newly introduced
make_sink_and_source_for_stream_mutation_fragments() and friends for row
level repair.

Fixes: #4767
2019-07-31 11:44:57 +03:00
Calle Wilund
c540e36fe2 gms::inet_address: Make serialization ipv6 aware
Because inet_address was initially hardcoded to
ipv4, its wire format is not very forward compatible.
Since we potentially need to communicate with older version nodes, we
manually define the new serial format for inet_address to be:

ipv4: 4  bytes address
ipv6: 4  bytes marker 0xffffffff (invalid address)
      16 bytes data -> address
2019-07-08 14:13:09 +00:00
Calle Wilund
e9816efe06 Remove usage of inet_address::raw_addr() 2019-07-08 14:13:09 +00:00
Calle Wilund
4ef940169f Replace use of "ipv4_addr" with socket_address
Allows the various sockets to use ipv6 address binding if so configured.
2019-07-08 14:13:09 +00:00
Asias He
37b3de4ea0 messaging_service: Add REPAIR_GET_FULL_ROW_HASHES_WITH_RPC_STREAM support
It is used by row level repair.
2019-07-02 21:18:55 +08:00
Asias He
a7c7ba9765 messaging_service: Add REPAIR_PUT_ROW_DIFF_WITH_RPC_STREAM support
It is used by row level repair.
2019-07-02 21:18:55 +08:00
Asias He
dc92bda93b messaging_service: Add REPAIR_GET_ROW_DIFF_WITH_RPC_STREAM support 2019-07-02 21:18:55 +08:00
Asias He
f312c95b74 messaging_service: Add do_make_sink_source helper
It is used by the row level repair rpc stream verbs to make sink and
source object.
2019-07-02 21:18:55 +08:00
Asias He
bc295a00a6 messaging_service: Add rpc stream verb for row level repair
- REPAIR_GET_ROW_DIFF_WITH_RPC_STREAM

Get repair rows from follower nodes

- REPAIR_PUT_ROW_DIFF_WITH_RPC_STREAM

Put repair rows to follower nodes

- REPAIR_GET_FULL_ROW_HASHES_WITH_RPC_STREAM:

Get full hashes from follower nodes
2019-07-02 21:18:55 +08:00
Asias He
3db136f81e repair: Use the same schema version for repair master and followers
Before this patch, repair master and followers use their own schema
version at the point repair starts independently. The schemas can be
different due to schema change. Repair uses the schema to serialize
mutation_fragment and deserialize the mutation_fragment received from
peer nodes. Using different schema version to serialize and deserialize
cause undefined behaviour.

To fix, we use the schema the repair master decides for all the repair
nodes involved.

On top of this patch, we could do another step to make sure all nodes
has the latest schema. But let's do it in a separate patch.

Fixes: #4549
Backports: 3.1
2019-06-18 18:27:21 +08:00
Asias He
b463d7039c repair: Introduce get_combined_row_hash_response
Currently, REPAIR_GET_COMBINED_ROW_HASH RPC verb returns only the
repair_hash object. In the future, we will use set reconciliation
algorithm to decode the full row hashes in working row buf. It is useful
to return the number of rows inside working row buf in addition to the
combined row hashes to make sure the decode is successful.

It is also better to use a wrapper class for the verb response so we can
extend the return values later more easily with IDL.

Fixes #4526
Message-Id: <93be47920b523f07179ee17e418760015a142990.1559771344.git.asias@scylladb.com>
2019-06-12 13:51:29 +03:00
Gleb Natapov
1d851a3892 messaging: catch an error that sending of CLIENT_ID may return
Avoid a warning about unhandled exception.

Message-Id: <20190506122718.GL21208@scylladb.com>
2019-05-06 18:13:51 +03:00
Paweł Dziepak
d47ea66ec6 messaging_service: add lz4_fragmented RPC compressor
Seastar now supports two RPC compression algorithm: the original LZ4 one
and LZ4_FRAGMENTED. The latter uses lz4 stream interface which allows it
to process large messages without fully linearising them. Since, RPC
requests used by Scylla often contain user-provided data that
potentially could be very large, LZ4_FRAGMENTED is a better choice for
the default compression algorithm.

Message-Id: <20190417144318.27701-1-pdziepak@scylladb.com>
2019-04-18 19:07:14 +03:00
Gleb Natapov
1abc50ad8a messaging_service: make sure a client is unique for a destination
Function messaging_service::get_rpc_client() suppose to either return
existing client or create one and return it. The function is suppose to
be atomic, so after checking that requested client does not exist it is
safe to assume emplace() will succeed. But we saw bugs that made the
function to not be atomic. Lets add an assert that will help to catch
such bugs easier if they will happen in the future.

Message-Id: <20190326115741.GX26144@scylladb.com>
2019-03-26 14:19:08 +02:00
Gleb Natapov
bb93d990ad messaging_service: keep shared pointer to an rpc connection while opening mutation fragment stream
Current code captures a reference to rpc::client in a continuation, but
there is no guaranty that the reference will be valid when continuation runs.
Capture shared pointer to rpc::client instead.

Fixes #4350.

Message-Id: <20190314135538.GC21521@scylladb.com>
2019-03-21 12:46:01 -03:00
Gleb Natapov
a70374d982 messaging_service: do not forget to close stream when sending it to another side failed
Fixes #4124

Message-Id: <20190131091857.GC3172@scylladb.com>
2019-01-31 12:01:56 +02:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Avi Kivity
c96fc1d585 Merge "Introduce row level repair" from Asias
"
=== How the the partition level repair works

- The repair master decides which ranges to work on.
- The repair master splits the ranges to sub ranges which contains around 100
partitions.
- The repair master computes the checksum of the 100 partitions and asks the
related peers to compute the checksum of the 100 partitions.
- If the checksum matches, the data in this sub range is synced.
- If the checksum mismatches, repair master fetches the data from all the peers
and sends back the merged data to peers.

=== Major problems with partition level repair

- A mismatch of a single row in any of the 100 partitions causes 100
partitions to be transferred. A single partition can be very large. Not to
mention the size of 100 partitions.

- Checksum (find the mismatch) and streaming (fix the mismatch) will read the
same data twice

=== Row level repair

Row level checksum and synchronization: detect row level mismatch and transfer
only the mismatch

=== How the row level repair works

- To solve the problem of reading data twice

Read the data only once for both checksum and synchronization between nodes.

We work on a small range which contains only a few mega bytes of rows,
We read all the rows within the small range into memory. Find the
mismatch and send the mismatch rows between peers.

We need to find a sync boundary among the nodes which contains only N bytes of
rows.

- To solve the problem of sending unnecessary data.

We need to find the mismatched rows between nodes and only send the delta.
The problem is called set reconciliation problem which is a common problem in
distributed systems.

For example:
Node1 has set1 = {row1, row2, row3}
Node2 has set2 = {      row2, row3}
Node3 has set3 = {row1, row2, row4}

To repair:
Node1 fetches nothing from Node2 (set2 - set1), fetches row4 (set3 - set1) from Node3.
Node1 sends row1 and row4 (set1 + set2 + set3 - set2) to Node2
Node1 sends row3 (set1 + set2 + set3 - set3) to Node3.

=== How to implement repair with set reconciliation

- Step A: Negotiate sync boundary

class repair_sync_boundary {
    dht::decorated_key pk;
    position_in_partition position
}

Reads rows from disk into row buffers until the size is larger than N
bytes. Return the repair_sync_boundary of the last mutation_fragment we
read from disk. The smallest repair_sync_boundary of all nodes is
set as the current_sync_boundary.

- Step B: Get missing rows from peer nodes so that repair master contains all the rows

Request combined hashes from all nodes between last_sync_boundary and
current_sync_boundary. If the combined hashes from all nodes are identical,
data is synced, goto Step A. If not, request the full hashes from peers.

At this point, the repair master knows exactly what rows are missing. Request the
missing rows from peer nodes.

Now, local node contains all the rows.

- Step C: Send missing rows to the peer nodes

Since local node also knows what peer nodes own, it sends the missing rows to
the peer nodes.

=== How the RPC API looks like

- repair_range_start()

Step A:
- request_sync_boundary()

Step B:
- request_combined_row_hashes()
- reqeust_full_row_hashes()
- request_row_diff()

Step C:
- send_row_diff()

- repair_range_stop()

=== Performance evaluation

We created a cluster of 3 Scylla nodes on AWS using i3.xlarge instance. We
created a keyspace with a replication factor of 3 and inserted 1 billion
rows to each of the 3 nodes. Each node has 241 GiB of data.
We tested 3 cases below.

1) 0% synced: one of the node has zero data. The other two nodes have 1 billion identical rows.

Time to repair:
   old = 87 min
   new = 70 min (rebuild took 50 minutes)
   improvement = 19.54%

2) 100% synced: all of the 3 nodes have 1 billion identical rows.
Time to repair:
   old = 43 min
   new = 24 min
   improvement = 44.18%

3) 99.9% synced: each node has 1 billion identical rows and 1 billion * 0.1% distinct rows.

Time to repair:
   old: 211 min
   new: 44 min
   improvement: 79.15%

Bytes sent on wire for repair:
   old: tx= 162 GiB,  rx = 90 GiB
   new: tx= 1.15 GiB, tx = 0.57 GiB
   improvement: tx = 99.29%, rx = 99.36%

It is worth noting that row level repair sends and receives exactly the
number of rows needed in theory.

In this test case, repair master needs to receives 2 million rows and
sends 4 million rows. Here are the details: Each node has 1 billion *
0.1% distinct rows, that is 1 million rows. So repair master receives 1
million rows from repair slave 1 and 1 million rows from repair slave 2.
Repair master sends 1 million rows from repair master and 1 million rows
received from repair slave 1 to repair slave 2. Repair master sends
sends 1 million rows from repair master and 1 million rows received from
repair slave 2 to repair slave 1.

In the result, we saw the rows on wire were as expected.

tx_row_nr  = 1000505 + 999619 + 1001257 + 998619 (4 shards, the numbers are for each shard) = 4'000'000
rx_row_nr  =  500233 + 500235 +  499559 + 499973 (4 shards, the numbers are for each shard) = 2'000'000

Fixes: #3033

Tests: dtests/repair_additional_test.py
"

* 'asias/row_level_repair_v7' of github.com:cloudius-systems/seastar-dev: (51 commits)
  repair: Enable row level repair
  repair: Add row_level_repair
  repair: Add docs for row level repair
  repair: Add repair_init_messaging_service_handler
  repair: Add repair_meta
  repair: Add repair_writer
  repair: Add repair_reader
  repair: Add repair_row
  repair: Add fragment_hasher
  repair: Add decorated_key_with_hash
  repair: Add get_random_seed
  repair: Add get_common_diff_detect_algorithm
  repair: Add shard_config
  repair: Add suportted_diff_detect_algorithms
  repair: Add repair_stats to repair_info
  repair: Introduce repair_stats
  flat_mutation_reader:  Add make_generating_reader
  storage_service: Introduce ROW_LEVEL_REPAIR feature
  messaging_service: Add RPC verbs for row level repair
  repair: Export the repair logger
  ...
2018-12-25 13:13:00 +02:00
Duarte Nunes
ede5742f9b service/storage_proxy: Send view update backlog from replicas
Change the inter-node protocol so we can propagate the view update
backlog from a base replica to the coordinator through the
mutation_done and mutation_failed verbs.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-12-19 22:38:30 +00:00
Botond Dénes
1865e5da41 treewide: remove include database.hh from headers where possible
Many headers don't really need to include database.hh, the include can
be replaced by forward declarations and/or including the actually needed
headers directly. Some headers don't need this include at all.

Each header was verified to be compilable on its own after the change,
by including it into an empty `.cc` file and compiling it. `.cc` files
that used to get `database.hh` through headers that no longer include it
were changed to include it themselves.
2018-12-14 08:03:57 +02:00
Asias He
acc9ff8dce messaging_service: Add RPC verbs for row level repair
This patch adds the RPC verbs that are needed by the row level repair.
The usage of those verbs are in the following patches.

All the verbs for row level repair are sent by the repair master.
Repair master asks repair slaves to create repair meta objects, a.k.a,
repair_meta object, to store the repair meta data needed by row level
repair algorithm. The repair meta object is identified by the IP address
of the repair master and a uint32 number repair_meta_id chosen by repair
master. When repair master restarts or is out of the cluster, repair
slaves will detect it and remove all existing repair_meta for the repair
master. When repair slave restarts, the existing repair_meta on the
slave will be gone.

The sync boundary used in the verbs is the position_in_partition of the
last mutation_fragment. In each repair round, peers work on
(last_sync_boundary, current_sync_boundary]
2018-12-12 16:49:01 +08:00
Asias He
063dfcda26 messaging_service: Add constructor for msg_addr
Which takes the ip address and shard id.
2018-12-12 16:49:01 +08:00
Avi Kivity
775b7e41f4 Update seastar submodule
* seastar d59fcef...b924495 (2):
  > build: Fix protobuf generation rules
  > Merge "Restructure files" from Jesse

Includes fixup patch from Jesse:

"
Update Seastar `#include`s to reflect restructure

All Seastar header files are now prefixed with "seastar" and the
configure script reflects the new locations of files.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com>
"
2018-11-21 00:01:44 +02:00
Gleb Natapov
d144e6ceac messaging_service: enable port load balancing algorithm for RPC server
In a homogeneous cluster this will reduce number of internal cross-shard hops
per request since RPC calls will arrive to correct shard.

Message-Id: <20181118150817.GF2062@scylladb.com>
2018-11-20 16:15:12 +00:00
Asias He
7f826d3343 streaming: Expose reason for streaming
On receiving a mutation_fragment or a mutation triggered by a streaming
operation, we pass an enum stream_reason to notify the receiver what
the streaming is used for. So the receiver can decide further operation,
e.g., send view updates, beyond applying the streaming data on disk.

Fixes #3276
Message-Id: <f15ebcdee25e87a033dcdd066770114a499881c0.1539498866.git.asias@scylladb.com>
2018-10-15 22:03:28 +01:00
Calle Wilund
3cb50c861d messaging_service: Make rpc streaming sink respect tls connection
Fixes #3787

Message service streaming sink was created using direct call to
rpc::client::make_sink. This in turn needs a new socker, which it
creates completely ignoring what underlying transport is active for the
client in question.

Fix by retaining the tls credential pointer in the client wrapper, and
using this in a sink method to determine whether to create a new tls
socker, or just go ahead with a plain one.

Message-Id: <20181010003249.30526-1-calle@scylladb.com>
2018-10-10 12:55:28 +03:00
Avi Kivity
4553238653 messaging: fix unbounded allocation in TLS RPC server
The non-TLS RPC server has an rpc::resource_limits configuration that limits
its memory consumption, but the TLS server does not. That means a many-node
TLS configuration can OOM if all nodes gang up on a single replica.

Fix by passing the limits to the TLS server too.

Fixes #3757.
Message-Id: <20180907192607.19802-1-avi@scylladb.com>
2018-09-10 12:11:16 +01:00
Duarte Nunes
a025bf6a7d Merge seastar upstream
Seastar introduced a "compat" namespace, which conflicts with Scylla's
own "compat" namespaces. The merge thus includes changes to scope
uses of Scylla's "compat" namespaces.

* seastar 8ad870f...9bb1611  (5):
  > util/variant_utils: Ensure variant_cast behaves well with rvalues
  > util/std-compat: Fix infinite recursion
  > doc/tutorial: Undo namespace changes
  > util/variant_utils: Add cast_variant()
  > Add compatbility with C++17's library types

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-08-14 13:07:09 +01:00
Avi Kivity
c4013f6fe1 messaging: categorize more streaming/repair verbs as streaming
Since the messaging service will assign a scheduling group based
on the client index, it's more important now to get the verbs categorized
correctly.

Re-categorize REPLICATION_FINISHED, REPAIR_CHECKSUM_RANGE, and most
importantly STREAM_MUTATION_FRAGMENTS to the repair/streaming oriented
connections so we get the correct scheduling.
2018-07-15 15:44:10 +03:00
Avi Kivity
ff3d7839ab messaging: remove default when computing rpc client index
A default means that when adding new verbs, we may forget to
categorize a verb correctly.  Without the default, the compiler
will complain due to -Wswitch.
2018-07-15 15:40:29 +03:00
Avi Kivity
fe2db68be8 messaging: convert do_get_rpc_client_idx into a switch
A switch is more readable for multiple choice with no
clearly preferred choice.
2018-07-15 15:26:50 +03:00
Avi Kivity
3b1e04091c messaging: choose connection index via a look-up table
Looking up is faster than a bunch of if()s.
2018-07-15 15:21:06 +03:00
Avi Kivity
8ee807321f Merge "scylla streaming with rpc streaming" from Asias
"
This work is on top of Gleb's rpc streaming which is merged recently.

What this series does is to replace scylla streaming service's data plane to
use the new rpc streaming instead of the old rpc verb to send the mutations for
scylla streaming. Other parts of scylla streaming, the control plane, are not
changed.

In my test, to bootstrap a new node to the existing one node cluster, smp 2,
scylla stores data on ramdisk to minimize disk io impact.

I saw x2 improvment in streaming bandwidth.

Before:
   [shard 0] stream_session - [Stream #2ae92320-5fc8-11e8-911a-000000000000]
   Streaming plan for Bootstrap-ks3-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1570312 KiB, 109521.02 KiB/s
   [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks3 succeeded, took 14.338 seconds

After:
   [shard 0] stream_session - [Stream #e5589ac0-5fc7-11e8-b463-000000000000]
   Streaming plan for Bootstrap-ks3-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1546875 KiB, 220415.36 KiB/s
   [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks3 succeeded, took 7.018 seconds

Tests: dtest update_cluster_layout_tests.py

Fixes: #3591
"

* tag 'asias/scylla_streaming_with_rpc_streaming_v8' of github.com:scylladb/seastar-dev:
  streaming: Add rpc streaming support
  storage_service: Introduce STREAM_WITH_RPC_STREAM feature
  streaming: Add estimate_partitions to send_info
  messaging_service: Add streaming with rpc streaming support
  messaging_service: Add streaming_domain
  database: Add add_sstable_and_update_cache
  database: Add make_streaming_sstable_for_write
2018-07-15 12:36:52 +03:00
Avi Kivity
8c993e0728 messaging: tag RPC services with scheduling groups
Assign a scheduling_group for each RPC service. Assignement is
done by connection (get_rpc_client_idx()) - all verbs on the
same connection are assigned the same group. While this may seem
arbitrary, it avoids priority inversion; if two verbs on the same
connection have different scheduling groups, the verb with the low
shares may cause a backlog and stall the connection, including
following requests from verbs that ought to have higher shares.

The scheduling_group parameters are encapsulated in different
classes as they are passed around to avoid adding dependencies.
Message-Id: <20180708140433.6426-1-avi@scylladb.com>
2018-07-13 13:57:08 +02:00
Asias He
ddfb4590ce messaging_service: Add streaming with rpc streaming support
Preparation for adding rpc streaming in scylla streaming.

- register_stream_mutation_fragments is used to register the rpc
streaming verb

- make_sink_and_source_for_stream_mutation_fragments is used to get the
sink and source object for the sender

- make_sink_for_stream_mutation_fragments is used to get a sink object
for the receiver
2018-07-13 08:36:46 +08:00
Asias He
671e1b08fe messaging_service: Add streaming_domain
The rpc streaming needs a streaming_domain id for the same logical
server. Chose one for our messaging service.
2018-07-13 08:36:46 +08:00
Gleb Natapov
646e400918 Provide available memory size to messaging_service object during creation 2018-06-11 15:34:13 +03:00
Avi Kivity
dd12214628 messaging_service: move msg_addr into its own header file
Make it possible to use msg_addr without depending on messaging_service.hh.
2018-03-12 20:05:23 +02:00
Avi Kivity
cd668061fc storage_service: remove system_keyspace.hh include
Re-distribute include among the files that really need it.
2018-03-11 18:53:49 +02:00
Duarte Nunes
440ea56010 message/messaging_service: Specify algorithm when requesting digest
While not strictly needed, specify which algorithm to use when request
a digest from a remote node. This is more flexible than relying on a
cluster wide feature, although that's what we'll do in subsequent
patches. It also makes the verb more consistent with the data request.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 01:02:50 +00:00
Glauber Costa
08a0c3714c allow request-specific read timeouts in storage proxy reads
Timeouts are a global property. However, for tables in keyspaces like
the system keyspace, we don't want to uphold that timeout--in fact, we
wan't no timeout there at all.

We already apply such configuration for requests waiting in the queued
sstable queue: system keyspace requests won't be removed. However, the
storage proxy will insert its own timeouts in those requests, causing
them to fail.

This patch changes the storage proxy read layer so that the timeout is
applied based on the column family configuration, which is in turn
inherited from the keyspace configuration. This matches our usual
way of passing db parameters down.

In terms of implementation, we can either move the timeout inside the
abstract read executor or keep it external. The former is a bit cleaner,
the the latter has the nice property that all executors generated will
share the exact same timeout point. In this patch, we chose the latter.

We are also careful to propagate the timeout information to the replica.
So even if we are talking about the local replica, when we add the
request to the concurrency queue, we will do it in accordance with the
timeout specified by the storage proxy layer.

After this patch, Scylla is able to start just fine with very low
timeouts--since read timeouts in the system keyspace are now ignored.

Fixes #2462

Implementation notes, and general comments about open discussion in 2462:

* Because we are not bypassing the timeout, just setting it high enough,
  I consider the concerns about the batchlog moot: if we fail for any
  other reason that will be propagated. Last case, because the timeout
  is per-CF, we could do what we do for the dirty memory manager and
  move the batchlog alone to use a different timeout setting.

* Storage proxy likes specifying its timeouts as a time_point, whereas
  when we get low enough as to deal with the read_concurrency_config,
  we are talking about deltas. So at some point we need to convert time_points
  to durations. We do that in the database query functions.

v2:
- use per-request instead of per-table timeouts.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-12 07:43:21 -05:00
Vlad Zolotarov
be6f8be9cb messaging_service: fix a mutli-NIC support
Don't enforce the outgoing connections from the 'listen_address'
interface only.

If 'local_address' is given to connect() it will enforce it to use a
particular interface to connect from, even if the destination address
should be accessed from a different interface. If we don't specify the
'local_address' the source interface will be chosen according to the
routing configuration.

Fixes #3066

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1513372688-21595-1-git-send-email-vladz@scylladb.com>
2017-12-17 10:51:20 +02:00
Gleb Natapov
16964de1f3 storage_proxy: fail read/write requests early if it cannot be completed due to errors
If errors make reaching CL impossible a request can be aborted earlier
without waiting for timeout.
2017-12-05 16:46:25 +02:00
Duarte Nunes
1fbe9dc851 message/messaging_service: Close all server sockets
We were stopping the loop prematurely.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20171127181417.8167-1-duarte@scylladb.com>
2017-11-28 11:08:08 +02:00
Asias He
8fa35d6ddf messaging_service: Get rid of timeout and retry logic for streaming verb
With the "Use range_streamer everywhere" (7217b7ab36) seires, all
the user of streaming now do streaming with relative small ranges and
can retry streaming at higher level.

There are problems with timeout and retry at RPC verb level in streaming:
1) Timeout can be false negative.
2) We can not cancel the send operations which are already called. When
user aborts the streaming, the retry logic keeps running for a long
time.

This patch removes all the timeout and retry logic for streaming verbs.
After this, the timeout is the job of TCP, the retry is the job of the
upper layer.

Message-Id: <df20303c1fa728dcfdf06430417cf2bd7a843b00.1503994267.git.asias@scylladb.com>
2017-08-29 17:20:00 +03:00
Avi Kivity
3edec66903 Revert "repair: Make send_repair_checksum_range timeout"
This reverts commit 98757069a5. We have the
failure detector which will detect an unresponsive node and fail the RPC.
Adding a timeout can just introduce false positives.
2017-08-06 13:09:36 +03:00
Asias He
98757069a5 repair: Make send_repair_checksum_range timeout
If the verb never returns the repair will hangs forever. Make it use the
timeout version of the send_message.

Fixes #2662
2017-08-02 21:41:50 +08:00
Duarte Nunes
85e85ec72e Don't catch polymorphic exceptions by value
It makes gcc a very sad compiler.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170726172053.5639-2-duarte@scylladb.com>
2017-07-27 09:39:58 +03:00