Commit Graph

37 Commits

Author SHA1 Message Date
Asias He
bdd6a69af7 streaming: Drop unused parameters
- int connections_per_host

Scylla does not create connections per stream_session, instead it uses
rpc, thus connections_per_host is not relevant to scylla.

- bool keep_ss_table_level
- int repaired_at

Scylla does not stream sstable files. They are not relevant to scylla.
2016-01-25 11:38:13 +08:00
Asias He
88e99e89d6 streaming: Add more debug info
- Add debug for the peer address info
- Add debug in stream_transfer_task and stream_receive_task
- Add debug when cancel the keep_alive timer
- Add debug for has_active_sessions in stream_result_future::maybe_complete
2016-01-22 07:43:16 +08:00
Asias He
2345cda42f messaging_service: Rename shard_id to msg_addr
Use shard_id as the destination of the messaging_service is confusing,
since shard_id is used in the context of cpu id.
Message-Id: <8c9ef193dc000ef06f8879e6a01df65cf24635d8.1452155241.git.asias@scylladb.com>
2016-01-07 10:36:35 +02:00
Asias He
22d0525bc0 streaming: Get rid of the _from_ parameter
Get this from cinfo.retrieve_auxiliary inside the rpc handler.
2015-12-31 11:25:08 +01:00
Asias He
89b79d44de streaming: Get rid of the _connecting_ parameter
messaging_service will use private ip address automatically to connect a
peer node if possible. There is no need for the upper level like
streaming to worry about it. Drop it simplifies things a bit.
2015-12-31 11:25:08 +01:00
Avi Kivity
827a4d0010 Merge "streaming: Invalidate cache upon receiving of stream" from Asias
"When a node gain or regain responsibility for certain token ranges, streaming
will be performed, upon receiving of the stream data, the row cache
is invalidated for that range.

Refs #484."
2015-12-28 10:24:46 +02:00
Asias He
c971fad618 streaming: Introduce keep alive timer for each stream_session
If the session is idle for 10 minutes, close the session. This can
detect the following hangs:

1) if the sending node is gone, the receiving peer will wait forever
2) if the node which should send COMPLETE_MESSAGE to the peer node is
gone, the peer node will wait forever

Fixes simple_kill_streaming_node_while_bootstrapping_test.
2015-12-24 20:34:44 +08:00
Asias He
2d32195c32 streaming: Invalidate cache upon receiving of stream
When a node gain or regain responsibility for certain token ranges,
streaming will be performed, upon receiving of the stream data, the
row cache is invalidated for that range.

Refs #484.
2015-12-21 14:44:13 +08:00
Asias He
242e5ea291 streaming: Ignore remote no_such_column_family for stream_transfer_task
When we start to sending mutations for cf_id to remote node, remote node
might do not have the cf_id anymore due to dropping of the cf for
instance.

We should not fail the streaming if this happens, since the cf does not
exist anymore there is no point streaming it.

Fixes #566
2015-11-18 15:12:23 +02:00
Asias He
6ac54a27dc streaming: Skip non-exist cf for stream_transfer_task
Skip sending the mutation if the cf is dropped after we call
make_local_reader in stream_session::add_transfer_ranges().

Fix #550.
2015-11-16 16:48:35 +01:00
Asias He
860c7aff37 streaming: Print plan_id in logger 2015-11-10 15:39:34 +08:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Avi Kivity
b22a598efb mutation_reader: make noncopyable
Many mutation_reader implementations capture 'this', which, if copied,
becomes invalid.  Protect against this error my making mutation_reader
a non-copyable object.

Fix inadvertant copied around the code base.
2015-08-25 15:49:08 +03:00
Asias He
fd1c0e0bb3 streaming: Fix iterate and delete
The problem is that in start_streaming_files we iterate the _transfers
map, however in task.start() we can delete the task from _transfers:
stream_transfer_task::start() -> stream_transfer_task::complete ->
stream_session::task_completed -> _transfers.erase(completed_task.cf_id)

To fix, we advance the iterator before we start the task.

std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from
/lib64/libstdc++.so.6
/usr/include/c++/5.1.1/bits/stl_tree.h:205
(this=this@entry=0x6000000dc290) at streaming/stream_transfer_task.cc:55
streaming::stream_session::start_streaming_files
(this=this@entry=0x6000000ab500) at streaming/stream_session.cc:526
(this=0x6000000ab500, requests=std::vector of length 1, capacity 1 =
{...}, summaries=std::vector of length 1, capacity 1 = {...})
    at streaming/stream_session.cc:356
streaming/stream_session.cc:83
2015-08-17 11:00:30 +08:00
Asias He
d2e826d6e6 streaming: Log STREAM_MUTATION_DONE before sending it
It is useful for debug.
2015-08-17 11:00:30 +08:00
Asias He
0f1f710b27 streaming: Introduce transfer_task_completed 2015-08-17 11:00:30 +08:00
Asias He
651200c123 streaming: Log exception
It is easier to tell what is going wrong.
2015-08-17 10:52:30 +08:00
Asias He
aa012ba374 streaming: Send STREAM_MUTATION in parallel
At the moment, when local node send a mutation to remote node, it will
wait for remote node to apply the mutation and send back a response,
then it will send the next mutation. This means the sender are sending
mutations one by one. To optimize, we can make the sender send more
mutations in parallel without waiting for the response. In order to
apply back pressure from remote node, a per shard mutation send limiter
is introduced so that the sender will not overwhelm the receiver.
2015-08-17 10:52:30 +08:00
Asias He
e13d93b2ff streaming: Improve error handling in stream_transfer_task::complete 2015-08-10 14:49:34 +08:00
Asias He
c7c33a9f44 streaming: Add error handling for STREAM_MUTATION sending 2015-08-10 14:44:25 +08:00
Asias He
be4d9c63b1 streaming: Drop do_with in stream_transfer_task::start
We can copy id instead, it is cheap.
2015-08-10 14:13:15 +08:00
Asias He
f9109c33ba streaming: Implement stream_transfer_task completion logic 2015-07-21 16:12:54 +08:00
Asias He
f2960a7cb0 streaming: Send plan_id for STREAM_MUTATION
We need this to find session associated with this frozen_mutation.
2015-07-21 16:12:54 +08:00
Asias He
ccb32ceec5 streaming: Add stream_transfer_task::complete 2015-07-21 16:12:54 +08:00
Asias He
8561315cf2 streaming: de-thread_local-ize logger 2015-07-21 16:12:54 +08:00
Asias He
857fa5ccbb messaging_service: Add wrapper for STREAM_MUTATION verb 2015-07-16 17:19:51 +08:00
Asias He
d720dadf7b streaming: Switch to use logger class 2015-07-14 20:56:28 +08:00
Asias He
e82bdf2995 streaming: Swith to use shared_ptr from std::shared_ptr
Since our shared_ptr works with incomplete types now, switch to it.
2015-07-14 20:41:14 +08:00
Asias He
8fd8f39d63 streaming: Add more debug info for message exchange 2015-07-14 20:41:14 +08:00
Asias He
ca7f5ca5c9 streaming: Set proper dst_cpu_id in shard_id for PREPARE_MESSAGE and STREAM_MUTATION 2015-07-14 20:41:14 +08:00
Asias He
14ae9e66ae streaming: Use shared_ptr to track back to stream_session
I tried our lw_shared_ptr, the compiler complained endless usage of
incomplete type stream_session. I can not include stream_session.hh
everywhere due to circular dependency.

For now, I'm using std::shared_ptr which works fine.
2015-07-14 20:41:14 +08:00
Asias He
b7b0aa3318 streaming: Negotiate core to core connection.
In streaming code, we need core to core connection(the second connection
from B to A). That is when node A initiates a stream to node B, it is
possible that node A will transfer data to node B and vice verse, so we
need two connections. When node A creates a tcp connection (within the
messaging_service) to node B, we have a connection ip_a:core_a to
ip_b:core_b. When node B creates a connection to node B, we can not
guarantee it is ip_b:core_b to ip_a:core_a.

Current messaging_service does not support core to core connection yet,
although we use shard_id{ip, cpu_id} as the destination of the message.

We can solve the issue in upper layer. We can pass extra cpu_id as a
user msg.

Node A sends stream_init_message with my_cpu_id = current_cpu_id

Node B receives stream_init_message, it runs on whatever cpu this
connection goes to, then it sends response back with Node B's
current_cpu_id.

After this, each node knows which cpu_id to send to each other.

TODO: we need to handle the case when peer node reboots with different
number of cpus.
2015-07-09 15:52:28 +08:00
Asias He
3256a21556 streaming: Use frozen_mutation to send mutations
Each outgoing_file_message might contain multiple mutations. Send them
one mutation per RPC call (using frozen_mutation), instead of one big
outgoing_file_message per one RPC call.
2015-07-09 15:52:28 +08:00
Asias He
4718211d4a streaming: Wire up stream_transfer_task::add_transfer_file
Wire up with outgoing_file_message
2015-07-09 15:52:27 +08:00
Asias He
ad3692f666 streaming: Implement stream_session::add_transfer_ranges
Given keyspace names, ranges and column_families names, figure out
mutation_readers to transfer.
2015-07-09 15:52:27 +08:00
Asias He
4c9af76261 streaming: Move add_transfer_file to source file
Reduce dependency to stream_session
2015-06-24 16:13:30 +08:00
Asias He
334b1f81fc streaming: Convert StreamTransferTask.java to C++ 2015-06-18 14:55:07 +08:00