- int connections_per_host
Scylla does not create connections per stream_session, instead it uses
rpc, thus connections_per_host is not relevant to scylla.
- bool keep_ss_table_level
- int repaired_at
Scylla does not stream sstable files. They are not relevant to scylla.
- Add debug for the peer address info
- Add debug in stream_transfer_task and stream_receive_task
- Add debug when cancel the keep_alive timer
- Add debug for has_active_sessions in stream_result_future::maybe_complete
messaging_service will use private ip address automatically to connect a
peer node if possible. There is no need for the upper level like
streaming to worry about it. Drop it simplifies things a bit.
"When a node gain or regain responsibility for certain token ranges, streaming
will be performed, upon receiving of the stream data, the row cache
is invalidated for that range.
Refs #484."
If the session is idle for 10 minutes, close the session. This can
detect the following hangs:
1) if the sending node is gone, the receiving peer will wait forever
2) if the node which should send COMPLETE_MESSAGE to the peer node is
gone, the peer node will wait forever
Fixes simple_kill_streaming_node_while_bootstrapping_test.
When a node gain or regain responsibility for certain token ranges,
streaming will be performed, upon receiving of the stream data, the
row cache is invalidated for that range.
Refs #484.
When we start to sending mutations for cf_id to remote node, remote node
might do not have the cf_id anymore due to dropping of the cf for
instance.
We should not fail the streaming if this happens, since the cf does not
exist anymore there is no point streaming it.
Fixes#566
Many mutation_reader implementations capture 'this', which, if copied,
becomes invalid. Protect against this error my making mutation_reader
a non-copyable object.
Fix inadvertant copied around the code base.
The problem is that in start_streaming_files we iterate the _transfers
map, however in task.start() we can delete the task from _transfers:
stream_transfer_task::start() -> stream_transfer_task::complete ->
stream_session::task_completed -> _transfers.erase(completed_task.cf_id)
To fix, we advance the iterator before we start the task.
std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from
/lib64/libstdc++.so.6
/usr/include/c++/5.1.1/bits/stl_tree.h:205
(this=this@entry=0x6000000dc290) at streaming/stream_transfer_task.cc:55
streaming::stream_session::start_streaming_files
(this=this@entry=0x6000000ab500) at streaming/stream_session.cc:526
(this=0x6000000ab500, requests=std::vector of length 1, capacity 1 =
{...}, summaries=std::vector of length 1, capacity 1 = {...})
at streaming/stream_session.cc:356
streaming/stream_session.cc:83
At the moment, when local node send a mutation to remote node, it will
wait for remote node to apply the mutation and send back a response,
then it will send the next mutation. This means the sender are sending
mutations one by one. To optimize, we can make the sender send more
mutations in parallel without waiting for the response. In order to
apply back pressure from remote node, a per shard mutation send limiter
is introduced so that the sender will not overwhelm the receiver.
I tried our lw_shared_ptr, the compiler complained endless usage of
incomplete type stream_session. I can not include stream_session.hh
everywhere due to circular dependency.
For now, I'm using std::shared_ptr which works fine.
In streaming code, we need core to core connection(the second connection
from B to A). That is when node A initiates a stream to node B, it is
possible that node A will transfer data to node B and vice verse, so we
need two connections. When node A creates a tcp connection (within the
messaging_service) to node B, we have a connection ip_a:core_a to
ip_b:core_b. When node B creates a connection to node B, we can not
guarantee it is ip_b:core_b to ip_a:core_a.
Current messaging_service does not support core to core connection yet,
although we use shard_id{ip, cpu_id} as the destination of the message.
We can solve the issue in upper layer. We can pass extra cpu_id as a
user msg.
Node A sends stream_init_message with my_cpu_id = current_cpu_id
Node B receives stream_init_message, it runs on whatever cpu this
connection goes to, then it sends response back with Node B's
current_cpu_id.
After this, each node knows which cpu_id to send to each other.
TODO: we need to handle the case when peer node reboots with different
number of cpus.
Each outgoing_file_message might contain multiple mutations. Send them
one mutation per RPC call (using frozen_mutation), instead of one big
outgoing_file_message per one RPC call.