Now stream_result_future can create a stream_coordinator if not
provided.
So
- On sending side, stream_coordinator is created by stream_plan
- On receiving side, stream_coordinator is created by stream_result_future
stream_session::stream_session(inet_address peer_, inet_address connecting_,
int index_, bool keep_ss_table_level_)
: peer(peer_)
, connecting(connecting_)
, conn_handler(shared_from_this())
Calling shared_from_this() inside stream_session's constructor is
problematic. I got
Exiting on unhandled exception of type 'std::bad_weak_ptr': bad_weak_ptr
exceptions, with
auto session = std::make_shared<stream_session>(peer, connecting, size, _keep_ss_table_level)
Also, the logic in connection_handler is not very useful for us. The
sending and receiving of messages are handled using messaging_service.
There is no need to add another layer.
I tried our lw_shared_ptr, the compiler complained endless usage of
incomplete type stream_session. I can not include stream_session.hh
everywhere due to circular dependency.
For now, I'm using std::shared_ptr which works fine.
In streaming code, we need core to core connection(the second connection
from B to A). That is when node A initiates a stream to node B, it is
possible that node A will transfer data to node B and vice verse, so we
need two connections. When node A creates a tcp connection (within the
messaging_service) to node B, we have a connection ip_a:core_a to
ip_b:core_b. When node B creates a connection to node B, we can not
guarantee it is ip_b:core_b to ip_a:core_a.
Current messaging_service does not support core to core connection yet,
although we use shard_id{ip, cpu_id} as the destination of the message.
We can solve the issue in upper layer. We can pass extra cpu_id as a
user msg.
Node A sends stream_init_message with my_cpu_id = current_cpu_id
Node B receives stream_init_message, it runs on whatever cpu this
connection goes to, then it sends response back with Node B's
current_cpu_id.
After this, each node knows which cpu_id to send to each other.
TODO: we need to handle the case when peer node reboots with different
number of cpus.
This is a bit different from Origin. We always send back a
prepare_message even if the initializer requested no data from the
follower, to unify the handling.
Each outgoing_file_message might contain multiple mutations. Send them
one mutation per RPC call (using frozen_mutation), instead of one big
outgoing_file_message per one RPC call.