Currently, only the shard where the stream_plan is created on will send
streaing mutations. To utilize all the available cores, we can make each
shard send mutations which it is responsbile for. On the receiver side,
we do not forward the mutations to the shard where the stream_session is
created, so that we can avoid unnecessary forwarding.
Note: the downside is that it is now harder to:
1) to track number of bytes sent and received
2) to update the keep alive timer upon receive of the STREAM_MUTATION
To fix, we now store the sent/recieved bytes info on all shards. When
the keep alive timer expires, we check if any progress has been made.
Hopefully, this patch will make the streaming much faster and in turn
make the repair/decommission/adding a node faster.
Refs: https://github.com/scylladb/scylla/issues/849
Tested with decommission/repair dtest.
Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>
The problem is that on the follower side, we set up _session_info too
late, after received PREPARE_DONE_MESSAGE message. The initiator can
send STREAM_MUTATION before sending PREPARE_DONE_MESSAGE message.
To fix, we set up _session_info after we received the prepare_message on
both initiator and follower.
Fixes#869
scylla: streaming/session_info.cc:44: void
streaming::session_info::update_progress(streaming::progress_info):
Assertion `peer == new_progress.peer' failed.
Message-Id: <6d945ba1e8c4fc0949c3f0a72800c9448ba27761.1454476876.git.asias@scylladb.com>
To simplify streaming verb handler.
- Use get_session instead of open coded logic to get get_coordinator and
stream_session in all the verb handlers
- Use throw instead of assert for error handling
- init_receiving_side now returns a shared_ptr<stream_result_future>
It is 1:1 mapping between session_info and stream_session. Putting
session_info inside stream_session, we can get rid of the
stream_coordinator::host_streaming_data class.
There are only two messages: prepare_message and outgoing_file_message.
Actually only the prepare_message is the message we send on wire.
Flatten the namespace.
After this patch, our I/O operations will be tagged into a specific priority class.
The available classes are 5, and were defined in the previous patch:
1) memtable flush
2) commitlog writes
3) streaming mutation
4) SSTable compaction
5) CQL query
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Unlike streaming in c*, scylla does not need to open tcp connections in
streaming service for both incoming and outgoing messages, seastar::rpc
does the work. There is no need for a standalone stream_init_message
message in the streaming negotiation stage, we can merge the
stream_init_message into stream_prepare_message.
- from
We can get it form the rpc::client_info
- session_index
There will always be one session in stream_coordinator::host_streaming_data with a peer.
- is_for_outgoing
In cassandra, it initiates two tcp connections, one for incoming stream and one for outgoing stream.
logger.debug("[Stream #{}] Sending stream init for incoming stream", session.planId());
logger.debug("[Stream #{}] Sending stream init for outgoing stream", session.planId());
In scylla, it only initiates one "connection" for sending, the peer initiates another "connection" for receiving.
So, is_for_outgoing will also be true in scylla, we can drop it.
- keep_ss_table_level
In scylla, again, we stream mutations instead of sstable file. It is
not relevant to us.
- int connections_per_host
Scylla does not create connections per stream_session, instead it uses
rpc, thus connections_per_host is not relevant to scylla.
- bool keep_ss_table_level
- int repaired_at
Scylla does not stream sstable files. They are not relevant to scylla.
- Add debug for the peer address info
- Add debug in stream_transfer_task and stream_receive_task
- Add debug when cancel the keep_alive timer
- Add debug for has_active_sessions in stream_result_future::maybe_complete
We cannot stop the stream manager because it's accessible via the API
server during shutdown, for example, which can cause a SIGSEGV.
Spotted by ASan.
Message-Id: <1453130811-22540-1-git-send-email-penberg@scylladb.com>
Schema is tracked in memtable and cache per-entry. Entries are
upgraded lazily on access. Incoming mutations are upgraded to table's
current schema on given shard.
Mutating nodes need to keep schema_ptr alive in case schema version is
requested by target node.
messaging_service will use private ip address automatically to connect a
peer node if possible. There is no need for the upper level like
streaming to worry about it. Drop it simplifies things a bit.
"When a node gain or regain responsibility for certain token ranges, streaming
will be performed, upon receiving of the stream data, the row cache
is invalidated for that range.
Refs #484."
The problem is that we set the session state to WAIT_COMPLETE in
send_complete_message's continuation, the peer node might send
COMPLETE_MESSAGE before we run the continuation, thus we set the wrong
status in COMPLETE_MESSAGE's handler and will not close the session.
Before:
GOT STREAM_MUTATION_DONE
receive task_completed
SEND COMPLETE_MESSAGE to 127.0.0.2:0
GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0
complete: PREPARING -> WAIT_COMPLETE
GOT COMPLETE_MESSAGE Reply
maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE
After:
GOT STREAM_MUTATION_DONE
receive task_completed
maybe_completed: PREPARING -> WAIT_COMPLETE
SEND COMPLETE_MESSAGE to 127.0.0.2:0
GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0
complete: WAIT_COMPLETE -> COMPLETE
Session with 127.0.0.2 is complete
If the session is idle for 10 minutes, close the session. This can
detect the following hangs:
1) if the sending node is gone, the receiving peer will wait forever
2) if the node which should send COMPLETE_MESSAGE to the peer node is
gone, the peer node will wait forever
Fixes simple_kill_streaming_node_while_bootstrapping_test.
Get from address from cinfo. It is needed to figure out which stream
session this mutation is belonged to, since we need to update the keep
alive timer for this stream session.
It is oneway message at the moment. If a COMPLETE_MESSAGE is lost, no
one will close the session. The first step to fix the issue is to try to
retransmit the message.
When a node gain or regain responsibility for certain token ranges,
streaming will be performed, upon receiving of the stream data, the
row cache is invalidated for that range.
Refs #484.
Currently, there are multiple places we can close a session, this makes
the close code path hard to follow. Remove the call to maybe_completed
in follower_start_sent to simplify closing a bit.
- stream_session::follower_start_sent -> maybe_completed()
- stream_session::receive_task_completed -> maybe_completed()
- stream_session::transfer_task_completed -> maybe_completed()
- on receive of the COMPLETE_MESSAGE -> complete()