scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 10:30:38 +00:00

Author	SHA1	Message	Date
Asias He	31d439213c	streaming: Send mutations on all shards Currently, only the shard where the stream_plan is created on will send streaing mutations. To utilize all the available cores, we can make each shard send mutations which it is responsbile for. On the receiver side, we do not forward the mutations to the shard where the stream_session is created, so that we can avoid unnecessary forwarding. Note: the downside is that it is now harder to: 1) to track number of bytes sent and received 2) to update the keep alive timer upon receive of the STREAM_MUTATION To fix, we now store the sent/recieved bytes info on all shards. When the keep alive timer expires, we check if any progress has been made. Hopefully, this patch will make the streaming much faster and in turn make the repair/decommission/adding a node faster. Refs: https://github.com/scylladb/scylla/issues/849 Tested with decommission/repair dtest. Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>	2016-02-07 10:57:51 +02:00
Asias He	c67538009c	streaming: Fix assert in update_progress The problem is that on the follower side, we set up _session_info too late, after received PREPARE_DONE_MESSAGE message. The initiator can send STREAM_MUTATION before sending PREPARE_DONE_MESSAGE message. To fix, we set up _session_info after we received the prepare_message on both initiator and follower. Fixes #869 scylla: streaming/session_info.cc:44: void streaming::session_info::update_progress(streaming::progress_info): Assertion `peer == new_progress.peer' failed. Message-Id: <6d945ba1e8c4fc0949c3f0a72800c9448ba27761.1454476876.git.asias@scylladb.com>	2016-02-03 10:15:45 +02:00
Asias He	cb92fe75e6	streaming: Introduce get_session helper To simplify streaming verb handler. - Use get_session instead of open coded logic to get get_coordinator and stream_session in all the verb handlers - Use throw instead of assert for error handling - init_receiving_side now returns a shared_ptr<stream_result_future>	2016-01-29 16:31:07 +08:00
Asias He	360df6089c	streaming: Remove unused stream_session::retry	2016-01-29 16:31:07 +08:00
Asias He	2f48d402e2	streaming: Remove unused commented code	2016-01-29 16:31:07 +08:00
Asias He	ed3da7b04c	streaming: Drop flush_tables option for add_transfer_ranges We do not stream sstable files. No need to flush it.	2016-01-29 16:31:07 +08:00
Asias He	aa69d5ffb2	streaming: Drop update_progress in stream_coordinator Since we have session_info inside stream_session now, we can call update_progress directly in stream_session.	2016-01-29 16:31:07 +08:00
Asias He	46bec5980b	streaming: Put session_info inside stream_session It is 1:1 mapping between session_info and stream_session. Putting session_info inside stream_session, we can get rid of the stream_coordinator::host_streaming_data class.	2016-01-29 16:31:07 +08:00
Asias He	c4bdb6f782	streaming: Wire up session progress The progress info is needed by JMX api.	2016-01-29 16:31:07 +08:00
Asias He	03aced39c4	streaming: Account number of bytes sent and received per session The API will consume it soon.	2016-01-27 18:16:58 +08:00
Avi Kivity	a53788d61d	Merge "More streaming cleanup and fix" from Asias "- Drop compression_info/stream_message - Cleanup outgoing_file_message/prepare_message - Fix stream manager API (more to come)"	2016-01-26 13:17:58 +02:00
Asias He	e8b8b454df	streaming: Flatten streaming messages class namespace There are only two messages: prepare_message and outgoing_file_message. Actually only the prepare_message is the message we send on wire. Flatten the namespace.	2016-01-26 13:04:29 +08:00
Glauber Costa	b63611e148	mark I/O operations with priority classes After this patch, our I/O operations will be tagged into a specific priority class. The available classes are 5, and were defined in the previous patch: 1) memtable flush 2) commitlog writes 3) streaming mutation 4) SSTable compaction 5) CQL query Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Asias He	eba9820b22	streaming: Remove stream_session::file_sent It is the callback after sending file_message_header. In scylla, we do not sent the file_message_header. Drop it.	2016-01-25 17:25:34 +08:00
Asias He	2cc31ac977	streaming: Get rid of the stream_index It is always zero.	2016-01-25 16:58:57 +08:00
Asias He	ad4a096b80	streaming: Get rid of stream_init_message Unlike streaming in c*, scylla does not need to open tcp connections in streaming service for both incoming and outgoing messages, seastar::rpc does the work. There is no need for a standalone stream_init_message message in the streaming negotiation stage, we can merge the stream_init_message into stream_prepare_message.	2016-01-25 16:24:16 +08:00
Asias He	dc94c5e42e	streaming: Rename get_or_create_next_session to get_or_create_session There is only one session for each peer in stream_coordinator.	2016-01-25 11:38:13 +08:00
Asias He	9a346d56b9	streaming: Drop unnecessary parameters in stream_init_message - from We can get it form the rpc::client_info - session_index There will always be one session in stream_coordinator::host_streaming_data with a peer. - is_for_outgoing In cassandra, it initiates two tcp connections, one for incoming stream and one for outgoing stream. logger.debug("[Stream #{}] Sending stream init for incoming stream", session.planId()); logger.debug("[Stream #{}] Sending stream init for outgoing stream", session.planId()); In scylla, it only initiates one "connection" for sending, the peer initiates another "connection" for receiving. So, is_for_outgoing will also be true in scylla, we can drop it. - keep_ss_table_level In scylla, again, we stream mutations instead of sstable file. It is not relevant to us.	2016-01-25 11:38:13 +08:00
Asias He	1bc5cd1b22	streaming: Drop streaming/messages/session_failed_message It is not used.	2016-01-25 11:38:13 +08:00
Asias He	2a04e8d70e	streaming: Drop streaming/messages/incoming_file_message It is not used.	2016-01-25 11:38:13 +08:00
Asias He	26ba21949e	streaming: Drop streaming/messages/retry_message It is not used.	2016-01-25 11:38:13 +08:00
Asias He	4b4363b62d	streaming: Drop streaming/messages/received_message It is not used.	2016-01-25 11:38:13 +08:00
Asias He	5a0bf10a0b	streaming: Drop streaming/messages/complete_message It is not used.	2016-01-25 11:38:13 +08:00
Asias He	bdd6a69af7	streaming: Drop unused parameters - int connections_per_host Scylla does not create connections per stream_session, instead it uses rpc, thus connections_per_host is not relevant to scylla. - bool keep_ss_table_level - int repaired_at Scylla does not stream sstable files. They are not relevant to scylla.	2016-01-25 11:38:13 +08:00
Asias He	864c7f636c	streaming: Fail the session if fails to send COMPLETE_MESSAGE We will retry sending COMPLETE_MESSAGE, if it fails even with the retry, there must be something wrong. Abort the stream_session in this case.	2016-01-22 07:44:21 +08:00
Asias He	9be671e7f5	streaming: Simplify send_complete_message The send once logic is open coded. Moved it into send_complete_message(), so we can simplify the caller.	2016-01-22 07:43:39 +08:00
Asias He	88e99e89d6	streaming: Add more debug info - Add debug for the peer address info - Add debug in stream_transfer_task and stream_receive_task - Add debug when cancel the keep_alive timer - Add debug for has_active_sessions in stream_result_future::maybe_complete	2016-01-22 07:43:16 +08:00
Asias He	1c2d95f2b0	streaming: Remove unused verb handlers They are never used in scylla. Message-Id: <1453283955-23691-2-git-send-email-asias@scylladb.com>	2016-01-20 13:58:59 +02:00
Asias He	767e25a686	streaming: Remove the _handlers helper It is introduced to help to run the invoke_on_all, we can reuse the distributed<database> db for it. Message-Id: <1453283955-23691-1-git-send-email-asias@scylladb.com>	2016-01-20 13:58:44 +02:00
Pekka Enberg	2ca8606b4e	streaming/stream_session: Don't stop stream manager We cannot stop the stream manager because it's accessible via the API server during shutdown, for example, which can cause a SIGSEGV. Spotted by ASan. Message-Id: <1453130811-22540-1-git-send-email-penberg@scylladb.com>	2016-01-18 16:34:19 +01:00
Tomasz Grabiec	e1e8858ed1	service: Fetch and sync schema	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00
Asias He	2345cda42f	messaging_service: Rename shard_id to msg_addr Use shard_id as the destination of the messaging_service is confusing, since shard_id is used in the context of cpu id. Message-Id: <8c9ef193dc000ef06f8879e6a01df65cf24635d8.1452155241.git.asias@scylladb.com>	2016-01-07 10:36:35 +02:00
Asias He	1b3d2dee8f	streaming: Drop src_cpu_id parameter Now that we can get the src_cpu_id from rpc::client_info. No need to pass it as verb parameter.	2015-12-31 11:25:09 +01:00
Asias He	22d0525bc0	streaming: Get rid of the _from_ parameter Get this from cinfo.retrieve_auxiliary inside the rpc handler.	2015-12-31 11:25:08 +01:00
Asias He	89b79d44de	streaming: Get rid of the _connecting_ parameter messaging_service will use private ip address automatically to connect a peer node if possible. There is no need for the upper level like streaming to worry about it. Drop it simplifies things a bit.	2015-12-31 11:25:08 +01:00
Avi Kivity	827a4d0010	Merge "streaming: Invalidate cache upon receiving of stream" from Asias "When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484."	2015-12-28 10:24:46 +02:00
Asias He	20c258f202	streaming: Fix session hang with maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE The problem is that we set the session state to WAIT_COMPLETE in send_complete_message's continuation, the peer node might send COMPLETE_MESSAGE before we run the continuation, thus we set the wrong status in COMPLETE_MESSAGE's handler and will not close the session. Before: GOT STREAM_MUTATION_DONE receive task_completed SEND COMPLETE_MESSAGE to 127.0.0.2:0 GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0 complete: PREPARING -> WAIT_COMPLETE GOT COMPLETE_MESSAGE Reply maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE After: GOT STREAM_MUTATION_DONE receive task_completed maybe_completed: PREPARING -> WAIT_COMPLETE SEND COMPLETE_MESSAGE to 127.0.0.2:0 GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0 complete: WAIT_COMPLETE -> COMPLETE Session with 127.0.0.2 is complete	2015-12-24 20:34:44 +08:00
Asias He	c971fad618	streaming: Introduce keep alive timer for each stream_session If the session is idle for 10 minutes, close the session. This can detect the following hangs: 1) if the sending node is gone, the receiving peer will wait forever 2) if the node which should send COMPLETE_MESSAGE to the peer node is gone, the peer node will wait forever Fixes simple_kill_streaming_node_while_bootstrapping_test.	2015-12-24 20:34:44 +08:00
Asias He	f527e07be6	streaming: Get stream_session in STREAM_MUTATION handler Get from address from cinfo. It is needed to figure out which stream session this mutation is belonged to, since we need to update the keep alive timer for this stream session.	2015-12-24 20:34:44 +08:00
Asias He	d7a8c655a6	streaming: Print All sessions completed after state change message close_session will print "All sessions completed" message, print the state change message before that.	2015-12-24 20:34:44 +08:00
Asias He	eaea09ee71	streaming: Retransmit COMPLETE_MESSAGE message It is oneway message at the moment. If a COMPLETE_MESSAGE is lost, no one will close the session. The first step to fix the issue is to try to retransmit the message.	2015-12-24 20:34:44 +08:00
Asias He	d1d6395978	streaming: Print old state before setting the new state	2015-12-24 20:34:44 +08:00
Asias He	2d32195c32	streaming: Invalidate cache upon receiving of stream When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484.	2015-12-21 14:44:13 +08:00
Asias He	b7d10b710e	streaming: Propagate fail to send PREPARE_DONE_MESSAGE exception Otherwise the stream_plan will not be marked as failed state.	2015-12-10 12:38:00 +02:00
Asias He	19a6dfcfd0	streaming: stream_session print stream_session_state properly	2015-11-10 15:39:34 +08:00
Asias He	860c7aff37	streaming: Print plan_id in logger	2015-11-10 15:39:34 +08:00
Asias He	d2e5d13e69	streaming: Set state to STREAMING only if we really have data to sent	2015-11-10 15:39:34 +08:00
Asias He	fcf7486d4c	streaming: Improve state transition log for maybe_completed and complete	2015-11-10 15:39:34 +08:00
Asias He	72a7a6bd9b	streaming: session close Currently, there are multiple places we can close a session, this makes the close code path hard to follow. Remove the call to maybe_completed in follower_start_sent to simplify closing a bit. - stream_session::follower_start_sent -> maybe_completed() - stream_session::receive_task_completed -> maybe_completed() - stream_session::transfer_task_completed -> maybe_completed() - on receive of the COMPLETE_MESSAGE -> complete()	2015-11-10 15:39:34 +08:00

1 2 3

144 Commits