scylladb

Author	SHA1	Message	Date
Asias He	fad34801bf	streaming: Introduce streaming::abort() It will be used soon by stream_plan::abort() to abort a stream session.	2017-08-30 15:19:50 +08:00
Asias He	7fba7cca01	streaming: Make stream_manager and coordinator message debug level When we abort a session, it is possible that: node 1 abort the session by user request node 1 send the complete_message to node 2 node 2 abort the session upon receive of the complete_message node 1 sends one more stream message to node 2 and the stream_manager for the session can not be found. It is fine for node 2 to not able to find the stream_manager, make the log on node 2 less verbose to confuse user less.	2017-08-30 15:19:50 +08:00
Asias He	be573bcafb	streaming: Check if _stream_result is valid If on_error() was called before init() was executed, the _stream_result can be invalid.	2017-08-30 15:19:44 +08:00
Asias He	8a3f6acdd2	streaming: Log peer address in on_error	2017-08-30 15:18:27 +08:00
Asias He	eace5fc6e8	streaming: Introduce received_failed_complete_message It is the handler for the failed complete message. Add a flag to remember if we received a such message from peer, if so, do not send back the failed complete message back to the peer when running close_session with failed status.	2017-08-30 15:18:27 +08:00
Duarte Nunes	85e85ec72e	Don't catch polymorphic exceptions by value It makes gcc a very sad compiler. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170726172053.5639-2-duarte@scylladb.com>	2017-07-27 09:39:58 +03:00
Asias He	aa87429e67	streaming: Send complete message with failed flag when session is failed To notify peer node the session is failed.	2017-07-19 10:11:05 +08:00
Asias He	03b838705c	streaming: Handle failed flag in complete message Fail the current session if the failed flag is on in the complete message handler.	2017-07-19 10:11:05 +08:00
Asias He	12d18cfab4	streaming: Do not fail the session when failed to send complete message Since the complete message is not mandatary, no point to fail the session in case failed to send the complete message.	2017-07-19 10:11:04 +08:00
Asias He	ca5248cd58	streaming: Introduce send_failed_complete_message Currently, send_complete_message is not used. We will use it shortly in case the local session is failed. Send a complete message with failed flag to notify peer node that the session is failed so that peer can close the session. This can speed up the closing of failed session. Also rename it to send_failed_complete_message.	2017-07-19 10:11:04 +08:00
Asias He	f21cb75cdb	streaming: Do not send complete message when session is successful The complete_message is not needed and the handler of this rpc message does nothing but returns a ready future. The patch to remove it did not make into the Scylla 1.0 release so it was left there.	2017-07-18 15:29:42 +08:00
Asias He	0ba4e73068	streaming: Introduce the failed parameter for complete message Use this flag to notify the peer that the session is failed so that the peer can close the failed session more quickly. The flag is used as a rpc::optional so it is compatible use old version of the verb.	2017-07-18 11:24:31 +08:00
Asias He	7599c1524d	streaming: Remove unused session_failed function It is never used. Get rid of it.	2017-07-18 11:22:09 +08:00
Asias He	caad7ced23	streaming: Less verbose in logging Now, we will have large number of small streaming. Make the not very important logging message debug level.	2017-07-18 11:17:09 +08:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Asias He	937f28d2f1	Convert to use dht::partition_range_vector and dht::token_range_vector	2016-12-19 14:08:50 +08:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Asias He	d1178fa299	Convert to use dht::token_range	2016-12-19 08:04:29 +08:00
Tomasz Grabiec	c1a7e2090e	Revert "database: change find_column_families signature so it returns a lw_shared_ptr" This reverts commit `f3528ede65`.	2016-11-04 10:48:21 +01:00
Glauber Costa	f3528ede65	database: change find_column_families signature so it returns a lw_shared_ptr There are places in which we need to use the column family object many times, with deferring points in between. Because the column family may have been destroyed in the deferring point, we need to go and find it again. If we use lw_shared_ptr, however, we'll be able to at least guarantee that the object will be alive. Some users will still need to check, if they want to guarantee that the column family wasn't removed. But others that only need to make sure we don't access an invalid object will be able to avoid the cost of re-finding it just fine. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <722bf49e158da77ff509372c2034e5707706e5bf.1478111467.git.glauber@scylladb.com>	2016-11-03 13:27:31 +01:00
Avi Kivity	a35136533d	Convert ring_position and token ranges to be nonwrapping Wrapping ranges are a pain, so we are moving wrap handling to the edges. Since cql can't generate wrapping ranges, this means thrift and the ring maintenance code; also range->ring transformations need to merge the first and last ranges. Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>	2016-11-02 21:04:11 +02:00
Asias He	a0020fdad2	stream_session: Allow adding ranges to a cf more than once Append the ranges to a stream_transfer_task if the cf is already added to _transfers in add_transfer_ranges.	2016-09-26 06:28:50 +08:00
Duarte Nunes	aaa76d58ba	query: Move to_partition_range to dht namespace This patch moves to_partition_range, from the query namespace to the dht namespace, where it is a more natural fit. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1468498060-19251-1-git-send-email-duarte@scylladb.com>	2016-07-15 10:41:52 +02:00
Paweł Dziepak	3fe1aec29d	streaming: avoid word "ERROR" in non-error messages Some tools (e.g. ccm) get confused and consider messages containing word "ERROR" as error level messagess irrespectively of their actual severity level. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1468399752-5228-1-git-send-email-pdziepak@scylladb.com>	2016-07-13 12:06:33 +03:00
Paweł Dziepak	32a5de7a1f	db: handle receiving fragmented mutations If mutations are fragmented during streaming a special care must be taken so that isolation guarantees are not broken. Mutations received with flag "fragmented" set are applied to a memtable that is used only by that particular streaming task and the sstables created by flushing such memtables are not made visible until the task is complte. Also, in case the streaming fails all data is dropped. This means that fragmented mutations cannot benefit from coalescing of writes from multiple streaming plans, hence separate way of handling them so that there is no loss of performance for small partitions. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	f2ae31711e	streaming: inform CF when streaming fails Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	4031c0ed8f	streaming: pass plan_id to column family for apply and flush plan_id is needed to keep track of the origin of mutations so that if they are fragmented all fragments are made visible at the same time, when that particular streaming plan_id completes. Basically, each streaming plan that sends big (fragmented) mutations is going to have its own memtables and a list of sstables which will get flushed and made visible when that plan completes (or dropped if it fails). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Asias He	94c9211b0e	streaming: Switch log level to warn instead of error dtest takes error level log as serious error. It is not a serious error for streaming to fail to send a verb and fail a streaming session, for example, the peer node is gone or stopped. Switch to use log level warn instead of level error. Fixes repair_additional_test.py:RepairAdditionalTest.repair_kill_3_test Fixes: #1335 Message-Id: <0149d30044e6e4d80732f1a20cd20593de489fc8.1465979288.git.asias@scylladb.com>	2016-06-15 13:01:22 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Asias He	62d443a07d	streaming: Fix log of plan_id and session address in stream_session They are get swapped. Fix it up. Spotted by looking at the log. Message-Id: <d163d71e9a96d1a45c3a4c529519790eeff7c486.1459172778.git.asias@scylladb.com>	2016-03-29 09:01:06 +03:00
Asias He	6fd6e57e80	streaming: Harden keep alive timer - Do nothing in case the session is closed, to prevent we fire up the timer again - Print log info when no progress has been made if the time expires, it is very useful to debug a idle session - Grab a reference when the keep alive timer is running Message-Id: <9f2cc3164696905a6a39c0d072a980765d598dfd.1458782956.git.asias@scylladb.com>	2016-03-24 11:58:54 +02:00
Asias He	fe263e5436	Revert "Revert "streaming: Start to send mutations after PREPARE_DONE_MESSAGE"" This reverts commit `1f29a698d5`.	2016-03-24 08:43:17 +08:00
Asias He	a6dd6e6d55	Revert "Revert "streaming: Simplify session completion logic"" This reverts commit `354fca9d56`.	2016-03-24 07:48:27 +08:00
Asias He	c2eff7e824	streaming: Complete receive task after the flush A STREAM_MUTATION_DONE message will signal the receiver that the sender has completed the sending of streams mutations. When the receiver finds it has zero task to send and zero task to receive, it will finish the stream_session, and in turn finish the stream_plan if all the stream_sessions are finished. We should call receive_task_completed only after the flush finishes so that when stream_plan is finshed all the data is on disk. Fixes repair_disjoint_data_test issue with Glauber's "[PATCH v4 0/9] Make sure repairs do not cripple incoming load" serries ====================================================================== FAIL: repair_disjoint_data_test (repair_additional_test.RepairAdditionalTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "scylla-dtest/repair_additional_test.py", line 102, in repair_disjoint_data_test self.check_rows_on_node(node1, 3000) File "scylla-dtest/repair_additional_test.py", line 33, in check_rows_on_node self.assertEqual(len(result), rows, len(result)) AssertionError: 2461	2016-03-23 09:40:49 -04:00
Glauber Costa	5fa866223d	streaming: add incoming streaming mutations to a different sstable Keeping the mutations coming from the streaming process as mutations like any other have a number of advantages - and that's why we do it. However, this makes it impossible for Seastar's I/O scheduler to differentiate between incoming requests from clients, and those who are arriving from peers in the streaming process. As a result, if the streaming mutations consume a significant fraction of the total mutations, and we happen to be using the disk at its limits, we are in no position to provide any guarantees - defeating the whole purpose of the scheduler. To implement that, we'll keep a separate set of memtables that will contain only streaming mutations. We don't have to do it this way, but doing so makes life a lot easier. In particular, to write an SSTable, our API requires (because the filter requires), that a good estimate on the number of partitions is informed in advance. The partitions also need to be sorted. We could write mutations directly to disk, but the above conditions couldn't be met without significant effort. In particular, because mutations can be arriving from multiple peer nodes, we can't really sort them without keeping a staging area anyway. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-23 09:13:00 -04:00
Pekka Enberg	354fca9d56	Revert "streaming: Simplify session completion logic" This reverts commit `208b7fa7ba`. It breaks Glauber's upcoming repair series.	2016-03-22 20:37:50 +02:00
Pekka Enberg	1f29a698d5	Revert "streaming: Start to send mutations after PREPARE_DONE_MESSAGE" This reverts commit `4c06221766`. It breaks Glauber's upcoming repair series.	2016-03-22 20:37:22 +02:00
Asias He	4c06221766	streaming: Start to send mutations after PREPARE_DONE_MESSAGE Below are 3 possible cases in a stream session, after commit `208b7fa7ba` (streaming: Simplify session completion logic) We might close the session before the exchange of the PREPARE_DONE_MESSAGE message in case 1). To fix, we defer the sending of mutations after PREPARE_DONE_MESSAGE is sent at the initiator node. 1) Initiator Follower tx rx tx rx 1 0 0 1 send prepare send back prepare recev prepare send mutations (close the session before prepare_done msg is sent) recv mutations (close session before prepare_done msg is received) send prepare_done recv prepare_done and send no mutations 2) Initiator Follower tx rx tx rx 0 1 1 0 send prepare send back prepare recv prepare nothing to send send prepare_done recv prepare_done and send mutations (close session) recv mutations (close session) 3) Initiator Follower tx rx tx rx 1 1 1 1 send prepare send back prepare recv prepare send mutations recv mutations, can not close session since we have mutations to send send prepare_done recv prepare_done and send mutations (close session) recv mutations (close session) Message-Id: <d6510b558565db23202164fa491b883ef3796e58.1458634037.git.asias@scylladb.com>	2016-03-22 15:05:57 +02:00
Asias He	208b7fa7ba	streaming: Simplify session completion logic Both the initiator and follower of a stream session knows how many transfer task and receive task the stream session contains in the preparation phase. They use the _transfers and _receivers map to track the tasks, like below: std::map<UUID, stream_transfer_task> _transfers; std::map<UUID, stream_receive_task> _receivers; A stream_transfer_task will send STREAM_MUTATION verb to transfer data with frozen_mutation, when all the STREAM_MUTATIONs are sent, it will send STREAM_MUTATION_DONE to tell the peer the stream_transfer_task is completed and remove the stream_transfer_task from _transfers map. The peer will remove the corresponding stream_receive_task in _receivers. We do not really need the COMPLETE_MESSAGE verb to notify the peer we have completed sending. It makes the session completion logic much simpler and cleaner if we do not depend on COMPLETE_MESSAGE verb. However, to be compatible with older version, we always send a COMPLETE_MESSAGE message and do nothing in the COMPLETE_MESSAGE handler and replies a ready future even if the stream_session is closed already. This way, node with older version will get a COMPLETE_MESSAGE message and manage to send a COMPLETE_MESSAGE message to new node as before. Message-Id: <1458540564-34277-2-git-send-email-asias@scylladb.com>	2016-03-21 16:58:03 +02:00
Glauber Costa	e52b869b25	fix small typo will sent -> will send Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20eaf0cea6fe14b03332547b7c4a3b85e9b619e7.1458325926.git.glauber@scylladb.com>	2016-03-18 20:34:22 +02:00
Glauber Costa	a3ebf640c6	stream_session: print debug message for STREAM_MUTATION For this verb(), we don't call get_session - and it doesn't look like we will. We currently have no debug message for this one, which makes it harder to debug the stream of messages. Print it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-16 22:09:46 -04:00
Glauber Costa	0ab4275893	stream_session: remove duplicated debug message Whenever we call get_session, that will print a debug message about the arrival of this new verb. Because we also print that explicitly in PREPARE_DONE, that message gets duplicated. That confuses poor developers who are, for a while, left wondering why is it that the sender is sender the message twice. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-16 22:04:25 -04:00
Asias He	2d50c71ca3	streaming: Handle cf is deleted after the deletion check The cf can be deleted after the cf deletion check. Handle this case as well. Use "warn" level to log if cf is missing. Although we can handle the case, but it is good to distingush where the receiver of streaming applied all the stream mutations or not. We believe that the cf is missing because it was dropped, but it could be missing because of a bug or something we didn't anticipated here. Related patch: "streaming: Handle cf is deleted when sending STREAM_MUTATION_DONE" Fixes simple_add_new_node_while_schema_changes_test failure. Message-Id: <c4497e0500f50e0a3422efb37e73130765c88c57.1458090598.git.asias@scylladb.com>	2016-03-16 09:46:41 +01:00
Asias He	7c4c99d7c7	streaming: Fix a log level in get_column_family_stores It is supposed to be debug level instead of info level.	2016-03-10 10:56:48 +08:00
Asias He	d9ead889f3	streaming: Handle cf is deleted when sending STREAM_MUTATION_DONE In the preparation phase of streaming, we check that remote node has all the cf_id which are needed for the entire streaming process, including the cf_id which local node will send to remote node and wise versa. So, at later time, if the cf_id is missing, it must be that the cf_id is deleted. It is fine to ingore no_such_column_family exception. In this patch, we change the code to ignore at server side to avoid sending the exception back, to avoid handle exception in an IDL compatiable way. One thing we can improve is that the sender might know the cf is deleted later than the receiver does. In this case, the sender will send some more mutations if we send back the no_such_column_family back to the sender. However, since we do not throw exceptions in the receiver stream mutation handler, it will not cause a lot of overhead, the receiver will just ignore the mutation received. Fixes #979	2016-03-09 16:50:38 +08:00
Asias He	dca9e594cc	streaming: Remove the unused test code It is introduced in the early development of streaming. We have dtest for streaming now, drop it. Message-Id: <1457499303-21163-1-git-send-email-asias@scylladb.com>	2016-03-09 10:31:42 +02:00
Asias He	1f3928c321	streaming: Hook streaming with gossip callback If the peer node of a stream_session is restarted or removed we should abort the streaming. It is better to hook gossip callback in the stream manager than in each streamm_session.	2016-03-09 07:35:20 +08:00
Asias He	50bf65db8d	streaming: Fix keep alive timer progress checking When the first time the keep alive timer fires, the _last_stream_bytes btyes will be zero since it is the first time we update it. The keep alive timer will be rearmed and fired again. The second time, we find there is no progress, we close the session. The total idle time will be 2 * keep alive timer. To make the idle time to close the session be more precise, we reduce the interval to check the progess and close the session by checking last time the progress is made. Message-Id: <c959cffce0cc738a3d73caaf71d2adb709d46863.1456831616.git.asias@scylladb.com>	2016-03-01 16:46:08 +02:00
Asias He	fd5f3cff47	streaming: Fix stream_manager progress api For each stream_session, we pretend we are sending/receiving one file, to make it compatible with nodetool. For receiving_files, the file name is "rxnofile". For sending_files, the file name is "txnofile". stream_manager::update_all_progress_info is introduced to update the progress info of all the stream_sessions in the node. We need this because streaming mutations are received on all the cores, but the stream_session object is only on one of the cores. It adds overhead if we update progress info in stream_session object whenever we receive a streaming mutation. So, what we do now is when we really need the progress info, we update the progress info in stream_session object. With http://127.0.0.$i:10000/stream_manager/, it looks like below when decommission node 3 in a 3 nodes cluster. =========== GET NODE 1 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"receiving_files": [{"value": {"direction": "IN", "file_name": "rxnofile", "session_index": 0, "total_bytes": 16876296, "peer": "127.0.0.3", "current_bytes": 16876296}, "key": "rxnofile"}], "receiving_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.3", "peer": "127.0.0.3"}]}] =========== GET NODE 2 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"receiving_files": [{"value": {"direction": "IN", "file_name": "rxnofile", "session_index": 0, "total_bytes": 16755552, "peer": "127.0.0.3", "current_bytes": 16755552}, "key": "rxnofile"}], "receiving_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.3", "peer": "127.0.0.3"}]}] =========== GET NODE 3 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"sending_files": [{"value": {"direction": "OUT", "file_name": "txnofile", "session_index": 0, "total_bytes": 16876296, "peer": "127.0.0.1", "current_bytes": 16876296}, "key": "txnofile"}], "sending_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.1", "peer": "127.0.0.1"},{"sending_files": [{"value": {"direction": "OUT", "file_name": "txnofile", "session_index": 0, "total_bytes": 16755552, "peer": "127.0.0.2", "current_bytes": 16755552}, "key": "txnofile"}], "sending_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.2", "peer": "127.0.0.2"}]}]	2016-02-26 17:38:37 +08:00
Asias He	37f52d632f	streaming: Remove unused progress() function	2016-02-26 17:38:37 +08:00

1 2 3 4

196 Commits