scylladb

Author	SHA1	Message	Date
Asias He	937f28d2f1	Convert to use dht::partition_range_vector and dht::token_range_vector	2016-12-19 14:08:50 +08:00
Asias He	d1178fa299	Convert to use dht::token_range	2016-12-19 08:04:29 +08:00
Tomasz Grabiec	c1a7e2090e	Revert "database: change find_column_families signature so it returns a lw_shared_ptr" This reverts commit `f3528ede65`.	2016-11-04 10:48:21 +01:00
Glauber Costa	f3528ede65	database: change find_column_families signature so it returns a lw_shared_ptr There are places in which we need to use the column family object many times, with deferring points in between. Because the column family may have been destroyed in the deferring point, we need to go and find it again. If we use lw_shared_ptr, however, we'll be able to at least guarantee that the object will be alive. Some users will still need to check, if they want to guarantee that the column family wasn't removed. But others that only need to make sure we don't access an invalid object will be able to avoid the cost of re-finding it just fine. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <722bf49e158da77ff509372c2034e5707706e5bf.1478111467.git.glauber@scylladb.com>	2016-11-03 13:27:31 +01:00
Avi Kivity	a35136533d	Convert ring_position and token ranges to be nonwrapping Wrapping ranges are a pain, so we are moving wrap handling to the edges. Since cql can't generate wrapping ranges, this means thrift and the ring maintenance code; also range->ring transformations need to merge the first and last ranges. Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>	2016-11-02 21:04:11 +02:00
Avi Kivity	c94fb1bf12	build: reduce inclusions of messaging_service.hh Remove inclusions from header files (primary offender is fb_utilities.hh) and introduce new messaging_service_fwd.hh to reduce rebuilds when the messaging service changes. Message-Id: <1475584615-22836-1-git-send-email-avi@scylladb.com>	2016-10-05 11:46:49 +03:00
Paweł Dziepak	f2ae31711e	streaming: inform CF when streaming fails Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Asias He	dca9e594cc	streaming: Remove the unused test code It is introduced in the early development of streaming. We have dtest for streaming now, drop it. Message-Id: <1457499303-21163-1-git-send-email-asias@scylladb.com>	2016-03-09 10:31:42 +02:00
Asias He	1f3928c321	streaming: Hook streaming with gossip callback If the peer node of a stream_session is restarted or removed we should abort the streaming. It is better to hook gossip callback in the stream manager than in each streamm_session.	2016-03-09 07:35:20 +08:00
Asias He	50bf65db8d	streaming: Fix keep alive timer progress checking When the first time the keep alive timer fires, the _last_stream_bytes btyes will be zero since it is the first time we update it. The keep alive timer will be rearmed and fired again. The second time, we find there is no progress, we close the session. The total idle time will be 2 * keep alive timer. To make the idle time to close the session be more precise, we reduce the interval to check the progess and close the session by checking last time the progress is made. Message-Id: <c959cffce0cc738a3d73caaf71d2adb709d46863.1456831616.git.asias@scylladb.com>	2016-03-01 16:46:08 +02:00
Asias He	fd5f3cff47	streaming: Fix stream_manager progress api For each stream_session, we pretend we are sending/receiving one file, to make it compatible with nodetool. For receiving_files, the file name is "rxnofile". For sending_files, the file name is "txnofile". stream_manager::update_all_progress_info is introduced to update the progress info of all the stream_sessions in the node. We need this because streaming mutations are received on all the cores, but the stream_session object is only on one of the cores. It adds overhead if we update progress info in stream_session object whenever we receive a streaming mutation. So, what we do now is when we really need the progress info, we update the progress info in stream_session object. With http://127.0.0.$i:10000/stream_manager/, it looks like below when decommission node 3 in a 3 nodes cluster. =========== GET NODE 1 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"receiving_files": [{"value": {"direction": "IN", "file_name": "rxnofile", "session_index": 0, "total_bytes": 16876296, "peer": "127.0.0.3", "current_bytes": 16876296}, "key": "rxnofile"}], "receiving_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.3", "peer": "127.0.0.3"}]}] =========== GET NODE 2 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"receiving_files": [{"value": {"direction": "IN", "file_name": "rxnofile", "session_index": 0, "total_bytes": 16755552, "peer": "127.0.0.3", "current_bytes": 16755552}, "key": "rxnofile"}], "receiving_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.3", "peer": "127.0.0.3"}]}] =========== GET NODE 3 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"sending_files": [{"value": {"direction": "OUT", "file_name": "txnofile", "session_index": 0, "total_bytes": 16876296, "peer": "127.0.0.1", "current_bytes": 16876296}, "key": "txnofile"}], "sending_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.1", "peer": "127.0.0.1"},{"sending_files": [{"value": {"direction": "OUT", "file_name": "txnofile", "session_index": 0, "total_bytes": 16755552, "peer": "127.0.0.2", "current_bytes": 16755552}, "key": "txnofile"}], "sending_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.2", "peer": "127.0.0.2"}]}]	2016-02-26 17:38:37 +08:00
Asias He	37f52d632f	streaming: Remove unused progress() function	2016-02-26 17:38:37 +08:00
Asias He	d146045bc5	Revert "Revert "streaming: Send mutations on all shards"" This brings back streaming on all shards. The bug in locator/abstract_replication_strategy is now fixed. This reverts commit `9f3061ade8`. Message-Id: <a79ce9cdd6f4af1c6088b89e1911b4b2ed1c10ae.1455589460.git.asias@scylladb.com>	2016-02-16 11:16:51 +02:00
Avi Kivity	9f3061ade8	Revert "streaming: Send mutations on all shards" This reverts commit `31d439213c`. Fixes #894. Conflicts: streaming/stream_manager.cc (may have undone part of `63a5aa6122`)	2016-02-09 18:26:14 +02:00
Asias He	31d439213c	streaming: Send mutations on all shards Currently, only the shard where the stream_plan is created on will send streaing mutations. To utilize all the available cores, we can make each shard send mutations which it is responsbile for. On the receiver side, we do not forward the mutations to the shard where the stream_session is created, so that we can avoid unnecessary forwarding. Note: the downside is that it is now harder to: 1) to track number of bytes sent and received 2) to update the keep alive timer upon receive of the STREAM_MUTATION To fix, we now store the sent/recieved bytes info on all shards. When the keep alive timer expires, we check if any progress has been made. Hopefully, this patch will make the streaming much faster and in turn make the repair/decommission/adding a node faster. Refs: https://github.com/scylladb/scylla/issues/849 Tested with decommission/repair dtest. Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>	2016-02-07 10:57:51 +02:00
Asias He	360df6089c	streaming: Remove unused stream_session::retry	2016-01-29 16:31:07 +08:00
Asias He	2f48d402e2	streaming: Remove unused commented code	2016-01-29 16:31:07 +08:00
Asias He	ed3da7b04c	streaming: Drop flush_tables option for add_transfer_ranges We do not stream sstable files. No need to flush it.	2016-01-29 16:31:07 +08:00
Asias He	46bec5980b	streaming: Put session_info inside stream_session It is 1:1 mapping between session_info and stream_session. Putting session_info inside stream_session, we can get rid of the stream_coordinator::host_streaming_data class.	2016-01-29 16:31:07 +08:00
Asias He	c4bdb6f782	streaming: Wire up session progress The progress info is needed by JMX api.	2016-01-29 16:31:07 +08:00
Asias He	03aced39c4	streaming: Account number of bytes sent and received per session The API will consume it soon.	2016-01-27 18:16:58 +08:00
Asias He	e8b8b454df	streaming: Flatten streaming messages class namespace There are only two messages: prepare_message and outgoing_file_message. Actually only the prepare_message is the message we send on wire. Flatten the namespace.	2016-01-26 13:04:29 +08:00
Asias He	eba9820b22	streaming: Remove stream_session::file_sent It is the callback after sending file_message_header. In scylla, we do not sent the file_message_header. Drop it.	2016-01-25 17:25:34 +08:00
Asias He	fa4e94aa27	streaming: Get rid of keep_ss_table_level We stream mutation instead of files, so keep_ss_table_level is not relevant for us.	2016-01-25 16:58:57 +08:00
Asias He	2cc31ac977	streaming: Get rid of the stream_index It is always zero.	2016-01-25 16:58:57 +08:00
Asias He	2a04e8d70e	streaming: Drop streaming/messages/incoming_file_message It is not used.	2016-01-25 11:38:13 +08:00
Asias He	bdd6a69af7	streaming: Drop unused parameters - int connections_per_host Scylla does not create connections per stream_session, instead it uses rpc, thus connections_per_host is not relevant to scylla. - bool keep_ss_table_level - int repaired_at Scylla does not stream sstable files. They are not relevant to scylla.	2016-01-25 11:38:13 +08:00
Asias He	767e25a686	streaming: Remove the _handlers helper It is introduced to help to run the invoke_on_all, we can reuse the distributed<database> db for it. Message-Id: <1453283955-23691-1-git-send-email-asias@scylladb.com>	2016-01-20 13:58:44 +02:00
Asias He	2345cda42f	messaging_service: Rename shard_id to msg_addr Use shard_id as the destination of the messaging_service is confusing, since shard_id is used in the context of cpu id. Message-Id: <8c9ef193dc000ef06f8879e6a01df65cf24635d8.1452155241.git.asias@scylladb.com>	2016-01-07 10:36:35 +02:00
Asias He	1b3d2dee8f	streaming: Drop src_cpu_id parameter Now that we can get the src_cpu_id from rpc::client_info. No need to pass it as verb parameter.	2015-12-31 11:25:09 +01:00
Asias He	89b79d44de	streaming: Get rid of the _connecting_ parameter messaging_service will use private ip address automatically to connect a peer node if possible. There is no need for the upper level like streaming to worry about it. Drop it simplifies things a bit.	2015-12-31 11:25:08 +01:00
Avi Kivity	827a4d0010	Merge "streaming: Invalidate cache upon receiving of stream" from Asias "When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484."	2015-12-28 10:24:46 +02:00
Asias He	20c258f202	streaming: Fix session hang with maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE The problem is that we set the session state to WAIT_COMPLETE in send_complete_message's continuation, the peer node might send COMPLETE_MESSAGE before we run the continuation, thus we set the wrong status in COMPLETE_MESSAGE's handler and will not close the session. Before: GOT STREAM_MUTATION_DONE receive task_completed SEND COMPLETE_MESSAGE to 127.0.0.2:0 GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0 complete: PREPARING -> WAIT_COMPLETE GOT COMPLETE_MESSAGE Reply maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE After: GOT STREAM_MUTATION_DONE receive task_completed maybe_completed: PREPARING -> WAIT_COMPLETE SEND COMPLETE_MESSAGE to 127.0.0.2:0 GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0 complete: WAIT_COMPLETE -> COMPLETE Session with 127.0.0.2 is complete	2015-12-24 20:34:44 +08:00
Asias He	c971fad618	streaming: Introduce keep alive timer for each stream_session If the session is idle for 10 minutes, close the session. This can detect the following hangs: 1) if the sending node is gone, the receiving peer will wait forever 2) if the node which should send COMPLETE_MESSAGE to the peer node is gone, the peer node will wait forever Fixes simple_kill_streaming_node_while_bootstrapping_test.	2015-12-24 20:34:44 +08:00
Asias He	2d32195c32	streaming: Invalidate cache upon receiving of stream When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484.	2015-12-21 14:44:13 +08:00
Asias He	517fd9edd4	streaming: Add helper to get distributed<database> db	2015-12-21 14:42:47 +08:00
Asias He	52a5e954f9	gossip: Pass const ref for versioned_value in on_change and before_change	2015-12-09 12:29:15 +08:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Asias He	e7c0db0160	streaming: Fix a race between initiator and follower 1) Node A sends prepare message (msg1) to Node A 2) Node B sends prepare message (msg2) back to Node A 3) Node A prepares what to receive according to msg2 The issue is that, Node B might sends before Node A prepares to receive. To fix, we send a PREPARE_DONE_MESSAGE after step 3 to notify node B to start sending.	2015-08-17 14:28:11 +08:00
Asias He	0f1f710b27	streaming: Introduce transfer_task_completed	2015-08-17 11:00:30 +08:00
Asias He	924ca5915e	stream_session: Make sure cf exists before streaming We use storage_proxy::mutate_locally() to apply the mutations when we receive them. mutate_locally() will ignore the mutation if the cf does not exist. We check in the prepare phase to make sure all the cf's exist.	2015-08-04 16:21:40 +08:00
Asias He	6712e9404e	streaming: Implement session completion logic	2015-07-21 16:12:54 +08:00
Asias He	f9109c33ba	streaming: Implement stream_transfer_task completion logic	2015-07-21 16:12:54 +08:00
Asias He	9794fa1f97	streaming: Improve the test Instead of streaming system.local table, create and stream user created table.	2015-07-21 16:12:54 +08:00
Asias He	d720dadf7b	streaming: Switch to use logger class	2015-07-14 20:56:28 +08:00
Asias He	e82bdf2995	streaming: Swith to use shared_ptr from std::shared_ptr Since our shared_ptr works with incomplete types now, switch to it.	2015-07-14 20:41:14 +08:00
Asias He	38ee079916	streaming: Add test helper function This is a very preliminary test to make sure negotiating between two nodes is ok.	2015-07-14 20:41:14 +08:00
Asias He	01aa42ddca	streaming: Add streaming_debug Add debug print for message exchange	2015-07-14 20:41:14 +08:00
Asias He	85d9204d0e	streaming: Drop connection_handler stream_session::stream_session(inet_address peer_, inet_address connecting_, int index_, bool keep_ss_table_level_) : peer(peer_) , connecting(connecting_) , conn_handler(shared_from_this()) Calling shared_from_this() inside stream_session's constructor is problematic. I got Exiting on unhandled exception of type 'std::bad_weak_ptr': bad_weak_ptr exceptions, with auto session = std::make_shared<stream_session>(peer, connecting, size, _keep_ss_table_level) Also, the logic in connection_handler is not very useful for us. The sending and receiving of messages are handled using messaging_service. There is no need to add another layer.	2015-07-14 20:41:14 +08:00

1 2

89 Commits