scylladb

Author	SHA1	Message	Date
Gleb Natapov	1e6352e398	messaging: do not admit new requests during messaging service shutdown. Sending a message may open new client connection which will never be closed in case messaging service is shutting down already. Fixes #1059 Message-Id: <1458639452-29388-3-git-send-email-gleb@scylladb.com>	2016-03-22 13:00:18 +02:00
Gleb Natapov	357c91a076	messaging: do not delete client during messaging service shutdown Messaging service stop() method calls stop() on all clients. If remove_rpc_client_one() is called while those stops are running client::stop() will be called twice which not suppose to happen. Fix it by ignoring client remove request during messaging service shutdown. Fixes #1059 Message-Id: <1458639452-29388-2-git-send-email-gleb@scylladb.com>	2016-03-22 13:00:18 +02:00
Asias He	b8abd88841	messaging_service: Take reference of ms in send_message_timeout_and_retry Take a reference of messaging_service object inside send_message_timeout_and_retry to make sure it is not freed during the life time of send_message_timeout_and_retry operation.	2016-03-22 12:32:19 +02:00
Gleb Natapov	e228ef1bd9	messaging: enable keepalive tcp option for inter-node communication Some network equipment that does TCP session tracking tend to drop TCP sessions after a period of inactivity. Use keepalive mechanism to prevent this from happening for our inter-node communication. Message-Id: <20160314173344.GI31837@scylladb.com>	2016-03-14 19:39:39 +02:00
Pekka Enberg	16f947dcb3	message/messaging_service: Remove init_messaging_service() declaration The function no longer exists so drop the function declaration. Message-Id: <1457694134-25600-1-git-send-email-penberg@scylladb.com>	2016-03-14 13:54:53 +02:00
Asias He	bcdd3dbb3e	messaging_service: Add missed throw It is missed somehow. Message-Id: <1457684884-4776-1-git-send-email-asias@scylladb.com>	2016-03-11 11:01:24 +02:00
Asias He	bf3507d093	messaging_service: Stop retrying if node is removed from gossip - Start a node - Inject data - Start another node to bootstrap - Before the second node finishes streaming, kill the second node - After a while the node will be removed from the cluster becusue it does not manage to join the cluster. - At this time, messaging_service might keep retrying the stream_mutations unncessarily. To fix, check if the peer node is still a known node in the gossip.	2016-03-09 07:35:20 +08:00
Gleb Natapov	2d092bbd32	storage_proxy: send read requests with timeout No need to wait for replies long after request is timed out. Message-Id: <1457351304-28721-2-git-send-email-gleb@scylladb.com>	2016-03-07 14:00:11 +01:00
Paweł Dziepak	b92f8a6d2b	messaging_service: add SCHEMA_CHECK verb SCHEMA_CHECK is used to get node schema version. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-02 12:49:54 +00:00
Asias He	697b16414a	gossip: Make gossip message handling async In each gossip round, i.e., gossiper::run(), we do: 1) send syn message 2) peer node: receive syn message, send back ack message 3) process ack message in handle_ack_msg apply_state_locally mark_alive send_gossip_echo handle_major_state_change on_restart mark_alive send_gossip_echo mark_dead on_dead on_join apply_new_states do_on_change_notifications on_change 4) send back ack2 message 5) peer node: process ack2 message apply_state_locally At the moment, syn is "wait" message, it times out in 3 seconds. In step 3, all the registered gossip callbacks are called which might take significant amount of time to complete. In order to reduce the gossip round latency, we make syn "no-wait" and do not run the handle_ack_msg insdie the gossip::run(). As a result, we will not get a ack message as the return value of a syn message any more, so a GOSSIP_DIGEST_ACK message verb is introduced. With this patch, the gossip message exchange is now async. It is useful when some nodes are down in the cluster. We will not delay the gossip round, which is supposed to run every second, 3*n seconds (n = 1-3, since it talks to 1-3 peer nodes in each gossip round) or even longer (considering the time to run gossip callbacks). Later, we can make talking to the 1-3 peer nodes in parallel to reduce latency even more. Refs: #900	2016-02-24 19:33:39 +08:00
Asias He	63df54b368	messaging_service: Add GOSSIP_DIGEST_ACK We will soon switch to use no-wait message for gossip. GOSSIP_DIGEST_SYN will no longer return GOSSIP_DIGEST_ACK message. So we need a standalone verb for GOSSIP_DIGEST_ACK.	2016-02-24 19:31:14 +08:00
Paweł Dziepak	351c69b476	frozen_schema: use IDL-based serialization Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-02-19 23:12:00 +00:00
Gleb Natapov	2ae1ae2d18	Cleanup messaging_service.hh includes a bit. Forward declare some classes instead. Message-Id: <1454496142-14537-2-git-send-email-gleb@scylladb.com>	2016-02-04 13:22:24 +02:00
Asias He	46c392eb17	messaging_service: Stop retrying if messaging_service is being shutdown If we are shutting down the messaging_service, we should not retry the message again. Refs #862 Message-Id: <7c3afb646ba8254eca69096d80dd5ea007e416a7.1454418053.git.asias@scylladb.com>	2016-02-02 19:50:54 +02:00
Gleb Natapov	116ad5a603	Use net::messaging_service::current_version for serialization format versioning Message-Id: <1454421603-13080-2-git-send-email-gleb@scylladb.com>	2016-02-02 17:08:53 +01:00
Gleb Natapov	19067db642	remove old serializer	2016-02-02 12:15:50 +02:00
Gleb Natapov	10cd4d948c	Move result_digest to idl	2016-02-02 12:15:50 +02:00
Gleb Natapov	775cc93880	remove unused range and token serializers	2016-02-02 12:15:49 +02:00
Gleb Natapov	e6f7b12b51	Move partition_checksum to use idl	2016-02-02 12:15:49 +02:00
Gleb Natapov	60e3637efc	Move frozen_schema to idl	2016-02-02 12:15:49 +02:00
Asias He	fbf796b812	messaging_service: Use standalone connection for stream verbs In streaming, the amount of data needs to be streamed to peer nodes might be large. In order to avoid the streaming overwhelms the TCP connection used by user CQL verbs and starves the user CQL queries, we use a standalone TCP connection for streaming verbs.	2016-02-01 11:01:56 +08:00
Amnon Heiman	7b53b99968	idl-compiler: split the idl list Not all the idls are used by the messaging service, this patch removes the auto-generated single include file that holds all the files and replaes it with individual include of the generated fiels. The patch does the following: * It removes from the auto-generated inc file and clean the configure.py from it. * It places an explicit include for each generated file in messaging_serivce. * It add dependency of the generated code in the idl-compiler, so a change in the compiler will trigger recreation of the generated files. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1453900241-13053-1-git-send-email-amnon@scylladb.com>	2016-01-27 15:23:00 +02:00
Gleb Natapov	6a581bb8b6	messaging_service: replace rpc::type with boost::type RPC moved to boost::type to make serializers less rpc centric. Scylla should follow. Message-Id: <20160126164450.GA11706@scylladb.com>	2016-01-27 11:57:45 +02:00
Asias He	e8b8b454df	streaming: Flatten streaming messages class namespace There are only two messages: prepare_message and outgoing_file_message. Actually only the prepare_message is the message we send on wire. Flatten the namespace.	2016-01-26 13:04:29 +08:00
Gleb Natapov	c9bd069815	messaging_service: log rpc errors Message-Id: <20160125155005.GC23862@scylladb.com>	2016-01-25 17:59:26 +02:00
Asias He	ad80916905	messaging_service: Add streaming implementation for idl - stream_request - stream_summary - prepare_message Please enter the commit message for your changes. Lines starting	2016-01-25 22:36:58 +08:00
Avi Kivity	5c5207f122	Merge "Another round of streaming cleanup" from Asias "- Merge stream_init_message and stream_parepare_message - Drop session_index / keep_ss_table_level / file_message_header"	2016-01-25 12:54:30 +02:00
Asias He	77684a5d4c	messaging_service: Drop STREAM_INIT_MESSAGE The verb is not used anymore. Message-Id: <1453719054-29584-1-git-send-email-asias@scylladb.com>	2016-01-25 12:53:08 +02:00
Asias He	53c6cd7808	gossip: Rename echo verb to gossip_echo It is used by gossip only. I really could not allow myself to get along this inconsistence. Change before we still can. Message-Id: <1453719054-29584-2-git-send-email-asias@scylladb.com>	2016-01-25 12:53:07 +02:00
Asias He	ad4a096b80	streaming: Get rid of stream_init_message Unlike streaming in c*, scylla does not need to open tcp connections in streaming service for both incoming and outgoing messages, seastar::rpc does the work. There is no need for a standalone stream_init_message message in the streaming negotiation stage, we can merge the stream_init_message into stream_prepare_message.	2016-01-25 16:24:16 +08:00
Asias He	4ce08ff251	messaging_service: Add heart_beat_state implementation	2016-01-25 11:28:29 +08:00
Asias He	ecca969adf	messaging_service: Add gossip::endpoint_state implementation	2016-01-25 11:28:29 +08:00
Asias He	2a0b6589dd	messaging_service: Add versioned_value implementation	2016-01-25 11:28:29 +08:00
Asias He	15f2b353b9	messaging_service: Add gossip_digest implementation	2016-01-25 11:28:29 +08:00
Asias He	d81fc12af3	messaging_service: Add gossip_digest_ack2 implementation	2016-01-25 11:28:29 +08:00
Asias He	e67cecaee1	messaging_service: Add gossip_digest_syn implementation	2016-01-25 11:28:29 +08:00
Avi Kivity	65a140481c	Merge " streaming COMPLETE_MESSAGE failure and message retry logic fix" from Asias "This series: - Add more debug info to stream session - Fail session if we fail to send COMPLETE_MESSAGE - Handle message retry logic for verbs used by streaming See commit log for details."	2016-01-24 16:41:06 +02:00
Gleb Natapov	067bdb23cd	Move reconcilable_result and frozen_mutation to idl	2016-01-24 12:45:41 +02:00
Gleb Natapov	18dff5ebc8	Move smart pointer serialization helpers to .cc file. They are not used outside of the .cc file, so should not be in the header.	2016-01-24 12:45:41 +02:00
Gleb Natapov	93da9b2725	Remove redundant vector serialization code. IDL serializer has the code to serialize vectors, so use it instead.	2016-01-24 12:45:41 +02:00
Gleb Natapov	afc407c6e5	Move query::result to use idl.	2016-01-24 12:45:41 +02:00
Gleb Natapov	4ae906b204	Add serializer overload for query::partition_range. From now on query::partition_range will use generated code.	2016-01-24 12:45:41 +02:00
Gleb Natapov	2d1b2765e6	Add serializer overload for query::read_command. From now on query::read_command will use generated code.	2016-01-24 12:45:41 +02:00
Amnon Heiman	577ce0d231	Adding a sepcific template initialization in messaging_service to use the serializer This patch adds a specific template initialization so that the rpc would use the serializer and deserializer that are auto-generated. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-01-24 12:29:21 +02:00
Asias He	7ac3e835a6	messaging_service: Fix send_message_timeout_and_retry When a verb timeout, if we resend the message again, the peer could receive the message more than once. This would confuse the receiver. Currently, only the streaming code use the retry logic. - In case of rpc:timeout_error: Instead of doing timeout in a relatively short time and resending a few times, we make the timeout big enough and let the tcp to do the resend. Thus, we can avoid resending the message more than once, of course, the receiver would not receive the message more than once. - In case of rpc::closed_error: There are two cases: 1) Failing to establish a connection. For instance, the peer is down. It is safe to resend since we know for sure the receiver hasn't received the message yet. 2) The connection is established. We can not figure out if the remote peer have received the message already or not upon receiving the rpc::closed_error exception. Currently, we still sleep & resend the message again, so the receiver might receive the message more than once. We do not have better choice in this case, if we want the resend to recover the sending error due to temporary network issue, since failing the whole stream_session due to failing to send a single message is not wise. NOTE: If the duplicated message is received when the stream_session is done, it will be ignored since it can not find the stream_manager anymore. For message like, STREAM_MUTATION, it is ok to receive twice (we apply the mutation twice). TODO: For other messages which uses the retry logic, we need to make sure it is ok to receive more than once.	2016-01-22 08:20:48 +08:00
Avi Kivity	221ef4536c	messaging service: limit rpc server resources Otherwise, a slow node can be overwhelmed by other nodes and run out of memory. Fixes #596. Message-Id: <1452776394-13682-1-git-send-email-avi@scylladb.com>	2016-01-18 11:16:45 +02:00
Avi Kivity	d5050e4c6a	storage_proxy: make MUTATION and MUTATION_DONE verbs sychronous at the server side While MUTATION and MUTATION_DONE are asynchronous by nature (when a MUTATION completes, it sends a MUTATION_DONE message instead of responding synchronously), we still want them to be synchronous at the server side wrt. the RPC server itself. This is because RPC accounts for resources consumed by the handler only while the handler is executing; if we return immediately, and let the code execute asynchronously, RPC believes no resources are consumed and can instantiate more handlers than the shard has resources for. Fix by changing the return type of the handlers to future<no_wait_type> (from a plain no_wait_type), and making that future complete when local processing is over. Ref #596. Message-Id: <1453048967-5286-1-git-send-email-avi@scylladb.com>	2016-01-18 09:59:34 +02:00
Tomasz Grabiec	e88f41fb3f	messaging_service: Move REPAIR_CHECKSUM_RANGE verb out of the streaming verbs group Message-Id: <1452620321-17223-1-git-send-email-tgrabiec@scylladb.com>	2016-01-12 20:17:08 +02:00
Vlad Zolotarov	9232ad927f	messaging_service::get_rpc_client(): fix the encryption logic According to specification (here https://wiki.apache.org/cassandra/InternodeEncryption) when the internode encryption is set to `dc` the data passed between DCs should be encrypted and similarly, when it's set to `rack` the inter-rack traffic should encrypted. Currently Scylla would encrypt the traffic inside a local DC in the first case and inside the local RACK in the later one. This patch fixes the encryption logic to follow the specification above. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1452501794-23232-1-git-send-email-vladz@cloudius-systems.com>	2016-01-12 16:22:26 +02:00
Tomasz Grabiec	e1e8858ed1	service: Fetch and sync schema	2016-01-11 10:34:53 +01:00

1 2 3 4

184 Commits