Commit Graph

105 Commits

Author SHA1 Message Date
Tomasz Grabiec
d64db98943 query: Convert serialization of query::result to use db::serializer<>
That's what we're trying to standardize on.

This patch also fixes an issue with current query::result::serialize()
not being const-qualified, because it modifies the
buffer. messaging_service did a const cast to work this around, which
is not safe.
2015-12-03 09:19:11 +01:00
Gleb Natapov
8c02ad0e9e messaging: log connection dropping event 2015-11-30 19:42:04 +02:00
Gleb Natapov
33e5097090 messaging: do not kill live connection needlessly
Messaging service closes connection in rpc call continuation on
closed_error, but the code runs for each outstanding rpc call on the
connection, so first continuation may destroy genuinely closed connection,
then connection is reopened and next continuation that handless previous
error kills now perfectly healthy connection. Fix this by closing
connection only in error state.
2015-11-23 20:16:28 +02:00
Gleb Natapov
eb220507ce storage_proxy: use correct endpoint address for mutation acks processing
Write handler keeps track of all endpoints that not yet acked mutation
verb. It uses broadcast address as an enpoint id, but if local address
is different from broadcast address for local enpoints acknowledgements
will come from different address, so socket address cannot be used as
an acknowledgement source. Origin solves this by sending "from" in each
message, it looks like an overhead, solve this by providing endpoint's
broadcast address in rpc client_info and use that instead.
2015-11-16 10:29:47 +01:00
Amnon Heiman
d5d0653210 messaging_service: Add a function that goes over all the server stats
The API needs to get the stats from the rpc server, that is hidden from the
messaging service API.

This patch adds a foreach function that goes over all the server stats
without exposing the server implementation.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2015-11-02 16:15:52 +02:00
Asias He
2c8867c348 config: Enable storage_port option 2015-10-29 08:58:41 +08:00
Vlad Zolotarov
d8de1099eb message::messaging_service: introduce _preferred_ip_cache
This map will contain the (internal) IPs corresponding to specific Nodes.
The mapping is also stored in the system.peers table.

So, instead of always connecting to external IP messaging_service::get_rpc_client()
will query _preferred_ip_cache and only if there is no entry for a given
Node will connect to the external IP.

We will call for init_local_preferred_ip_cache() at the end of system table init.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Improved the _preferred_ip_cache description.
   - Code styling issues.

New in v3:
   - Make get_internal_ip() public.
   - get_rpc_client(): return a get_preferred_ip() usage dropped
     in v2 by mistake during rebase.
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
f896f9a908 message::messaging_service: added remove_rpc_client(shard_id)
This function erases shard_info objects from all _clients maps.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Use remove_rpc_client_one() instead of direct map::erase().
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
e9789dd68c message::messaging_service: fixes in rpc_protocol_client_wrapper shut down
- Ensure messaging_service::stop() blocks until all rpc_protocol::client::stop()
     are over.
   - Remove the async code from rpc_protocol_client_wrapper destructor - call
     for stop() everywhere it's needed instead. Ensure that
     rpc_protocol_client_wrapper is always "stopped" when its destructor is called.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v3:
   - Code style fixes.
   - Killed rpc_protocol_client_wrapper::_stopped.
   - Killed rpc_protocol_client_wrapper::~rpc_protocol_client_wrapper().
   - Use std::move() for saving shared pointer before
     erasing the entry from _clients in
     remove_rpc_client_one() in
     order to avoid extra ref count bumping.
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
842b13325d message::messaging_service: make _clients to be std::array
This makes code cleaner. Also it would allow less changes
if we decide to increase _clients size in the future.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-26 14:09:26 +02:00
Asias He
1965e8751b messaging_service: Add REPLICATION_FINISHED verb
It is used to send replication finished message by storage_service when
removing a node from a cluster.
2015-10-21 16:11:33 +08:00
Tomasz Grabiec
19d7d30e67 Replace references to 'urchin' with 'scylla' 2015-10-19 11:08:05 +03:00
Calle Wilund
37131fcc05 messaging_service: TRUNCATE verb methods 2015-09-30 09:09:42 +02:00
Gleb Natapov
140641689b messaging: do not use rpc client in error state
Using rpc client in error state will result in a message loss. Try to
reconnect instead.
2015-09-24 17:50:51 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Asias He
eead846712 messaging_service: Make gossip use standalone tcp connection
For unknown reasons, I saw gossip syn message got rpc timeout erros when
the cluster is under heavy cassandra-strss stress.

Using a standalone tcp connection seems to fix the issue.
2015-09-19 10:17:42 +03:00
Asias He
0f5df4476c gossip: Make the timeout longer for gossip syn and echo message
When the cluster is under heavy load, the time to exchange a gossip
message might take longer than 1s. Let's make the timeout longer for now
before we can solve the large delay of gossip message issue.
2015-09-17 11:35:31 +03:00
Asias He
1e7d883ae1 messaging_service: Fix shard_id
We should ignore equal and less than operators for shard_id as well.

Within a 3 nodes cluster, each node has 4 cpus, on first node

Before:
[fedora@ip-172-30-0-99 ~]$ netstat -nt|grep 100\:7000
tcp        0      0 172.30.0.99:36998       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:36772       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:40125       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:60182       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:38013       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:51997       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:56532       172.30.0.100:7000 ESTABLISHED

After:
[fedora@ip-172-30-0-99 ~]$ netstat -nt|grep 100\:7000
tcp        0      0 172.30.0.99:45661       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:57395       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:37807       172.30.0.100:7000 ESTABLISHED
tcp        0     36 172.30.0.99:50567       172.30.0.100:7000 ESTABLISHED

Each shard of a node is supposed to have 1 connection to a peer node,
thus each node will have #cpu connections to a peer node.

With this patch, the cluster is much more stable than before on AWS. So
far, I see no timeout in the gossip syn message exchange.
2015-09-16 08:44:47 +02:00
Asias He
247e9109d9 gossip: Introduce uninit_messaging_service_handler
It is useful in gossip shutdown process.
2015-09-08 12:19:06 +08:00
Gleb Natapov
f51f5c819e messaging: Add unregister function for verbs used by storage proxy 2015-09-07 14:46:13 +02:00
Asias He
8cff2318dc gossip: Add timeout support for send_echo 2015-09-06 16:35:11 +08:00
Asias He
2a06214306 gossip: Switch to use rpc timeout for send_gossip_digest_syn
Timeout support was added to gossip message by using semaphore's
timeout support, now that rpc has timeout support, switch to it.
2015-09-06 16:34:41 +08:00
Asias He
06fb6f6b30 messaging_service: Introduce send_message_timeout
It is used to send a message with timeout.
2015-09-06 15:01:21 +08:00
Gleb Natapov
cea529f490 messaging_service: protect messaging_service from been destroyed while in use
Pointer to messageing_service object is stored in each request
continuation, so the object destruction should not happen while
any of these continuations is scheduled. Use shared pointer instead.

Fixes #273.
2015-09-03 13:59:05 +03:00
Gleb Natapov
1d96dcdbbe Revert "protect messaging_service destruction by a gate object"
This reverts commit 8599e9d84f.
2015-09-03 13:58:46 +03:00
Gleb Natapov
8599e9d84f protect messaging_service destruction by a gate object
Pointer to messageing_service object is stored in each request
continuation, so the object destruction should not happen while any of
these continuations is scheduled. Use gate object to ensure that.
2015-08-25 15:28:58 +03:00
Gleb Natapov
788fc66e29 messaging: keep shared reference to rpc client while send is underway
There can be multiple sends underway when first one detects an error and
destroys rpc client, but the rpc client is still in use by other sends.
Fix this by making rpc client pointer shared and hold on it for each
send operation.
2015-08-20 19:22:08 +03:00
Asias He
de504086b4 messaging_service: Add PREPARE_DONE_MESSAGE verb
It is used to notify the follower to start sending streaming data to the
initiator.
2015-08-17 14:28:11 +08:00
Avi Kivity
06c6432f1e messaging: fix bad return type in string deserializer
Found by gcc 6.
2015-08-13 17:51:29 +03:00
Asias He
74e2f0156a messaging_service: Ignore cpu id for shard_id hash
Since we do not support shard to shard connections at the moment, ip
address should fully decide if a connection to a remote node exists or
not. messaging_service maintains connections to remote node using

   std::unordered_map<shard_id, shard_info, shard_id::hash> _clients;

With this patch, we can possibly reduce number of tcp connections
between two nodes.
2015-08-12 10:25:06 +03:00
Avi Kivity
54af3e2d27 messaging_service: stop rpc client before removing it
Instead of just destroying the client, stop() it first and wait for that
to complete.  Otherwise any connect in progress, or any requests in progress,
will access a destroyed object (in practice it will fail earlier when
the _stopped promise is destroyed without having been fulfilled).

Depends on rpc client stop patch.
2015-07-30 11:13:30 +03:00
Asias He
a2b54fc757 main: Introduce init.cc to cleanup service startup code
This patch introduce init.cc file which hosts all the initialization
code. The benefits are 1) we can share initialization code with tests
code. 2) all the service startup dependency / order code is in one
single place instead of everywhere.
2015-07-28 18:20:45 +08:00
Asias He
5aedb7cfda messaging_service: Drop rpc client when rpc errors 2015-07-24 15:56:04 +08:00
Pekka Enberg
9e799cadd7 message/messaging_service: MIGRATION_REQUEST wrappers
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
49d9ac716d message/messaging_service.hh: Add missing do_with.hh include
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:24:37 +03:00
Asias He
b8be013934 messaging_service: Fix undefined reference to current_version
/usr/include/boost/format/feed_args.hpp:135: undefined reference to
`net::messaging_service::current_version'
2015-07-21 17:00:15 +08:00
Asias He
a010829f0c streaming: Add src_cpu_id parameter for PREPARE_MESSAGE verb
We need it to setup dst_cpu_id for the session of the follower.
2015-07-21 16:12:54 +08:00
Asias He
88aee5f51f streaming: Add register_complete_message and send_complete_message 2015-07-21 16:12:54 +08:00
Asias He
7f7c89951e streaming: Introduce STREAM_MUTATION_DONE verb
It is used when a stream_transfer_task sends all the mutations inside
the messages::outgoing_file_message's it contains to notify the remote
peer this stream_transfer_task is done.
2015-07-21 16:12:54 +08:00
Asias He
f2960a7cb0 streaming: Send plan_id for STREAM_MUTATION
We need this to find session associated with this frozen_mutation.
2015-07-21 16:12:54 +08:00
Asias He
fa2aee57ac utils: Move util/serialization.hh to utils/serialization.hh
Now we will not have the ugly utils and util directories, only utils.
2015-07-21 16:12:54 +08:00
Avi Kivity
8712796e0f Merge seastar upstream
Add messaging_service thunk from new-style serialization (free functions)
to old-style serialization (serializer methods).
2015-07-19 14:18:35 +03:00
Avi Kivity
0d68b4614d Merge seastar upstream
Rewrote messaging_service serialization for new non-deferring rpc
serialization.
2015-07-18 20:21:20 +03:00
Avi Kivity
c74e36c30e Merge branch 'master' of github.com:cloudius-systems/urchin into db
Conflicts:
	message/messaging_service.cc
	message/messaging_service.hh
2015-07-16 12:51:19 +03:00
Avi Kivity
1d4805236b messaging_service: don't include config.hh in .hh
config.hh changes rapidly, so don't force lots of recompiles by including it.

Need to place seed_provider_type in namespace scope, so we can forward
declare it for that.
2015-07-16 12:26:02 +03:00
Asias He
c75e89da88 messaging_service: Farewell rpc.hh 2015-07-16 17:23:26 +08:00
Asias He
48c895aa18 messaging_service: Move rpc_protocol_wrapper and friends to .cc file 2015-07-16 17:23:26 +08:00
Asias He
70236dbfa6 messaging_service: Add shard_info::get_stats helper
Will be used in api/messaging_service.cc.
2015-07-16 17:23:26 +08:00
Asias He
8c2cd037a2 messaging_service: Include rpc_types.hh
so that when we remove rpc.hh, types like rpc::closed_error will be
available.
2015-07-16 17:23:26 +08:00
Asias He
8bc59fb0b9 messaging_service: Move register_handler and send_message to .cc file 2015-07-16 17:23:26 +08:00