That's what we're trying to standardize on.
This patch also fixes an issue with current query::result::serialize()
not being const-qualified, because it modifies the
buffer. messaging_service did a const cast to work this around, which
is not safe.
Messaging service closes connection in rpc call continuation on
closed_error, but the code runs for each outstanding rpc call on the
connection, so first continuation may destroy genuinely closed connection,
then connection is reopened and next continuation that handless previous
error kills now perfectly healthy connection. Fix this by closing
connection only in error state.
Write handler keeps track of all endpoints that not yet acked mutation
verb. It uses broadcast address as an enpoint id, but if local address
is different from broadcast address for local enpoints acknowledgements
will come from different address, so socket address cannot be used as
an acknowledgement source. Origin solves this by sending "from" in each
message, it looks like an overhead, solve this by providing endpoint's
broadcast address in rpc client_info and use that instead.
The API needs to get the stats from the rpc server, that is hidden from the
messaging service API.
This patch adds a foreach function that goes over all the server stats
without exposing the server implementation.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This map will contain the (internal) IPs corresponding to specific Nodes.
The mapping is also stored in the system.peers table.
So, instead of always connecting to external IP messaging_service::get_rpc_client()
will query _preferred_ip_cache and only if there is no entry for a given
Node will connect to the external IP.
We will call for init_local_preferred_ip_cache() at the end of system table init.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Improved the _preferred_ip_cache description.
- Code styling issues.
New in v3:
- Make get_internal_ip() public.
- get_rpc_client(): return a get_preferred_ip() usage dropped
in v2 by mistake during rebase.
This function erases shard_info objects from all _clients maps.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Use remove_rpc_client_one() instead of direct map::erase().
- Ensure messaging_service::stop() blocks until all rpc_protocol::client::stop()
are over.
- Remove the async code from rpc_protocol_client_wrapper destructor - call
for stop() everywhere it's needed instead. Ensure that
rpc_protocol_client_wrapper is always "stopped" when its destructor is called.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v3:
- Code style fixes.
- Killed rpc_protocol_client_wrapper::_stopped.
- Killed rpc_protocol_client_wrapper::~rpc_protocol_client_wrapper().
- Use std::move() for saving shared pointer before
erasing the entry from _clients in
remove_rpc_client_one() in
order to avoid extra ref count bumping.
This makes code cleaner. Also it would allow less changes
if we decide to increase _clients size in the future.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
For unknown reasons, I saw gossip syn message got rpc timeout erros when
the cluster is under heavy cassandra-strss stress.
Using a standalone tcp connection seems to fix the issue.
When the cluster is under heavy load, the time to exchange a gossip
message might take longer than 1s. Let's make the timeout longer for now
before we can solve the large delay of gossip message issue.
We should ignore equal and less than operators for shard_id as well.
Within a 3 nodes cluster, each node has 4 cpus, on first node
Before:
[fedora@ip-172-30-0-99 ~]$ netstat -nt|grep 100\:7000
tcp 0 0 172.30.0.99:36998 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:36772 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:40125 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:60182 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:38013 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:51997 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:56532 172.30.0.100:7000 ESTABLISHED
After:
[fedora@ip-172-30-0-99 ~]$ netstat -nt|grep 100\:7000
tcp 0 0 172.30.0.99:45661 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:57395 172.30.0.100:7000 ESTABLISHED
tcp 0 0 172.30.0.99:37807 172.30.0.100:7000 ESTABLISHED
tcp 0 36 172.30.0.99:50567 172.30.0.100:7000 ESTABLISHED
Each shard of a node is supposed to have 1 connection to a peer node,
thus each node will have #cpu connections to a peer node.
With this patch, the cluster is much more stable than before on AWS. So
far, I see no timeout in the gossip syn message exchange.
Pointer to messageing_service object is stored in each request
continuation, so the object destruction should not happen while
any of these continuations is scheduled. Use shared pointer instead.
Fixes#273.
Pointer to messageing_service object is stored in each request
continuation, so the object destruction should not happen while any of
these continuations is scheduled. Use gate object to ensure that.
There can be multiple sends underway when first one detects an error and
destroys rpc client, but the rpc client is still in use by other sends.
Fix this by making rpc client pointer shared and hold on it for each
send operation.
Since we do not support shard to shard connections at the moment, ip
address should fully decide if a connection to a remote node exists or
not. messaging_service maintains connections to remote node using
std::unordered_map<shard_id, shard_info, shard_id::hash> _clients;
With this patch, we can possibly reduce number of tcp connections
between two nodes.
Instead of just destroying the client, stop() it first and wait for that
to complete. Otherwise any connect in progress, or any requests in progress,
will access a destroyed object (in practice it will fail earlier when
the _stopped promise is destroyed without having been fulfilled).
Depends on rpc client stop patch.
This patch introduce init.cc file which hosts all the initialization
code. The benefits are 1) we can share initialization code with tests
code. 2) all the service startup dependency / order code is in one
single place instead of everywhere.
It is used when a stream_transfer_task sends all the mutations inside
the messages::outgoing_file_message's it contains to notify the remote
peer this stream_transfer_task is done.
config.hh changes rapidly, so don't force lots of recompiles by including it.
Need to place seed_provider_type in namespace scope, so we can forward
declare it for that.