Commit Graph

137 Commits

Author SHA1 Message Date
Tomasz Grabiec
e88f41fb3f messaging_service: Move REPAIR_CHECKSUM_RANGE verb out of the streaming verbs group
Message-Id: <1452620321-17223-1-git-send-email-tgrabiec@scylladb.com>
2016-01-12 20:17:08 +02:00
Vlad Zolotarov
9232ad927f messaging_service::get_rpc_client(): fix the encryption logic
According to specification
(here https://wiki.apache.org/cassandra/InternodeEncryption)
when the internode encryption is set to `dc` the data passed between
DCs should be encrypted and similarly, when it's set to `rack`
the inter-rack traffic should encrypted.

Currently Scylla would encrypt the traffic inside a local DC in the
first case and inside the local RACK in the later one.

This patch fixes the encryption logic to follow the specification
above.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1452501794-23232-1-git-send-email-vladz@cloudius-systems.com>
2016-01-12 16:22:26 +02:00
Tomasz Grabiec
e1e8858ed1 service: Fetch and sync schema 2016-01-11 10:34:53 +01:00
Tomasz Grabiec
cdca20775f messaging_service: Introduce get_source() 2016-01-11 10:34:53 +01:00
Tomasz Grabiec
da3a453003 service: Add GET_SCHEMA_VERSION remote call
The verb belongs to a seaprate client to avoid potential deadlocks
should the throttling on connection level be introduced in the
future. Another reason is to reduce latency for version requests as it
can potentially block many requests.
2016-01-11 10:34:52 +01:00
Asias He
2345cda42f messaging_service: Rename shard_id to msg_addr
Use shard_id as the destination of the messaging_service is confusing,
since shard_id is used in the context of cpu id.
Message-Id: <8c9ef193dc000ef06f8879e6a01df65cf24635d8.1452155241.git.asias@scylladb.com>
2016-01-07 10:36:35 +02:00
Nadav Har'El
f5b2135a80 repair: repair_checksum_range message
This patch adds a new type of message, "REPAIR_CHECKSUM_RANGE" to scylla's
"messaging_service" RPC mechanism, for the use of repair:

With this message the repair's master host tells a slave host to calculate
the checksum of a column-family's partitions in a given token range, and
return that checksum.

The implementation of this message uses the checksum_range() function
defined in the previous patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2016-01-05 15:38:40 +02:00
Gleb Natapov
fae98f5d67 Revert "messaging_service: wait for outstanding requests"
This reverts commit 9661d8936b.
Message-Id: <1450690729-22551-3-git-send-email-gleb@scylladb.com>
2016-01-03 16:06:39 +02:00
Gleb Natapov
de0771f1d1 Revert "messaging_service: restore indentation"
This reverts commit dcbba2303e.
Message-Id: <1450690729-22551-2-git-send-email-gleb@scylladb.com>
2016-01-03 16:06:38 +02:00
Asias He
1b3d2dee8f streaming: Drop src_cpu_id parameter
Now that we can get the src_cpu_id from rpc::client_info.
No need to pass it as verb parameter.
2015-12-31 11:25:09 +01:00
Asias He
3ae21e06b5 messaging_service: Add src_cpu_id to CLIENT_ID verb
It is useful to figure out which shard to send messages back to the
sender.
2015-12-31 11:25:09 +01:00
Asias He
22d0525bc0 streaming: Get rid of the _from_ parameter
Get this from cinfo.retrieve_auxiliary inside the rpc handler.
2015-12-31 11:25:08 +01:00
Asias He
89b79d44de streaming: Get rid of the _connecting_ parameter
messaging_service will use private ip address automatically to connect a
peer node if possible. There is no need for the upper level like
streaming to worry about it. Drop it simplifies things a bit.
2015-12-31 11:25:08 +01:00
Gleb Natapov
2bcfe02ee6 messaging: remove unused verbs 2015-12-30 15:06:35 +01:00
Gleb Natapov
f0e8b8805c messaging: constify some handlers 2015-12-30 15:06:35 +01:00
Calle Wilund
d1badfa108 messaging_service: Optionally create SSL endpoints
* Accept port + credentials + option for what to encrypt
* If set, enable a SSL listener at ssl_port
* Check outgoing connections by IP to determine if
  they should go to SSL/normal endpoint

Requires seastar RPC patch

Note: currently, the connections created by messaging service
does _not_ do certificate name verification. While DNS lookup
is probably not that expensive here, I am not 100% sure it is
the desired behaviour.
Normal trust is however verified.
2015-12-28 10:10:35 +00:00
Avi Kivity
827a4d0010 Merge "streaming: Invalidate cache upon receiving of stream" from Asias
"When a node gain or regain responsibility for certain token ranges, streaming
will be performed, upon receiving of the stream data, the row cache
is invalidated for that range.

Refs #484."
2015-12-28 10:24:46 +02:00
Avi Kivity
2b22772e3c Merge "Introduce keep alive timer for stream_session" from Asias
"Fixes stream_session hangs:

1) if the sending node is gone, the receiving peer will wait forever
2) if the node which should send COMPLETE_MESSAGE to the peer node is gone,
   the peer node will wait forever"
2015-12-27 16:56:32 +02:00
Avi Kivity
f3980f1fad Merge seastar upstream
* seastar 51154f7...8b2171e (9):
  > memcached: avoid a collision of an expiration with time_point(-1).
  > tutorial: minor spelling corrections etc.
  > tutorial: expand semaphores section
  > Merge "Use steady_clock where monotonic clock is required" from Vlad
  > Merge "TLS fixes + RPC adaption" from Calle
  > do_with() optimization
  > tutorial: explain limiting parallelism using semaphores
  > submit_io: change pending flushes criteria
  > apps: remove defunct apps/seastar

Adjust code to use steady_clock instead of high_resolution_clock.
2015-12-27 14:40:20 +02:00
Asias He
f527e07be6 streaming: Get stream_session in STREAM_MUTATION handler
Get from address from cinfo. It is needed to figure out which stream
session this mutation is belonged to, since we need to update the keep
alive timer for this stream session.
2015-12-24 20:34:44 +08:00
Asias He
bd276fd087 streaming: Increase retry timeout
Currently, if the node is actually down, although the streaming_timeout
is 10 seconds, the sending of the verb will return rpc_closed error
immediately, so we give up in 20 * 5 = 100 seconds. After this change,
we give up in 10 * 30 = 300 seconds at least, and 10 * (30 + 30) = 600
seconds at most.
2015-12-24 20:34:44 +08:00
Asias He
eaea09ee71 streaming: Retransmit COMPLETE_MESSAGE message
It is oneway message at the moment. If a COMPLETE_MESSAGE is lost, no
one will close the session. The first step to fix the issue is to try to
retransmit the message.
2015-12-24 20:34:44 +08:00
Asias He
2d32195c32 streaming: Invalidate cache upon receiving of stream
When a node gain or regain responsibility for certain token ranges,
streaming will be performed, upon receiving of the stream data, the
row cache is invalidated for that range.

Refs #484.
2015-12-21 14:44:13 +08:00
Paweł Dziepak
dcbba2303e messaging_service: restore indentation
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-17 14:06:41 +01:00
Paweł Dziepak
9661d8936b messaging_service: wait for outstanding requests
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-17 14:06:41 +01:00
Avi Kivity
b34a1f6a84 Merge "Preliminary changes for handling of schema changes" from Tomasz
"I extracted some less controversial changes on which the schema changes series will depend
 o somewhat reduce the noise in the main series."
2015-12-16 19:08:22 +02:00
Tomasz Grabiec
872bfadb3d messaging_service: Remove unused parameters from send_migration_request() 2015-12-16 18:06:54 +01:00
Avi Kivity
e27a5d97f6 Merge "background mutation throttling" from Gleb
Fixes the case where background activity needed to complete CL=ONE writes
is queued up in the storage proxy, and the client adds new work faster
than it can be cleared.
2015-12-16 18:08:12 +02:00
Gleb Natapov
de63b3a824 storage_proxy: provide timeout for send_mutation verb
Providing timeout for send_mutation verb allows rpc to drop packets that
sit in outgoing queue for to long.
2015-12-16 10:13:46 +02:00
Nadav Har'El
63c0906b16 messaging_service: drop unnecessary explicit templates
The previous patch added message_service read()/write() support for all
types which know how to serialize themselves through our "old" serialization
API (serialize()/deserialize()/serialized_size()).

So we no longer need the almost 200 lines of repetitive code in
messaging_service.{cc,hh} which defined these read/write templates
separately for a dozen different types using their *serialize() methods.
We also no longer need the helper functions read_gms()/write_gms(), which
are basically the same code as that in the template functions added in the
previous patch.

Compilation is not significantly slowed down by this patch, because it
merely replaces a dozen templates by one template that covers them all -
it does not add new template complexity, and these templates are anyway
instantiated only in messaging_service.cc (other code only calls specific
functions defined in messaging_service.cc, and does not use these templates).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2015-12-15 19:07:05 +02:00
Nadav Har'El
438f6b79f7 messaging_service: allow any self-serializing type
Currently, messaging_service only supports sending types for which a read/
write function has been explicitly implemented in messageing_service.hh/cc.

Some types already have serialization/deserialization methods inside them,
and those could have been used for the serialization without having to write
new functions for each of these types. Many of these types were already
supported explicitly in messaging_service.{cc,hh}, but some were forgot -
for example, dht::token.

So this patch adds a default implemention of messaging_service write()/read()
which will work for any type which has these serialization methods.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2015-12-15 19:07:05 +02:00
Asias He
66938ac129 streaming: Add retransmit logic for streaming verbs
Retransmit streaming related verbs and give up in 5 minutes.

Tested with:

  lein test :only cassandra.batch-test/batch-halves-decommission

Fixes #568.
2015-12-09 15:12:36 +02:00
Tomasz Grabiec
d64db98943 query: Convert serialization of query::result to use db::serializer<>
That's what we're trying to standardize on.

This patch also fixes an issue with current query::result::serialize()
not being const-qualified, because it modifies the
buffer. messaging_service did a const cast to work this around, which
is not safe.
2015-12-03 09:19:11 +01:00
Gleb Natapov
8c02ad0e9e messaging: log connection dropping event 2015-11-30 19:42:04 +02:00
Gleb Natapov
33e5097090 messaging: do not kill live connection needlessly
Messaging service closes connection in rpc call continuation on
closed_error, but the code runs for each outstanding rpc call on the
connection, so first continuation may destroy genuinely closed connection,
then connection is reopened and next continuation that handless previous
error kills now perfectly healthy connection. Fix this by closing
connection only in error state.
2015-11-23 20:16:28 +02:00
Gleb Natapov
eb220507ce storage_proxy: use correct endpoint address for mutation acks processing
Write handler keeps track of all endpoints that not yet acked mutation
verb. It uses broadcast address as an enpoint id, but if local address
is different from broadcast address for local enpoints acknowledgements
will come from different address, so socket address cannot be used as
an acknowledgement source. Origin solves this by sending "from" in each
message, it looks like an overhead, solve this by providing endpoint's
broadcast address in rpc client_info and use that instead.
2015-11-16 10:29:47 +01:00
Amnon Heiman
d5d0653210 messaging_service: Add a function that goes over all the server stats
The API needs to get the stats from the rpc server, that is hidden from the
messaging service API.

This patch adds a foreach function that goes over all the server stats
without exposing the server implementation.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2015-11-02 16:15:52 +02:00
Asias He
2c8867c348 config: Enable storage_port option 2015-10-29 08:58:41 +08:00
Vlad Zolotarov
d8de1099eb message::messaging_service: introduce _preferred_ip_cache
This map will contain the (internal) IPs corresponding to specific Nodes.
The mapping is also stored in the system.peers table.

So, instead of always connecting to external IP messaging_service::get_rpc_client()
will query _preferred_ip_cache and only if there is no entry for a given
Node will connect to the external IP.

We will call for init_local_preferred_ip_cache() at the end of system table init.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Improved the _preferred_ip_cache description.
   - Code styling issues.

New in v3:
   - Make get_internal_ip() public.
   - get_rpc_client(): return a get_preferred_ip() usage dropped
     in v2 by mistake during rebase.
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
f896f9a908 message::messaging_service: added remove_rpc_client(shard_id)
This function erases shard_info objects from all _clients maps.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Use remove_rpc_client_one() instead of direct map::erase().
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
e9789dd68c message::messaging_service: fixes in rpc_protocol_client_wrapper shut down
- Ensure messaging_service::stop() blocks until all rpc_protocol::client::stop()
     are over.
   - Remove the async code from rpc_protocol_client_wrapper destructor - call
     for stop() everywhere it's needed instead. Ensure that
     rpc_protocol_client_wrapper is always "stopped" when its destructor is called.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v3:
   - Code style fixes.
   - Killed rpc_protocol_client_wrapper::_stopped.
   - Killed rpc_protocol_client_wrapper::~rpc_protocol_client_wrapper().
   - Use std::move() for saving shared pointer before
     erasing the entry from _clients in
     remove_rpc_client_one() in
     order to avoid extra ref count bumping.
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
842b13325d message::messaging_service: make _clients to be std::array
This makes code cleaner. Also it would allow less changes
if we decide to increase _clients size in the future.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-26 14:09:26 +02:00
Asias He
1965e8751b messaging_service: Add REPLICATION_FINISHED verb
It is used to send replication finished message by storage_service when
removing a node from a cluster.
2015-10-21 16:11:33 +08:00
Tomasz Grabiec
19d7d30e67 Replace references to 'urchin' with 'scylla' 2015-10-19 11:08:05 +03:00
Calle Wilund
37131fcc05 messaging_service: TRUNCATE verb methods 2015-09-30 09:09:42 +02:00
Gleb Natapov
140641689b messaging: do not use rpc client in error state
Using rpc client in error state will result in a message loss. Try to
reconnect instead.
2015-09-24 17:50:51 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Asias He
eead846712 messaging_service: Make gossip use standalone tcp connection
For unknown reasons, I saw gossip syn message got rpc timeout erros when
the cluster is under heavy cassandra-strss stress.

Using a standalone tcp connection seems to fix the issue.
2015-09-19 10:17:42 +03:00
Asias He
0f5df4476c gossip: Make the timeout longer for gossip syn and echo message
When the cluster is under heavy load, the time to exchange a gossip
message might take longer than 1s. Let's make the timeout longer for now
before we can solve the large delay of gossip message issue.
2015-09-17 11:35:31 +03:00
Asias He
1e7d883ae1 messaging_service: Fix shard_id
We should ignore equal and less than operators for shard_id as well.

Within a 3 nodes cluster, each node has 4 cpus, on first node

Before:
[fedora@ip-172-30-0-99 ~]$ netstat -nt|grep 100\:7000
tcp        0      0 172.30.0.99:36998       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:36772       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:40125       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:60182       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:38013       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:51997       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:56532       172.30.0.100:7000 ESTABLISHED

After:
[fedora@ip-172-30-0-99 ~]$ netstat -nt|grep 100\:7000
tcp        0      0 172.30.0.99:45661       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:57395       172.30.0.100:7000 ESTABLISHED
tcp        0      0 172.30.0.99:37807       172.30.0.100:7000 ESTABLISHED
tcp        0     36 172.30.0.99:50567       172.30.0.100:7000 ESTABLISHED

Each shard of a node is supposed to have 1 connection to a peer node,
thus each node will have #cpu connections to a peer node.

With this patch, the cluster is much more stable than before on AWS. So
far, I see no timeout in the gossip syn message exchange.
2015-09-16 08:44:47 +02:00