Because the cql types deal with a raw inet address and not the gms container, we need
a method to fetch it
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
For all replicated maps:
- Keep the shadow copy on CPU0 and if at the end of a gossiper task execution
it differs from the current contents of the map replicate it on all shards
and update the shadow copy on CPU0.
- Ensure that gossiper task is restarted 1 second AFTER the current iteration
is over and not 1 second after it started.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Rename: _live_endpoints_shadow -> _shadow_live_endpoints
- s/inly/only/
- Clean up the things that don't belong to this patch.
- Replicate _live_endpoints as well
- gossiper: copy _shadow_endpoint_state_map
Code calls failure_detector::is_alive on all cpus, so we start
failure_detector on all cpus. However, the internal data of failure_detector
is modified on cpu zero and it is not replicated to non-zero cpus.
This is fine since the user of failure_detector (the gossiper) accesses
it on cpu0 only.
We do not care about the order of the tokens.
Also, in token_metadata, we use unordered_set for tokens as well, e.g.
update_normal_tokens. Unify the usage.
The failure detector runs on CPU 0, for external usage, this is an
implementation detail which is unrelevant.
This adds a wrapper functions for the functions that are defined in
FailureDetectorMBean which would map the request to the correct CPU.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The current gossiper implementation runs the gossiper on CPU 0, this is
irelevent to user of the gossiper, that may want to inquire it.
This adds a globally available API for get_unreachable_members,
get_live_members, get_endpoint_downtime, get_current_generation_number,
unsafe_assassinate_endpoint and assassinate_endpoint that returns a
future and perform on the correct CPU.
The target user is the API that would use this function to expose the
gossiper.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
In db:config, "localhost" is used as the default IP address for
listen_address, rpc_address. We do not have a name resolver at the
moment.
Add a minimal resolver for localhost for now.
The msg parameter is missing.
Fix a bug where Node B does not not recognize Node A.
Node A
$ ./gossip --seed 127.0.0.1 --listen-address 127.0.0.1
Node B
$ ./gossip --seed 127.0.0.1 --listen-address 127.0.0.2
The issue is that in gossiper::mark_alive(), the parameter for ECHO
message is wrong and after commit 1a8c4b75f5 (message: do not erase client's
rpc call type), we deduce the rpc handler using the parameters supplied
to messaging_service::send_message(), so we will use a wrong handler for
the ECHO message and the message will never reply thus we never mark the
peer node alive.
empty_msg was used as a placeholder when messaging_service does not handle void
return type correctly. Since we support it now, drop it.
send_message() and send_message_oneway() are almost identical, implement
the later in terms of the former. The patch also fixes send_message() to
work properly with MsgIn = void.
Reviewed-by: Asias He <asias@cloudius-systems.com>
With the next patch "gossip: Add storage_service_value_factory helper"
in this series.
[asias@hjpc urchin]$ ninja-build
[8/10] LINK build/release/seastar
build/release/gms/gossiper.o: In function
`gms::versioned_value::versioned_value_factory::removing_nonlocal(utils::UUID
const&)':
/home/asias/src/cloudius-systems/urchin/./gms/versioned_value.hh:201:
undefined reference to `gms::versioned_value::DELIMITER_STR'
build/release/gms/gossiper.o: In function
`gms::versioned_value::versioned_value_factory::removal_coordinator(utils::UUID
const&)':
/home/asias/src/cloudius-systems/urchin/./gms/versioned_value.hh:211:
undefined reference to `gms::versioned_value::DELIMITER_STR'
build/release/gms/gossiper.o: In function
`gms::versioned_value::versioned_value_factory::removed_nonlocal(utils::UUID
const&, long)':
/home/asias/src/cloudius-systems/urchin/./gms/versioned_value.hh:206:
undefined reference to `gms::versioned_value::DELIMITER_STR'
/home/asias/src/cloudius-systems/urchin/./gms/versioned_value.hh:206:
undefined reference to `gms::versioned_value::DELIMITER_STR'
collect2: error: ld returned 1 exit status
Fix by defining the symbol in gms/versioned_value.cc.
There is one gossiper instance per node and it runs on cpu0 only. We can
not guarantee there will always be a core to core tcp connection within
messaging service, so messaging service needs to listen on all cpus.
When a remote node connects to local node with a connection bound to cpu
other than cpu0, we need to forward this message to cpu0.
Some subscribers are allocated statically, so it is a churn to make
shared pointers from them. And since registered subscribers have to be
unregister before been destroyed anyway there is no lifetime issue here
that require use of a smart pointer.