Commit Graph

58 Commits

Author SHA1 Message Date
Kefu Chai
ec5f0fccce gms: remove unused operator<<
since we've switched almost all callers of the operator<< to {fmt},
let's drop the unused operator<<:s.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-18 15:55:22 +08:00
Benny Halevy
86f1fcdcdd gms: endpoint_state: add getters for host_id, dc_rack, and tokens
Allow getting metadata from the endpoint_state based
on the respective application states instead of going
through the gossiper.

To be used by the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-04-14 15:16:58 +03:00
Kefu Chai
ff43628b44 gms: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18194
2024-04-05 08:48:17 +03:00
Benny Halevy
a9fb0cf3dc gms: endpoint_state: add get_host_id
A simpler getter to get the HOST_ID application state
from the endpoint_state.

Return a null host_id if the application state is not found.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-03-10 15:19:51 +02:00
Kefu Chai
f61f6c27e3 gms: add formatter for gms::endpoint_state
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define a formatter for gms::endpoint_state, and
change update the callers of `operator<<` to use `fmt::print()`.
but we cannot drop `operator<<` yet, as we are still using the
templated operator<< and templated fmt::formatter to print containers
in scylla and in seastar -- they are still using `operator<<`
under the hood.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16705
2024-01-10 09:16:23 +02:00
Kefu Chai
7e84e03f52 gms: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

because the removal of `#include "unimplemented.hh"`,
`service/migration_manager.cc` misses the definition of
`unimplemented::cause::VALIDATION`, so include the header where it is
used.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16654
2024-01-05 13:37:08 +02:00
Benny Halevy
cdd5605d81 gms: endpoint_state: change application_state_map to std::unordered_map
State changes are processed as a batch and
there is no reason to maintain them as an ordered map.
Instead, use a std::unordered_map that is more efficient.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
5abf556399 gms: endpoint_state: define application_state_map
Have a central definition for the map held
in the endpoint_state (before changing it to
std::unordered_map).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
d00e49a1bb gossiper: keep and serve shared endpoint_state_ptr in map
This commit changes the interface to
using endpoint_state_ptr = lw_shared_ptr<const endpoint_state>
so that users can get a snapshot of the endpoint_state
that they must not modify in-place anyhow.
While internally, gossiper still has the legacy helpers
to manage the endpoint_state.

Fixes scylladb/scylladb#14799

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-31 09:34:36 +03:00
Benny Halevy
75d1dd3a76 endpoint_state: get rid of _is_alive member and methods
Now that gossiper bases its is_alive status on _live_endpoints.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 12:06:45 +03:00
Benny Halevy
97061cc3b8 endpoint_state: use gossiper::is_alive externally
Before we remove endpoint_state:_is_alive to rely
solely on gossipper::_live_endpoints.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:06:09 +03:00
Benny Halevy
f74d154fe3 gossiper: use permit_id to serialize state changes while preventing deadlocks
Pass permit_id to subscribers when we acquire one
via lock_endpoint.  The subscribers then pass it back to
gossiper for paths that acquire lock_endpoint for
the same endpoint, to detect nested locks when the endpoint
is locked with the same permit_id.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-01 17:41:57 +03:00
Benny Halevy
4cdad8bc8b gms: heart_beat_state: use generation_type and version_type
Define default constructor as heart_beat_state(gms::generation_type(0))

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:48:01 +03:00
Benny Halevy
c5d819ce60 gms: versioned_value: make members private
and provide accessor functions to get them.

1. So they can't be modified by mistake, as the versioned value is
   immutable. A new value must have a higher version.
2. Before making the version a strong gms::version_type.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
fff7ef1fc2 treewide: reduce boost headers usage in scylla header files
`dev-headers` target is also ensured to build successfully.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-05-20 01:33:18 +03:00
Benny Halevy
126e486fde gms/endpoint_state: mark methods using get_status noexcept
Now that get_status returns string_view, just compare it with a const char*
rather than making a sstring out of it, and consequently, can be marked noexcept.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-01 16:46:18 +02:00
Benny Halevy
6b9191b6c2 gms/endpoint_state: get_status: return string_view and make noexcept
get_status doesn't need to allocate a sstring, it can just
return a std::string_view to the status string, if found.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-01 16:46:18 +02:00
Benny Halevy
232c665bab gms/endpoint_state: mark get_application_state_ptr and is_cql_ready noexcept
Although std::map::find is not guaranteed to be noexcept
it depends on the comperator used and in this case comparing application_state
is noexcept.  Therefore, we can safely mark get_application_state_ptr noexcept.

is_cql_ready depends on get_application_state_ptr and otherwise
handles an exceptions boost::lexical_cast so it can be marked
noexcept as well.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-01 16:46:18 +02:00
Benny Halevy
5d8e2c038b gms/endpoint_state: mark trivial methods noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-01 16:46:18 +02:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Asias He
2737654c75 gms: Add endpoint_state::is_cql_ready
Retrun if the endpoint_state has the RPC_READY application_state.
2018-12-10 19:16:44 +08:00
Tomasz Grabiec
c4ec81e126 gms/gossiper: Always override states from older generations
Application states of each node are versioned per-node with a pair of
generation number (more significant) and value version. Generation
number uniquely identifies the life time of a scylla
process. Generation number changes after restart. Value versions start
from 0 on each restart. When a node gets updates for application
states, it merges them with its view on given node. Value updates with
older versions are ignored.

Gossiper processes updates only on shard 0, and replicates value
updates to other shards. When it sees a value with a new generation,
it correclty forgets all previous values. However, non-zero shards
don't forget values from previous generations. As a result,
replication will fail to override the values on non-zero shards when
generation number changes until their value version exceeds the
version prior to the restart.

This will result in incorrect STATUS for non-seed nodes on non-zero
shards.  When restarting a non-seed node, it will do a shadow gossip
round before setting its STATUS to NORMAL. In the shadow round it will
learn from other nodes about itself, and set its STATUS to shutdown on
all shards with a high value version. Later, when it sets its status
to NORMAL, it will override it only on shard 0, because on other
shards the version of STATUS is higher.

This will cause CQL truncate to skip current node if the coordinator
runs on non-zero shards.

The fix is to override the entries on remote shards in the same way we
do on shard 0. All updates to endpoint states should be already
serialized on shard 0, and remote shards should see them in the same
order.

Introduced in 2d5fb9d

Fixes #3798
Fixes #3694
2018-10-04 12:47:27 +02:00
Asias He
059ec89ad1 gms: Add is_normal helper to endpoint_state
It is faster than gossiper::is_normal because it avoids to do search in
the std::map<application_state, versioned_value>. It is useful for the
code in the fast path which needs to query if a node is in NORMAL
status.

Fixes #3500

Message-Id: <42db91fa4108f9f4fcf94fed3ec403ccf35d15e9.1528354644.git.asias@scylladb.com>
2018-06-10 19:21:03 +03:00
Tomasz Grabiec
7323fe76db gossiper: Replicate endpoint_state::is_alive()
Broken in f570e41d18.

Not replicating this may cause coordinator to treat a node which is
down as alive, or vice verse.

Fixes regression in dtest:

  consistency_test.py:TestAvailability.test_simple_strategy

which was expected to get "unavailable" exception but it was getting a
timeout.

Message-Id: <1510666967-1288-1-git-send-email-tgrabiec@scylladb.com>
2017-11-14 15:58:00 +02:00
Tomasz Grabiec
2d5fb9d109 gms/gossiper: Replicate changes incrementally to other shards
storage_service depends on endpoint states to be replicated to all
shards before token metadata is replicated. Currently this is taken
care of by storage_service::replicate_to_all_cores(), invoked from
storage_service's change listener. It copies whole endpoint state map,
which is expensive in large clusters. It's more efficient to replicate
only incremental changes, and only once, rather than for each
application state.
2017-10-18 08:49:53 +02:00
Tomasz Grabiec
28c9609370 gms/gossiper: Document validity of endpoint_state properties 2017-10-18 08:49:53 +02:00
Tomasz Grabiec
f20a805eca gms/gossiper: Avoid copies in endpoint_state::add_application_state() 2017-10-18 08:49:52 +02:00
Duarte Nunes
f67a553b96 gms/endpoint_state: Remove get_application_state()
It is no longer used, as all callsites have moved to
get_application_state_ptr().

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-10-11 10:02:32 +01:00
Duarte Nunes
9d5c6e0c72 gms/endpoint_state: Avoid copies in is_shutdown()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-10-11 10:02:32 +01:00
Tomasz Grabiec
66a15ccd18 gms/gossiper: Introduce copy-less endpoint_state::get_application_state_ptr()
Message-Id: <1507642411-28680-3-git-send-email-tgrabiec@scylladb.com>
2017-10-10 18:27:43 +01:00
Duarte Nunes
bc976b4773 endpoint_state: const-qualify functions
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-10-10 13:30:28 +01:00
Asias He
a36141843a gossip: Switch to seastar::lowres_system_clock
The newly added lowres_system_clock is good enough for gossip
resolution. Switch to use it.

Message-Id: <fe0e7a9ef1ea0caffaa8364afe5c78b6988613bf.1503971833.git.asias@scylladb.com>
2017-08-29 10:16:25 +03:00
Avi Kivity
171fe67a64 gms: remove unneeded #include "types.hh" 2017-08-27 15:18:57 +03:00
Asias He
ed7e6974d5 gms: Add is_shutdown helper for endpoint_state class
It will be used by streaming manager to check if a node is in shutdown
status.
2017-07-19 10:11:05 +08:00
Asias He
f0d3084c8b gossip: Switch to use system_clock
The expire time which is used to decide when to remove a node from
gossip membership is gossiped around the cluster. We switched to steady
clock in the past. In order to have a consistent time_point in all the
nodes in the cluster, we have to use wall clock. Switch to use
system_clock for gossip.

Fixes #1704
2016-09-27 16:42:13 +08:00
Duarte Nunes
b3011c9039 gossip: Rename set_heart_beat_state
...to set_heart_beat_state_and_update_timestamp in order to make it
explicit for callers the update_timestamp is also changed.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1464309023-3254-3-git-send-email-duarte@scylladb.com>
2016-05-27 09:11:39 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Asias He
8098ba10b7 gossip: Drop unused serialization code
- endpoint_state
2016-01-25 11:28:29 +08:00
Amnon Heiman
ddc3fe1328 endpoint_state adds a constructor for all serialized parameters
An external deserialize function needs a constructor with all the
serialized parameters.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-01-24 12:13:01 +02:00
Avi Kivity
f3980f1fad Merge seastar upstream
* seastar 51154f7...8b2171e (9):
  > memcached: avoid a collision of an expiration with time_point(-1).
  > tutorial: minor spelling corrections etc.
  > tutorial: expand semaphores section
  > Merge "Use steady_clock where monotonic clock is required" from Vlad
  > Merge "TLS fixes + RPC adaption" from Calle
  > do_with() optimization
  > tutorial: explain limiting parallelism using semaphores
  > submit_io: change pending flushes criteria
  > apps: remove defunct apps/seastar

Adjust code to use steady_clock instead of high_resolution_clock.
2015-12-27 14:40:20 +02:00
Asias He
fa3c84db10 gossip: Kill default constructor for versioned_value
The only reason we needed it is to make
   _application_state[key] = value
work.

With the current default constructor, we increase the version number
needlessly. To fix and to be safe, remove the default constructor
completely.
2015-12-09 12:29:15 +08:00
Asias He
615e1362da gossip: Add const version for get_heart_beat_state and friends
get_endpoint_state_map
get_application_state_map
2015-11-09 13:01:36 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Asias He
6398bb4bdc gossip: Move code from gms/endpoint_state.hh to source file 2015-07-31 10:43:40 +08:00
Asias He
0ce3d89a85 gossip: Switch to use chrono for time operation
This is a long-awaited cleanup. Gossiper code runs every second, it is
not performance sensitive, so it does not make much sense to stick to
lowres db_clock, use high_resolution_clock instead.
2015-07-28 14:59:44 +03:00
Pekka Enberg
bc4c04a2ab gms: Make endpoint_state::get_application_state() const
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Asias He
fa2aee57ac utils: Move util/serialization.hh to utils/serialization.hh
Now we will not have the ugly utils and util directories, only utils.
2015-07-21 16:12:54 +08:00
Vlad Zolotarov
1e32bdf090 gms: added missing operator==() required for endpoint_state_map comparison.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-06-09 15:18:46 +03:00