Commit Graph

25 Commits

Author SHA1 Message Date
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Asias He
022c7e50a1 failure_detector: Fix false alarm of "Not marking nodes down due to local pause of"
The problem is we initialize _last_interpret when failure_detector
object is constructed. When interpret() runs for the first time, the
_last_interpret value is not the last time we run interpret() but the
time we initialize failure_detector object.

Fix by initializing _last_interpret inside interpret().

[Thu Feb 18 02:40:04 2016] INFO  [shard 0] storage_service - Node 127.0.0.1 state jump to normal
[Thu Feb 18 02:40:04 2016] INFO  [shard 0] storage_service - NORMAL: node is now in normal status
[Thu Feb 18 02:40:04 2016] INFO  [shard 0] gossip - Waiting for gossip to settle before accepting client requests...
[Thu Feb 18 02:40:12 2016] INFO  [shard 0] gossip - No gossip backlog; proceeding
Starting listening for CQL clients on 127.0.0.1:9042...
[Thu Feb 18 02:40:12 2016] INFO  [shard 0] gossip - Node 127.0.0.2 is now part of the cluster
[Thu Feb 18 02:40:12 2016] INFO  [shard 0] gossip - InetAddress 127.0.0.2 is now UP
[Thu Feb 18 02:40:13 2016] INFO  [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2
[Thu Feb 18 02:40:13 2016] WARN  [shard 0] failure_detector - Not marking nodes down due to local pause of 9091 > 5000 (milliseconds)
2016-02-24 19:31:14 +08:00
Asias He
ad30cf0faf failure_detector: Use a standalone logger name
Do not share logger with gossip. Sometimes, it is useful to only see one
of them.
2015-12-02 14:21:26 +08:00
Asias He
59694a8e43 failure_detector: Print versions for gossip states in gossipinfo
Backport: CASSANDRA-10330

ae4cd69 Print versions for gossip states in gossipinfo

For instance, the version for each state, which can be useful for
diagnosing the reason for any missing states. Also instead of just
omitting the TOKENS state, let's indicate whether the state was actually
present or not.
2015-12-01 17:29:25 +08:00
Asias He
224db2ba37 failure_detector: Don't mark nodes down before the max local pause interval once paused
Backport: CASSANDRA-9446

7fba3d2 Don't mark nodes down before the max local pause interval once paused
2015-12-01 17:29:25 +08:00
Asias He
51fcc48700 failure_detector: Failure detector detects and ignores local pauses
Backport: CASSANDRA-9183

4012134 Failure detector detects and ignores local pauses
2015-12-01 17:29:25 +08:00
Asias He
2022117234 failure_detector: Enable phi_convict_threshold option
Adjusts the sensitivity of the failure detector on an exponential scale.

Use as:

$ scylla --phi-convict-threshold 9

Default to 8.
2015-11-30 11:09:36 +02:00
Asias He
db70643fe3 failure_detector: Print application_state properly 2015-11-30 11:08:40 +02:00
Asias He
36b2de10ed failure_detector: Improve FD logging when the arrival time is ignored
Backport from:

eb9c5bb Improve FD logging when the arrival time is ignored.
2015-11-27 15:31:56 +08:00
Asias He
01ee5d002a failure_detector: Remove debug print in operator<< 2015-10-28 16:13:57 +08:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Asias He
5bec8cba82 gossip: Kill one async::thread for mark_dead
We have this call chain,

  gossiper::run -> do_status_check -> interpret -> convict -> mark_dead

since gossip::run is executed inside a seastar thread, we can assure all
functions above run inside a seastar thread.
2015-09-06 11:04:41 +08:00
Asias He
4390b448a2 gossip: Move _the_failure_detector to failure_detector.cc
We will kill gms/gms.cc soon.
2015-07-31 10:43:39 +08:00
Asias He
1547fa05a5 failure_detector: Simplify get_initial_value and get_max_interval 2015-07-24 19:01:49 +08:00
Asias He
64f8c6e498 failure_detector: Switch to use std::chrono::steady_clock
Instead of naked integer based time point value.
2015-07-24 18:55:21 +08:00
Asias He
73bb690b40 failure_detector: Fix now unit in report 2015-07-24 15:56:05 +08:00
Asias He
9f1dc2877e failure_detector: Fix INITIAL_VALUE_NANOS 2015-07-24 15:56:05 +08:00
Asias He
1c2f5d5997 failure_detector: Add more log printout 2015-07-24 15:56:05 +08:00
Asias He
c3b77f499b failure_detector: Enable logger 2015-07-24 15:56:04 +08:00
Asias He
26cd039005 gossip: Add is_alive helper
failure_detector::is_alive asks gossiper if a node is up or down.
2015-06-04 17:16:58 +08:00
Shlomi Livne
0ad0a02d93 Change failure_detector registration of listeners to accept a ptr
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-05-14 17:01:18 +08:00
Asias He
a800fbfe64 gossip: Set get_phi_convict_threshold to 8
It is the default value.
2015-04-23 14:55:26 +08:00
Asias He
b38dae4a2b gossip: Dump failure detector info 2015-04-20 15:49:27 +08:00
Asias He
650e69da9e gossip: Reduce header inclusion for gms/failure_detector.hh 2015-04-15 15:03:29 +08:00
Asias He
fc72506f68 gossip: Add gms/failure_detector.cc
Move code from failure_detector.hh to failure_detector.cc
2015-04-15 15:03:29 +08:00