scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Files

Nadav Har'El 79629a80cd alternator: fix "/localnodes" to not return nodes still joining

Alternator's "/localnodes" HTTP request is supposed to return the list of
nodes in the local DC to which the user can send requests.

The existing implementation incorrectly used gossiper::is_alive() to check
for which nodes to return - but "alive" nodes include nodes which are still
joining the cluster and not really usable. These nodes can remain in the
JOINING state for a long time while they are copying data, and an attempt
to send requests to them will fail.

The fix for this bug is trivial: change the call to is_alive() to a call
to is_normal().

But the hard part of this test is the testing:

1. An existing multi-node test for "/localnodes" assummed that right after
   a new node was created, it appears on "/localnodes". But after this
   patch, it may take a bit more time for the bootstrapping to complete
   and the new node to appear in /localnodes - so I had to add a retry loop.

2. I added a test that reproduces the bug fixed here, and verifies its
   fix. The test is in the multi-node topology framework. It adds an
   injection which delays the bootstrap, which leaves a new node in JOINING
   state for a long time. The test then verifies that the new node is
   alive (as checked by the REST API), but is not returned by "/localnodes".

3. The new injection for delaying the bootstrap is unfortunately not
   very pretty - I had to do it in three places because we have several
   code paths of how bootstrap works without repair, with repair, without
   Raft and with Raft - and I wanted to delay all of them.

Fixes #19694.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19725

(cherry picked from commit bac7c33313)
(deleted test for cherry-pick)
(cherry picked from commit af39675c38)

2024-07-24 11:58:16 +03:00

auth.cc

cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt

2023-05-07 17:17:36 +03:00

auth.hh

alternator,util: Move aws4-hmac-sha256 signature generator to util

2023-04-04 18:24:48 +03:00

CMakeLists.txt

build: cmake: find and link against RapidJSON

2023-03-04 13:11:25 +08:00

conditions.cc

alternator: evaluate expressions as false for stored malformed binary