Files
scylladb/utils
Avi Kivity 85db7b1caf Merge 'address_map: Use more efficient and reliable replication method' from Tomasz Grabiec
Primary issue with the old method is that each update is a separate
cross-shard call, and all later updates queue behind it. If one of the
shards has high latency for such calls, the queue may accumulate and
system will appear unresponsive for mapping changes on non-zero shards.

This happened in the field when one of the shards was overloaded with
sstables and compaction work, which caused frequent stalls which
delayed polling for ~100ms. A queue of 3k address updates
accumulated, because we update mapping on each change of gossip
states. This made bootstrap impossible because nodes couldn't
learn about the IP mapping for the bootstrapping node and streaming
failed.

To protect against that, use a more efficient method of replication
which requires a single cross-shard call to replicate all prior
updates.

It is also more reliable, if replication fails transiently for some
reason, we don't give up and fail all later updates.

Fixes #26865

Closes scylladb/scylladb#26941

* github.com:scylladb/scylladb:
  address_map: Use barrier() to wait for replication
  address_map: Use more efficient and reliable replication method
  utils: Introduce helper for replicated data structures
2025-11-23 19:15:12 +02:00
..
2025-08-31 11:37:39 +03:00
2025-01-14 07:56:39 -05:00
2025-07-08 10:38:23 +03:00
2025-04-01 00:07:28 +02:00
2025-04-01 00:07:28 +02:00
2025-09-01 14:58:21 +03:00
2025-11-20 07:29:47 +03:00
2025-09-28 04:27:33 +02:00
2025-08-28 23:31:04 +02:00
2025-01-14 07:56:39 -05:00
2025-01-14 07:56:39 -05:00