mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-21 09:00:35 +00:00
Currently, when attempting to send a hint, we might choose its recipients in one of two ways: - If the original destination is a natural endpoint of the hint, we only send the hint to that node and none other, - Otherwise, we send the hint to all current replicas of the mutation. There is a problem when we decommission a node: while data is streamed away from that node, it is still considered to be a natural endpoint of the data that it used to own. Because of that, it might happen that a hint is sent directly to it but streaming will miss it, effectively resulting in the hint being discarded. As sending the hint _only_ to the leaving replica is a rather bad idea, send the hint to all replicas also in the case when the original destiantion of the hint is leaving. Note that this is a conservative fix written only with the decommission + vnode-based keyspaces combo in mind. In general, such "data loss" can occur in other situations where the replica set is changing and we go through a streaming phase, i.e. other topology operations in case of vnodes and tablet load balancing. However, the consistency guarantees of hinted handoff in the face of topology changes are not defined and it is not clear what they should be, if there should be any at all. The picture is further complicated by the fact that hints are used by materialized views, and sending view updates to more replicas than necessary can introduce inconsistencies in the form of "ghost rows". This fix was developed in response to a failing test which checked the hint replay + decommission scenario, and it makes it work again. Fixes scylladb/scylla-dtest#4582 Refs scylladb/scylladb#19835