mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-23 10:00:35 +00:00
When a view replica becomes unavailable, updates to it are stored as hints at the paired based replica. This on-disk queue of pending view updates grows as long as there are view updated and the view replica remains unavailable. Currently, we take that relative queue size into account when calculating the delay for new base writes, in the context of the backpressure algorithm for materialized views. However, the way we're calculating that on-disk backlog is wrong, since we calculate it per-device and then feed it to all the hints managers for that device. This means that normal hints will show up as backlog for the view hints manager, which in turn introduces delays. This can make the view backpressure mechanism kick-in even if the cluster uses no materialized views. There's yet another way in which considering the view hints backlog is wrong: a view replica that is unavailable for some period of time can cause the backlog to grow to a point where all base writes are applied the maximum delay of 1 second. This turns a single-node failure into cluster unavailability. The fix to both issues is to simply not take this on-disk backlog into account for the backpressure algorithm. Fixes #4351 Fixes #4352 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Reviewed-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190321170418.25953-1-duarte@scylladb.com>