Add .set_skip_when_empty() to four metrics in the db module that are
only incremented on very rare error paths and are almost always zero:
- cache::pinned_dirty_memory_overload: described as 'should sit
constantly at 0, nonzero is indicative of a bug'
- corrupt_data::entries_reported: only fires on actual data corruption
- hints::corrupted_files: only fires on on-disk hint file corruption
- rate_limiter::failed_allocations: only fires when the rate limiter
hash table is completely full and gives up allocating, requiring
extreme cardinality pressure
These metrics create unnecessary reporting overhead when they are
perpetually zero. set_skip_when_empty() suppresses them from metrics
output until they become non-zero.
AI-Assisted: yes
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Closesscylladb/scylladb#29344
Introduces the rate_limiter, a replica-side data structure meant for
tracking the frequence with which each partition is being accessed
(separately for reads and writes) and deciding whether the request
should be accepted and processed further or rejected.
The limiter is implemented as a statically allocated hashmap which keeps
track of the frequency with which partitions are accessed. Its entries
are incremented when an operation is admitted and are decayed
exponentially over time.
If a partition is detected to be accessed more than its limit allows,
requests are rejected with a probability calculated in such a way that,
on average, the number of accepted requests is kept at the limit.
The structure currently weights a bit above 1MB and each shard is meant
to keep a separate instance. All operations are O(1), including the
periodic timer.