The reader concurrency semaphore has no mechanism to limit the memory consumption of already admitted read. Once memory collective memory consumption of all the admitted reads is above the limit, all it can do is to not admit any more. Sometimes this is not enough and the memory consumption of the already admitted reads balloons to the point of OOMing the node. This pull-request offers a solution to this: it introduces two more layers of defense above this: a soft and a hard limit. Both are multipliers applied on the semaphores normal memory limit.
When the soft limit threshold is surpassed, all readers but one are blocked via a new blocking `request_memory()` call which is used by the `tracking_file_impl`. The reader to be allowed to proceed is chosen at random, it is the first reader which happens to request memory after the limit is surpassed. This is both very simple and should avoid situations where the algorithm choosing the reader to be allowed to proceed chooses a reader which will then always time out.
When the hard limit threshold is surpassed, `reader_concurrency_semaphore::consume()` starts throwing `std::bad_alloc`. This again will result in eliminating whichever reader was unlucky enough to request memory at the right moment.
With this, the semaphore is now effectively enforcing an upper bound for memory consumption, defined by the hard limit.
Refs: https://github.com/scylladb/scylladb/issues/11927Closes#11955
* github.com:scylladb/scylladb:
test: reader_concurrency_semaphore_test: add tests for semaphore memory limits
reader_permit: expose operator<<(reader_permit::state)
reader_permit: add id() accessor
reader_concurrency_semaphore: add foreach_permit()
reader_concurrency_semaphore: document the new memory limits
reader_concurrency_semaphore: add OOM killer
reader_concurrency_semaphore: make consume() and signal() private
test: stop using reader_concurrency_semaphore::{consume,signal}() directly
reader_concurrency_semaphore: move consume() out-of-line
reader_permit: consume(): make it exception-safe
reader_permit: resource_units::reset(): only call consume() if needed
reader_concurrency_semaphore: tracked_file_impl: use request_memory()
reader_concurrency_semaphore: add request_memory()
reader_concurrency_semaphore: wrap wait list
reader_concurrency_semaphore: add {serialize,kill}_limit_multiplier parameters
test/boost/reader_concurrency_semaphore_test: dummy_file_impl: don't use hardoced buffer size
reader_permit: add make_new_tracked_temporary_buffer()
reader_permit: add get_state() accessor
reader_permit: resource_units: add constructor for already consumed res
reader_permit: resource_units: remove noexcept qualifier from constructor
db/config: introduce reader_concurrency_semaphore_{serialize,kill}_limit_multiplier
scylla-gdb.py: scylla-memory: extract semaphore stats formatting code
scylla-gdb.py: fix spelling of "graphviz"