Files
M. J. Fromberger 22ed610083 mempool: rework lock discipline to mitigate callback deadlocks (#9030)
The priority mempool has a stricter synchronization requirement than the legacy
mempool. Under sufficiently-heavy load, exclusive access can lead to deadlocks
when processing a large batch of transaction rechecks through an out-of-process
application using the socket client.

By design, a socket client stalls when its send buffer fills, during which time
it holds a lock shared with the receive thread.  While blocked in this state, a
response read by the receive thread waits for the shared lock so the callback
can be invoked.

If we're lucky, the server will then read the next request and make enough room
in the buffer for the sender to proceed. If not however (e.g., if the next
request is bigger than the one just consumed), the receive thread is blocked:
It is waiting on the lock and cannot read a response.  Once the server's output
buffer fills, the system deadlocks.

This can happen with any sufficiently-busy workload, but is more likely during
a large recheck in the v1 mempool, where the callbacks need exclusive access to
mempool state.  As a workaround, process rechecks for the priority mempool in
their own goroutines outside the mempool mutex.  Responses still head-of-line
block, but will no longer get pushback due to contention on the mempool itself.
2022-07-19 13:28:46 -07:00
..