When a vector store node becomes unreachable, a client request sent before the keep-alive timer fires would hang until the CQL query timeout was reached. This occurred because the HTTP request writes to the TCP buffer and then waits for a response. While data is in the buffer, TCP retransmissions prevent the keep-alive timer from detecting the dead connection. This patch resolves the issue by setting the `TCP_USER_TIMEOUT` socket option, which applies an effective timeout to TCP retransmissions, allowing the connection to fail faster. Closes scylladb/scylladb#27388
7.1 KiB
7.1 KiB