This fix tcp_server rxrx test on DPDK. The problem is that when we
receive out of order packets, we will hold the packet in the ooo queue.
We do linearize on the incoming packet which will copy the packet and
thus free the old packet. However, we missed one case where we need to
linearize. As a result, the original packet will be held in the ooo
queue. In DPDK, we have fixed buffer in the rx pool. When all the dpdk
buffer are in ooo queue, we will not be able to receive further packets.
So rx hangs, even ping will not work.
This fix the following:
Server side:
$ tcp_server
Client side:
$ go run client.go -host 192.168.66.123 -conn 10 -test txtx
$ control-c
At this time, connection in tcp_server will be in CLOSED state (reset by
the remote), then tcp_server will call tcp::tcb::close() and wait for
wait_for_all_data_acked(), but no one will signal it. Thus we have tons
of leaked connection in CLOSED state.
We call output_one to make sure a packet with FIN is actually generated
and then sent out. If we only call output() and _packetq is not empty,
in tcp::tcb::get_packet(), packet with FIN will not be generated, thus
we will not send out a FIN.
This can happen when retransmit packets have been queued into _packetq,
then ACK comes which ACK all of the unacked data, then the application
call close() to close the connection.
We currently have RFC5681, a.k.a Reno TCP, as the congestion control
algorithms: slow start, congestion avoidance, fast retransmit, and fast
recovery. RFC6582 describes a specific algorithm for responding to
partial acknowledgments, referred to as NewReno, to improve Reno.
I found wrk sometimes sends RST instead a FIN to close a connection. In
this case, we will reset the connection and go to CLOSED state. However
httpd will not delete this, so we will have leaked connections in CLOSED
state.
Fix by handling the exception and sending an empty response as we do in
EOF case. Here we do not pass the exception to upper layer again,
otherwise httpd will be very noise.
The way periodic timers are rearmed during timer completion causes
timer_settime() to be called twice for each periodic timer completion:
once during rearm and second time by enable_fn(). Fix it by providing
another function that only re-adds timer into timers container, but do
not call timer_settime().
Follow RFC793 section "SEGMENT ARRIVES".
There are 4 major cases:
1) If the state is CLOSED
2) If the state is LISTEN
3) If the state is SYN-SENT
4) If the state is other state
Note:
- This change is significant (more than 10 pages in RFC793 describing
this segment arrival handling).
- More test is needed. Good news is, so far, tcp_server(ping/txtx/rxrx)
tests and httpd work fine.
So that the callback which is set on it and which is allocated on CPU
0 is destroyed on CPU 0 when the clock dies. Otherwise we can attempt
to delete it after the CPU thread is gone if CPU 0 != main thread.
When smp::configure() is called from non-main thread, then the global
state which it allocates will be destroyed after reactor is destroyed,
because it will be destroyed from the main thread and the reactor will
be destroyed together with the thread which called
smp::configure(). This will result in SIGSEGV when allocator tries to
free _threads vector across CPU threads because the target CPU was
alrady freed. See issue #10.
To fix this, I introduced smp::cleanup() method which should cleanup
all global state and should be called in the same thread in which
smp::configure() was called.
I need to call smp::configure() from non-main thread for integration
with boost unit testing framework.
Instead of scheduling timer processing to happen in the future,
process timers in the context of the poller (signal poller for high
resolution timer, lowres time poller for low resolution timers).
This both reduces timer latency (not very important) but removes a
use of promise/future in a repetitive task, which is error prone.
Incoming item processing usually takes more work then completion
item processing. Prefetch more completion items to make sure they are
ready before access.
Turn a condition into an assert() since if a mapping is invalid this may
only mean that we have a bug.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
receive_signal() uses the unordered map _signal_handlers (signo mapped to
signal_handler) to either register a signal or find an existing one, and
from there get a future from the promise associated with that signal.
The problem is _signal_handlers.emplace() being called unconditionally,
resulting in the constructor from signal_handler always being called to
needlessly re-register the same handler, even when the signo is already
inserted in the map.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Build a 128-entry redirection table to select which cpu services which
packet, when we have more cores than queues (and thus need to dispatch
internally).
Add a --hw-queue-weight to control the relative weight of the hardware queue.
With a weight of 0, the core that services the hardware queue will not
process any packets; with a weight of 1 (default) it will process an equal
share of packets, compared to proxy queues.
Unlike tcp::tcb::send() and tcp::connection::send() which send tcp
packets associated with tcb, tcp::send() only send packets associated
without tcb. We have a bunch of send() functions, rename it to make the
code more readable.