It is used to recover from a race where the sender is waiting for a
window update and the receiver is waiting for the sender to send more,
because somehow the window update carried in the ACK packet is not seen
by the sender.
This fix tcp_server rxrx test on DPDK. The problem is that when we
receive out of order packets, we will hold the packet in the ooo queue.
We do linearize on the incoming packet which will copy the packet and
thus free the old packet. However, we missed one case where we need to
linearize. As a result, the original packet will be held in the ooo
queue. In DPDK, we have fixed buffer in the rx pool. When all the dpdk
buffer are in ooo queue, we will not be able to receive further packets.
So rx hangs, even ping will not work.
This fix the following:
Server side:
$ tcp_server
Client side:
$ go run client.go -host 192.168.66.123 -conn 10 -test txtx
$ control-c
At this time, connection in tcp_server will be in CLOSED state (reset by
the remote), then tcp_server will call tcp::tcb::close() and wait for
wait_for_all_data_acked(), but no one will signal it. Thus we have tons
of leaked connection in CLOSED state.
We call output_one to make sure a packet with FIN is actually generated
and then sent out. If we only call output() and _packetq is not empty,
in tcp::tcb::get_packet(), packet with FIN will not be generated, thus
we will not send out a FIN.
This can happen when retransmit packets have been queued into _packetq,
then ACK comes which ACK all of the unacked data, then the application
call close() to close the connection.
We currently have RFC5681, a.k.a Reno TCP, as the congestion control
algorithms: slow start, congestion avoidance, fast retransmit, and fast
recovery. RFC6582 describes a specific algorithm for responding to
partial acknowledgments, referred to as NewReno, to improve Reno.
Follow RFC793 section "SEGMENT ARRIVES".
There are 4 major cases:
1) If the state is CLOSED
2) If the state is LISTEN
3) If the state is SYN-SENT
4) If the state is other state
Note:
- This change is significant (more than 10 pages in RFC793 describing
this segment arrival handling).
- More test is needed. Good news is, so far, tcp_server(ping/txtx/rxrx)
tests and httpd work fine.
Build a 128-entry redirection table to select which cpu services which
packet, when we have more cores than queues (and thus need to dispatch
internally).
Add a --hw-queue-weight to control the relative weight of the hardware queue.
With a weight of 0, the core that services the hardware queue will not
process any packets; with a weight of 1 (default) it will process an equal
share of packets, compared to proxy queues.
Unlike tcp::tcb::send() and tcp::connection::send() which send tcp
packets associated with tcb, tcp::send() only send packets associated
without tcb. We have a bunch of send() functions, rename it to make the
code more readable.
Tested with tcp_server + client.go using iptables dropping <SYN,ACK> or
<FIN,ACK> on client side.
I verified that the SYN or FIN packet is retransmitted and the
connection is closed after N (currently 5) retries.
rss_bits should be equal to the number of bits HW used in its RSS calculation.
Use dev_info.max_rx_queues if dev_info.reta_size is not available.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
With SO_REUSEPORT, we can bind() & accept() on each thread, kernel will dispatch incomming connections.
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
- Adjust the asserts.
- Add an assert in the place where we should not get if RSS info is not provided.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Some packets generated by tcp do not belong to any connection. Currently
such packets are pushed to ipv4 directly. This patch adds a packet queue
for ipv4 to poll them from and limits amount of memory those packets can
consume.
This patch change tcp to register a poller so that l3 can poll tcp for
a packet instead of pushing packets from tcp to ipv4. This pushes
networking tx path inversion a little bit closer to an application.