Commit Graph

49 Commits

Author SHA1 Message Date
Gleb Natapov
bef054f8c8 net: rename udp_v4 to ipv4_udp for consistency with other l4 protocols 2015-01-11 12:29:05 +02:00
Gleb Natapov
32b42af49f net: register l3 poller for tcp connections
This patch change tcp to register a poller so that l3 can poll tcp for
a packet instead of pushing packets from tcp to ipv4. This pushes
networking tx path inversion a little bit closer to an application.
2015-01-11 10:48:32 +02:00
Gleb Natapov
d5c309c74e net: provide poller registration API between l3 and l4
Both push and pull methods will be supported between l3 and l4 after
this patch.
2015-01-11 10:17:48 +02:00
Gleb Natapov
b824790798 net: move udp_v4 from network_stack into ipv4 class
ipv4 class manages tcp and icmp, but for some reason udp is managed by
network_stack. Fix this and make all L4 protocol handling to be the same.
2015-01-08 11:33:19 +02:00
Takuya ASADA
2118fe5a09 Add l4connid::hash() 2015-01-08 01:26:36 +09:00
Gleb Natapov
0fd014fc35 net: add add completion callback between l3 and l4
L4 will provide the callback to be called by L3 after the packet is
handled to lower layers for transmission. L4 will know that it can queue
more data from user at this point. The patch also change send function
that can no longer block to return void instead of future<>.
2015-01-06 15:24:10 +02:00
Gleb Natapov
12bce3f4fc net: make interface get packets from l3
Instead of l3 (arp/ipv4) pushing packets into interface's queue, make
them register functions that interface can use to ask l3 for packets.
2015-01-06 15:24:10 +02:00
Avi Kivity
3e4c53300d Merge branch 'mq' of ssh://github.com/cloudius-systems/seastar-dev
Multiqueue support for #cpu != #q, from Gleb.
2014-12-16 11:11:22 +02:00
Gleb Natapov
7ac3ba901c net: rework packet forwarding logic
Instead of forward() deciding packet destination make it collect input
for RSS hash function depending on packet type. After data is collected
use toeplitz hash function to calculate packet's destination.
2014-12-16 10:53:41 +02:00
Gleb Natapov
055fbb9430 net: broadcast arp reply on arp protocol level
Instead of returning special value from forward() to broadcast arm reply
call arp.learn() on all cpus at arp protocol lever. The ability of
forward() to return special value will be removed by later patches.
2014-12-15 17:36:14 +02:00
Gleb Natapov
c13adb9c12 net: rework how dhcp handles dhcp packet.
Currently dhcp assumes that cpu 0 gets all the packets and redistributes
them by itself. With multiqueue this is not necessary the case, so the
current trick to disable forwarding by installing special dhcp forward()
function will not work. Rework it by installing packet filter on all
cpus before running dhcp and forward all dhcp packets to cpu 0.
2014-12-15 17:31:25 +02:00
Asias He
0790266be0 ip: Switch to use lowres_clock 2014-12-15 19:39:33 +08:00
Asias He
62fff15e54 timer: Make timer a template 2014-12-15 19:39:33 +08:00
Asias He
9a9297c89d ip: Implement fragment timeout and memory usage limit 2014-12-09 09:59:44 +02:00
Asias He
59aa280f0d ip: Add IPv4 reassembly support
If a TCP or UDP IP datagram is fragmented, only the first fragment will
contain the port information. When a fragment without port information
is received, we have no idea which "stream" this fragment belongs to,
thus we no idea how to forward this packet.

To solve this problem, we use "forward twice" method. When IP datagram
which needs fragmentation is received, we forward it using the
frag_id(src_ip, dst_ip, identification, protocol) hash. When all the
fragments are received, we forward it using the connection_id(src_ip,
src_port, dst_ip, dst_port) hash.
2014-12-03 21:40:49 +08:00
Asias He
7ca33fdd72 ip: Add helper for fragmentation 2014-12-03 17:47:29 +08:00
Asias He
88a1a37a88 ip: Support IP fragmentation in TX path
Tested with UDP sending large datagrams with ufo off.
2014-11-30 10:16:38 +02:00
Asias He
e2b1186cca net: Add more tcp and ip header const
net::tcp_hdr_len_min
net::ipv4_hdr_len_min
net::ipv6_hdr_len_min

InetTraits::ip_hdr_len_min is added to handle both ipv4 and ipv6.
2014-11-10 10:17:49 +02:00
Gleb Natapov
c64e1e27fb net: move connid out of tcp to be reused for udp 2014-11-09 18:17:44 +02:00
Gleb Natapov
d698811bdd fix smp broadcast packet handling
Some packets, like arp replies, are broadcast to all cpus for handling,
but only packet structure is copied for each cpu, the actual packet data
is the same for all of them. Currently networking stack mangles a
packet data during its travel up the stack while doing ntoh()
translations which cannot obviously work for broadcaster packets. This
patches fixes the code to not modify packet data while doing ntoh(), but
do it in a stack allocated copy of a data instead.
2014-11-06 10:30:30 +02:00
Calle Wilund
bd263b3b4e net: Add "packet filter" functionality + accessors + "raw" packet send function
Perhaps not the best way to enable "hijacking" the ip stack (for DHCP
querying), but considering the options seems the least intrusive.

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2014-11-05 14:50:28 +02:00
Avi Kivity
7a1f84a556 reactor: replace references to reactor::_id by its accessor cpu_id() 2014-11-01 17:34:43 +02:00
Avi Kivity
332cd6424b ip: use indirection to access tcp
This reduces the number of files that include tcp.hh.
2014-10-24 22:18:46 +03:00
Avi Kivity
ec7b5eeed2 tcp: move ipv4_tcp implementation into tcp.cc
First step in isolating tcp from the rest of the stack.
2014-10-24 21:45:20 +03:00
Asias He
c5861476b9 net: Rename pseudo_header_checksum to tcp_pseudo_header_checksum
We already have udp_pseudo_header_checksum. Make the name more
consistent.
2014-10-13 11:37:56 +08:00
Asias He
2625dd5944 net: Introduce eth_protocol_num 2014-10-13 11:37:56 +08:00
Asias He
5cf3f200c5 net: Introduce ip_protocol_num
We use this in all the places where the ip protocol number is used.
2014-10-13 11:37:56 +08:00
Asias He
05c72b0808 net: UDP checksum offload and UPD fragmentation offload 2014-10-13 11:37:56 +08:00
Gleb Natapov
4e7d8a8506 Introduce packet classification mechanism
Classifier returns what cpu a packets should be processed on. It may
return special broadcast identifier. The patch includes classifier for
tcp, udp and arp. Arp classifier broadcasts arp reply to all cpus. Default
classifier does not forward packet.
2014-10-07 11:03:57 +03:00
Tomasz Grabiec
3775dae6fb net: convert ipv4_addr.host from array to uint32_t
It will be easier to convert it to a format on which the native stack
works.
2014-10-06 18:34:28 +02:00
Tomasz Grabiec
05aece51dc virtio: remove intermediate queue
Currently the send path buffers packets in (unbounded) _tx_queue and
in virtio ring. On queue would suffice though.

This change also prpagates the back pressure resulting from queue-full
condition up the send path. This is needed, becasue otherwise if
senders are faster than the network we will eventually run out of
memory. This would also cause a "buffer bloat" effect, which hurts
latency-sensitive workloads.
2014-10-04 11:27:23 +02:00
Tomasz Grabiec
076d3b2682 ip: connect send() action with L3's send() action
So that back-pressure or failure from the lower layers are tranferred.
2014-10-04 11:27:23 +02:00
Tomasz Grabiec
04b53b7498 ip: make send() composable
This allows the caller to compose it with other actions when send() is
done or when it fails.
2014-10-01 13:45:28 +02:00
Asias He
cff8cb353a net: Add netmask option 2014-09-28 10:06:08 +03:00
Asias He
7ab735d3c7 net: Gateway support 2014-09-28 10:05:58 +03:00
Asias He
c5d623265d net: Support ping 2014-09-25 17:49:38 +03:00
Asias He
236418d262 net: Support TCP checksum offload
It gives ~5% httpd improvements on monster.

csum-offload option is added, e.g., to disable:

./httpd --network-stack native --csum-offload off
2014-09-24 11:03:39 +03:00
Avi Kivity
907792fe26 net: ipv4: remove unused field to please clang 2014-09-22 17:19:15 +03:00
Avi Kivity
313768654a net: remove queuing from l2->l3 rx path
Use a subscription instead.  Queueing should be implemented at the highest
possible level (e.g. tcp), to avoid double-queueing.
2014-09-22 11:28:35 +03:00
Avi Kivity
bf59db0ef0 tcp: don't use pseudo headers for checksum
Calculate it directly instead.
2014-09-18 12:46:25 +03:00
Tomasz Grabiec
bafc6cb18e ip: add method to check if IP address is a wildcard 2014-09-16 18:48:16 +03:00
Tomasz Grabiec
53ce24c850 ip: add method to register L4 protocol handlers 2014-09-16 18:48:14 +03:00
Avi Kivity
5e2f1f0bc6 net: spit out ip checksum routines into their own file 2014-09-16 10:26:44 +03:00
Asias He
647a0cacd3 net: Add ipv4_address(string) constructor
Set ipv4 address using dotted decimal form.

Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
2014-09-13 07:20:31 +03:00
Avi Kivity
1396459085 net: integrate tcp into ipv4
Define the traits class used to communicate address types and pseudo header
to tcp, and a few glue classes.
2014-09-02 20:39:12 +03:00
Avi Kivity
577ac850c8 inet: move ip_checksum into a separate header
Thus it can be used by tcp.
2014-09-02 20:32:03 +03:00
Avi Kivity
0a45d4d73b net: implement IPv4 L3->l4 dispatching 2014-09-01 15:19:17 +03:00
Avi Kivity
b2b24031e9 net: generalize IP checksummer
Allow it to checksum packets and fragments.
2014-09-01 15:17:27 +03:00
Avi Kivity
c77f77ee3f build: organize files into a directory structure 2014-08-31 21:29:13 +03:00