Commit Graph

422 Commits

Author SHA1 Message Date
Avi Kivity
94ffb2c948 net: add missing includes to byteorder.hh 2015-01-12 14:11:56 +02:00
Gleb Natapov
bef054f8c8 net: rename udp_v4 to ipv4_udp for consistency with other l4 protocols 2015-01-11 12:29:05 +02:00
Gleb Natapov
32b42af49f net: register l3 poller for tcp connections
This patch change tcp to register a poller so that l3 can poll tcp for
a packet instead of pushing packets from tcp to ipv4. This pushes
networking tx path inversion a little bit closer to an application.
2015-01-11 10:48:32 +02:00
Gleb Natapov
d5c309c74e net: provide poller registration API between l3 and l4
Both push and pull methods will be supported between l3 and l4 after
this patch.
2015-01-11 10:17:48 +02:00
Gleb Natapov
2b340b80ce net: unfuturize packet fragmentation
Since sending of a single packet does not involve futures anymore we can
simplify this code.
2015-01-11 10:17:48 +02:00
Avi Kivity
4c59fe6436 xen: ensure _xenstore member is initialized early enough
Thanks clang.
2015-01-08 18:44:23 +02:00
Avi Kivity
062f621aa0 net: wrap toeplitz hash initializer in more braces
Nagging by clang.
2015-01-08 18:43:49 +02:00
Gleb Natapov
13c1324d45 net: provide some statistics via collectd
Provide batching and overall send/received packet stats.
2015-01-08 17:41:26 +02:00
Gleb Natapov
51fb18aba0 net: remove unused variable from virtio 2015-01-08 16:45:01 +02:00
Gleb Natapov
b824790798 net: move udp_v4 from network_stack into ipv4 class
ipv4 class manages tcp and icmp, but for some reason udp is managed by
network_stack. Fix this and make all L4 protocol handling to be the same.
2015-01-08 11:33:19 +02:00
Takuya ASADA
e9e1ed7977 Implement TCP client on native-stack 2015-01-08 01:26:36 +09:00
Takuya ASADA
7b3b9e5a46 Implement TCP client on posix-stack
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-01-08 01:26:36 +09:00
Takuya ASADA
b9a2541c7e Add reactor::connect(), client_socket definition and network stack stub code
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-01-08 01:26:36 +09:00
Takuya ASADA
2118fe5a09 Add l4connid::hash() 2015-01-08 01:26:36 +09:00
Takuya ASADA
9c2ca3d6b6 return correct option size when new SYN packet sending
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-01-08 01:26:36 +09:00
Takuya ASADA
ff39f8c93e Initialize _snd.initial/next on tcb constructor 2015-01-07 23:48:41 +09:00
Gleb Natapov
8d4e6b832a net: implement bulk sending interface for dpdk 2015-01-06 15:24:10 +02:00
Gleb Natapov
aae617f9f5 net: revert whatever is left from "virtio: batch transmitted packets" commit.
Revert remains of commit 503f1bf4 since there is no need to batch
packets inside virtio any more. Upper layer does it already.
2015-01-06 15:24:10 +02:00
Gleb Natapov
72324f02e2 net: implement bulk sending interface for virtio 2015-01-06 15:24:10 +02:00
Gleb Natapov
77bd21c387 net: implement bulk sending interface for proxy queue
Take advantage of the bulk interface to send several packets simultaneity
with one submit_to() to remote cpu.
2015-01-06 15:24:10 +02:00
Gleb Natapov
c98bdf5137 net: limit native udp send buffer size
Currently udp sender my send whenever it has data and if it does
this faster than packets can be transmitted we will run out of memory.
This patch limits how much outstanding data each native udp channel may
have.
2015-01-06 15:24:10 +02:00
Gleb Natapov
0fd014fc35 net: add add completion callback between l3 and l4
L4 will provide the callback to be called by L3 after the packet is
handled to lower layers for transmission. L4 will know that it can queue
more data from user at this point. The patch also change send function
that can no longer block to return void instead of future<>.
2015-01-06 15:24:10 +02:00
Gleb Natapov
e80fa4af7d net: drop top level 'remaining' from ipv4::send()
It is not needed.
2015-01-06 15:24:10 +02:00
Gleb Natapov
12bce3f4fc net: make interface get packets from l3
Instead of l3 (arp/ipv4) pushing packets into interface's queue, make
them register functions that interface can use to ask l3 for packets.
2015-01-06 15:24:10 +02:00
Gleb Natapov
e5d0adb339 net: make qp poll for tx packets from networking stack
Packets are accumulated in interface's packet queue. The queue is polled
by qp to see if there is something to send.
2015-01-06 15:24:10 +02:00
Gleb Natapov
865d95c0f1 net: provide bulk sending interface for qp
Implement it as calls to send() in a loop for now. Each device will
get proper implementation later.
2015-01-06 15:24:10 +02:00
Avi Kivity
87f63f7b90 shared_ptr: rename to lw_shared_ptr (for light-weight)
The current shared_ptr implementation is efficient, but does not support
polymorphic types.

Rename it in order to make room for a polymorphic shared_ptr.
2015-01-04 22:38:49 +02:00
Gleb Natapov
d72de7e959 net: signal space availability in tcp buffer only after
Currently space availability is signaled when packet is taken from
unsend queue, but packet can still be alive for a long time at his
point.
2015-01-04 16:37:17 +02:00
Avi Kivity
cef37f1546 tcp: fix unfulfilled read promise on RST
A user may be waiting for data, but we never we never notify them if we
receive an RST.  As a result the tcb, connection, and any user data structures
will hang around in memory.

Fix by notifying the user if they are waiting, and marking the connection as
nuked so they don't try to read again.

Reviewed-by: Asias He <asias@cloudius-systems.com>
2015-01-04 12:20:31 +02:00
Avi Kivity
05a02b93a0 Merge branch 'tcp'
Tcp protects tcbs using a shared_ptr, but in some cases captures an
unprotected [this] in lambdas, which can outlive the shared_ptr.

Introduce and use enable_shared_from_this to fix.

Reviewed-by: Asias He <asias@cloudius-systems.com>
2014-12-31 12:17:21 +02:00
Gleb Natapov
6ad9114c0b reactor: add at_destroy() function to the reactor and use it
Unfortunately at_exit() cannot be used to delete objects since when
it runs the reactor is still active and deleted object may still been used.
We need another API that runs its task after reactor is already stopped.
at_destroy() will be such api.
2014-12-30 15:21:10 +02:00
Gleb Natapov
f0cdc47a3a net: do not sleep while waiting for link in dpdk
Use promise and seastar timers instead.
2014-12-29 13:06:10 +02:00
Gleb Natapov
a445b8174e net: wait for link to be ready before creating network stack 2014-12-29 13:06:10 +02:00
Avi Kivity
073d6062a4 tcp: protect continuations launched from tcb against its destruction
If the tcb is destroyed (by, say, the connection being closed and an RST),
then any continuation launched from it would see it destroyed when it
executes.

Fix by protecting the tcb using a shared pointer reference.
2014-12-29 11:12:44 +02:00
Avi Kivity
b05f33cf63 tcp: drop two useless continuations
They do nothing.
2014-12-29 11:12:33 +02:00
Gleb Natapov
d329a0a614 net: remove non polling mode from virtio-net 2014-12-28 14:54:43 +02:00
Gleb Natapov
4d25571349 net: fix dhcp renewing
Current code forgets to install dhcp packet filter before renewing
dhcp. The patch fixes this.
2014-12-23 18:45:33 +02:00
Avi Kivity
133d39131c net: fix const correctness for byte-order functions
The byte-order functions were changed not to do in-place conversions,
but they still accept non-const inputs, although they do not modify them.
This can make them harder to use in some cases.

Fix by marking the inputs const.
2014-12-23 17:48:16 +02:00
Gleb Natapov
510171d083 net: add function to map packet's rss hash to a cpu
Provide a function that maps packet's rss hash to a cpu that should handle
it. This function is needed to find appropriate src port for outgoing
tcp/udp connection. Use this function to forward de-fragmented ip packet
to avoid one extra hop too.
2014-12-23 17:36:40 +02:00
Vlad Zolotarov
db50b480a3 dpdk: check_port_link_status(): Cosmetics fix of a printouts.
Add a space after the "Checking link status" to prevent it from
merging with "done" if the link is up immediatelly.
For instance this is going to be the case for a VF
of a PF with already established link (e.g. on AWS).

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:55:50 +02:00
Vlad Zolotarov
1a6474d6cc dpdk: added the asserts to check the assumptions regarding CSUM features
We assume that if Rx IPv4, TCP and UDP checksum offload features are suported then
they are supported or not supported all together. The same is about the Tx UDP and TCP
checksum offload.

Add the assert that check this assumption.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:55:44 +02:00
Vlad Zolotarov
38781639ef dpdk: Use all availiable parser options for RSS.
Don't limit ourselves to just IPV4, TCP and UDP even if it's all we currently
care about.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:55:38 +02:00
Vlad Zolotarov
02dd7a3e24 packet: Change the type of offload_info.vlan_tci to std::experimental::optional
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:51:05 +02:00
Vlad Zolotarov
c9e0e7aff8 dpdk: Set RSS mode: enable RSS if seastar is configured with more than 1 CPU.
Even if port has a single queue we still want the RSS feature to be
available in order to make HW calculate RSS hash for us.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:50:28 +02:00
Vlad Zolotarov
15e432715a dpdk: Use DPDK provided default configurations for Rx and Tx queues parameters.
DPDK 1.8 provides per-device default Tx and Rx queues configurations in the output
of rte_eth_dev_info_get(). Use them instead of ixgbe tuned hardcoded values.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:48:31 +02:00
Vlad Zolotarov
51bb90a397 dpdk: Don't print the MAC address from the hw_address() method.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:37:18 +02:00
Vlad Zolotarov
2b4f9f69f8 dpdk: Make the port initialization stages more pronounced
- Rename: init_port() -> init_port_start().
 - Added a function init_port_fini() that has a code originally found flat in
   init_local_queue().
 - Moved the link state check to init_port_fini() since the link state should
   be checked after the port has been started.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:37:13 +02:00
Vlad Zolotarov
59403f0774 dpdk: First version that supports both 1.7.x and 1.8.x (current git master) DPDK versions.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:37:05 +02:00
Vlad Zolotarov
11c54bd1d5 dpdk: Change the default behavior when --dpdk-pmd is set
1) Make --dpdk-pmd parameter to be a flag instead of a (key, value).
 2) Default to a default hugetlbfs DPDK settings when --hugepages is not
    given and --dpdk-pmd is set.

This will allow a more friendly user experience in general and when one doesn't
want to provide a --hugepages parameter in particular.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:58 +02:00
Vlad Zolotarov
ddf239a943 dpdk: Move the scattered DPDK EAL initialization into the dpdk::eal.
- Move the smp::dpdk_eal_init() code into the dpdk::eal::init() where it belongs.
 - Removed the unused "opts" parameter of dpdk::dpdk_device constructor - all its usage
   has been moved to dpdk::eal::init().
 - Cleanup in reactor.cc: #if HAVE_DPDK -> #ifdef HAVE_DPDK; since we give a -DHAVE_DPDK
   option to a compiler.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:49 +02:00