Commit Graph

390 Commits

Author SHA1 Message Date
Gleb Natapov
6ad9114c0b reactor: add at_destroy() function to the reactor and use it
Unfortunately at_exit() cannot be used to delete objects since when
it runs the reactor is still active and deleted object may still been used.
We need another API that runs its task after reactor is already stopped.
at_destroy() will be such api.
2014-12-30 15:21:10 +02:00
Gleb Natapov
f0cdc47a3a net: do not sleep while waiting for link in dpdk
Use promise and seastar timers instead.
2014-12-29 13:06:10 +02:00
Gleb Natapov
a445b8174e net: wait for link to be ready before creating network stack 2014-12-29 13:06:10 +02:00
Gleb Natapov
d329a0a614 net: remove non polling mode from virtio-net 2014-12-28 14:54:43 +02:00
Gleb Natapov
4d25571349 net: fix dhcp renewing
Current code forgets to install dhcp packet filter before renewing
dhcp. The patch fixes this.
2014-12-23 18:45:33 +02:00
Avi Kivity
133d39131c net: fix const correctness for byte-order functions
The byte-order functions were changed not to do in-place conversions,
but they still accept non-const inputs, although they do not modify them.
This can make them harder to use in some cases.

Fix by marking the inputs const.
2014-12-23 17:48:16 +02:00
Gleb Natapov
510171d083 net: add function to map packet's rss hash to a cpu
Provide a function that maps packet's rss hash to a cpu that should handle
it. This function is needed to find appropriate src port for outgoing
tcp/udp connection. Use this function to forward de-fragmented ip packet
to avoid one extra hop too.
2014-12-23 17:36:40 +02:00
Vlad Zolotarov
db50b480a3 dpdk: check_port_link_status(): Cosmetics fix of a printouts.
Add a space after the "Checking link status" to prevent it from
merging with "done" if the link is up immediatelly.
For instance this is going to be the case for a VF
of a PF with already established link (e.g. on AWS).

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:55:50 +02:00
Vlad Zolotarov
1a6474d6cc dpdk: added the asserts to check the assumptions regarding CSUM features
We assume that if Rx IPv4, TCP and UDP checksum offload features are suported then
they are supported or not supported all together. The same is about the Tx UDP and TCP
checksum offload.

Add the assert that check this assumption.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:55:44 +02:00
Vlad Zolotarov
38781639ef dpdk: Use all availiable parser options for RSS.
Don't limit ourselves to just IPV4, TCP and UDP even if it's all we currently
care about.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:55:38 +02:00
Vlad Zolotarov
02dd7a3e24 packet: Change the type of offload_info.vlan_tci to std::experimental::optional
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:51:05 +02:00
Vlad Zolotarov
c9e0e7aff8 dpdk: Set RSS mode: enable RSS if seastar is configured with more than 1 CPU.
Even if port has a single queue we still want the RSS feature to be
available in order to make HW calculate RSS hash for us.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:50:28 +02:00
Vlad Zolotarov
15e432715a dpdk: Use DPDK provided default configurations for Rx and Tx queues parameters.
DPDK 1.8 provides per-device default Tx and Rx queues configurations in the output
of rte_eth_dev_info_get(). Use them instead of ixgbe tuned hardcoded values.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-23 16:48:31 +02:00
Vlad Zolotarov
51bb90a397 dpdk: Don't print the MAC address from the hw_address() method.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:37:18 +02:00
Vlad Zolotarov
2b4f9f69f8 dpdk: Make the port initialization stages more pronounced
- Rename: init_port() -> init_port_start().
 - Added a function init_port_fini() that has a code originally found flat in
   init_local_queue().
 - Moved the link state check to init_port_fini() since the link state should
   be checked after the port has been started.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:37:13 +02:00
Vlad Zolotarov
59403f0774 dpdk: First version that supports both 1.7.x and 1.8.x (current git master) DPDK versions.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:37:05 +02:00
Vlad Zolotarov
11c54bd1d5 dpdk: Change the default behavior when --dpdk-pmd is set
1) Make --dpdk-pmd parameter to be a flag instead of a (key, value).
 2) Default to a default hugetlbfs DPDK settings when --hugepages is not
    given and --dpdk-pmd is set.

This will allow a more friendly user experience in general and when one doesn't
want to provide a --hugepages parameter in particular.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:58 +02:00
Vlad Zolotarov
ddf239a943 dpdk: Move the scattered DPDK EAL initialization into the dpdk::eal.
- Move the smp::dpdk_eal_init() code into the dpdk::eal::init() where it belongs.
 - Removed the unused "opts" parameter of dpdk::dpdk_device constructor - all its usage
   has been moved to dpdk::eal::init().
 - Cleanup in reactor.cc: #if HAVE_DPDK -> #ifdef HAVE_DPDK; since we give a -DHAVE_DPDK
   option to a compiler.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:49 +02:00
Vlad Zolotarov
7ec062e222 dpdk: Move dpdk_eal class into a separate file
- Make it's methods static.
 - Rename dpdk::dpdk_eal -> dpdk::eal

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:42 +02:00
Gleb Natapov
b958a44304 smp: create seastar threads using DPDK when compiled with DPDK support
DPDK initialization creates its own threads and assumes that application
uses them, otherwise things do not work correctly (rte_lcore_id()
returns incorrect value for instance). This patch uses DPDK threads to
run seastar main loop making DPDK APIs work as expected.
2014-12-18 14:43:37 +02:00
Avi Kivity
ebf89ac560 virtio: use make_object_deleter 2014-12-16 14:55:02 +02:00
Avi Kivity
3e4c53300d Merge branch 'mq' of ssh://github.com/cloudius-systems/seastar-dev
Multiqueue support for #cpu != #q, from Gleb.
2014-12-16 11:11:22 +02:00
Gleb Natapov
c8189157ed net: use RSS hash key calculated by HW if available
Some (all?) RSS capable HW provides us with a hash that was used to
select rx queue the packet was delivered to. If such hash is available
it is better to use it to forward packet instead of calculating hash
ourself and suffering cache missed.
2014-12-16 10:53:41 +02:00
Gleb Natapov
d796487976 net: use our RSS key instead of letting DPDK select one 2014-12-16 10:53:41 +02:00
Gleb Natapov
d8ddaeb104 net: forward reassembled ip packet to correct queue
To figure out a cpu that should handle reassembled TCP packet RSS
redirection table have to be consulted.
2014-12-16 10:53:41 +02:00
Gleb Natapov
64adef7def net: copy RSS redirection table from a device
We will need it in later patch.
2014-12-16 10:53:41 +02:00
Gleb Natapov
fbef83beb0 net: support for num of cpus > num of queues
This patch introduce a logic to divide cpus between available hw queue
pairs. Each cpu with hw qp gets a set of cpus to distribute traffic
to. The algorithm doesn't take any topology considerations into account yet.
2014-12-16 10:53:41 +02:00
Gleb Natapov
7ac3ba901c net: rework packet forwarding logic
Instead of forward() deciding packet destination make it collect input
for RSS hash function depending on packet type. After data is collected
use toeplitz hash function to calculate packet's destination.
2014-12-16 10:53:41 +02:00
Gleb Natapov
dd2f73401f net: add toeplitz hash function for rss 2014-12-16 10:53:41 +02:00
Gleb Natapov
bd9b0b8962 net: remove broadcast logic from forwarding path
No longer used.
2014-12-15 17:38:20 +02:00
Gleb Natapov
055fbb9430 net: broadcast arp reply on arp protocol level
Instead of returning special value from forward() to broadcast arm reply
call arp.learn() on all cpus at arp protocol lever. The ability of
forward() to return special value will be removed by later patches.
2014-12-15 17:36:14 +02:00
Gleb Natapov
c13adb9c12 net: rework how dhcp handles dhcp packet.
Currently dhcp assumes that cpu 0 gets all the packets and redistributes
them by itself. With multiqueue this is not necessary the case, so the
current trick to disable forwarding by installing special dhcp forward()
function will not work. Rework it by installing packet filter on all
cpus before running dhcp and forward all dhcp packets to cpu 0.
2014-12-15 17:31:25 +02:00
Asias He
c9553b7825 tcp: Switch to user lowres_clock 2014-12-15 19:39:33 +08:00
Asias He
0790266be0 ip: Switch to use lowres_clock 2014-12-15 19:39:33 +08:00
Asias He
62fff15e54 timer: Make timer a template 2014-12-15 19:39:33 +08:00
Avi Kivity
a3f08c32de virtio: rename misleading _deleters field
It's just a set of buffers (albeit maintained as unique_ptrs for their
destructors).  Not the 'deleter' type.
2014-12-15 11:42:33 +02:00
Avi Kivity
38b1398750 virtio: remove outdated TODO re single-fragment packet
We already special case single fragment packets on the receive path.
2014-12-15 11:39:00 +02:00
Avi Kivity
508322c7da virtio: de-futurize receive
Move completion handling (destroy packet, adjust descriptors count) to
a completion function rather than a future.  Reduces allocations and task
executed.
2014-12-14 18:49:01 +02:00
Avi Kivity
1ee959d3e2 virtio: de-futurize transmit
Move completion handling (destroy packet, adjust descriptors count) to
a completion function rather than a future.  Reduces allocations and task
executed.
2014-12-14 18:49:01 +02:00
Avi Kivity
c7c0aebf07 virtio: abstract vring request completions
Currently vring request completions are handled by fulfilling a promise
contained in the request.  While promises are very flexible, this comes
at a cost (allocating and executing a task), and this flexibility is unneeded
when request handling is very regular (such as in virtio-net rx and tx
completion handling).

Make vring more flexible by allowing the completion function to be specified
as a template parameter.  No changes to the actual users - they now specify
the completion function as fulfilling the same promise as vring previously
did.
2014-12-14 18:49:01 +02:00
Avi Kivity
a86faf0209 virtio: de-virtualize virt_to_phys
It is not a device property, but a system property.
2014-12-14 18:49:01 +02:00
Avi Kivity
f3d2908757 virtio: move buffer and config out of vring class
Prior to templating it, best to get the common elements out.
2014-12-14 18:49:01 +02:00
Avi Kivity
fcbcc19231 virtio: remove buffer_chain class
It's a concept that is instantiated by its users, not a true class.
2014-12-14 18:49:01 +02:00
Avi Kivity
5c4ae7a726 virtio: minor code movement 2014-12-14 18:49:01 +02:00
Avi Kivity
d14da53171 virtio: move into 'namespace virtio' 2014-12-14 18:49:01 +02:00
Avi Kivity
ea2cfbbcd8 virtio: fix indentation 2014-12-14 10:28:48 +02:00
Avi Kivity
503f1bf4d0 virtio: batch transmitted packets
Instead of placing packets directly into the virtio ring, add them to
a temporary queue, and flush it when we are polled.  This reduces
cross-cpu writes and kicks.
2014-12-11 19:20:50 +02:00
Avi Kivity
97dff83461 virtio: don't try to complete after posting a buffer, if in poll mode
We will poll for it soon anyway, and completing too soon simply reduces
batching.
2014-12-11 19:15:46 +02:00
Avi Kivity
4e653081a4 virtio: poll mode support
With a new --virtio-poll-mode, poll queues instead of waiting for an
interrupt.

Increases httpd throughput by about 12%.
2014-12-11 19:15:46 +02:00
Gleb Natapov
da53dcff80 net: simplify calculation of number of queues 2014-12-11 13:06:38 +02:00