Currently udp sender my send whenever it has data and if it does
this faster than packets can be transmitted we will run out of memory.
This patch limits how much outstanding data each native udp channel may
have.
L4 will provide the callback to be called by L3 after the packet is
handled to lower layers for transmission. L4 will know that it can queue
more data from user at this point. The patch also change send function
that can no longer block to return void instead of future<>.
The current shared_ptr implementation is efficient, but does not support
polymorphic types.
Rename it in order to make room for a polymorphic shared_ptr.
A user may be waiting for data, but we never we never notify them if we
receive an RST. As a result the tcb, connection, and any user data structures
will hang around in memory.
Fix by notifying the user if they are waiting, and marking the connection as
nuked so they don't try to read again.
Reviewed-by: Asias He <asias@cloudius-systems.com>
Tcp protects tcbs using a shared_ptr, but in some cases captures an
unprotected [this] in lambdas, which can outlive the shared_ptr.
Introduce and use enable_shared_from_this to fix.
Reviewed-by: Asias He <asias@cloudius-systems.com>
Unfortunately at_exit() cannot be used to delete objects since when
it runs the reactor is still active and deleted object may still been used.
We need another API that runs its task after reactor is already stopped.
at_destroy() will be such api.
If the tcb is destroyed (by, say, the connection being closed and an RST),
then any continuation launched from it would see it destroyed when it
executes.
Fix by protecting the tcb using a shared pointer reference.
The byte-order functions were changed not to do in-place conversions,
but they still accept non-const inputs, although they do not modify them.
This can make them harder to use in some cases.
Fix by marking the inputs const.
Provide a function that maps packet's rss hash to a cpu that should handle
it. This function is needed to find appropriate src port for outgoing
tcp/udp connection. Use this function to forward de-fragmented ip packet
to avoid one extra hop too.
Add a space after the "Checking link status" to prevent it from
merging with "done" if the link is up immediatelly.
For instance this is going to be the case for a VF
of a PF with already established link (e.g. on AWS).
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
We assume that if Rx IPv4, TCP and UDP checksum offload features are suported then
they are supported or not supported all together. The same is about the Tx UDP and TCP
checksum offload.
Add the assert that check this assumption.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Even if port has a single queue we still want the RSS feature to be
available in order to make HW calculate RSS hash for us.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
DPDK 1.8 provides per-device default Tx and Rx queues configurations in the output
of rte_eth_dev_info_get(). Use them instead of ixgbe tuned hardcoded values.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Rename: init_port() -> init_port_start().
- Added a function init_port_fini() that has a code originally found flat in
init_local_queue().
- Moved the link state check to init_port_fini() since the link state should
be checked after the port has been started.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
1) Make --dpdk-pmd parameter to be a flag instead of a (key, value).
2) Default to a default hugetlbfs DPDK settings when --hugepages is not
given and --dpdk-pmd is set.
This will allow a more friendly user experience in general and when one doesn't
want to provide a --hugepages parameter in particular.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Move the smp::dpdk_eal_init() code into the dpdk::eal::init() where it belongs.
- Removed the unused "opts" parameter of dpdk::dpdk_device constructor - all its usage
has been moved to dpdk::eal::init().
- Cleanup in reactor.cc: #if HAVE_DPDK -> #ifdef HAVE_DPDK; since we give a -DHAVE_DPDK
option to a compiler.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
DPDK initialization creates its own threads and assumes that application
uses them, otherwise things do not work correctly (rte_lcore_id()
returns incorrect value for instance). This patch uses DPDK threads to
run seastar main loop making DPDK APIs work as expected.
Some (all?) RSS capable HW provides us with a hash that was used to
select rx queue the packet was delivered to. If such hash is available
it is better to use it to forward packet instead of calculating hash
ourself and suffering cache missed.
This patch introduce a logic to divide cpus between available hw queue
pairs. Each cpu with hw qp gets a set of cpus to distribute traffic
to. The algorithm doesn't take any topology considerations into account yet.
Instead of forward() deciding packet destination make it collect input
for RSS hash function depending on packet type. After data is collected
use toeplitz hash function to calculate packet's destination.
Instead of returning special value from forward() to broadcast arm reply
call arp.learn() on all cpus at arp protocol lever. The ability of
forward() to return special value will be removed by later patches.
Currently dhcp assumes that cpu 0 gets all the packets and redistributes
them by itself. With multiqueue this is not necessary the case, so the
current trick to disable forwarding by installing special dhcp forward()
function will not work. Rework it by installing packet filter on all
cpus before running dhcp and forward all dhcp packets to cpu 0.