Commit Graph

41 Commits

Author SHA1 Message Date
Gleb Natapov
f77d3bbf52 net: extend connect() API to allow to bind to specific local address/port 2015-05-20 17:10:00 +03:00
Vlad Zolotarov
934c6ace46 DPDK: Add Ethernet HW Flow Control feature
Add an option to enable/disable sending and respecting PAUSE frames as defined in
802.3x and 802.3z specifications. We will configure the Link level PAUSEs
(as opposed to PFC).

In simple words Ethernel Flow Control relies on sending/receiving PAUSE (XOFF) MAC frames that
indicate the sender that receiver's buffer is almost full. The idea is to avoid receive buffer overflow. When
receiver's buffer is being freed it will send XON frame to indicate to the sender that it may
transmit again.

   - Added DPDK-specific command option to toggle the feature.
   - Sending PAUSEs is enabled by default.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-04-16 16:58:16 +03:00
Vlad Zolotarov
553e1a5e75 DPDK: Add LRO support
- Added LRO ON/OFF native stack command line parameter.
   - Implemented handling the reception of a clustered packet:
      - Without hugetlbfs: allocate a single buffer to contain the whole
        packet's data and copy its contents into it. If the allocation failed - build
        the "packet" directly from the cluster.
      - With hugetlbfs: create a packet from cluster mbuf's data buffers.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v3:
   - Use RTE_ETHDEV_HAS_LRO_SUPPORT defined in rte_ethdev.h instead of
     RTE_ETHDEV_LRO_SUPPORT defined in config/common_linuxapp.

New in v2:
   - dpdk_qp<false>::from_mbuf_lro(): Free the cluster after copying
     to the allocated buffer.
   - Some style cleanups.
2015-04-14 19:44:05 +03:00
Avi Kivity
7f8d88371a Add LICENSE, NOTICE, and copyright headers to all source files.
The two files imported from the OSv project retain their original licenses.
2015-02-19 16:52:34 +02:00
Gleb Natapov
7a92efe8d1 core: add local engine accessor function
Do not use thread local engine variable directly, but use accessor
instead.
2015-01-27 14:46:49 +02:00
Avi Kivity
5678a0995e net: use a redirection table to forward packets to proxy queues
Build a 128-entry redirection table to select which cpu services which
packet, when we have more cores than queues (and thus need to dispatch
internally).

Add a --hw-queue-weight to control the relative weight of the hardware queue.
With a weight of 0, the core that services the hardware queue will not
process any packets; with a weight of 1 (default) it will process an equal
share of packets, compared to proxy queues.
2015-01-22 09:36:04 +02:00
Asias He
0c09a6bd7a tcp: Return a future for tcp::connect() 2015-01-21 16:20:39 +08:00
Gleb Natapov
19ced3da4c net: fix dhcp to use use udp socket to send packets
No need for ad-hoc code to create udp packets.
2015-01-12 17:39:07 +02:00
Gleb Natapov
bef054f8c8 net: rename udp_v4 to ipv4_udp for consistency with other l4 protocols 2015-01-11 12:29:05 +02:00
Gleb Natapov
b824790798 net: move udp_v4 from network_stack into ipv4 class
ipv4 class manages tcp and icmp, but for some reason udp is managed by
network_stack. Fix this and make all L4 protocol handling to be the same.
2015-01-08 11:33:19 +02:00
Takuya ASADA
e9e1ed7977 Implement TCP client on native-stack 2015-01-08 01:26:36 +09:00
Takuya ASADA
b9a2541c7e Add reactor::connect(), client_socket definition and network stack stub code
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-01-08 01:26:36 +09:00
Avi Kivity
87f63f7b90 shared_ptr: rename to lw_shared_ptr (for light-weight)
The current shared_ptr implementation is efficient, but does not support
polymorphic types.

Rename it in order to make room for a polymorphic shared_ptr.
2015-01-04 22:38:49 +02:00
Gleb Natapov
a445b8174e net: wait for link to be ready before creating network stack 2014-12-29 13:06:10 +02:00
Gleb Natapov
4d25571349 net: fix dhcp renewing
Current code forgets to install dhcp packet filter before renewing
dhcp. The patch fixes this.
2014-12-23 18:45:33 +02:00
Vlad Zolotarov
11c54bd1d5 dpdk: Change the default behavior when --dpdk-pmd is set
1) Make --dpdk-pmd parameter to be a flag instead of a (key, value).
 2) Default to a default hugetlbfs DPDK settings when --hugepages is not
    given and --dpdk-pmd is set.

This will allow a more friendly user experience in general and when one doesn't
want to provide a --hugepages parameter in particular.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:58 +02:00
Vlad Zolotarov
ddf239a943 dpdk: Move the scattered DPDK EAL initialization into the dpdk::eal.
- Move the smp::dpdk_eal_init() code into the dpdk::eal::init() where it belongs.
 - Removed the unused "opts" parameter of dpdk::dpdk_device constructor - all its usage
   has been moved to dpdk::eal::init().
 - Cleanup in reactor.cc: #if HAVE_DPDK -> #ifdef HAVE_DPDK; since we give a -DHAVE_DPDK
   option to a compiler.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-22 17:36:49 +02:00
Avi Kivity
3e4c53300d Merge branch 'mq' of ssh://github.com/cloudius-systems/seastar-dev
Multiqueue support for #cpu != #q, from Gleb.
2014-12-16 11:11:22 +02:00
Gleb Natapov
fbef83beb0 net: support for num of cpus > num of queues
This patch introduce a logic to divide cpus between available hw queue
pairs. Each cpu with hw qp gets a set of cpus to distribute traffic
to. The algorithm doesn't take any topology considerations into account yet.
2014-12-16 10:53:41 +02:00
Gleb Natapov
055fbb9430 net: broadcast arp reply on arp protocol level
Instead of returning special value from forward() to broadcast arm reply
call arp.learn() on all cpus at arp protocol lever. The ability of
forward() to return special value will be removed by later patches.
2014-12-15 17:36:14 +02:00
Gleb Natapov
c13adb9c12 net: rework how dhcp handles dhcp packet.
Currently dhcp assumes that cpu 0 gets all the packets and redistributes
them by itself. With multiqueue this is not necessary the case, so the
current trick to disable forwarding by installing special dhcp forward()
function will not work. Rework it by installing packet filter on all
cpus before running dhcp and forward all dhcp packets to cpu 0.
2014-12-15 17:31:25 +02:00
Asias He
62fff15e54 timer: Make timer a template 2014-12-15 19:39:33 +08:00
Gleb Natapov
649210b5b6 net: rename net::distributed_device to net::device 2014-12-11 13:06:32 +02:00
Gleb Natapov
73f6d943e1 net: separate device initialization from queues initialization
This patch adds new class distributed_device which is responsible for
initializing HW device and it is shared between all cpus. Old device
class responsibility becomes managing rx/tx queue pair and it is local
per cpu. Each cpu have to call distributed_device::init_local_queue() to
create its own device. The logic to distribute cpus between available
queues (in case there is no enough queues for each cpu) is in the
distributed_device currently and not really implemented yet, so only one
queue or queues == cpus scenarios are supported currently, but this can
be fixed later.

The plan is to rename "distributed_device" to "device" and "device"
to "queue_pair" in later patches.
2014-12-09 18:55:14 +02:00
Gleb Natapov
2fb3dc03f6 net: remove unused opts parameter from proxy_net_device constructor 2014-12-09 18:55:05 +02:00
Vlad Zolotarov
2d10018870 dpdk: separate the EAL initialization from port initialization
- Create a new class dpdk_eal that initializes DPDK EAL.
 - Get rid of portmask crap and provide a port index to a dpdk::net_device
   constructor.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-12-07 17:31:12 +02:00
Gleb Natapov
bf46f9c948 net: Change how networking devices are created
Currently each cpu creates network device as part of native networking
stack creation and all cpus create native networking stack independently,
which makes it impossible to use data initialized by one cpu in another
cpu's networking device initialization. For multiqueue devices often some
parts of an initialization have to be handled by one cpu and all other
cpus should wait for the first one before creating their network devices.
Even without multiqueue proxy devices should be created after master
device is created so that proxy device may get a pointer to the master
at creation time (existing code uses global per cpu device pointer and
assume that master device is created on cpu 0 to compensate for the lack
of ordering).

This patch makes it possible to delay native networking stack creation
until network device is created. It allows one cpu to be responsible
for creation of network devices on multiple cpus. Single queue device
initialize master device on one cpu and call other cpus with a pointer
to master device and its cpu id which are used in proxy device creation.
This removes the need for per cpu device pointer and "master on cpu 0"
assumption from the code since now master device and slave devices know
about each other and can communicate directly.
2014-11-30 18:10:08 +02:00
Vlad Zolotarov
12caa3afe4 net: add option to use a dpdk PMD networking backend
- Added "dpdk-pmd" option:
     - Defaulted to FALSE.
     - When TRUE - use DPDK PMD drivers.
 - Call for dpdk net_device creation function if dpdk-poll option is given
 - Added DPDK networking backend options to all options list

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-11-30 12:14:56 +02:00
Gleb Natapov
4f4731c37b net: delay network stack creation
Network device has to be available when network stack is created, but
sometimes network device creation should wait for device initialization
by another cpu. This patch makes it possible to delay network stack
creation until network device is available.
2014-11-26 16:46:04 +02:00
Gleb Natapov
8a754386c2 net: remove unused variable in native_network_stack 2014-11-25 09:54:44 +02:00
Calle Wilund
bfbdbdf29c dhcp: fix assert/crash in DHCP renew cycle.
Must not signal "_config" promise on renew. Also not needed.

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2014-11-11 14:04:00 +02:00
Tomasz Grabiec
95e09be799 net: add has_per_core_namespace() attribute to network stack
POSIX stack does not allow one to bind more than one socket to given
port. Native stack on the other hand does. The way services are set up
depends on that. For instance, on native stack one might want to start
the service on all cores, but on POSIX stack only on one of them.
2014-11-11 13:52:23 +02:00
Avi Kivity
adc97c0162 dhcp: filter out DHCP failures
If we don't, we start the system before we have an IP address, and when
we actually do get the IP address, we fail an assert on the _config promise,
which was already fulfilled.
2014-11-09 15:03:07 +02:00
Avi Kivity
5bb13601fe xen: wrap in "xen" namespace
Names like "port" are too generic for the global namespace.
2014-11-09 14:41:01 +02:00
Pekka Enberg
86aa399482 net: Fix build when Xen support is disabled
Fixes the following link errors when Xen support is disabled:

build/release/net/native-stack.o: In function `net::add_native_net_options_description(boost::program_options::options_description&)':
/seastar/net/native-stack.cc:101: undefined reference to `get_xenfront_net_options_description()'
build/release/net/native-stack.o: In function `net::create_native_net_device(boost::program_options::variables_map)':
/seastar/net/native-stack.cc:93: undefined reference to `create_xenfront_net_device(boost::program_options::variables_map, bool)'
collect2: error: ld returned 1 exit status

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2014-11-06 10:24:03 +02:00
Avi Kivity
5052d34d23 Merge branch 'xen'
Partial Xen support.
2014-11-05 15:31:23 +02:00
Glauber Costa
6bb8d687d0 native stack: support more than virtio
Support xenfront as well, when we are in a Xen domain.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 15:09:03 +02:00
Calle Wilund
6d2095e12d net: DHCP
Simple discovery class + usage of it in the native stack init.

DHCP discovery will be default if no other ip options are set on command
line, and not explicitly turned off.

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2014-11-05 14:50:56 +02:00
Avi Kivity
9fbd13175b net: move mechanics of listening to a tcp connection to tcp.cc
Removes an include of tcp.hh.
2014-10-24 22:18:54 +03:00
Avi Kivity
04db837450 net: move native stack implementation classes to new header file
This will allow us to instantiate them for tcp in tcp.cc, reducing
compile times.
2014-10-24 22:18:54 +03:00
Asias He
4717d0bc48 net: Rename stack -> native-stack 2014-10-24 09:14:16 +08:00