Directory listing support, using subscription<sstring> to represent the
stream of file names produced by the directory lister running in parallel
with the directory consumer.
open_directory() is similar to open_file_dma() with just the O_ flags adjusted.
list_directory() returns a subscription(), so that both the producer and
the consumer can be asynchronous.
Unfortunately at_exit() cannot be used to delete objects since when
it runs the reactor is still active and deleted object may still been used.
We need another API that runs its task after reactor is already stopped.
at_destroy() will be such api.
- Move the smp::dpdk_eal_init() code into the dpdk::eal::init() where it belongs.
- Removed the unused "opts" parameter of dpdk::dpdk_device constructor - all its usage
has been moved to dpdk::eal::init().
- Cleanup in reactor.cc: #if HAVE_DPDK -> #ifdef HAVE_DPDK; since we give a -DHAVE_DPDK
option to a compiler.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
DPDK initialization creates its own threads and assumes that application
uses them, otherwise things do not work correctly (rte_lcore_id()
returns incorrect value for instance). This patch uses DPDK threads to
run seastar main loop making DPDK APIs work as expected.
register_poller() (and unregister_poller()) adjusts _pollers, but it may be
called while iterating it, and since std::vector<> mutations invalidate
iterators, corruption occurs.
Fix by deferring manipulation of _pollers into a task, which is executed at
a time where _pollers is not touched.
Currently, reactor::_pollers holds reactor::poller pointers; since these
are movable types, it's hard to maintain _pollers, as the pointers can keep
changing.
Refactor poller so that _pollers points at an internal type, which does not
move when a reactor::poller moves. This requires getting rid of
std::function, since it lacks a comparison operator.
We look at _poll mode in another cpu's cache accidentally, as pard of
the peer->idle() call.
Fix by looking at our own _poll variable first; they should all be the same.
wait_and_process() expects an std::function<>, but we pass it a lambda,
forcing it to allocate.
Prepare the sdt::function<> in advance, so it can pass by reference.
We're currently using boost::lockfree::consume_all() to consume
smp requests, but this has two problems:
1. consume_all() calls consume_one() internally, which means it accesses
the ring index once per message
2 we interleave calling the request function with accessing the ring, which
allows the other side to access the ring again, bouncing ring cache lines.
Fix by copying all available items in one show, using pop(array), and then
processing them afterwards.
We're currently using boost::lockfree::consume_all() to consume
smp completions, but this has two problems:
1. consume_all() calls consume_one() internally, which means it accesses
the ring index once per message
2 we interleave calling the request function with accessing the ring, which
allows the other side to access the ring again, bouncing ring cache lines.
Fix by copying all available items in one show, using pop(array), and then
processing them afterwards.
Instead of incurring the overhead of pushing a message down the queue (two
cache line misses), amortize of over 16 messages (3/4 cache line misses per
batch).
Batch size is limited by poll frequency, so we should adjust that
dynamically.
If it needs to be resized, it will cause a deallocation on the wrong cpu,
so initialize it on the sending cpu.
Does not break with circular_buffer<>, but it's not going to be a
circular_buffer<> for long.
This patch adds new class distributed_device which is responsible for
initializing HW device and it is shared between all cpus. Old device
class responsibility becomes managing rx/tx queue pair and it is local
per cpu. Each cpu have to call distributed_device::init_local_queue() to
create its own device. The logic to distribute cpus between available
queues (in case there is no enough queues for each cpu) is in the
distributed_device currently and not really implemented yet, so only one
queue or queues == cpus scenarios are supported currently, but this can
be fixed later.
The plan is to rename "distributed_device" to "device" and "device"
to "queue_pair" in later patches.
If start promise on initial cpu is signaled before other cpus have
networking stack constructed collected initialization crashes since it
tries to create a UDP socket on all available cpus when initial one is
ready.
Move idle state management out from smp poller back to generic code. Each
poller returns if it did any useful work and generic code decided if it
should go idle based on that. If a poller requires constant polling it
should always return true.
Currently each cpu creates network device as part of native networking
stack creation and all cpus create native networking stack independently,
which makes it impossible to use data initialized by one cpu in another
cpu's networking device initialization. For multiqueue devices often some
parts of an initialization have to be handled by one cpu and all other
cpus should wait for the first one before creating their network devices.
Even without multiqueue proxy devices should be created after master
device is created so that proxy device may get a pointer to the master
at creation time (existing code uses global per cpu device pointer and
assume that master device is created on cpu 0 to compensate for the lack
of ordering).
This patch makes it possible to delay native networking stack creation
until network device is created. It allows one cpu to be responsible
for creation of network devices on multiple cpus. Single queue device
initialize master device on one cpu and call other cpus with a pointer
to master device and its cpu id which are used in proxy device creation.
This removes the need for per cpu device pointer and "master on cpu 0"
assumption from the code since now master device and slave devices know
about each other and can communicate directly.
Each "poller" registers a non-blocking callback which is then called in
every iteration of a reactor's main loop.
Each "poller"'s callback returns a boolean: if TRUE then a main loop is allowed to block
(e.g. in epoll()).
If any of registered "pollers" returns FALSE then reactor's main loop is forbidded to block
in the current iteration.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Network device has to be available when network stack is created, but
sometimes network device creation should wait for device initialization
by another cpu. This patch makes it possible to delay network stack
creation until network device is available.
Configure all smp queues before calling engine.configure() so that
engine.configure() may use submit_to() api. Note that messages will
still be processed only after engine.run() is executed.