This is a little tricky, since we only know we want hugetlbfs after memory
has been initialized, so we start up in anonymous memory, and later
switch to hugetlbfs by copying it to hugetlb-backed memory and mremap()ing
it back into place.
Since we switched temporary_buffer to malloc(), it now longer throws
an exception after running out of memory, which leads to a segfault
when referencing a null buffer.
While make_free_deleter(nullptr) will function correctly,
deleter::operator bool() on the result will not.
Fix by checking for null, and avoiding the free deleter optimization in
that case -- it doesn't help anyway.
This patch adds new class distributed_device which is responsible for
initializing HW device and it is shared between all cpus. Old device
class responsibility becomes managing rx/tx queue pair and it is local
per cpu. Each cpu have to call distributed_device::init_local_queue() to
create its own device. The logic to distribute cpus between available
queues (in case there is no enough queues for each cpu) is in the
distributed_device currently and not really implemented yet, so only one
queue or queues == cpus scenarios are supported currently, but this can
be fixed later.
The plan is to rename "distributed_device" to "device" and "device"
to "queue_pair" in later patches.
If start promise on initial cpu is signaled before other cpus have
networking stack constructed collected initialization crashes since it
tries to create a UDP socket on all available cpus when initial one is
ready.
A future that does not carry any data (future<>) and its sibling (promise<>)
are heavily used in the code. We can optimize them by overlaying the
future's payload, which in this case can only be an std::exception_ptr,
with the future state, as a pointer and an enum have disjoint values.
This of course depends on std::exception_ptr being implemented as a pointer,
but as it happens, it is.
With this, sizeof(future<>) is reduced from 24 bytes to 16 bytes.
Releases owenrship of the data and gives it away as
temporary_buffer. This way we can avoid allocation when putting rvalue
sstring if it's already using external storage. Except we need to
allocate a deleter which uses delete[], but this can be fixed later.
This function belongs to a group of functions for associating futures
with time points. Currently there's only now(), which servers as a shorthand
for make_ready_future<>().
Move idle state management out from smp poller back to generic code. Each
poller returns if it did any useful work and generic code decided if it
should go idle based on that. If a poller requires constant polling it
should always return true.
With -fvisibility=hidden, all executable symbols are hidden from shared
objects, allowing more optimizations (especially with -flto). However, hiding
the allocator symbols mean that memory allocated in the executable cannot
be freed in a library, since they will use different allocators.
Fix by exposing these symbols with default visibility.
Fixes crash loading some dpdk libraries.
Current code assumes that memory is at node level, but on non numa
machines there is no node level at all. Instead of assuming memory
location in a topology search for it dynamically.