Send packets without copying fragments data:
- Poll all the Tx descriptors and place them into a circular_buffer.
We will take them from there when we need to send new packets.
- PMD will return the completed buffers descriptors to the Tx mempool.
This way we are going to know that we may release the buffer.
- "move" the packet object into the last segment's descriptor's private data.
When this fragment is completed means the whole packet has been sent
and its memory may be released. So, we will do it by calling the packet's
destructor.
Exceptions:
- Copy if hugepages backend is not enabled.
- Copy when we failed to send in a zero-copy flow (e.g. when we failed
to translate a buffer virtual address).
- Copy if first frag requires fragmentation below 128 bytes level - this is
in order to avoid headers splitting.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v5:
- NULL -> nullptr across the board.
- Removed unused macros: MBUF_ZC_PRIVATE() and max_frags_zc.
- Improved the local variables localization according to Nadav's remarks.
- tx_buf class:
- Don't regress the whole packet to the copy-send if a single fragment failed to be sent
in a zero-copy manner (e.g. its data failed the virt2phys translation). Send only such a
fragment in a copy way and try to send the rest of the fragments in a zero-copy way.
- Make set_packet() receive packet&&.
- Fixed the comments in check_frag0(): we check first 128 bytes and not first 2KB.
starting from v2.
- Use assert() instead of rte_exit() in do_one_frag().
- Rename in set_one_data_buf() and in copy_one_data_buf(): l -> buf_len
- Improve the assert about the size of private data in the tx_buf class:
- Added two MARKER fields at the beginning and at the end of the private fields section
which are going to be allocated on the mbuf's private data section.
- Assert on the distance between these two markers.
- Replace the sanity_check() (checks that packet doesn't have a zero-length) in a
copy-flow by an assert() in a general function since this check
is relevant both for a copy and for a zero-copy flows.
- Make a sanity_check to be explicitly called frag0_check.
- Make from_packet() receive packet&&.
- In case frag0_check() fails - copy only the first fragment and
not the whole packet.
- tx_buf_factory class:
- Change the interface to work with tx_buf* instead of tx_buf&.
- Better utilize for-loop facilities in gc().
- Kill the extra if() in the init_factory().
- Use std::deque instead of circular_buffer for storing elements in tx_buf_factory.
- Optimize the tx_buf_factory::get():
- First take the completed buffers from the mempool and only if there
aren't any - take from the factory's cache.
- Make Tx mempools using cache: this significantly improves the performance despite the fact that it's
not the right mempool configuration for a single-producer+single-consumer mode.
- Remove empty() and size() methods.
- Add comments near the assert()s in the fast-path.
- Removed the not-needed "inline" qualifiers:
- There is no need to specify "inline" qualifier for in-class defined
methods INCLUDING static methods.
- Defining process_packets() and poll_rx_once() as inline degraded the
performance by about 1.5%.
- Added a _tx_gc_poller: it will call tx_buf_factory::gc().
- Don't check a pointer before calling free().
- alloc_mempool_xmem(): Use posix_memalign() instead of memalign().
New in v4:
- Improve the info messages.
- Simplified the mempool name creation code.
- configure.py: Opt-out the invalid-offsetof compilation warning.
New in v3:
- Add missing macros definitions dropped in v2 by mistake.
New in v2:
- Use Tx mbufs in a LIFO way for better cache utilization.
- Lower the frag0 non-split thresh to 128 bytes.
- Use new (iterators) semantics in circular_buffer.
- Use optional<packet> for storing the packing in the mbuf.
- Use rte_pktmbuf_alloc() instead of __rte_mbuf_raw_alloc().
- Introduce tx_buf class:
- Hide the private rte_mbuf area handling.
- Hide packet to rte_mbuf cluster translation handling.
- Introduce a "Tx buffers factory" class:
- Hide the rte_mbuf flow details:
mempool->circular_buffer->(PMD->)mempool
- Templatization:
- Make huge_pages_mem_backend a dpdk_qp class template parameter.
- Unite the from_packet_xxx() code into a single template function.
- Unite the translate_one_frag() and copy_one_frag() into a single
template function.
When we use hugetlbfs we will give mempools external buffer for allocations
but the mempool internals still need memory.
We will assume that each CPU core is going to have a HW QP ("worst" case) and
provide the DPDK with enough memory to be able to allocate them all.
The memory above is subtracted from the total amount of memory given to the application
(with -m seastar application parameter).
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Size of std::vector<cpu> can be pre-determined, then let's reserve memory ahead
of time so that push back calls would be optimized.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
If the difference between the sizes of the two strings is larger than can
be represented by an int, truncation will occur and the sign of the result
is undefined.
Fix by using explicit tests and return values.
Add string comparison functions to basic_sstring that are required for
C++ containers such as std::map and std::multimap.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
This patchs adds a header file, "core/enum.hh"; Code which includes
this header file will be able to use an enumerated type as the key in a
hash table.
The header file implements a hash function for *all* enumerated types,
by using the standard hash function of the underlying integer type.
The way periodic timers are rearmed during timer completion causes
timer_settime() to be called twice for each periodic timer completion:
once during rearm and second time by enable_fn(). Fix it by providing
another function that only re-adds timer into timers container, but do
not call timer_settime().
So that the callback which is set on it and which is allocated on CPU
0 is destroyed on CPU 0 when the clock dies. Otherwise we can attempt
to delete it after the CPU thread is gone if CPU 0 != main thread.
When smp::configure() is called from non-main thread, then the global
state which it allocates will be destroyed after reactor is destroyed,
because it will be destroyed from the main thread and the reactor will
be destroyed together with the thread which called
smp::configure(). This will result in SIGSEGV when allocator tries to
free _threads vector across CPU threads because the target CPU was
alrady freed. See issue #10.
To fix this, I introduced smp::cleanup() method which should cleanup
all global state and should be called in the same thread in which
smp::configure() was called.
I need to call smp::configure() from non-main thread for integration
with boost unit testing framework.
Instead of scheduling timer processing to happen in the future,
process timers in the context of the poller (signal poller for high
resolution timer, lowres time poller for low resolution timers).
This both reduces timer latency (not very important) but removes a
use of promise/future in a repetitive task, which is error prone.
Incoming item processing usually takes more work then completion
item processing. Prefetch more completion items to make sure they are
ready before access.
Turn a condition into an assert() since if a mapping is invalid this may
only mean that we have a bug.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
receive_signal() uses the unordered map _signal_handlers (signo mapped to
signal_handler) to either register a signal or find an existing one, and
from there get a future from the promise associated with that signal.
The problem is _signal_handlers.emplace() being called unconditionally,
resulting in the constructor from signal_handler always being called to
needlessly re-register the same handler, even when the signo is already
inserted in the map.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Tell all waiters that something bad happened and they should all go away.
Can be used only once; waiters should clean up and there must not be any
new waiters.
When size > align, we simply call the small allocator with the provided size,
but that does not guarantee any alignment above the default.
Round the allocation size to the nearest power of two in that case, to
guarantee alignment.