Add a compile-time option, DEFAULT_ALLOCATOR, to use the existing
memory allocator (malloc() and friends) instead of redefining it.
This option is a workaround needed to run Seastar on OSv.
Without this workaround, what seems to happen is that some code compiled
into the kernel (notably, libboost_program_options.a) uses the standard
malloc(), while inline code compiled into Seastar uses the seastar free()
to try and free that memory, resulting in a spectacular crash.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
This is the first step in a journy of many miles, to be able to run
Seastar applications (such as httpd or memcached) on top of OSv.
This patch adds a configure.py option "-so" to build the applications as
shared-objects instead of executables. It also adds the option "-pie" to
build position-independent executables instead.
These are needed to run a Seastar application on OSv, as OSv cannot run
a normal position-dependent executable. Note that currently, PIE won't
actually work (because of OSv bug #352 - unfortunately Seastar uses
thread_local in one place) - so only "-so" is useful to build an application
to run in OSv.
The resulting shared-object doesn't yet run on OSv (many bugs ahead, as
well as missing features), but it's a start.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
With N3778, the compiler can provide us with the size of the object,
so we can avoid looking it up in the page array. Unfortunately only
implemented in clang at the moment.
This patchset adds a memory allocator, inspired by tcmalloc,
that paritions memory into per-cpu pools and only allows allocation
from local memory. As a result it is very efficient.
Instead of rounding up to a power-of-two, have four equally spaced
regions between powers of two. For example:
1024
1280 (+256)
1536 (+256)
1792 (+256)
2048 (+256)
2560 (+512)
3072 (+512)
3584 (+512)
4096 (+512)
Allocate small objects within spans, minimizing waste.
Each object size class has its own pool, and its own freelist. On overflow
free objects are pushed into the spans; if a span is completely free, it is
returned to the main free list.
Add to the README the instruction to run "./configure.py" once.
If you don't, ninja-build will not work because the build.ninja
file will be missing.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Currently the send path buffers packets in (unbounded) _tx_queue and
in virtio ring. On queue would suffice though.
This change also prpagates the back pressure resulting from queue-full
condition up the send path. This is needed, becasue otherwise if
senders are faster than the network we will eventually run out of
memory. This would also cause a "buffer bloat" effect, which hurts
latency-sensitive workloads.
Currently completion processing start during object creation, but since
all object are created by main thread they all run on the same cpu which
is incorrect. This patch starts completion processing on correct cpu.
Since we have lots of queues, we need an efficient queue structure,
esp. for moveable types. libstdc++'s std::deque is quite hairy,
and boost's circular_buffer_space_optimized uses assignments instead of
constructors, which are both slower and less available than constructors.
This patch implements a growable circular buffer for these needs.
Here, transferring is defined as moving an object to a new location
(either via a move or copy constructor) and destroying the source. This
is useful when implementing containers.