cpu_pages::initialize() established the one-past-the-end page as a sentinel
to avoid boundary conditions checks. cpu_pages::do_resize() considers the
last page as the sentinel. This discrepancy causes the last page to be
considered free by do_resize, which promptly ends up as a use-after-free
page.
Fix by aligning do_resize() with initialize().
ioctl and fallocate are being used for TRIMMING effect on block and regular
files, respectively.
This feature is useful for SSDs, where write amplification can be reduced,
and drive life increased. Unlike HDD, SSD has to be explicitly notified of
blocks no longer used.
The interface consists of passing the range to the function, through offset
and length parameters.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Intended to allow different file classes to implement their own
file operations; currently supporting regular and block device
files. This support would also be required for a native file
system, for example.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
It improves memaslap score of UDP memcache by 30%.
Note that for recvmsg we always speculate because in case of UDP we
don't know the exact sizeof the datagram and thus we always specify
larger size in msghdr.
For mmaps over a file descriptor, it makes sense to have the operation as a
method of file_desc.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
NOTE: This patch makes pollable_fd::get_file_desc public in order to
access it in posix_server_socket_impl::accept(). Maybe there is a better
solution. But, fow now, this avoids a lot of including game.
The result is not used for anything and I am not sure what it could be
used for, as the result carries little (write) to none (flush)
information. So I went ahead and simplified it to be future<> so that
it is easier to return it in places which expect future<>.
Allow the memory manager to call us back requesting a reclaim. Push
the task to the front of the queue, so we don't run out of memory waiting
for it to fire.
Allow memory users to declare methods of reclaiming memory (reclaimers),
and allow the main loop to declare a safe point for calling these reclaimers.
The memory mananger will then schedule calls to reclaimers when memory runs
low.
The idea is that only one thread opens listen socket and runs accept().
Other threads emulate listen()/accept() by waiting for connected
socket from the main thread. Main thread distributes connected sockets
according to round robin pattern. Patch introduce new specialization
for server_socket_impl and network_stack: posix_ap_server_socket_impl
and posix_ap_network_stack respectively. _ap_ stand for auxiliary processor.
Sometimes we need to know if we are running on main thread during engine
initialization, before engine._id is available. This function will be
used for that.
Basically, wrapping stat around _thread_pool as it might block
waiting for metadata to be read from the underlying device.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Add a compile-time option, DEFAULT_ALLOCATOR, to use the existing
memory allocator (malloc() and friends) instead of redefining it.
This option is a workaround needed to run Seastar on OSv.
Without this workaround, what seems to happen is that some code compiled
into the kernel (notably, libboost_program_options.a) uses the standard
malloc(), while inline code compiled into Seastar uses the seastar free()
to try and free that memory, resulting in a spectacular crash.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
With N3778, the compiler can provide us with the size of the object,
so we can avoid looking it up in the page array. Unfortunately only
implemented in clang at the moment.
Instead of rounding up to a power-of-two, have four equally spaced
regions between powers of two. For example:
1024
1280 (+256)
1536 (+256)
1792 (+256)
2048 (+256)
2560 (+512)
3072 (+512)
3584 (+512)
4096 (+512)
Allocate small objects within spans, minimizing waste.
Each object size class has its own pool, and its own freelist. On overflow
free objects are pushed into the spans; if a span is completely free, it is
returned to the main free list.