Commit Graph

362 Commits

Author SHA1 Message Date
Avi Kivity
fa5c61d4e4 temporary_buffer: fix wrong oom check
If malloc(0) is allowed to return nullptr, so don't throw an exception in
that case.
2014-12-10 10:33:29 +02:00
Avi Kivity
441331f158 temporary_buffer: fix missing exception
Since we switched temporary_buffer to malloc(), it now longer throws
an exception after running out of memory, which leads to a segfault
when referencing a null buffer.
2014-12-10 09:53:37 +02:00
Avi Kivity
9ae2075d54 deleter: remove bad/unused interfaces 2014-12-09 20:37:44 +02:00
Avi Kivity
7708627144 deleter: improve make_free_deleter() with null input
While make_free_deleter(nullptr) will function correctly,
deleter::operator bool() on the result will not.

Fix by checking for null, and avoiding the free deleter optimization in
that case -- it doesn't help anyway.
2014-12-09 20:37:16 +02:00
Avi Kivity
15dd8ed1bb deleter: mark as final class
Prevent accidental inheritance.
2014-12-09 20:24:35 +02:00
Gleb Natapov
73f6d943e1 net: separate device initialization from queues initialization
This patch adds new class distributed_device which is responsible for
initializing HW device and it is shared between all cpus. Old device
class responsibility becomes managing rx/tx queue pair and it is local
per cpu. Each cpu have to call distributed_device::init_local_queue() to
create its own device. The logic to distribute cpus between available
queues (in case there is no enough queues for each cpu) is in the
distributed_device currently and not really implemented yet, so only one
queue or queues == cpus scenarios are supported currently, but this can
be fixed later.

The plan is to rename "distributed_device" to "device" and "device"
to "queue_pair" in later patches.
2014-12-09 18:55:14 +02:00
Gleb Natapov
34a8744fd3 smp: wait for all cpus before signaling start promise
If start promise on initial cpu is signaled before other cpus have
networking stack constructed collected initialization crashes since it
tries to create a UDP socket on all available cpus when initial one is
ready.
2014-12-09 18:54:56 +02:00
Avi Kivity
7dfd7de8cd future: optimize data-less future<>
A future that does not carry any data (future<>) and its sibling (promise<>)
are heavily used in the code.  We can optimize them by overlaying the
future's payload, which in this case can only be an std::exception_ptr,
with the future state, as a pointer and an enum have disjoint values.

This of course depends on std::exception_ptr being implemented as a pointer,
but as it happens, it is.

With this, sizeof(future<>) is reduced from 24 bytes to 16 bytes.
2014-12-09 10:08:48 +02:00
Asias He
20acb6db9c xen: Fix mismatched signature
Found with clang:

[46/68] CXX build/release/core/xen/evtchn.o
FAILED: clang -MMD -MT build/release/core/xen/evtchn.o -MF
build/release/core/xen/evtchn.o.d -std=gnu++1y   -Wall -Werror
-fvisibility=hidden -pthread -I.  -Wno-mismatched-tags -DHAVE_XEN
-DHAVE_HWLOC -DHAVE_NUMA -O2 -I build/release/gen -c -o
build/release/core/xen/evtchn.o core/xen/evtchn.cc
core/xen/evtchn.cc:83:18: error: 'xen::userspace_evtchn::umask' hides
overloaded virtual function [-Werror,-Woverloaded-virtual]
    virtual void umask(int *port, unsigned count);
                 ^
core/xen/evtchn.hh:38:18: note: hidden overloaded virtual function
'xen::evtchn::umask' declared here: type mismatch at 2nd parameter
('int' vs 'unsigned int')
    virtual void umask(int *port, int count) {};
                 ^
1 error generated.
2014-12-09 09:59:46 +02:00
Avi Kivity
30143fe18d reactor: destroy network_stack after timer infrastructure
The network stack contains a timer, so it must be constructed after the
timer infrastructure and destroyed before it.

Fixes a segfault on shutdown.
2014-12-07 17:37:13 +02:00
Avi Kivity
674076c7bd smp: fix indentation 2014-12-07 17:37:13 +02:00
Avi Kivity
f4d7bd7e00 reactor: register pollers using a RAII class
Avoids leaking a poller.
2014-12-07 17:36:44 +02:00
Gleb Natapov
4ade76a182 reactor: add missing std::forward in at_exit() 2014-12-07 16:45:53 +02:00
Avi Kivity
2ee0239a4a Merge branch 'tgrabiec/zero-copy-2' of github.com:cloudius-systems/seastar-dev
Zero-copy memcached get from Tomasz:

"I've measured memcached on muninn/huginn to be 7.5% better with this on vhost
stack."
2014-12-04 16:31:04 +02:00
Tomasz Grabiec
c4335c49f6 core: convert output APIs to work on packets
This way zero-copy supporting code can put data directly to packet
object and pass it through all layers efficiently.
2014-12-04 13:51:26 +01:00
Tomasz Grabiec
ba0ac1c2b8 core: simplify write_all()
The only case when write_all() does not write all the data is when the
fiber fails at some point, in which case the resulting future is
failed too.
2014-12-04 13:37:36 +01:00
Tomasz Grabiec
cd3ba33ead core: introduce scattered_message
It's a builder class for creating messages comprised of multiple
fragments.
2014-12-04 13:37:35 +01:00
Tomasz Grabiec
a2ca556836 sstring: introduce release()
Releases owenrship of the data and gives it away as
temporary_buffer. This way we can avoid allocation when putting rvalue
sstring if it's already using external storage. Except we need to
allocate a deleter which uses delete[], but this can be fixed later.
2014-12-04 13:37:35 +01:00
Tomasz Grabiec
e720f53c22 deleter: add chaining make_free_deleter() overload 2014-12-04 13:37:35 +01:00
Tomasz Grabiec
8e38cb4159 deleter: introduce append()
It's to help chaining up deleters when appending a fragment with anq
already created deleter.
2014-12-04 13:37:35 +01:00
Tomasz Grabiec
a88ddcec25 sstring: overload to_sstring() for temporary_buffer<> 2014-12-04 13:35:33 +01:00
Tomasz Grabiec
bcea3a67ca output_stream: support for output packet trimming
For UDP memcached we cannot generate arbitrarily large chunks, we need
to trim to datagram size. It's most efficient to split in the
output_stream.
2014-12-03 20:02:21 +01:00
Tomasz Grabiec
4b7c42a5c7 output_stream: fix bug in write()
When coalescing large buffer with buffered data _end was not updated
so flush() would yield shorter packet.
2014-12-03 20:02:21 +01:00
Tomasz Grabiec
6ae5177c2c output_stream: do not allocate on flush()
In UDP memcached flush() is always the last operation on
outpout_stream, so that allocation is wasted.
2014-12-03 20:02:21 +01:00
Tomasz Grabiec
584139decd future-util: make do_for_each() propagate failure 2014-12-03 20:02:21 +01:00
Tomasz Grabiec
8d48c91a35 future-util: introduce now()
This function belongs to a group of functions for associating futures
with time points. Currently there's only now(), which servers as a shorthand
for make_ready_future<>().
2014-12-03 19:57:43 +01:00
Gleb Natapov
4fd3313e3e reactor: add "--poll" command line switch
If the switch is used reactor never goes idle.
2014-12-03 14:37:49 +02:00
Gleb Natapov
d151763967 reactor: move memory barrier to idle() accessors 2014-12-03 14:37:41 +02:00
Gleb Natapov
4d3b6497ea reactor: rework poll infrastructure
Move idle state management out from smp poller back to generic code. Each
poller returns if it did any useful work and generic code decided if it
should go idle based on that. If a poller requires constant polling it
should always return true.
2014-12-03 14:37:33 +02:00
Tomasz Grabiec
f556172619 temporary_buffer: make empty buffer don't need to malloc() 2014-12-03 13:15:09 +01:00
Tomasz Grabiec
1c49669f59 temporary_buffer: introduce operator bool()
It's used as a test for emptiness by convention.

Allows for things like:

  if (buf) {
     // not empty
  }
2014-12-03 13:15:09 +01:00
Avi Kivity
1e572a3248 app-template: add missing app-template.cc 2014-12-01 18:01:12 +02:00
Avi Kivity
6b5973af70 app-template: don't alias boost::program_options as bpo in a header file
We only have one global namespace, let's work together to keep it free
of pollution.
2014-12-01 17:56:34 +02:00
Avi Kivity
78691fc72f app-template: move to a .cc file
Reduce compile loads.
2014-12-01 17:48:18 +02:00
Avi Kivity
e1397038d4 future-util.hh: add missing include
'task_quota' needs reactor.hh
2014-12-01 17:47:28 +02:00
Avi Kivity
256d1823c6 app-template: warn on debug mode 2014-12-01 17:33:47 +02:00
Avi Kivity
be0ae4f5dc memory: Un-hide standard allocator functions
With -fvisibility=hidden, all executable symbols are hidden from shared
objects, allowing more optimizations (especially with -flto).  However, hiding
the allocator symbols mean that memory allocated in the executable cannot
be freed in a library, since they will use different allocators.

Fix by exposing these symbols with default visibility.

Fixes crash loading some dpdk libraries.
2014-12-01 14:49:04 +02:00
Gleb Natapov
c90e56e4fb memory: dynamically search for memory level in a topology
Current code assumes that memory is at node level, but on non numa
machines there is no node level at all. Instead of assuming memory
location in a topology search for it dynamically.
2014-12-01 14:09:36 +02:00
Gleb Natapov
bf46f9c948 net: Change how networking devices are created
Currently each cpu creates network device as part of native networking
stack creation and all cpus create native networking stack independently,
which makes it impossible to use data initialized by one cpu in another
cpu's networking device initialization. For multiqueue devices often some
parts of an initialization have to be handled by one cpu and all other
cpus should wait for the first one before creating their network devices.
Even without multiqueue proxy devices should be created after master
device is created so that proxy device may get a pointer to the master
at creation time (existing code uses global per cpu device pointer and
assume that master device is created on cpu 0 to compensate for the lack
of ordering).

This patch makes it possible to delay native networking stack creation
until network device is created. It allows one cpu to be responsible
for creation of network devices on multiple cpus. Single queue device
initialize master device on one cpu and call other cpus with a pointer
to master device and its cpu id which are used in proxy device creation.
This removes the need for per cpu device pointer and "master on cpu 0"
assumption from the code since now master device and slave devices know
about each other and can communicate directly.
2014-11-30 18:10:08 +02:00
Gleb Natapov
a38f189f5a memory: handle hwloc cousin lists not circular
Use hwloc_get_next_obj_by_type() instead of directly following cousin
list and handle list wrap around. Also fixed use of uninitialized
variable (I wonder why compiler did not complain).
2014-11-30 16:20:53 +02:00
Avi Kivity
e9432e9254 reactor: move collectd initialization out of reactor::run()
It's complicated enough without it.
2014-11-30 14:24:19 +02:00
Gleb Natapov
cbc2f40680 memory: fix numa memory initialization
Current code crashes on an assert while dividing memory to cpus if number
of cpus seastar is configured to use is smaller then number of available
numa nodes. The reason is that seastar tries to use all available memory,
but considers only one numa node while dividing it. This patch makes
memory division two phase process: first each cpu tries to grub as
much memory from its local node as it can, second all free memory that
was left is divided between all cpus. The algorithm works like that to
prevent one cpu from stealing local memory from another cpu.
2014-11-30 12:50:23 +02:00
Vlad Zolotarov
47b3721ccf reactor: added a "pollers" abstraction
Each "poller" registers a non-blocking callback which is then called in
every iteration of a reactor's main loop.

Each "poller"'s callback returns a boolean: if TRUE then a main loop is allowed to block
(e.g. in epoll()).

If any of registered "pollers" returns FALSE then reactor's main loop is forbidded to block
in the current iteration.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2014-11-30 12:12:39 +02:00
Glauber Costa
2cf187590f xen: fix userspace interrupts
The local variable used to read the ports won't be valid after we return from
the function. Moving it to be an instance member is not ideal, but it work if
we don't unmask the ports until we're ready signaling them all.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-28 14:23:14 +01:00
Glauber Costa
a4667c48e6 xen: fix gntalloc for userspace
It broke when we changed things to accomodate OSv's functions. The following
code works.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-27 18:00:35 +01:00
Glauber Costa
f06233695c xenstore: bail on error
If there is some error opening the xenstore - for instance, if we run
without privileges, we should bail out or we will segfault later.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-27 18:00:35 +01:00
Glauber Costa
bd8a18c178 xen: umask event channels when setup is ready
This is not required for OSv, but is required for userspace operation.
It won't work without it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-27 18:00:35 +01:00
Gleb Natapov
4f4731c37b net: delay network stack creation
Network device has to be available when network stack is created, but
sometimes network device creation should wait for device initialization
by another cpu. This patch makes it possible to delay network stack
creation until network device is available.
2014-11-26 16:46:04 +02:00
Avi Kivity
87fdf52205 Merge branch 'clang' 2014-11-26 15:01:14 +02:00
Avi Kivity
58487b55d4 smp: massage init captures to satisfy clang 2014-11-26 14:59:03 +02:00