Commit Graph

53948 Commits

Author SHA1 Message Date
Glauber Costa
ee172e36c1 xen: enhance gntref
Enhance gntref with some useful operations. Also provide a default object that
represents an invalid grant.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-06 11:21:29 +01:00
Gleb Natapov
d698811bdd fix smp broadcast packet handling
Some packets, like arp replies, are broadcast to all cpus for handling,
but only packet structure is copied for each cpu, the actual packet data
is the same for all of them. Currently networking stack mangles a
packet data during its travel up the stack while doing ntoh()
translations which cannot obviously work for broadcaster packets. This
patches fixes the code to not modify packet data while doing ntoh(), but
do it in a stack allocated copy of a data instead.
2014-11-06 10:30:30 +02:00
Pekka Enberg
86aa399482 net: Fix build when Xen support is disabled
Fixes the following link errors when Xen support is disabled:

build/release/net/native-stack.o: In function `net::add_native_net_options_description(boost::program_options::options_description&)':
/seastar/net/native-stack.cc:101: undefined reference to `get_xenfront_net_options_description()'
build/release/net/native-stack.o: In function `net::create_native_net_device(boost::program_options::variables_map)':
/seastar/net/native-stack.cc:93: undefined reference to `create_xenfront_net_device(boost::program_options::variables_map, bool)'
collect2: error: ld returned 1 exit status

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2014-11-06 10:24:03 +02:00
Glauber Costa
63c8db870f xen: remove debug printfs
As packet flow is working reasonably now, most of the prints can go.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 22:30:25 +01:00
Glauber Costa
73b8f98318 xen: use nr_ents instead of numeric constant in netfront header
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 21:41:52 +01:00
Avi Kivity
5052d34d23 Merge branch 'xen'
Partial Xen support.
2014-11-05 15:31:23 +02:00
Avi Kivity
369f31d4c5 xen: simplify front_ring constructor 2014-11-05 15:09:04 +02:00
Avi Kivity
2d14053e6e xen: make gntref more readable
Convert it from std::pair with meaningless .first and .second fields to
a proper struct.
2014-11-05 15:09:04 +02:00
Avi Kivity
0a0dc6eb90 xen: provide correct checksum offload flags to the host
Tell Xen when we've computed the checksum ourselves, and when we have a
partial checksum filled.
2014-11-05 15:09:04 +02:00
Avi Kivity
c52b4fdc47 xen: partial support for checksum offload
Checksum offload cannot be disabled in Xen (or at least, I haven't figured
out how).  Advertise it as enabled, so that tcp doesn't drop packets as
failing their checksum.

Still need to flesh out the transmit path.

With this, seastar sends SYN/ACK packets in response to connection requests.
2014-11-05 15:09:04 +02:00
Avi Kivity
6581de0fa7 xen: nack features we don't support yet
Pretending to support a feature we don't can lead to protocol failures.
2014-11-05 15:09:04 +02:00
Avi Kivity
2fdaac3132 xen: linearize packet before transmitting
Since we haven't negotiated the scatter/gather capability yet, and we
don't support the scatter protocol, linearize the packet before sending it.
2014-11-05 15:09:03 +02:00
Avi Kivity
a9a87c8dbd xen: fix low-level interrupt handling with osv
The Xen code registers a function that calls semaphore::signal as
an interrupt handler, however that function is not smp safe and may crash,
and in events it generates are likely to be ignored, since they are just
appended to the reactor queue without any real wakeup to the reactor thread.

Switch to using an eventfd.  That's still unsafe, but a little better, since
its signalling is smp safe, and will cause the reactor thread to wake up
in case it was asleep.

With this, we are able to receive multiple packets.
2014-11-05 15:09:03 +02:00
Avi Kivity
6e193b2874 xen: fix memory barrier when writing rx buffer ring
The barrier must separate writing the ring data from the ring index,
otherwise the other side may see unwritten ring data.
2014-11-05 15:09:03 +02:00
Avi Kivity
9f5a4e90d1 xen: fix misaccounting of prepared rx buffers
We prepared N buffers, but only told the host about one.  This meant the host
stopped forwarding received packets almost immediately.

Fix by writing the Xen-visible ring index correctly.
2014-11-05 15:09:03 +02:00
Avi Kivity
80c8337eef xen: don't receive packets before we've created a subscription
Or the code falls over on a null _sub.
2014-11-05 15:09:03 +02:00
Avi Kivity
a769737faa xen: fix another bad grant operation
We used gnttab_grant_foreign_access() instead of
gnttab_grant_foreign_access_ref().  While the two functions have similar
enough signatures, they do very different things.

With the change, we are able to receive packets from Xen, though we crash
immediately.
2014-11-05 15:09:03 +02:00
Avi Kivity
afbe788235 xen: fix bad grant operation
We used gnttab_grant_foreign_access() instead of
gnttab_grant_foreign_access_ref().  While the two functions have similar
enough signatures, they do very different things.

With the change, we are able to transmit packets through Xen.
2014-11-05 15:09:03 +02:00
Avi Kivity
6269fe2bdf xen: fix virt_to_mfn()
Need to shift by 12 to get to a frame number.  With this the host accepts
the guest interface.
2014-11-05 15:09:03 +02:00
Glauber Costa
6bb8d687d0 native stack: support more than virtio
Support xenfront as well, when we are in a Xen domain.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 15:09:03 +02:00
Glauber Costa
72abe62c4e xenfront basic support
This is the basic support for xenfront. It can be used in domU, provided there
is a network interface to be hijacked.

The code that follows, is just the mechanics of managing the grants, event
channels, etc.

However, it does not yet work: I can't see netback injecting any data into it.
I am still debugging the protocol, but I wanted to flush the current state.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 15:09:03 +02:00
Glauber Costa
9fa8124ade xen: evtchn support
This patch enables xen event channels. It creates the placeholder for the
kernel evtchns when we move to OSv as well.

The main problem with this patch, is that evtchn::pending can return more than
one evtchn, so this that I am doing here is technically wrong. We should probably
call keep_doing() in pending() itself, and have that to store the references to
futures equivalent to the possible event channels, that would then be made ready.

I am, however, having a bit of a hard time coding this, since it's still
unclear how, once the future is consumed, we would generate the next.

Please note: All of this is moot if we disable "split event channels", which
can be done by masking that feature in case it is even available. In that case,
only one event channel will be notified, and when ready, we process both tx and
rx. This is yet another reason why I haven't insisted so much in fixing this properly

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 15:09:03 +02:00
Glauber Costa
891b40a2af xen: gntalloc device
This patch creates a seastar enabled version of the xen gntalloc device.

TODO: grow the table dynamically, and fix the index selection algorithm. Shouldn't
just always bump 1.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 15:09:03 +02:00
Glauber Costa
7963eb026c header functions for osv + xen
Should come from OSv, we should fix this soon.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-05 15:09:03 +02:00
Calle Wilund
6d2095e12d net: DHCP
Simple discovery class + usage of it in the native stack init.

DHCP discovery will be default if no other ip options are set on command
line, and not explicitly turned off.

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2014-11-05 14:50:56 +02:00
Calle Wilund
2417fb9612 reactor: Make "engine_started" dependent on network stack init
Also make queue init identical for main and secondary cpus to ensure
notification across threads is possible before "engine start".

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2014-11-05 14:50:42 +02:00
Calle Wilund
bd263b3b4e net: Add "packet filter" functionality + accessors + "raw" packet send function
Perhaps not the best way to enable "hijacking" the ip stack (for DHCP
querying), but considering the options seems the least intrusive.

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2014-11-05 14:50:28 +02:00
Asias He
534d4017a1 scripts: Add tap.sh
This is the script I used to setup a tap device for seastar.
2014-11-05 13:05:55 +02:00
Avi Kivity
6a2532fb00 print: add log() function
Like print(), but with time information prepended.
2014-11-05 11:35:50 +02:00
Gleb Natapov
d77ee625bd virtio: signal availability of a virtio buffer in a vring after sending packet
Currently there is an implicit unbounded queue between virtio driver
and networking stack where packets may accumulate if they are received
faster that networking stack can handle them. The queuing happen because
virtio buffer availability is signaled immediately after received buffer
promise is fulfilled, but promise fulfilment does not mean that buffer is
processed, only that task that will process it is placed on a task queue.

The patch fixes the problem by making virtio buffer available only after
previous buffer's completion task is executed. It makes the aforementioned
implicit queue between virtio driver and networking stack bound by virtio
ring size.
2014-11-04 15:19:27 +02:00
Gleb Natapov
99941f0c16 virtio: remove feedback from virtio_net_device::queue_rx_packet()
Instead of providing back pressure towards NIC, which will cause NIC to
slow down and drop packets, network stack should drop packets it cannot
handle by itself. Otherwise one slow receiver may cause drops for all
others.  Our native network stack correctly drops packets instead of
providing feedback, so it is safe to just remove feedback from an API.
2014-11-04 15:19:13 +02:00
Gleb Natapov
f0416f44b1 keep_doing: remove infinite loop
Prevent keep_doing() from monopolizing the cpu.
2014-11-04 15:19:01 +02:00
Gleb Natapov
e5dfd8e863 future: limit number of ready futures that are executed without scheduling a task
Otherwise stack may overflow if a very long chain of ready futures is
executed.
2014-11-04 15:18:32 +02:00
Gleb Natapov
f8575a1745 reactor: limit number of tasks the reactor runs between polling fds
If there is a task that always adds another task to a ready task list
epoll_wait will never run.
2014-11-04 15:18:23 +02:00
Avi Kivity
174cc6b876 packet: add linearize()
This is helpful for net devices that do not support scatter/gather.
2014-11-04 10:55:04 +02:00
Avi Kivity
31078be7f7 net: initialize interface::_proto_map early
If the driver starts pushing packets early, we need this field to be
initialized so they can be properly ignored.
2014-11-04 10:54:44 +02:00
Asias He
c33270105b net: Handle extra bytes contained in Ethernet frame.
The Ethernet frame might contain extra bytes after the IP packet for
padding. Trim the extra bytes in order not to confuse TCP.

E.g. When doing TCP connection:

1) <SYN>
2) <SYN,ACK>
3) <ACK>

Packet 3) should be 14 + 20 + 20 = 54 bytes, the sender might send a
packet of size 60 bytes, containing 6 extra bytes for padding.

Fix httpd on ran/sif.
2014-11-04 10:41:41 +02:00
Asias He
345d3a3628 net: Add trim_back to packet 2014-11-04 10:13:36 +02:00
Asias He
52f2a2b35b tests: Add tcp_sever
Currently, it is a ping/pong sever. When the client sends a ping, the
server responds a pong. We can add more tests to extend.
This is useful for testing TCP_RR.
2014-11-03 09:53:32 +02:00
Asias He
dbbd9865ad core: Fix read_exactly
Return an empty tmp buf if eof.
2014-11-03 09:53:31 +02:00
Glauber Costa
9a86af9543 xen: xenstore communication
This patch enables to interact with xenstore. Since now OSv now fakes the
presence of libxenstore, the code is the same for userspace and kernel.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-02 16:41:19 +02:00
Glauber Costa
5c5a5b8279 build: reorder configure.py
Reorder the file in a way, as to allow the usage of command line switches
in the construction of the compiled objects.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2014-11-02 16:41:19 +02:00
Avi Kivity
c845606a2d Merge branch 'join'
Rewrite http connection termination management to support the
various cases dictated by the HTTP spec: client-side connection
close, server-side connection close, and header specified connection
close.
2014-11-01 19:38:09 +02:00
Avi Kivity
55f3a03e1d reactor: make more internal variables private 2014-11-01 17:34:45 +02:00
Avi Kivity
7a1f84a556 reactor: replace references to reactor::_id by its accessor cpu_id() 2014-11-01 17:34:43 +02:00
Raphael S. Carvalho
d382a314b3 core: sstring: move initialized_later to be a public member
It's required for instantiating a sstring with the constructor
basic_sstring(initialized_later, size_t size).

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2014-11-01 15:36:28 +02:00
Raphael S. Carvalho
6188745a98 core: introduce size method into file_impl
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2014-11-01 15:36:28 +02:00
Asias He
b4544a3c76 tcp: Retransmission support
Manage the RTO Timer using the algorithm in rfc6298.
2014-10-31 12:11:20 +02:00
Tomasz Grabiec
3de4680e56 function_input_iterator: fix postfix operator++ 2014-10-30 19:51:21 +02:00
Tomasz Grabiec
95fd885996 virito: fix typo 2014-10-30 19:50:58 +02:00