scylladb

Author	SHA1	Message	Date
Gleb Natapov	13c1324d45	net: provide some statistics via collectd Provide batching and overall send/received packet stats.	2015-01-08 17:41:26 +02:00
Gleb Natapov	51fb18aba0	net: remove unused variable from virtio	2015-01-08 16:45:01 +02:00
Gleb Natapov	aae617f9f5	net: revert whatever is left from "virtio: batch transmitted packets" commit. Revert remains of commit `503f1bf4` since there is no need to batch packets inside virtio any more. Upper layer does it already.	2015-01-06 15:24:10 +02:00
Gleb Natapov	72324f02e2	net: implement bulk sending interface for virtio	2015-01-06 15:24:10 +02:00
Gleb Natapov	d329a0a614	net: remove non polling mode from virtio-net	2014-12-28 14:54:43 +02:00
Avi Kivity	ebf89ac560	virtio: use make_object_deleter	2014-12-16 14:55:02 +02:00
Gleb Natapov	fbef83beb0	net: support for num of cpus > num of queues This patch introduce a logic to divide cpus between available hw queue pairs. Each cpu with hw qp gets a set of cpus to distribute traffic to. The algorithm doesn't take any topology considerations into account yet.	2014-12-16 10:53:41 +02:00
Avi Kivity	a3f08c32de	virtio: rename misleading _deleters field It's just a set of buffers (albeit maintained as unique_ptrs for their destructors). Not the 'deleter' type.	2014-12-15 11:42:33 +02:00
Avi Kivity	38b1398750	virtio: remove outdated TODO re single-fragment packet We already special case single fragment packets on the receive path.	2014-12-15 11:39:00 +02:00
Avi Kivity	508322c7da	virtio: de-futurize receive Move completion handling (destroy packet, adjust descriptors count) to a completion function rather than a future. Reduces allocations and task executed.	2014-12-14 18:49:01 +02:00
Avi Kivity	1ee959d3e2	virtio: de-futurize transmit Move completion handling (destroy packet, adjust descriptors count) to a completion function rather than a future. Reduces allocations and task executed.	2014-12-14 18:49:01 +02:00
Avi Kivity	c7c0aebf07	virtio: abstract vring request completions Currently vring request completions are handled by fulfilling a promise contained in the request. While promises are very flexible, this comes at a cost (allocating and executing a task), and this flexibility is unneeded when request handling is very regular (such as in virtio-net rx and tx completion handling). Make vring more flexible by allowing the completion function to be specified as a template parameter. No changes to the actual users - they now specify the completion function as fulfilling the same promise as vring previously did.	2014-12-14 18:49:01 +02:00
Avi Kivity	a86faf0209	virtio: de-virtualize virt_to_phys It is not a device property, but a system property.	2014-12-14 18:49:01 +02:00
Avi Kivity	f3d2908757	virtio: move buffer and config out of vring class Prior to templating it, best to get the common elements out.	2014-12-14 18:49:01 +02:00
Avi Kivity	fcbcc19231	virtio: remove buffer_chain class It's a concept that is instantiated by its users, not a true class.	2014-12-14 18:49:01 +02:00
Avi Kivity	5c4ae7a726	virtio: minor code movement	2014-12-14 18:49:01 +02:00
Avi Kivity	d14da53171	virtio: move into 'namespace virtio'	2014-12-14 18:49:01 +02:00
Avi Kivity	ea2cfbbcd8	virtio: fix indentation	2014-12-14 10:28:48 +02:00
Avi Kivity	503f1bf4d0	virtio: batch transmitted packets Instead of placing packets directly into the virtio ring, add them to a temporary queue, and flush it when we are polled. This reduces cross-cpu writes and kicks.	2014-12-11 19:20:50 +02:00
Avi Kivity	97dff83461	virtio: don't try to complete after posting a buffer, if in poll mode We will poll for it soon anyway, and completing too soon simply reduces batching.	2014-12-11 19:15:46 +02:00
Avi Kivity	4e653081a4	virtio: poll mode support With a new --virtio-poll-mode, poll queues instead of waiting for an interrupt. Increases httpd throughput by about 12%.	2014-12-11 19:15:46 +02:00
Gleb Natapov	649210b5b6	net: rename net::distributed_device to net::device	2014-12-11 13:06:32 +02:00
Gleb Natapov	0e70ba69cf	net: rename net::device to net::qp	2014-12-11 13:06:27 +02:00
Nadav Har'El	3d874892a7	dpdk: enable transmit-side checksumming offload This patch uses the NIC's capability to calculate in hardware the IP, TCP and UDP checksums on outgoing packets, instead of us doing this on the sending CPU. This can save us quite a bit of calculations (especially for the TCP/UDP checksum of full-sized packets), and avoid cache-polution on the CPU when sending cold data. On my setup this patch improves the performance of a single-cpu memcached by 6%. Together with the recent patch for receive-side checksum offloading, the total improvement is 10%. This patch is somewhat complicated by the fact we have so many different combinations of checksum-offloading capabilities; While virtio can only offload layer-4 checksumming (tcp/udp), dpdk lets us offload both ip and layer-4 checksum. Moreover, some packets are just IP but not TCP/UDP (e.g., ICMP), and some packets are not even IP (e.g., ARP), so this patch modifies a few of the hardware-features flags and the per-packet offload-information flags to fit our new needs. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-12-10 18:05:02 +02:00
Asias He	53f95abd96	virtio: Fix feature setup This fixes a big tcp_server rx regression. Before: ========== rxrx ============ Server: 192.168.66.123:10000 Connections: 100 Bytes Sent(MiB): 10000 Total Time(Secs): 85.074086675 --->> big regression!!! Bandwidth(MiB/Sec): 117.54460601148733 After: ========== rxrx ============ Server: 192.168.66.123:10000 Connections: 100 Bytes Sent(MiB): 10000 Total Time(Secs): 9.905637754 Bandwidth(MiB/Sec): 1009.5261151622362	2014-12-10 11:01:54 +02:00
Gleb Natapov	73f6d943e1	net: separate device initialization from queues initialization This patch adds new class distributed_device which is responsible for initializing HW device and it is shared between all cpus. Old device class responsibility becomes managing rx/tx queue pair and it is local per cpu. Each cpu have to call distributed_device::init_local_queue() to create its own device. The logic to distribute cpus between available queues (in case there is no enough queues for each cpu) is in the distributed_device currently and not really implemented yet, so only one queue or queues == cpus scenarios are supported currently, but this can be fixed later. The plan is to rename "distributed_device" to "device" and "device" to "queue_pair" in later patches.	2014-12-09 18:55:14 +02:00
Avi Kivity	2ee0239a4a	Merge branch 'tgrabiec/zero-copy-2' of github.com:cloudius-systems/seastar-dev Zero-copy memcached get from Tomasz: "I've measured memcached on muninn/huginn to be 7.5% better with this on vhost stack."	2014-12-04 16:31:04 +02:00
Tomasz Grabiec	76a8908b21	virtio: fix indentation	2014-12-03 13:15:09 +01:00
Gleb Natapov	7dbc333da6	core: Allow forwarding from/to any cpu	2014-12-03 17:47:29 +08:00
Gleb Natapov	bf46f9c948	net: Change how networking devices are created Currently each cpu creates network device as part of native networking stack creation and all cpus create native networking stack independently, which makes it impossible to use data initialized by one cpu in another cpu's networking device initialization. For multiqueue devices often some parts of an initialization have to be handled by one cpu and all other cpus should wait for the first one before creating their network devices. Even without multiqueue proxy devices should be created after master device is created so that proxy device may get a pointer to the master at creation time (existing code uses global per cpu device pointer and assume that master device is created on cpu 0 to compensate for the lack of ordering). This patch makes it possible to delay native networking stack creation until network device is created. It allows one cpu to be responsible for creation of network devices on multiple cpus. Single queue device initialize master device on one cpu and call other cpus with a pointer to master device and its cpu id which are used in proxy device creation. This removes the need for per cpu device pointer and "master on cpu 0" assumption from the code since now master device and slave devices know about each other and can communicate directly.	2014-11-30 18:10:08 +02:00
Asias He	88a1a37a88	ip: Support IP fragmentation in TX path Tested with UDP sending large datagrams with ufo off.	2014-11-30 10:16:38 +02:00
Avi Kivity	88b38bfbdf	Revert "virtio: Lazy interrupts" This reverts commit 817023f91741e43731823e72d60800016cbf2633; causes hangs and throughput problems.	2014-11-24 09:28:41 +02:00
Vlad Zolotarov	1238807d98	net: implement a few proper constructors for ethernet_address Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2014-11-23 23:26:54 +02:00
Asias He	817023f917	virtio: Lazy interrupts Tell host to interrupt less. This is useful for tx queue completion since we do not care much when the tx is completed exactly. Passed test with memcached and tcp_server.	2014-11-18 10:17:38 +02:00
Nadav Har'El	5b24dd78e2	virtio: don't use file eventfd for OSv notifications Now that our reactor supports non-file-descriptor notification mechanisms, switch to using one instead of eventfd when notifying of virtio interrupts. This will allow us to change the OSv enable_interrupt() code to run the handler directly, not in a separate thread, because it no longer needs to do sleepable write() to an eventfd file descriptor. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-11-13 22:24:38 +02:00
Gleb Natapov	d77ee625bd	virtio: signal availability of a virtio buffer in a vring after sending packet Currently there is an implicit unbounded queue between virtio driver and networking stack where packets may accumulate if they are received faster that networking stack can handle them. The queuing happen because virtio buffer availability is signaled immediately after received buffer promise is fulfilled, but promise fulfilment does not mean that buffer is processed, only that task that will process it is placed on a task queue. The patch fixes the problem by making virtio buffer available only after previous buffer's completion task is executed. It makes the aforementioned implicit queue between virtio driver and networking stack bound by virtio ring size.	2014-11-04 15:19:27 +02:00
Gleb Natapov	99941f0c16	virtio: remove feedback from virtio_net_device::queue_rx_packet() Instead of providing back pressure towards NIC, which will cause NIC to slow down and drop packets, network stack should drop packets it cannot handle by itself. Otherwise one slow receiver may cause drops for all others. Our native network stack correctly drops packets instead of providing feedback, so it is safe to just remove feedback from an API.	2014-11-04 15:19:13 +02:00
Tomasz Grabiec	95fd885996	virito: fix typo	2014-10-30 19:50:58 +02:00
Nadav Har'El	f497299f44	virtio: support virtio ring assigned from OSv As a second option beyond running on Linux with vhost, this patch allows Seastar to run in OSv with the virtio network device "assigned" to the application (i.e., we use the virtio rings directly, with no OSv involvement beyond the initial setup). To use this feature, one needs to compile Seastar with the "HAVE_OSV" flag, the osv::assigned_virtio::get() symbol needs to be available (which means we run under OSv), and it should return a non-null object (which means the OSv was run with --assign-net). Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-10-30 16:45:08 +02:00
Nadav Har'El	4b44968e86	virtio: expose notifier's wake_wait The wake_wait() method is only available for the notifier. Expose it from the vring holding this notifier, and from the rx or tx queue holding this vring. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-10-30 16:45:07 +02:00
Nadav Har'El	5db5f7622a	virtio: make virtio_net_device an abstract class Make virtio_net_device an abstract class, and move the vhost-specific code to a subclass, virtio_net_device_vhost. In a subsequent patch, we'll have a second subclass, for a virtio device assigned from OSv. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-10-30 16:45:07 +02:00
Nadav Har'El	8326f43ded	virtio: make virt_to_phys a virtual function In the existing code, virt_to_phys() was a fixed do-nothing function. This is good for vhost, but not good enough in OSv where the to convert virtual addresses to physical we need an actual calculation. The solution in this patch, using a virtual function, is not optimal and should probably be replaced with a template later. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-10-30 16:45:06 +02:00
Nadav Har'El	db16e4f634	virtio: separate notification from vring Currently, the "vring" class is hardcoded to do guest-host notifications via eventfd. This patch switches to a general "notification object" with two virtual functions - host_notify(), which unconditionally notifies the host, and host_wait() which returns a future<> on which one can wait for the host to notify us. This patch provides one implementation of this notification object, using eventfd as before, as needed when using vhost. We'll later provide a different implementation for running under OSv. This patch uses pointers and virtual functions; This adds a bit of overhead to every notification, but it is small compared to the other costs of these notifications. Nevertheless, we can change it in the future to make the notification object a template parameter instead of an abstract class. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-10-30 16:45:06 +02:00
Tomasz Grabiec	95975151f6	virtio: change descriptor free list to FIFO instead of LIFO Based on observation that with packets comprised of multiple fragments vhost_get_vq_desc() goes higher in CPU profile. Avi suggested that the current LIFO handling of free descriptors causes contention on cache lines between seastar on vhost. Gives 6-10% boost depending on hardware.	2014-10-29 19:19:54 +02:00
Gleb Natapov	1c827805bc	virtio: Use correct eventfd for virtio rx queue It is nice to be able to actually kick rx queue from time to time. Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-10-28 16:47:26 +02:00
Avi Kivity	6dcf24f98d	Move contents of async-action.hh into future-util.hh	2014-10-27 19:28:10 +02:00
Avi Kivity	91782ac6a2	virtio: optimize single-buffer packet deleter Instead of allocating a vector to store the buffers to be destroyed, in the case of a single buffer, use an ordinary free deleter. This doesn't currently help much because the packet is share()d later on, but if we may be able to eliminate the sharing one day.	2014-10-21 11:27:05 +03:00
Avi Kivity	e6834b9fb3	virtio: remove allocations from transmit path Instead of allocating a buffer vector, construct a "virtual vector" that transforms packet fragments as needed.	2014-10-15 17:17:01 +03:00
Avi Kivity	ba5447871b	virtio: switch to allocating virtio decriptors front-to-back Simplifies requirements on callers.	2014-10-15 17:17:01 +03:00
Avi Kivity	a331b5a129	virtio: move vring::buffer::completed to vring::buffer_chain We aren't interested in completion of a buffer, just a buffer_chain (aka request). Move it there to simplify things.	2014-10-15 17:17:01 +03:00

1 2

83 Commits