scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 20:16:43 +00:00

Author	SHA1	Message	Date
Gleb Natapov	73f6d943e1	net: separate device initialization from queues initialization This patch adds new class distributed_device which is responsible for initializing HW device and it is shared between all cpus. Old device class responsibility becomes managing rx/tx queue pair and it is local per cpu. Each cpu have to call distributed_device::init_local_queue() to create its own device. The logic to distribute cpus between available queues (in case there is no enough queues for each cpu) is in the distributed_device currently and not really implemented yet, so only one queue or queues == cpus scenarios are supported currently, but this can be fixed later. The plan is to rename "distributed_device" to "device" and "device" to "queue_pair" in later patches.	2014-12-09 18:55:14 +02:00
Gleb Natapov	2fb3dc03f6	net: remove unused opts parameter from proxy_net_device constructor	2014-12-09 18:55:05 +02:00
Gleb Natapov	34a8744fd3	smp: wait for all cpus before signaling start promise If start promise on initial cpu is signaled before other cpus have networking stack constructed collected initialization crashes since it tries to create a UDP socket on all available cpus when initial one is ready.	2014-12-09 18:54:56 +02:00
Avi Kivity	7dfd7de8cd	future: optimize data-less future<> A future that does not carry any data (future<>) and its sibling (promise<>) are heavily used in the code. We can optimize them by overlaying the future's payload, which in this case can only be an std::exception_ptr, with the future state, as a pointer and an enum have disjoint values. This of course depends on std::exception_ptr being implemented as a pointer, but as it happens, it is. With this, sizeof(future<>) is reduced from 24 bytes to 16 bytes.	2014-12-09 10:08:48 +02:00
Asias He	20acb6db9c	xen: Fix mismatched signature Found with clang: [46/68] CXX build/release/core/xen/evtchn.o FAILED: clang -MMD -MT build/release/core/xen/evtchn.o -MF build/release/core/xen/evtchn.o.d -std=gnu++1y -Wall -Werror -fvisibility=hidden -pthread -I. -Wno-mismatched-tags -DHAVE_XEN -DHAVE_HWLOC -DHAVE_NUMA -O2 -I build/release/gen -c -o build/release/core/xen/evtchn.o core/xen/evtchn.cc core/xen/evtchn.cc:83:18: error: 'xen::userspace_evtchn::umask' hides overloaded virtual function [-Werror,-Woverloaded-virtual] virtual void umask(int port, unsigned count); ^ core/xen/evtchn.hh:38:18: note: hidden overloaded virtual function 'xen::evtchn::umask' declared here: type mismatch at 2nd parameter ('int' vs 'unsigned int') virtual void umask(int port, int count) {}; ^ 1 error generated.	2014-12-09 09:59:46 +02:00
Asias He	9a9297c89d	ip: Implement fragment timeout and memory usage limit	2014-12-09 09:59:44 +02:00
Asias He	89c8c6148f	net: Add packet::memory Add packet::memory() which estimates the memory load (by adding sizeof packet::impl). Note it will only be accurate after linearize/compact.	2014-12-09 09:59:44 +02:00
Asias He	c03e356873	net: Improve packet::linearize Free the original memory earlier if copied all of them.	2014-12-09 09:59:43 +02:00
Nadav Har'El	3f2ea82e6d	dpdk: rx checksum offloading If the card supports this (and usually, it does), enable rx checksum offloading by the card, and avoid calculating the checksums ourselves. With rx checksum offloading, the card checks in incoming packets the IP header checksum and the L4 (TCP or UDP) checksum, and gives us a flag when one of them is wrong, meaning that we do not need to do these calculations ourselves. This patch improves memcached performance on my setup by almost 3%. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2014-12-08 20:41:31 +02:00
Shlomi Livne	2f5644db1b	Update README with additional instruction for running DPDK	2014-12-08 16:11:22 +02:00
Avi Kivity	30143fe18d	reactor: destroy network_stack after timer infrastructure The network stack contains a timer, so it must be constructed after the timer infrastructure and destroyed before it. Fixes a segfault on shutdown.	2014-12-07 17:37:13 +02:00
Avi Kivity	674076c7bd	smp: fix indentation	2014-12-07 17:37:13 +02:00
Avi Kivity	f4d7bd7e00	reactor: register pollers using a RAII class Avoids leaking a poller.	2014-12-07 17:36:44 +02:00
Avi Kivity	5b7ebc0f6f	build: disable string literal warnings when building with dpdk	2014-12-07 17:34:41 +02:00
Vlad Zolotarov	5bc89b974a	dpdk: First proper offload features initialization - Query the port for its caps. - Properly adjust the queue numbers according to the caps. - Enable RSS only if the final queues number is greater than 1. - Enable Rx VLAN stripping. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2014-12-07 17:32:36 +02:00
Vlad Zolotarov	5cc8785b96	packet: Added HW VLAN stipping option. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2014-12-07 17:32:36 +02:00
Vlad Zolotarov	2d10018870	dpdk: separate the EAL initialization from port initialization - Create a new class dpdk_eal that initializes DPDK EAL. - Get rid of portmask crap and provide a port index to a dpdk::net_device constructor. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2014-12-07 17:31:12 +02:00
Gleb Natapov	4ade76a182	reactor: add missing std::forward in at_exit()	2014-12-07 16:45:53 +02:00
Avi Kivity	a2016bc1dd	ip: fix smp fragment reassembly ipv4::handle_on_cpu() did not properly convert from network byte order, so it saw any packets with DF=1 as fragmented. Fix by applying the proper conversion.	2014-12-07 12:01:31 +02:00
Avi Kivity	2ee0239a4a	Merge branch 'tgrabiec/zero-copy-2' of github.com:cloudius-systems/seastar-dev Zero-copy memcached get from Tomasz: "I've measured memcached on muninn/huginn to be 7.5% better with this on vhost stack."	2014-12-04 16:31:04 +02:00
Tomasz Grabiec	8bfca6f740	memcached: convert 'get' to use zero-copy send.	2014-12-04 13:51:35 +01:00
Tomasz Grabiec	e831884c13	tests: add zero copy UDP test It listens for requests on port 10000 and sends responses comprised of three chunks of data in one packet. The chunk sizes are specified via the --chunk-size argument. The reqest can be anything, its content is ignored. You can switch to equivalent copying version by passing --copy argument.	2014-12-04 13:51:35 +01:00
Tomasz Grabiec	c4335c49f6	core: convert output APIs to work on packets This way zero-copy supporting code can put data directly to packet object and pass it through all layers efficiently.	2014-12-04 13:51:26 +01:00
Tomasz Grabiec	ba0ac1c2b8	core: simplify write_all() The only case when write_all() does not write all the data is when the fiber fails at some point, in which case the resulting future is failed too.	2014-12-04 13:37:36 +01:00
Tomasz Grabiec	cd3ba33ead	core: introduce scattered_message It's a builder class for creating messages comprised of multiple fragments.	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	a2ca556836	sstring: introduce release() Releases owenrship of the data and gives it away as temporary_buffer. This way we can avoid allocation when putting rvalue sstring if it's already using external storage. Except we need to allocate a deleter which uses delete[], but this can be fixed later.	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	72b0794759	packet: add constructor for appending temporary_buffers	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	3a2d74e3d3	packet: add reserve() method	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	f3dada6f1d	packet: add constructor for appending deleters Deleters not always come with fragments. When multiple fragments share a deleter, first fragments are appended and then one deleter for all of them.	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	8ffcdac455	packet: move lambdas rather than copy them Some lambdas are not copyable.	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	2650c68824	packet: add more constructor variants	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	e720f53c22	deleter: add chaining make_free_deleter() overload	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	8e38cb4159	deleter: introduce append() It's to help chaining up deleters when appending a fragment with anq already created deleter.	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	7fd878b1ed	tests: memcached: fix test_mutliple_keys_in_get() Values may be reported in different order than the order of keys in the request.	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	411e6c1b02	tests: memcached: add test checking splitting of large responses	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	b6cb2b7477	memcached: add missing return	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	e4583699fd	memcached: make sure datagrams are below the size limit	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	71556de0e6	tests: add more test cases to output_stream_test	2014-12-04 13:37:35 +01:00
Tomasz Grabiec	11a501d884	tests: fix assertion failure at test program exit Seastar's allocator has an assertion which checks that the memory block is freed on the same CPU on which it was allocated. This is reasonable because all allocations in seastar need to be CPU-local. The problem is that boost libraries (program_options, unit_testing_framework) make heavy use of lazily-allocated static variables, for instance: const variable_value& variables_map::get(const std::string& name) const { static variable_value empty; // ... } Such variable will be allocated in in the current threads but freed in the main thread when exit handlers are executed. This results in: output_stream_test: core/memory.cc:415: void memory::cpu_pages::free(void*): Assertion `((reinterpret_cast<uintptr_t>(ptr) >> cpu_id_shift) & 0xff) == cpu_id' failed. This change works around the problem by forcing initialization in the main thread. See issue #10.	2014-12-04 13:37:02 +01:00
Tomasz Grabiec	a88ddcec25	sstring: overload to_sstring() for temporary_buffer<>	2014-12-04 13:35:33 +01:00
Tomasz Grabiec	bcea3a67ca	output_stream: support for output packet trimming For UDP memcached we cannot generate arbitrarily large chunks, we need to trim to datagram size. It's most efficient to split in the output_stream.	2014-12-03 20:02:21 +01:00
Tomasz Grabiec	4b7c42a5c7	output_stream: fix bug in write() When coalescing large buffer with buffered data _end was not updated so flush() would yield shorter packet.	2014-12-03 20:02:21 +01:00
Tomasz Grabiec	6ae5177c2c	output_stream: do not allocate on flush() In UDP memcached flush() is always the last operation on outpout_stream, so that allocation is wasted.	2014-12-03 20:02:21 +01:00
Tomasz Grabiec	584139decd	future-util: make do_for_each() propagate failure	2014-12-03 20:02:21 +01:00
Tomasz Grabiec	8d48c91a35	future-util: introduce now() This function belongs to a group of functions for associating futures with time points. Currently there's only now(), which servers as a shorthand for make_ready_future<>().	2014-12-03 19:57:43 +01:00
Avi Kivity	3e4842a2a1	Merge branch 'asias/ip' of github.com:cloudius-systems/seastar-dev IP fragment reassembly from Asias.	2014-12-03 16:03:18 +02:00
Asias He	59aa280f0d	ip: Add IPv4 reassembly support If a TCP or UDP IP datagram is fragmented, only the first fragment will contain the port information. When a fragment without port information is received, we have no idea which "stream" this fragment belongs to, thus we no idea how to forward this packet. To solve this problem, we use "forward twice" method. When IP datagram which needs fragmentation is received, we forward it using the frag_id(src_ip, dst_ip, identification, protocol) hash. When all the fragments are received, we forward it using the connection_id(src_ip, src_port, dst_ip, dst_port) hash.	2014-12-03 21:40:49 +08:00
Gleb Natapov	4fd3313e3e	reactor: add "--poll" command line switch If the switch is used reactor never goes idle.	2014-12-03 14:37:49 +02:00
Gleb Natapov	d151763967	reactor: move memory barrier to idle() accessors	2014-12-03 14:37:41 +02:00
Gleb Natapov	4d3b6497ea	reactor: rework poll infrastructure Move idle state management out from smp poller back to generic code. Each poller returns if it did any useful work and generic code decided if it should go idle based on that. If a poller requires constant polling it should always return true.	2014-12-03 14:37:33 +02:00

... 1059 1060 1061 1062 1063 ...

53948 Commits