Commit Graph

1280 Commits

Author SHA1 Message Date
Vlad Zolotarov
82e20564b0 DPDK: Initialize mempools to work with external memory
If seastar is configured to use hugetlbfs initialize mempools
with external memory buffer. This way we are going to better control the overall
memory consumption.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Use char* instead of void* for pointer's arithmetics.
2015-02-11 19:27:12 +02:00
Vlad Zolotarov
d4cddbc3d0 DPDK: Use separate pools for Rx and Tx queues and adjust their sizes
There is no reason for Rx and Tx pools to be of the same size:

Rx pool is 3 times the ring size to give the upper layers some time
to free the Rx buffers before the ring stalls with no buffers.

Tx has absolutely different constraints: since it provides a back pressure
to the upper layers if HW doesn't keep up there is no need to allow more buffers
in the air than the amount we may send in a single rte_eth_tx_burst() call.
Therefore we need 2 times HW ring size buffers since HW may release the whole
ring of buffers in a single rte_eth_tx_burst() call and thus we may be able to
place another whole ring of buffers in the same call.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v4:
   - Fixed the info message.
2015-02-11 19:27:12 +02:00
Vlad Zolotarov
18f35236db memory: Move page_size, page_bits and huge page size definitions to header
They are going to be used in more places (not just in memory.cc).

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-02-11 19:27:12 +02:00
Avi Kivity
3f848c5714 Merge branch 'file'
Add an adapter from our block-based files to our character stream interface,
input_stream, and a test program demonstrating their use.
2015-02-11 17:45:13 +02:00
Avi Kivity
64930bc610 tests: add linecount tests
Demonstrates and tests file_input_stream.
2015-02-11 15:38:51 +02:00
Avi Kivity
d7eb4e96fb app-template: add support for positional options
Example:

    app_template app;
    namespace bpo = boost::program_options;
    app.add_positional_options({
        { "file", bpo::value<std::string>(), "File to process", 1 },
    });
2015-02-11 15:38:51 +02:00
Avi Kivity
af0bf06836 core: add file_data_source, file_input_stream
Implement a character stream backed by a file.
2015-02-11 15:38:51 +02:00
Avi Kivity
d31de31aac core: add input_stream::reset()
Useful for seekable streams, to drop existing buffered data.
2015-02-11 15:38:49 +02:00
Raphael S. Carvalho
c725014614 memcached: add option to listen on a different port
useful when testing multiple memcached servers.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-02-10 19:27:43 +02:00
Avi Kivity
2dadcdc5e7 core: make some data_source internals available to derived classes
Useful for adding functionality such as seekable streams.
2015-02-10 19:00:45 +02:00
Avi Kivity
381814aeaf stream.hh: add missing include 2015-02-10 18:59:38 +02:00
Avi Kivity
951a93a534 file.hh: add missing include 2015-02-10 18:59:16 +02:00
Tomasz Grabiec
10e58e0cda tests: Make test runner catch and forward exceptions thrown directly from task 2015-02-10 14:47:42 +02:00
Tomasz Grabiec
85c67001dd tests: Add test for exceptions thrown from do_until() 2015-02-10 14:47:42 +02:00
Tomasz Grabiec
331d5e1569 core: Fail do_until() future when the callback throws
Otherwise we will aband the result promise, which results in abort.
2015-02-10 14:47:42 +02:00
Avi Kivity
ee58c77008 httpd: fix unbounded memory use in eerror handling
httpd uses recursion for its read loop:

  future<> read() {
     _read_buf.consume().then([] {
        ...
        if more work:
           return read();
     });
  }

However, after error handling was added, it looks like this:

  future<> read() {
     _read_buf.consume().then([] {
        ...
        if more work:
           return read();
     }).rescue(...);
  }

The problem is that rescue() is called for every iteration of the loop,
instead of for the loop in its entirety.  This means that a rescue
continuation is allocated for every processed request, but they will only
be called after the entire loop terminates.  This results in tons of
allocated memory.

Fix by moving error handling to the end of the loop (and incidentally using
do_until() instead of recursion).
2015-02-10 12:00:32 +02:00
Avi Kivity
29366cb076 net: add byteorder (ntoh/hton) variants for signed types 2015-02-09 17:07:21 +02:00
Asias He
f0c1bcdb33 tcp: Switch to debug print for persist timer
It is a left over during development.
2015-02-09 10:58:16 +02:00
Asias He
a192391ac6 tcp: Init timer callback using constructor 2015-02-09 10:58:15 +02:00
Asias He
0ac0e06d32 packet: Linearize after merge
The packet will be merged with the old packet anyway. Linearize after
the merge.
2015-02-09 10:58:15 +02:00
Raphael S. Carvalho
bf41da8974 core: small optimization when constructing std::vector<cpu>
Size of std::vector<cpu> can be pre-determined, then let's reserve memory ahead
of time so that push back calls would be optimized.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-02-08 19:05:45 +02:00
Avi Kivity
7a704f7a40 sstring: fix truncation in compare()
If the difference between the sizes of the two strings is larger than can
be represented by an int, truncation will occur and the sign of the result
is undefined.

Fix by using explicit tests and return values.
2015-02-08 11:41:22 +02:00
Pekka Enberg
9a55e9fd22 sstring: Add 'compare' and 'operator<'
Add string comparison functions to basic_sstring that are required for
C++ containers such as std::map and std::multimap.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-02-08 11:12:31 +02:00
Pekka Enberg
fc7cb5ab5e shared_ptr: Fix assignment of polymorphic types
Fix the assignment operator to work with polymorphic types.

Suggested by Nadav.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-02-08 10:24:21 +02:00
Tomasz Grabiec
f948ee79bd test.py: Add --name filtering option 2015-02-08 10:09:29 +02:00
Tomasz Grabiec
ead03f1b08 test.py: Add --mode parameter for filtering tests 2015-02-08 10:09:29 +02:00
Avi Kivity
4b28eb638f Merge branch 'asias/tcp_v1' of github.com:cloudius-systems/seastar-dev
tcp queue from Asias: "Contains both fixes and improvemnts".
2015-02-07 20:20:57 +02:00
Raphael S. Carvalho
2195f77879 memcached: stats: rename evicted to evictions
Change for compliance with stock memcached.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-02-07 19:57:19 +02:00
Nadav Har'El
a9ef189a54 core: add support for enum types as hash-table keys
This patchs adds a header file, "core/enum.hh"; Code which includes
this header file will be able to use an enumerated type as the key in a
hash table.

The header file implements a hash function for *all* enumerated types,
by using the standard hash function of the underlying integer type.
2015-02-07 12:33:48 +02:00
Asias He
abd5a24354 tcp: Implement persist timer
It is used to recover from a race where the sender is waiting for a
window update and the receiver is waiting for the sender to send more,
because somehow the window update carried in the ACK packet is not seen
by the sender.
2015-02-05 17:52:32 +08:00
Asias He
6a468dfd3d packet: Linearize more in packet_merger::merge
This fix tcp_server rxrx test on DPDK. The problem is that when we
receive out of order packets, we will hold the packet in the ooo queue.
We do linearize on the incoming packet which will copy the packet and
thus free the old packet. However, we missed one case where we need to
linearize. As a result, the original packet will be held in the ooo
queue. In DPDK, we have fixed buffer in the rx pool. When all the dpdk
buffer are in ooo queue, we will not be able to receive further packets.
So rx hangs, even ping will not work.
2015-02-05 17:52:32 +08:00
Asias He
4f21d500cb tcp: Do nothing if already in CLOSED state when close
This fix the following:

Server side:
$ tcp_server

Client side:
$ go run client.go -host 192.168.66.123 -conn 10 -test txtx
$ control-c

At this time, connection in tcp_server will be in CLOSED state (reset by
the remote), then tcp_server will call tcp::tcb::close() and wait for
wait_for_all_data_acked(), but no one will signal it. Thus we have tons
of leaked connection in CLOSED state.
2015-02-05 17:52:32 +08:00
Asias He
dd741d11b8 tcp: Fix FIN is not sent in some cases
We call output_one to make sure a packet with FIN is actually generated
and then sent out. If we only call output() and _packetq is not empty,
in tcp::tcb::get_packet(), packet with FIN will not be generated, thus
we will not send out a FIN.

This can happen when retransmit packets have been queued into _packetq,
then ACK comes which ACK all of the unacked data, then the application
call close() to close the connection.
2015-02-05 17:52:32 +08:00
Asias He
f600e3c902 tcp: Add queued_len
Take the number of queued data into account when checking if all
the data is sent.
2015-02-05 17:52:32 +08:00
Asias He
fca74f9563 tcp: Implement RFC6582 NewReno
We currently have RFC5681, a.k.a Reno TCP, as the congestion control
algorithms: slow start, congestion avoidance, fast retransmit, and fast
recovery. RFC6582 describes a specific algorithm for responding to
partial acknowledgments, referred to as NewReno, to improve Reno.
2015-02-05 17:45:48 +08:00
Asias He
426938f4ed tcp: Add Limited Transfer per RFC3042 and RFC5681
When RFC3042 is in use, additional data sent in limited transmit MUST
NOT be included in this calculation to update _snd.ssthresh.
2015-02-05 17:05:00 +08:00
Asias He
2289b03354 httpd: Fix RST handling
I found wrk sometimes sends RST instead a FIN to close a connection. In
this case, we will reset the connection and go to CLOSED state. However
httpd will not delete this, so we will have leaked connections in CLOSED
state.

Fix by handling the exception and sending an empty response as we do in
EOF case. Here we do not pass the exception to upper layer again,
otherwise httpd will be very noise.
2015-02-05 16:57:58 +08:00
Gleb Natapov
89763c95c9 core: optimise timer completions vs periodic timers
The way periodic timers are rearmed during timer completion causes
timer_settime() to be called twice for each periodic timer completion:
once during rearm and second time by enable_fn(). Fix it by providing
another function that only re-adds timer into timers container, but do
not call timer_settime().
2015-01-29 12:43:28 +02:00
Avi Kivity
94e01e6d0e tests: exit after timertest ends 2015-01-29 12:24:03 +02:00
Avi Kivity
070eb7d496 tests: serialize timer tests
Otherwise the output gets interspersed.
2015-01-29 12:20:39 +02:00
Avi Kivity
59c0d7e893 smp: fix work item deletion
Delete it after completion, not after responding.
2015-01-29 12:14:05 +02:00
Gleb Natapov
bcae5f2538 smp: fix memory leak in smp queue
Delete completed items. Fixes regression from ff4aca2ee0.
2015-01-29 11:49:24 +02:00
Avi Kivity
42bc73a25d dpdk: initialize _tx_burst_idx
Should fix random segfault.
2015-01-29 11:18:54 +02:00
Asias He
0ab01d06ac tcp: Rework segment arrival handling
Follow RFC793 section "SEGMENT ARRIVES".

There are 4 major cases:

1) If the state is CLOSED
2) If the state is LISTEN
3) If the state is SYN-SENT
4) If the state is other state

Note:

- This change is significant (more than 10 pages in RFC793 describing
  this segment arrival handling).
- More test is needed. Good news is, so far, tcp_server(ping/txtx/rxrx)
  tests and httpd work fine.
2015-01-29 10:59:31 +02:00
Tomasz Grabiec
661bb3d478 tests: Use test_runner to run boost tests 2015-01-29 10:30:14 +02:00
Tomasz Grabiec
a1fecad8cb tests: Introduce test_runner class
It uses app_template to launch seastar framework and can be used from
outside threads to inject tasks.
2015-01-29 10:30:14 +02:00
Tomasz Grabiec
8ad50d6614 core: Add exchanger class 2015-01-29 10:30:13 +02:00
Avi Kivity
b3dd1c8285 Merge branch 'signal' of ../seastar
Simplify signal handling.
2015-01-29 10:08:27 +02:00
Takuya ASADA
9de86ed651 tests: Support tcp_server tests(ping,txtx,rxrx) on tcp_client
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-01-28 16:26:46 +02:00
Tomasz Grabiec
7a55f21b29 core: Move _timer to an instance field
So that the callback which is set on it and which is allocated on CPU
0 is destroyed on CPU 0 when the clock dies. Otherwise we can attempt
to delete it after the CPU thread is gone if CPU 0 != main thread.
2015-01-28 16:18:55 +02:00