Instead of having an std::vector<> manage the fragment array,
allocate it at the end of the impl struct and manage it manually.
The result isn't pretty but it does remove an allocation.
Move all data fields into an 'impl' struct (pimpl idiom) so that move()ing
a packet becomes very cheap. The downside is that we need an extra
allocation, but we can later recover that by placing the fragment array
in the same structure.
Even with the extra allocation, performance is up ~10%.
The imperative form suggests that in addition to returning a future it
performs some action and thus is needed regardless of whether we want
to add a callback or not. But in fact it does not do anything, just
gives away a future.
Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
"-Wl,--no-as-needed" needs to be included before the libraries,
otherwise it has no effect.
Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
Start a timer processing loop, and control it in tcp::input()/output():
input() toggles the arming state of the timer, while output() clears it
unconditionally.
This is worth 25-30% on the http benchmark, since we no longer ACK the
HTTP GET request, instead coalescing it with the response.
Instead of keeping the future state in heap storage, referenced from
the promise/future/task, keep the state in the promise or future and only
allocate it on the heap as a last resort (for a stack).
Improves httpd throughtput by ~20%.
When reducing the checksum from a 32-bit or 64-bit intermediate,
we can get an overflow after the first overflow handling step:
0000_8000_8000_ffff
-> 10_ffff
-> 1_000f
-> 0010
Since we lacked the second step, we got an off-by-one in the checksum.
With the current listen() -> future<connection> interface, if a new connection
is established before the connection handler is able to re-register for the
port, we have to drop the connection.
Fix by adding a queue for accepted connections, and switching to the more
standard listen() -> accept() -> future<connection> model.
If an L3 packet receiver is not able to register itself as a packet receiver
after processing a packet, or if it is simply not dispatched quickly enough,
then we will drop packets.
Add a queue at the protocol layer to buffer those packets.