Commit Graph

18 Commits

Author SHA1 Message Date
Avi Kivity
d170efbfcb fstream: optimize file_data_source_impl
file_data_source_impl always calls file::size(), which can be slow.  This
slows down applications that create many short-lived input streams on the
same file (for random-access processing of a subset of the data).

Fix by not calling size(), and letting the file code handle short reads
itself.
2015-07-09 18:40:05 +03:00
Avi Kivity
c05c7c09cf fstream: close file on stream close 2015-06-28 20:36:23 +03:00
Nadav Har'El
f687ec40a3 fstream: catch bug early
the file_data_sink_impl::put() code assumes it is always called on buffers
with size multiple of dma alignment (4096), except the *last* one. After
writing one unaligned-size buffer, further writes cannot continue because
the offset into the file no longer has the right alignment! If a caller
does try to do that, there is a bug in the caller (it's not a run-time error,
it's a design bug), and better discover it quickly with an assert, as I do
in this patch.

I had such a caller in an example application, and it took me a whole day
of debugging just to figure out that this is where the caller actually had
a bug.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Reviewed-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-28 19:31:00 +03:00
Nadav Har'El
b306cee41e fstream: fix behavior of large writes to output stream
Our file-output code contains various layers making sometimes contradictory
assumptions, and it is a real art-form to make it all work together.
They usually do work together well, but there was one undetected bug for
large writes to a file output stream:

The problem is what happens when we try to write a large buffer (larger
than the output stream's buffer) in one output_stream::write() call.
By default, output_stream uses the efficient, zero-copy, implementation
which calls the underlying data sink's put function on the entire written
buffer, without copying it to the stream's buffer first.

Unfortunately, this solution does NOT work on *file* output streams.
Because of our use of AIO and O_DIRECT, we can only write from aligned
buffers, and at aligned (multiple of dma_alignment) sizes. Even a large
size cannot be fully written if not a multiple of dma_alignment, and
the need to align the buffers, and data already on the output_stream,
complicate things further.

Amazingly, we already had an option "_trim_to_size" in output_stream to
do the right thing, and we just need to enable it for file output stream.
In special cases (aligned position, aligned input buffer) it might be
possible to do something even more efficient - zero copy and just one
write request - but in the general case, _trim_to_size is exactly what
we needed.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-06-28 19:29:19 +03:00
Raphael S. Carvalho
534401c91f fstream: use dma_alignment constant instead of a hardcoded value
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-18 12:49:33 +03:00
Raphael S. Carvalho
02bdf380c4 fstream: abort instead of silently returning a ready future
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-18 12:49:33 +03:00
Avi Kivity
c4756c7622 fstream: fix dropped future in write path
Noticed by Raphael.
2015-06-10 11:48:20 +03:00
Avi Kivity
44e35ef545 fstream: preallocation support for file output stream
Preallocate disk blocks in advance of writing.

Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-09 08:44:59 +03:00
Raphael S. Carvalho
d864da71fc core: avoid fsyncing output stream twice
For some reason, I added a fsync call when the file underlying the
stream gets truncated. That happens when flushing a file, which
size isn't aligned to the requested DMA buffer.
Instead, fsync should only be called when closing the stream, so this
patch changes the code to do that.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-08 11:59:22 +03:00
Vlad Zolotarov
c547f9a718 fstream: prevent ANY exceptions to be thrown in good get() flow
There is a hidden exception that could be thrown insize file::dma_get_bulk()
if file size is not aligned with fstream::_buffer_size. In this case
file::dma_read_bulk() will be given a _buffer_size as a length for the last data
chunk in the file too. file::dma_read_bulk() will get a short read (till EOF)
and then will try to read beyond it (by calling file::read_maybe_eof())
in order to differentiate between I/O error and EOF. file::read_maybe_eof()
will throw a file::eof_error exception to indicate the EOF, it will be caught
by file::dma_read_bulk() and since we have read some "good" bytes by now this
exception won't be forwarded further. However the damage by throwing the exception
has already been done and we want to avoid this in fstream flow (unless there are
real errors).

In order to prevent the above we will always request file::dma_read_bulk() to
read the amount of data it should be able to deliver (not beyond EOF).

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-05-28 13:25:40 +03:00
Vlad Zolotarov
7b1f433aed file: Rework read interface
Move the get() logic in fstream.cc into the file::dma_read_bulk()
fixing some issues:
   - Fix the funny "alignment" calculation.
   - Make sure the length is aligned too.
   - Added new functions:
      - dma_read(pos, len): returns a temporary_buffer with read data and
                            doesn't assume/require any alignment from either "pos"
                            or "len". Unlike dma_read_bulk() this function will
                            trim the resulting buffer to the requested size.
      - dma_read_exactly(pos, len): does exactly what dma_read(pos, len) does but it
                                    will also throw and exception if it failed to read
                                    the required number of bytes (e.g. EOF is reached).
   - Changed the names of parameters of dma_read(pos, buf, len) in order to emphasize
     that they have to be aligned.
   - Added a description to dma_read(pos, buf, len) to make it even more clear.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-05-26 15:15:39 +03:00
Raphael S. Carvalho
f49888c649 fstream: sync file when file output stream is closed
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-03-31 10:26:03 +03:00
Raphael S. Carvalho
971a3bb0b9 fstream: add file output stream support 2015-03-10 18:44:22 -03:00
Nadav Har'El
8b4117cc66 fstream: use temporary_buffer<char>::aligned
In fstream.cc, use the convenient new temporary_buffer<char>::aligned()
function for creating an aligned temporary buffer - instead of repeating
its implementation.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-03-10 17:25:42 +02:00
Nadav Har'El
7a04d1f662 fstream: refactor file input stream interface
The file_input_stream interface was messy: it was not fiber safe (e.g., code
doing seek() in the middle of an ongoing read_exactly()), and went against
the PIMPL philosophy.

So this patch removes the file_input_stream class, and replaces it with a
completely different design:

We now have in fstream.hh a global function:

input_stream<char>
make_file_input_stream(
        lw_shared_ptr<file> file, uint64_t offset = 0,
	uint64_t buffer_size = 8192);

In other words, instead of "seeking" in an input stream, we just open a new
input stream object at a particular offset of the given file. Multiple input
streams might be concurrently active on the same file.

Note how make_file_input_stream now returns a regular "input_stream", not a
subtype, and it can be used just like any normal input_stream to read the stream
starting at the given position.

This patch makes "input_stream" a "final" type: we no longer subclass it in our
code, and we shouldn't in the future because it goes against the PIMPL design
(the subclass should be of the inner workings, like the data_source_impl, not
of input_stream).

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-03-10 15:39:17 +02:00
Avi Kivity
7f8d88371a Add LICENSE, NOTICE, and copyright headers to all source files.
The two files imported from the OSv project retain their original licenses.
2015-02-19 16:52:34 +02:00
Glauber Costa
861d2625b2 file_stream: proper seek support.
Our file_stream interface supports seek, but when we try to seek to arbitrary
locations that are smaller than an aio-boundary (say, for instance, f->seek(4)),
we will end up not being able to perform the read.

We need to guarantee the reads are aligned, and will then present to the caller
the buffer properly offset.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-02-18 22:56:07 +02:00
Avi Kivity
af0bf06836 core: add file_data_source, file_input_stream
Implement a character stream backed by a file.
2015-02-11 15:38:51 +02:00