scylladb

Author	SHA1	Message	Date
Avi Kivity	d170efbfcb	fstream: optimize file_data_source_impl file_data_source_impl always calls file::size(), which can be slow. This slows down applications that create many short-lived input streams on the same file (for random-access processing of a subset of the data). Fix by not calling size(), and letting the file code handle short reads itself.	2015-07-09 18:40:05 +03:00
Avi Kivity	c05c7c09cf	fstream: close file on stream close	2015-06-28 20:36:23 +03:00
Nadav Har'El	f687ec40a3	fstream: catch bug early the file_data_sink_impl::put() code assumes it is always called on buffers with size multiple of dma alignment (4096), except the last one. After writing one unaligned-size buffer, further writes cannot continue because the offset into the file no longer has the right alignment! If a caller does try to do that, there is a bug in the caller (it's not a run-time error, it's a design bug), and better discover it quickly with an assert, as I do in this patch. I had such a caller in an example application, and it took me a whole day of debugging just to figure out that this is where the caller actually had a bug. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com> Reviewed-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-28 19:31:00 +03:00
Nadav Har'El	b306cee41e	fstream: fix behavior of large writes to output stream Our file-output code contains various layers making sometimes contradictory assumptions, and it is a real art-form to make it all work together. They usually do work together well, but there was one undetected bug for large writes to a file output stream: The problem is what happens when we try to write a large buffer (larger than the output stream's buffer) in one output_stream::write() call. By default, output_stream uses the efficient, zero-copy, implementation which calls the underlying data sink's put function on the entire written buffer, without copying it to the stream's buffer first. Unfortunately, this solution does NOT work on file output streams. Because of our use of AIO and O_DIRECT, we can only write from aligned buffers, and at aligned (multiple of dma_alignment) sizes. Even a large size cannot be fully written if not a multiple of dma_alignment, and the need to align the buffers, and data already on the output_stream, complicate things further. Amazingly, we already had an option "_trim_to_size" in output_stream to do the right thing, and we just need to enable it for file output stream. In special cases (aligned position, aligned input buffer) it might be possible to do something even more efficient - zero copy and just one write request - but in the general case, _trim_to_size is exactly what we needed. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-06-28 19:29:19 +03:00
Raphael S. Carvalho	534401c91f	fstream: use dma_alignment constant instead of a hardcoded value Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-18 12:49:33 +03:00
Raphael S. Carvalho	02bdf380c4	fstream: abort instead of silently returning a ready future Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-18 12:49:33 +03:00
Avi Kivity	c4756c7622	fstream: fix dropped future in write path Noticed by Raphael.	2015-06-10 11:48:20 +03:00
Avi Kivity	44e35ef545	fstream: preallocation support for file output stream Preallocate disk blocks in advance of writing. Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-06-09 08:44:59 +03:00
Raphael S. Carvalho	d864da71fc	core: avoid fsyncing output stream twice For some reason, I added a fsync call when the file underlying the stream gets truncated. That happens when flushing a file, which size isn't aligned to the requested DMA buffer. Instead, fsync should only be called when closing the stream, so this patch changes the code to do that. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-06-08 11:59:22 +03:00
Vlad Zolotarov	c547f9a718	fstream: prevent ANY exceptions to be thrown in good get() flow There is a hidden exception that could be thrown insize file::dma_get_bulk() if file size is not aligned with fstream::_buffer_size. In this case file::dma_read_bulk() will be given a _buffer_size as a length for the last data chunk in the file too. file::dma_read_bulk() will get a short read (till EOF) and then will try to read beyond it (by calling file::read_maybe_eof()) in order to differentiate between I/O error and EOF. file::read_maybe_eof() will throw a file::eof_error exception to indicate the EOF, it will be caught by file::dma_read_bulk() and since we have read some "good" bytes by now this exception won't be forwarded further. However the damage by throwing the exception has already been done and we want to avoid this in fstream flow (unless there are real errors). In order to prevent the above we will always request file::dma_read_bulk() to read the amount of data it should be able to deliver (not beyond EOF). Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-05-28 13:25:40 +03:00
Vlad Zolotarov	7b1f433aed	file: Rework read interface Move the get() logic in fstream.cc into the file::dma_read_bulk() fixing some issues: - Fix the funny "alignment" calculation. - Make sure the length is aligned too. - Added new functions: - dma_read(pos, len): returns a temporary_buffer with read data and doesn't assume/require any alignment from either "pos" or "len". Unlike dma_read_bulk() this function will trim the resulting buffer to the requested size. - dma_read_exactly(pos, len): does exactly what dma_read(pos, len) does but it will also throw and exception if it failed to read the required number of bytes (e.g. EOF is reached). - Changed the names of parameters of dma_read(pos, buf, len) in order to emphasize that they have to be aligned. - Added a description to dma_read(pos, buf, len) to make it even more clear. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-05-26 15:15:39 +03:00
Raphael S. Carvalho	f49888c649	fstream: sync file when file output stream is closed Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-03-31 10:26:03 +03:00
Raphael S. Carvalho	971a3bb0b9	fstream: add file output stream support	2015-03-10 18:44:22 -03:00
Nadav Har'El	8b4117cc66	fstream: use temporary_buffer<char>::aligned In fstream.cc, use the convenient new temporary_buffer<char>::aligned() function for creating an aligned temporary buffer - instead of repeating its implementation. Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-03-10 17:25:42 +02:00
Nadav Har'El	7a04d1f662	fstream: refactor file input stream interface The file_input_stream interface was messy: it was not fiber safe (e.g., code doing seek() in the middle of an ongoing read_exactly()), and went against the PIMPL philosophy. So this patch removes the file_input_stream class, and replaces it with a completely different design: We now have in fstream.hh a global function: input_stream<char> make_file_input_stream( lw_shared_ptr<file> file, uint64_t offset = 0, uint64_t buffer_size = 8192); In other words, instead of "seeking" in an input stream, we just open a new input stream object at a particular offset of the given file. Multiple input streams might be concurrently active on the same file. Note how make_file_input_stream now returns a regular "input_stream", not a subtype, and it can be used just like any normal input_stream to read the stream starting at the given position. This patch makes "input_stream" a "final" type: we no longer subclass it in our code, and we shouldn't in the future because it goes against the PIMPL design (the subclass should be of the inner workings, like the data_source_impl, not of input_stream). Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-03-10 15:39:17 +02:00
Avi Kivity	7f8d88371a	Add LICENSE, NOTICE, and copyright headers to all source files. The two files imported from the OSv project retain their original licenses.	2015-02-19 16:52:34 +02:00
Glauber Costa	861d2625b2	file_stream: proper seek support. Our file_stream interface supports seek, but when we try to seek to arbitrary locations that are smaller than an aio-boundary (say, for instance, f->seek(4)), we will end up not being able to perform the read. We need to guarantee the reads are aligned, and will then present to the caller the buffer properly offset. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-02-18 22:56:07 +02:00
Avi Kivity	af0bf06836	core: add file_data_source, file_input_stream Implement a character stream backed by a file.	2015-02-11 15:38:51 +02:00

18 Commits