* seastar afe39231...99d28ff0 (16):
> file/util: Include seastar.hh
> http/exception: Use http::reply explicitly
> http/client: Include lost condition-variable.hh
> util: file: drop unnecessary include of reactor.hh
> tests: perf: add a markdown printer
> http/client: Introduce unexpected_status_error for client requests
> sharded: avoid #include <seastar/core/reactor.hh> for run_in_background()
> code: Use std::is_invocable_r_v instead of InvokeReturns
> http/client: Add ability to change pool size on the fly
> http/client: Add getters for active/idle connections counts
> http/client: Count and limit the number of connections
> http/client: Add connection->client RAII backref
> build: use the user-specified compiler when building DPDK
> build: use proper toolchain based on specified compiler
> build: only pass CMAKE_C_COMPILER when building ingredients
> build: use specified compiler when building liburing
Two changes are folded into the commit:
1. missing seastar/core/coroutine.hh include in one .cc file that
got it indirectly included before seastar reactor.hh drop from
file.hh
2. http client now returns unexpected_status_error instead of
std::runtime_error, so s3 test is updated respectively
Closes#14168
Currently the test uses a sequence of 1024-bytes buffers. This lets
minio server actively de-duplicate those blocks by page boundary (it's a
guess, but it it's truish because minio reports back equivalent ETags
for lots of uploading parts). Make the buffer not be power of two so
that when squashed together the resulting 2^X buffers don't get equal.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It re-uses most of the existing upload sink test, but configures the
jumbo sink with at most 3 parts in each intermediate object not to
upload 50Gb part to switch to the next one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
get_object_stats() will be used for retrieving content size and
also last modified.
The latter is required for filling st_mtim, etc, in the
s3::client::readable_file::stat() method.
Refs #13649.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Support the AWS_S3_EXTRA environment vairable that's :-split and the
respective substrings are set as endpoint AWS configuration. This makes
it possible to run boost S3 test over real S3.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the code temporarily assumes that the endpoint port is 9000.
This is what tests' local minio is started with. This patch keeps the
port number on endpoint config and makes test get the port number from
minio starting code via environment.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Similar to previous patch -- extent the s3::client constructor to get
the endpoint config value next to the endpoint string. For now the
configs are likely empty, but they are yet unused too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the client is constructed with socket_address which's prepared
by the caller from the endpoint string. That's not flexible engouh,
because s3 client needs to know the original endpoint string for two
reasons.
First, it needs to lookup endpoint config for potential AWS creds.
Second, it needs this exact value as Host: header in its http requests.
So this patch just relaxes the client constructor to accept the endpoint
string and hard-code the 9000 port. The latter is temporary, this is how
local tests' minio is started, but next patch will make it configurable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Using it the pylib minio code export minio address for tests. This
creates unneeded WTFs when running the test over AWS S3, so it's better
to rename to variable not to mention MINIO at all.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Local test.py runs minio with the public 'testbucket' bucket and all
test cases know that. This series adds an ability to run tests over real
S3 so the bucket name should be configurable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
If multipart upload fails for some reason the output stream remains not
closed and the respective assertion masquerades the original failure.
Fix that by closing the stream in all cases.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Sometimes an sstable is used for random read, sometimes -- for streamed
read using the input stream. For both cases the file API can be
provided, because S3 API allows random reads of arbitrary lengths.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Putting a large object into S3 using plain PUT is bad choice -- one need
to collect the whole object in memory, then send it as a content-length
request with plain body. Less memory stress is by using multipart
upload, but multipart upload has its limitation -- each part should be
at least 5Mb in size. For that reason using file API doesn't work --
file IO API operates with external memory buffers and the file impl
would only have raw pointers to it. In order to collect 5Mb of chunk in
RAM the impl would have to copy the memory which is not good. Unlike the
file API data_sink API is more flexible, as it has temporary buffers at
hand and can cache them in zero-copy manner.
Having sad that, the S3 data_sink implementation is like this:
* put(buffer):
move the buffer into local cache, once the local cache grows above 5Mb
send out the part
* flush:
send out whatever is in cache, then send upload completion request
* close:
check that the upload finihsed (in flush), abort the upload otherwise
User of the API may (actually should) wrap the sink with output_stream
and use it as any other output_stream.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Those include -- HEAD to get size, PUT to upload object in one go, GET
to read the object as contigious buffer and DELETE to drop one.
The client uses http client from seastar and just implements the S3
protocol using it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>