scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	ce6a1ca13b	Update seastar submodule * seastar afe39231...99d28ff0 (16): > file/util: Include seastar.hh > http/exception: Use http::reply explicitly > http/client: Include lost condition-variable.hh > util: file: drop unnecessary include of reactor.hh > tests: perf: add a markdown printer > http/client: Introduce unexpected_status_error for client requests > sharded: avoid #include <seastar/core/reactor.hh> for run_in_background() > code: Use std::is_invocable_r_v instead of InvokeReturns > http/client: Add ability to change pool size on the fly > http/client: Add getters for active/idle connections counts > http/client: Count and limit the number of connections > http/client: Add connection->client RAII backref > build: use the user-specified compiler when building DPDK > build: use proper toolchain based on specified compiler > build: only pass CMAKE_C_COMPILER when building ingredients > build: use specified compiler when building liburing Two changes are folded into the commit: 1. missing seastar/core/coroutine.hh include in one .cc file that got it indirectly included before seastar reactor.hh drop from file.hh 2. http client now returns unexpected_status_error instead of std::runtime_error, so s3 test is updated respectively Closes #14168	2023-06-07 20:25:49 +03:00
Pavel Emelyanov	b3df2d0db0	s3/test: Tune-up multipart upload test alignment Currently the test uses a sequence of 1024-bytes buffers. This lets minio server actively de-duplicate those blocks by page boundary (it's a guess, but it it's truish because minio reports back equivalent ETags for lots of uploading parts). Make the buffer not be power of two so that when squashed together the resulting 2^X buffers don't get equal. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-16 12:23:18 +03:00
Pavel Emelyanov	fffa04fa67	s3/test: Add jumbo upload test It re-uses most of the existing upload sink test, but configures the jumbo sink with at most 3 parts in each intermediate object not to upload 50Gb part to switch to the next one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-16 12:23:18 +03:00
Raphael S. Carvalho	57661f0392	s3: Introduce get_object_stats() get_object_stats() will be used for retrieving content size and also last modified. The latter is required for filling st_mtim, etc, in the s3::client::readable_file::stat() method. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-07 19:51:10 -03:00
Pavel Emelyanov	e00d3188ed	s3/test: Add ability to run boost test over real s3 Support the AWS_S3_EXTRA environment vairable that's :-split and the respective substrings are set as endpoint AWS configuration. This makes it possible to run boost S3 test over real S3. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:23:38 +03:00
Pavel Emelyanov	3bec5ea2ce	s3/client: Keep server port on config Currently the code temporarily assumes that the endpoint port is 9000. This is what tests' local minio is started with. This patch keeps the port number on endpoint config and makes test get the port number from minio starting code via environment. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	85f06ca556	s3/client: Construct it with config Similar to previous patch -- extent the s3::client constructor to get the endpoint config value next to the endpoint string. For now the configs are likely empty, but they are yet unused too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	caf9e357c8	s3/client: Construct it with sstring endpoint Currently the client is constructed with socket_address which's prepared by the caller from the endpoint string. That's not flexible engouh, because s3 client needs to know the original endpoint string for two reasons. First, it needs to lookup endpoint config for potential AWS creds. Second, it needs this exact value as Host: header in its http requests. So this patch just relaxes the client constructor to accept the endpoint string and hard-code the 9000 port. The latter is temporary, this is how local tests' minio is started, but next patch will make it configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	a77ca69360	s3/test: Rename MINIO_SERVER_ADDRESS environment variable Using it the pylib minio code export minio address for tests. This creates unneeded WTFs when running the test over AWS S3, so it's better to rename to variable not to mention MINIO at all. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	12c4e7d605	s3/test: Keep public bucket name in environment Local test.py runs minio with the public 'testbucket' bucket and all test cases know that. This series adds an ability to run tests over real S3 so the bucket name should be configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	91674da982	s3/test: Fix upload stream closure If multipart upload fails for some reason the output stream remains not closed and the respective assertion masquerades the original failure. Fix that by closing the stream in all cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	033fa107f8	utils: Add S3 readable file impl for random reads Sometimes an sstable is used for random read, sometimes -- for streamed read using the input stream. For both cases the file API can be provided, because S3 API allows random reads of arbitrary lengths. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	a4a64149a6	utils: Add S3 data sink for multipart upload Putting a large object into S3 using plain PUT is bad choice -- one need to collect the whole object in memory, then send it as a content-length request with plain body. Less memory stress is by using multipart upload, but multipart upload has its limitation -- each part should be at least 5Mb in size. For that reason using file API doesn't work -- file IO API operates with external memory buffers and the file impl would only have raw pointers to it. In order to collect 5Mb of chunk in RAM the impl would have to copy the memory which is not good. Unlike the file API data_sink API is more flexible, as it has temporary buffers at hand and can cache them in zero-copy manner. Having sad that, the S3 data_sink implementation is like this: * put(buffer): move the buffer into local cache, once the local cache grows above 5Mb send out the part * flush: send out whatever is in cache, then send upload completion request * close: check that the upload finihsed (in flush), abort the upload otherwise User of the API may (actually should) wrap the sink with output_stream and use it as any other output_stream. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	3745b5c715	utils: Add S3 client with basic ops Those include -- HEAD to get size, PUT to upload object in one go, GET to read the object as contigious buffer and DELETE to drop one. The client uses http client from seastar and just implements the S3 protocol using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00

14 Commits