scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Ernest Zaslavsky	a369dda049	s3_client: implement S3 copy object Add support for the CopyObject API to enable direct copying of S3 objects between locations. This approach eliminates networking overhead on the client side, as the operation is handled internally by S3.	2025-04-17 09:47:47 +03:00
Pavel Emelyanov	1f301b1c5d	s3/client: Introduce data_source_impl for object downloading The new data source implementation runs a single GET for the whole range specified and lends the body input_stream for the upper input_stream's get()-s. Eventually, getting the data from the body stream EOFs or fails. In either case, the existing body is closed and a new GET is spawn with the updater Range header so that not to include the bytes read so far. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Ernest Zaslavsky	367140a9c5	s3_client: Start using new retry strategy * Previously, token expiration was considered a fatal error. With this change, the `s3_client` uses new retry strategy that is trying to renew expired creds * Added related test to the `s3_proxy`	2025-03-17 16:38:14 +02:00
Ernest Zaslavsky	7c49ee4520	s3_client: enhance `retryable_http_client` functionality Enhanced `retryable_http_client` by allowing the injection of a custom error handler through its constructor.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	b589a882bb	s3_client: isolate `retryable_http_client` Relocated `retryable_http_client` into its own dedicated file for improved clarity and maintainability.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	5eff83af95	s3_client: Prepare for `retryable_http_client` relocation Expose `map_s3_client_exception` outside the S3 client class to facilitate moving `retryable_http_client` to a separate file.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	5b7d4a4136	s3_client: Move retryable functionality out of s3 client This commit moves the retryable HTTP client functionality out of the S3 client implementation. Since this functionality is also required for other services, such as AWS STS, it has been separated to ensure broader applicability.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	d534051bea	aws creds: add env. and file credentials providers This commit entirely removes credentials from the endpoint configuration. It also eliminates all instances of manually retrieving environment credentials. Instead, the construction of file and environment credentials has been moved to their respective providers. Additionally, a new aws_credentials_provider_chain class has been introduced to support chaining of multiple credential providers.	2025-02-05 14:57:19 +02:00
Kefu Chai	7215d4bfe9	utils: do not include unused headers these unused includes were identifier by clang-include-cleaner. after auditing these source files, all of the reports have been confirmed. please note, because quite a few source files relied on `utils/to_string.hh` to pull in the specialization of `fmt::formatter<std::optional<T>>`, after removing `#include <fmt/std.h>` from `utils/to_string.hh`, we have to include `fmt/std.h` directly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-14 07:56:39 -05:00
Pavel Emelyanov	bb094cc099	Merge 'Make restore task abortable' from Calle Wilund Fixes #20717 Enables abortable interface and propagates abort_source to all s3 objects used for reading the restore data. Note: because restore is done on each shard, we have to maintain a per-shard abort source proxy for each, and do a background per-shard abort on abort call. This is synced at the end of "run()". Abort source is added as an optional parameter to s3 storage and the s3 path in distributed loader. There is no attempt to "clean up" an aborted restore. As we read on a mutation level from remote sstables, we should not cause incomplete sstables as such, even though we might end up of course with partial data restored. Closes scylladb/scylladb#21567 * github.com:scylladb/scylladb: test_backup: Add restore abort test case sstables_loader: Make restore task abortable distributed_loader: Add optional abort_source to get_sstables_from_object_store s3_storage: Add optional abort_source to params/object s3::client: Make "readable_file" abortable	2024-12-19 12:23:33 +03:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Calle Wilund	af4dd1f2cb	s3::client: Make "readable_file" abortable Adds optional abortable source to "readable_file" interface. Note: the abortable aspect is not preserved across a "dup()" call however, since these objects are generally not used in a cross-shard fashion, it should be ok.	2024-12-02 12:30:24 +00:00
Ernest Zaslavsky	dc6e4c0d97	client: Add retries Add retries to the s3 client, all retries are coordinated by an instance of `retry_strategy`. In a case of error also parse response body in attempt to retrieve additional and more focused error information as suggested by AWS. See https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html. Also move the expected http status check to the `make_s3_error_handler` since the http::client::make_request call is done with `nullopt` - we want to manage all the aws errors handling in s3 client to prevent the http client to validate it and fail before we have a chance to analyze the error properly	2024-11-07 21:01:25 +02:00
Calle Wilund	3321820c67	s3::client: Make operations (individually) abortable Refs #20716 Adds optional abort_source to all s3 client operations. If provided, will propagate to actual HTTP client and allow for aborting actual net op. Note: this uses an abort source per call, not a client-local one. This is for two reasons: 1.) The usage pattern of the client object is to create it outside the eventual owning object (task) that hosts the relevant abort source 2.) It is quite possible to want to have different/no abort source for some operation usage.	2024-11-05 14:23:24 +00:00
Pavel Emelyanov	51e03b1025	s3/client: Introduce upload_progress This is a structure with "total" and "uploaded" counters that's passed by user to client::upload_file() method so that client would update it with the progress. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-29 08:38:39 +03:00
Pavel Emelyanov	f9a5e02b53	s3: Extract client_fwd.hh This is to export some simple structures to users without the need to include client.hh itself (rather large already) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-29 08:38:39 +03:00
Pavel Emelyanov	14b741afc9	s3/client: Split upload_sink_base class into two This class implements two facilities -- multipart upload protocol itself plus some common parts of upload_sink_impl (in fact -- only close() and plugs put(packet)). This patch aplits those two facilities into two classes. One of them will be re-used later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-09-12 18:00:19 +03:00
Pavel Emelyanov	86bc5b11fe	s3-client: Add support for lister::filter Directory lister comes with a filter function that tells lister which entries to skip by its .get() method. For uniformity, add the same to S3 bucket_lister. After this change the lister reports shorter name in the returned directory entry (with the prefix cut), so also need to tune up the unit test respectively. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-27 16:15:40 +03:00
Pavel Emelyanov	113d2449f8	utils: Introduce abstract (directory) lister This patch hides directory_lister and bucket_lister behind a common facade. The intention is to provide a uniform API for sstable_directory that it could use to list sstables' components wherever they are. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-27 16:15:40 +03:00
Pavel Emelyanov	a02e65c649	s3_client: Add bucket lister The lister resembles the directory_lister from util -- it returns entries upon its .get() invocation, and should be .close()d at the end. Internally the lister issues ListObjectsV2 request with provided prefix and limits the server with the amount of entries returned not to consume too much local memory (we don't have streaming XML parser for response). If the result is indeed truncated, the subsequent calls include the continuation token as per [1] [1] https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-13 21:15:43 +03:00
Kefu Chai	061def001d	s3/client: add client::upload_file() this member function prepares for the backup feature, where the object to be stored in the object storage is already persisted as a file on local filesystem. this brings us two benefits: - with the file, we don't need to accumulate the payloads in memory and send them in batch, as we do in upload_sink and in upload_jumbo_sink. this puts less pressure on the memory subsystem. - with the file, we can read multiple parts in parallel if multpart upload applies to it, this helps to improve the throughput. so, this new helper is introduced to help upload an sstable from local filesystem to the object storage. Fixes #16287 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-07-23 14:39:30 +08:00
Patryk Wrobel	a89e3d10af	code-cleanup: add missing header guards The following command had been executed to get the list of headers that did not contain '#pragma once': 'grep -rnw . -e "#pragma once" --include *.hh -L' This change adds missing include guard to headers that did not contain any guard. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#19626	2024-07-09 18:31:35 +03:00
Pavel Emelyanov	fc5306c5e8	s3::client: Track memory in client uploads When uploading an object part, client spawns a background fiber that keeps the buffers with data on the http request's write_body() lambda capture. This generates unbound usage of memory with uploaded buffers which is not nice. Even though s3 client is limited with http's client max-connections parallelism, waiting for the available connection still happens with buffers held in memory. This patch makes the client claim the background memory from the provided semaphore (which, in turn, sits on the shard-wide storage manager instance). Once body writing is complete, the claimed units are returned back to the semaphore allowing for more background writes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:29 +03:00
Pavel Emelyanov	b299757884	s3::client: Construct client with shared semaphore The semaphore will be used to cap memory consumption by client. This patch makes sure the reference to a semaphore exists as an argument to client's constructor, not more than that. In scylla binary, the semaphore sits on storage_manager. In tests the semaphore is some local object. For now the semaphore is unused and is initialized locked as this patch just pushes the needed argument all the way around, next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:07 +03:00
Pavel Emelyanov	308db51306	s3/client: Add IO stats metrics These metrics mimic the existing IO ones -- total number of read operation, total number of read bytes and total read delay. And the same for writing. This patch makes no difference between wrting object with plain PUT vs putting it with multipart uploading. Instead, it "measures" individual IO writes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	91235a84cd	s3/client: Add HTTP client metrics Currently an http client has several exported "numbers" regarding the number of transport connections the client uses. This patch exports those via S3 client's per-sched-group metrics and prepares the ground for more metrics in next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	08a12cd4a6	s3/client: Split make_request() There will appear another make_request() helper that'll do mostly the same. This split will help to avoid code duplication Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	4b548dd240	s3/client: Wrap http client with struct group_client The http-client is per-sched-group. Next patch will need to keep metrics per-sched-group too and this sched-group -> http-client map is the good place to put them on. Wrapping struct will allow extending it with metrics Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	627c1932e4	s3/client: Move client::stats to namespace scope The stats is stats about object, not about client, so it's better if it lives in namespace scope. Also it will avoid conflicts with client stats that will be reported as metrics (later patch) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Kefu Chai	ef78b31b43	s3/client: add tagging ops with tagging ops, we will be able to attach kv pairs to an object. this will allow us to mark sstable components with taggings, and filter them based on them. * test/pylib/minio_server.py: enable anonymous user to perform more actions. because the tagging related ops are not enabled by "mc anonymous set public", we have to enable them using "set-json" subcommand. * utils/s3/client: add methods to manipulate taggings. * test/boost/s3_test: add a simple test accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14486	2023-07-11 09:30:46 +03:00
Pavel Emelyanov	81d1bfce2a	s3/client: Maintain several http clients on-board The intent is to isolate workloads from different sched groups from each other and not let one sched group consume all sockets from the http client thus affecting requests made by other sched groups. The conention happens in the maximim number of socket an http client may have (see scylladb/seastar#1652). If requests take time and client is asked to make more and more it will eventually stop spawning new connections and would get blocked internally waiting for running requests to complete and put a socket back to pool. If a sched group workload (e.g. -- memtable flush) consumes all the available sockets then workload from another group (e.g. -- query) would be blocked thus spoiling its latency (which is poor on its own, but still) After this change S3 client maintains a sched_group:http_client map thus making sure different sched groups don't clash with each other so that e.g. query requests don't wait for flush/compaction to release a socket. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-06-08 18:28:55 +03:00
Pavel Emelyanov	b9ee0d385b	s3/client: Add make_request() method This helper call will serve several purposes. First, make necessary preparations to the request before making, in particular -- calling authorize() Second, there's the need to re-make requests that failed with "connection closed" error (see #13736) Third, one S3 client is shared between different scheduling groups. In order to isolate groups' workload from each other different http clients should be used, and this helper will be in change of selecting one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-06-08 18:19:19 +03:00
Pavel Emelyanov	f9686926c2	c3/client: Implement jumbo upload sink The sink is also in charge of uploading large objects in parts, but this time each part is put with the help of upload-part-copy API call, not the regular upload-part one. To make it work the new sink inherits from the uploading base class, but instead of keeping memory_data_sink_buffers with parts it keeps a sink to upload a temporary intermediate object with parts. When the object is "full", i.e. the number of parts in it hits the limit, the object is flushed, then copied into the target object with the S3 API call, then deletes the intermediate object. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-16 12:23:18 +03:00
Pavel Emelyanov	a88629227f	s3/client: Rename upload_sink -> upload_sink_base There will appear another sink that would implement multipart upload with the help of copy-part functionality. Current uploading code is going to be partially re-used, so this patch moves all of it into the base class in advance. Next patches will pick needed parts. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-16 12:19:50 +03:00
Pavel Emelyanov	613acba5d0	s3: Pick client from manager via handle Add the global-factory onto the client that is - cross-shard copyable - generates a client from local storage_manager by given endpoint With that the s3 file handle is fixed and also picks up shared s3 clients from the storage manager instead of creating its own one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-11 19:39:01 +03:00
Pavel Emelyanov	8ed9716f59	s3: Generalize s3 file handle Currently the s3 file handle tries to carry client's info via explicit host name and endpoint config pointer. This is buggy, the latter pointer is shard-local can cannot be transferred across shards. This patch prepares the fix by abstracting the client handle part. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-11 19:39:01 +03:00
Pavel Emelyanov	63ff6744d8	s3: Live-update clients' configs Now when the client is accessible directli via the storage_manager, when the latter is requested to update its endpoint config, it can kick the client to do the same. The latter, in turn, can only update the AWS creds info for now. The endpoint port and https usage are immutable for now. Also, updating the endpoint address is not possible, but for another reason -- the endpoint itself is the part of keyspace configuration and updating one in the object_storage.yaml will have no effect on it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-11 19:39:01 +03:00
Raphael S. Carvalho	57661f0392	s3: Introduce get_object_stats() get_object_stats() will be used for retrieving content size and also last modified. The latter is required for filling st_mtim, etc, in the s3::client::readable_file::stat() method. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-07 19:51:10 -03:00
Raphael S. Carvalho	da2ccc44a4	s3: introduce get_object_header() This allows other functions to reuse the code to retrieve the object header. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-07 19:49:52 -03:00
Pavel Emelyanov	98b9c205bb	s3/client: Sign requests if configured If the endpoint config specifies AWS key, secret and region, all the S3 requests get signed. Signature should have all the x-amz-... headers included and should contain at least three of them. This patch includes x-ams-date, x-amz-content-sha256 and host headers into the signing list. The content can be unsigned when sent over HTTPS, this is what this patch does. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:23:37 +03:00
Pavel Emelyanov	85f06ca556	s3/client: Construct it with config Similar to previous patch -- extent the s3::client constructor to get the endpoint config value next to the endpoint string. For now the configs are likely empty, but they are yet unused too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	caf9e357c8	s3/client: Construct it with sstring endpoint Currently the client is constructed with socket_address which's prepared by the caller from the endpoint string. That's not flexible engouh, because s3 client needs to know the original endpoint string for two reasons. First, it needs to lookup endpoint config for potential AWS creds. Second, it needs this exact value as Host: header in its http requests. So this patch just relaxes the client constructor to accept the endpoint string and hard-code the 9000 port. The latter is temporary, this is how local tests' minio is started, but next patch will make it configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	033fa107f8	utils: Add S3 readable file impl for random reads Sometimes an sstable is used for random read, sometimes -- for streamed read using the input stream. For both cases the file API can be provided, because S3 API allows random reads of arbitrary lengths. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	a4a64149a6	utils: Add S3 data sink for multipart upload Putting a large object into S3 using plain PUT is bad choice -- one need to collect the whole object in memory, then send it as a content-length request with plain body. Less memory stress is by using multipart upload, but multipart upload has its limitation -- each part should be at least 5Mb in size. For that reason using file API doesn't work -- file IO API operates with external memory buffers and the file impl would only have raw pointers to it. In order to collect 5Mb of chunk in RAM the impl would have to copy the memory which is not good. Unlike the file API data_sink API is more flexible, as it has temporary buffers at hand and can cache them in zero-copy manner. Having sad that, the S3 data_sink implementation is like this: * put(buffer): move the buffer into local cache, once the local cache grows above 5Mb send out the part * flush: send out whatever is in cache, then send upload completion request * close: check that the upload finihsed (in flush), abort the upload otherwise User of the API may (actually should) wrap the sink with output_stream and use it as any other output_stream. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	3745b5c715	utils: Add S3 client with basic ops Those include -- HEAD to get size, PUT to upload object in one go, GET to read the object as contigious buffer and DELETE to drop one. The client uses http client from seastar and just implements the S3 protocol using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00

45 Commits