scylladb

Author	SHA1	Message	Date
Ernest Zaslavsky	a0016bd0cc	s3_client: relocate `req` creation closer to usage Move the creation of the `req` object to the point where it is actually used, improving code clarity and reducing premature initialization.	2025-08-14 16:18:43 +03:00
Ernest Zaslavsky	6ef2b0b510	s3_client: reformat long logging lines for readability Break up excessively long logging statements to improve readability and maintain consistent formatting across the codebase.	2025-08-14 16:18:43 +03:00
Ernest Zaslavsky	dd51e50f60	s3_client: add memory fallback in `chunked_download_source` Introduce fallback logic in `chunked_download_source` to handle memory exhaustion. When memory is low, feed the `deque` with only one uncounted buffer at a time. This allows slow but steady progress without getting stuck on the memory semaphore. Fixes: https://github.com/scylladb/scylladb/issues/25453 Fixes: https://github.com/scylladb/scylladb/issues/25262 Closes scylladb/scylladb#25452	2025-08-14 09:52:10 +03:00
Ernest Zaslavsky	380c73ca03	s3_client: make memory semaphore acquisition abortable Add `abort_source` to the `get_units` call for the memory semaphore in the S3 client, allowing the acquisition process to be aborted. Fixes: https://github.com/scylladb/scylladb/issues/25454 Closes scylladb/scylladb#25469	2025-08-13 08:48:55 +03:00
Ernest Zaslavsky	fc2c9dd290	s3_client: Disable Seastar-level retries in HTTP client creation Prevent Seastar from retrying HTTP requests to avoid buffer double-feed issues when an entire request is retried. This could cause data corruption in `chunked_download_source`. The change is global for every instance of `s3_client`, but it is still safe because: * Seastar's `http_client` resets connections regardless of retry behavior * `s3_client` retry logic handles all error types—exceptions, HTTP errors, and AWS-specific errors—via `http_retryable_client`	2025-07-21 17:03:23 +03:00
Ernest Zaslavsky	ba910b29ce	s3_test: Validate handling of non-`aws_error` exceptions Inject exceptions not wrapped in `aws_error` from request callback lambda to verify they are properly caught and handled.	2025-07-21 16:52:43 +03:00
Ernest Zaslavsky	b7ae6507cd	s3_client: Improve error handling in chunked_download_source Create aws_error from raised exceptions when possible and respond appropriately. Previously, non-aws_exception types leaked from the request handler and were treated as non-retryable, causing potential data corruption during download.	2025-07-21 16:49:47 +03:00
Ernest Zaslavsky	342e94261f	s3_client: parse multipart response XML defensively Ensure robust handling of XML responses when initiating multipart uploads. Check for the existence of required nodes before access, and throw an exception if the XML is empty or malformed. Refs: https://github.com/scylladb/scylladb/issues/24676 Closes scylladb/scylladb#24990	2025-07-17 10:55:04 +03:00
Ernest Zaslavsky	acf15eba8e	s3_test: Add s3_client test for non-retryable error handling Introduce a test that injects a non-retryable error and verifies that the chunked download source throws an exception as expected.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	49e8c14a86	s3_client: Fix edge case when the range is exhausted Handle case where the download loop exits after consuming all data, but before receiving an empty buffer signaling EOF. Without this, the next request is sent with a non-zero offset and zero length, resulting in "Range request cannot be satisfied" errors. Now, an empty buffer is pushed to indicate completion and exit the fiber properly.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	e50f247bf1	s3_client: Fix indentation in try..catch block Correct indentation in the `try..catch` block to improve code readability and maintain consistent formatting.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	d2d69cbc8c	s3_client: Stop retries in chunked download source Disable retries for S3 requests in the chunked download source to prevent duplicate chunks from corrupting the buffer queue. The response handler now throws an exception to bypass the retry strategy, allowing the next range to be attempted cleanly. This exception is only triggered for retryable errors; unretryable ones immediately halt further requests.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	6d9cec558a	s3_client: Fix missing negation Restore a missing `not` in a conditional check that caused incorrect behavior during S3 client execution.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	e73b83e039	s3_client: Refine logging Fix typo in log message to improve clarity and accuracy during S3 operations.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	f1d0690194	s3_client: Improve logging placement for current_range output Relocated logging to occur after determining the `current_range`, ensuring more relevant output during S3 client operations.	2025-07-01 18:45:17 +03:00
Pavel Emelyanov	dc166be663	s3: Mark claimed_buffer constructor noexcept It just std::move-s a buffer and a semaphore_units objects, both moves are noexcept, so is the constructor itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#24552	2025-06-18 20:36:45 +03:00
Pavel Emelyanov	b0766d1e73	Merge 's3_client: Refactor `range` class for state validation' from Ernest Zaslavsky Revamped the `range` class to actively manage its state by enforcing validation on all modifications. This prevents overflow, invalid states, and ensures the object size does not exceed the 5TiB limit in S3. This should address and prevent future problems related to this issue https://github.com/minio/minio/issues/21333 No backport needed since this problem related only to this change https://github.com/scylladb/scylladb/pull/23880 Closes scylladb/scylladb#24312 * github.com:scylladb/scylladb: s3_client: headers cleanup s3_client: Refactor `range` class for state validation	2025-06-17 10:34:55 +03:00
Ernest Zaslavsky	e398576795	s3_client: Fix hang in get() on EOF by signaling condition variable * Ensure _get_cv.signal() is called when an empty buffer received * Prevents `get()` from stalling indefinitely while waiting on EOF * Found when testing https://github.com/scylladb/scylladb/pull/23695 Closes scylladb/scylladb#24490	2025-06-17 10:33:19 +03:00
Ernest Zaslavsky	1b20e0be4a	s3_client: headers cleanup	2025-06-16 16:02:30 +03:00
Ernest Zaslavsky	9ad7a456fe	s3_client: Refactor `range` class for state validation Revamped the `range` class to actively manage its state by enforcing validation on all modifications. This prevents overflow, invalid states, and ensures the object size does not exceed the 5TiB limit in S3.	2025-06-16 16:02:24 +03:00
Ernest Zaslavsky	2b300c8eb9	s3_client: Improve reporting of S3 client statistics Revise how we report statistics for `chunked_download_source`. Ensure metrics for downloaded but unconsumed data are visible, as they do not contribute to read amplification, which is tracked separately. Closes scylladb/scylladb#24491	2025-06-16 09:33:57 +03:00
Ernest Zaslavsky	30199552ac	s3_client: Mitigate connection exhaustion in `download_source` The existing `download_source` implementation optimizes performance by keeping the connection to S3 open and draining data directly from the socket. While this eliminates the overhead (60-100ms) of repeatedly establishing new connections, it leads to rapid exhaustion of client- side connections. On a single shard, two `mx_readers` for load and stream are enough to trigger this issue. Since each client typically holds two connections, readers keeping index and data sources open can cause deadlocks where processes stall due to unavailable connections. Introduce `chunked_download_source`, a new S3 download method built on `download_source`, to dynamically manage connections: - Buffers data in 5MiB chunks using a producer-consumer model - Closes connections once buffers reach capacity, returning them to the pool for other clients - Uses a filling fiber that resumes fetching once buffers are consumed from the queue Performance remains comparable to `download_source`, achieving 95MiB/s for sequential 1GiB downloads from S3. However, preloading large chunks may cause read amplification. Fixes: https://github.com/scylladb/scylladb/issues/23785 Closes scylladb/scylladb#23880	2025-06-10 12:58:24 +03:00
Ernest Zaslavsky	a369dda049	s3_client: implement S3 copy object Add support for the CopyObject API to enable direct copying of S3 objects between locations. This approach eliminates networking overhead on the client side, as the operation is handled internally by S3.	2025-04-17 09:47:47 +03:00
Ernest Zaslavsky	8929cb324e	s3_client: improve exception message Clarify that the multipart upload was aborted due to a failure in parsing ETags.	2025-04-16 18:58:22 +03:00
Ernest Zaslavsky	993953016f	s3_client: reposition local function for future use The local function has been relocated higher in the code to prepare for its usage in upcoming implementations.	2025-04-16 18:46:31 +03:00
Benny Halevy	d3f498ae59	utils: s3::client::multipart_upload: use named gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-04-12 11:47:00 +03:00
Benny Halevy	eea83464c7	utils: s3::client: use named_gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-04-12 11:46:51 +03:00
Kefu Chai	55777812d4	s3/client: Optimize file streaming with zero-copy multipart uploads When streaming files using multipart upload, switch from using `output_stream::write(const char*, size_t)` to passing buffer objects directly to `output_stream::write()`. This eliminates unnecessary memory copying that occurred when the original implementation had to defensively copy data before sending. The buffer objects can now be safely reused by the output stream instead of creating deep copies, which should improve performance by reducing memory operations during S3 file uploads. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23567	2025-04-07 12:50:06 +03:00
Pavel Emelyanov	1f301b1c5d	s3/client: Introduce data_source_impl for object downloading The new data source implementation runs a single GET for the whole range specified and lends the body input_stream for the upper input_stream's get()-s. Eventually, getting the data from the body stream EOFs or fails. In either case, the existing body is closed and a new GET is spawn with the updater Range header so that not to include the bytes read so far. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Pavel Emelyanov	d47719f70e	s3/client: Detach format_range_header() helper The get_object_contiguous() formats the 'bytes=X-Y' one for its GET request. The very same code will be needed by next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Ernest Zaslavsky	2fb5c7402e	s3_client: Rearrange credentials providers chain As the IAM role is not configured to assume a role at this moment, it makes sense to move the instance metadata credentials provider up in the chain. This avoids unnecessary network calls and prevents log clutter caused by failure messages. Closes scylladb/scylladb#23360	2025-03-20 17:43:04 +03:00
Pavel Emelyanov	23089e1387	Merge 'Enhance S3 client robustness' from Ernest Zaslavsky This PR introduces several key improvements to bolster the reliability of our S3 client, particularly in handling intermittent authentication and TLS-related issues. The changes include: 1. Automatic Credential Renewal and Request Retry: When credentials expire, the new retry strategy now resets the credentials and set the client to the retryable state, so the client will re-authenticate, and automatically retry the request. This change prevents transient authentication failures from propagating as fatal errors. 2. Enhanced Exception Unwrapping: The client now extracts the embedded std::system_error from std::nested_exception instances that may be raised by the Seastar HTTP client when using TLS. This allows for more precise error reporting and handling. 3. Expanded TLS Error Handling: We've added support for retryable TLS error codes within the std::system_error handler. This modification enables the client to detect and recover from transient TLS issues by retrying the affected operations. Together, these enhancements improve overall client robustness by ensuring smoother recovery from both credential and TLS-related errors. No backport needed since it is an enhancement Closes scylladb/scylladb#22150 * github.com:scylladb/scylladb: aws_error: Add GNU TLS codes s3_client: Handle nested std::system_error exceptions s3_client: Start using new retry strategy retry_strategy: Add custom retry strategy for S3 client retry_strategy: Make `should_retry` awaitable	2025-03-20 16:52:20 +03:00
Ernest Zaslavsky	367140a9c5	s3_client: Start using new retry strategy * Previously, token expiration was considered a fatal error. With this change, the `s3_client` uses new retry strategy that is trying to renew expired creds * Added related test to the `s3_proxy`	2025-03-17 16:38:14 +02:00
Pavel Emelyanov	6217124d1d	s3/client: Make "expected" reply status truly optional Currently when a client::make_request() is called it can pass std::optional<status> argument indicating which status it expects from server. In case status doesn't match, the request body handler won't be called, the request will fail with unexpected status exception. However, disengaged expected implicitly means, that the requestor expects the OK (200) status. This makes it impossible to make a query which return status is not known in advance and it's up to the handler to check it. Lower level http client allows disengaged expected with the described semantics -- handler will check status its own. This behavios for s3 client is needed for GET request. Server can respond with OK or partial content status depending on the Range header. If the header is absent or is large enough for the requested object to fit into it, the status would be OK, if the object is "trimmed" the status is partial content. In the end of the day, requestor cannot "guess" the returning status in advance and should check it upon response arrival. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23243	2025-03-17 15:34:58 +02:00
Ernest Zaslavsky	7c49ee4520	s3_client: enhance `retryable_http_client` functionality Enhanced `retryable_http_client` by allowing the injection of a custom error handler through its constructor.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	b589a882bb	s3_client: isolate `retryable_http_client` Relocated `retryable_http_client` into its own dedicated file for improved clarity and maintainability.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	5eff83af95	s3_client: Prepare for `retryable_http_client` relocation Expose `map_s3_client_exception` outside the S3 client class to facilitate moving `retryable_http_client` to a separate file.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	2b3abba10a	s3_client: Remove `is_redirect_status` function Eliminate the `is_redirect_status` function in favor of the equivalent functionality provided by Seastar's HTTP client.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	5b7d4a4136	s3_client: Move retryable functionality out of s3 client This commit moves the retryable HTTP client functionality out of the S3 client implementation. Since this functionality is also required for other services, such as AWS STS, it has been separated to ensure broader applicability.	2025-03-10 09:01:47 +02:00
Robert Bindar	27f2d64725	Remove object storage config credentials provider During development of #22428 we decided that we have no need for `object-storage.yaml`, and we'd rather store the endpoints in `scylla.yaml` and get a REST api to exopose the endpoints for free. This patch removes the credentials provider used to read the aws keys from this yaml file. Followup work will remove the `object-storage.yaml` file altogether and move the endpoints to `scylla.yaml`. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#22951	2025-03-07 10:40:58 +03:00
Pavel Emelyanov	b52d1a3d99	s3/client: Make http client connections limit configurable It's now calculated based on sched group shares, but for tests explicit value is needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-02-14 16:27:25 +03:00
Ernest Zaslavsky	5a266926e5	s3_client: Increase default part size for optimal performance Set the `upload_file` part size to 50MiB, as this value provides the best performance based on tests conducted using `perf_s3_client` on an i4i.4xlarge instance. ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 5 INFO 2025-02-06 10:34:08,007 [shard 0:main] perf - Uploaded 1024MB in 27.768863962s, speed 36.87583335786734MB/s ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 10 INFO 2025-02-06 10:35:07,161 [shard 0:main] perf - Uploaded 1024MB in 28.175412552s, speed 36.34374467845414MB/s ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 20 INFO 2025-02-06 10:35:55,530 [shard 0:main] perf - Uploaded 1024MB in 14.483539631s, speed 70.700949221575MB/s ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 30 INFO 2025-02-06 10:36:35,466 [shard 0:main] perf - Uploaded 1024MB in 11.486155799s, speed 89.15080188004683MB/s ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 40 INFO 2025-02-06 10:37:46,642 [shard 0:main] perf - Uploaded 1024MB in 10.236196424s, speed 100.03715809898961MB/s /perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 50 INFO 2025-02-06 10:38:34,777 [shard 0:main] perf - Uploaded 1024MB in 9.490644522s, speed 107.895728011548MB/s ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 60 INFO 2025-02-06 10:39:08,832 [shard 0:main] perf - Uploaded 1024MB in 9.767783693s, speed 104.83442633295012MB/s ./perf_s3_client --smp 1 --upload --object_name ./1G-test-file --sockets 1 --part_size_mb 70 INFO 2025-02-06 10:39:47,916 [shard 0:main] perf - Uploaded 1024MB in 10.166116742s, speed 100.72675988162482MB/s Closes scylladb/scylladb#22732	2025-02-07 13:49:54 +03:00
Ernest Zaslavsky	97d789043a	s3_client: Fix buffer offset reset on request retry This patch addresses an issue where the buffer offset becomes incorrect when a request is retried. The new request uses an offset that has already been advanced, causing misalignment. This fix ensures the buffer offset is correctly reset, preventing such errors. Closes scylladb/scylladb#22729	2025-02-07 08:52:08 +03:00
Ernest Zaslavsky	dee4fc7150	aws creds: add STS and Instance Metadata service credentials providers This commit introduces two new credentials providers: STS and Instance Metadata Service. The S3 client's provider chain has been updated to incorporate these new providers. Additionally, unit tests have been added to ensure coverage of the new functionality.	2025-02-05 14:57:19 +02:00
Ernest Zaslavsky	d534051bea	aws creds: add env. and file credentials providers This commit entirely removes credentials from the endpoint configuration. It also eliminates all instances of manually retrieving environment credentials. Instead, the construction of file and environment credentials has been moved to their respective providers. Additionally, a new aws_credentials_provider_chain class has been introduced to support chaining of multiple credential providers.	2025-02-05 14:57:19 +02:00
Ernest Zaslavsky	c911fc4f34	s3 creds: move credentials out of endpoint config This commit refactors the way AWS credentials are managed in Scylla. Previously, credentials were included in the endpoint configuration. However, since credentials and endpoint configurations serve different purposes and may have different lifetimes, it’s more logical to manage them separately. Moving forward, credentials will be completely removed from the endpoint_config to ensure clear separation of concerns.	2025-02-04 16:45:23 +02:00
Kefu Chai	7215d4bfe9	utils: do not include unused headers these unused includes were identifier by clang-include-cleaner. after auditing these source files, all of the reports have been confirmed. please note, because quite a few source files relied on `utils/to_string.hh` to pull in the specialization of `fmt::formatter<std::optional<T>>`, after removing `#include <fmt/std.h>` from `utils/to_string.hh`, we have to include `fmt/std.h` directly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-14 07:56:39 -05:00
Pavel Emelyanov	bb094cc099	Merge 'Make restore task abortable' from Calle Wilund Fixes #20717 Enables abortable interface and propagates abort_source to all s3 objects used for reading the restore data. Note: because restore is done on each shard, we have to maintain a per-shard abort source proxy for each, and do a background per-shard abort on abort call. This is synced at the end of "run()". Abort source is added as an optional parameter to s3 storage and the s3 path in distributed loader. There is no attempt to "clean up" an aborted restore. As we read on a mutation level from remote sstables, we should not cause incomplete sstables as such, even though we might end up of course with partial data restored. Closes scylladb/scylladb#21567 * github.com:scylladb/scylladb: test_backup: Add restore abort test case sstables_loader: Make restore task abortable distributed_loader: Add optional abort_source to get_sstables_from_object_store s3_storage: Add optional abort_source to params/object s3::client: Make "readable_file" abortable	2024-12-19 12:23:33 +03:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Calle Wilund	af4dd1f2cb	s3::client: Make "readable_file" abortable Adds optional abortable source to "readable_file" interface. Note: the abortable aspect is not preserved across a "dup()" call however, since these objects are generally not used in a cross-shard fashion, it should be ok.	2024-12-02 12:30:24 +00:00

1 2 3

146 Commits