The update_credentials_and_rearm() may get "empty" credentials from
_creds_provider_chain.get_aws_credentials() -- it doesn't throw, but
returns default-initialized value. In that case the expires_at will be
set to time_point::min, and it's probably not a good idea to arm the
refresh timer and, even worse idea, to subtract 1h from it.
Fixes#29056
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#29057
The format string had two {} placeholders but three arguments, the
_upload_id one is skipped from formatting
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#29053
Make the pattern static const so it is compiled once at first call rather
than on every Content-Range header parse.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#29054
- fix s3::range max value for object size which is 50TiB and not 5.
- refactor constants to make it accessible for all interested parties, also reuse these constants in tests
No need to backport, doubt we will encounter an object larger than 5TiB
Closesscylladb/scylladb#28601
* github.com:scylladb/scylladb:
s3_client: reorganize tests in part_size_calculation_test
s3_client: switch using s3 limits constants in tests
s3_client: fix the s3::range max object size
s3_client: remove "aws" prefix from object limits constants
s3_client: make s3 object limits accessible
in s3::Range class start using s3 global constant for two reasons:
1) uniformity, no need to introduce semantically same constant in each class
2) the value was wrong
Today S3 client has well established and well testes (hopefully) http request retry strategy, in the rest of clients it looks like we are trying to achieve the same writing the same code over and over again and of course missing corner cases that already been addressed in the S3 client.
This PR aims to extract the code that could assist other clients to detect the retryability of an error originating from the http client, reuse the built in seastar http client retryability and to minimize the boilerplate of http client exception handling
No backport needed since it is only refactoring of the existing code
Closesscylladb/scylladb#28250
* github.com:scylladb/scylladb:
exceptions: add helper to build a chain of error handlers
http: extract error classification code
aws_error: extract `retryable` from aws_error
- Correct `calc_part_size` function since it could return more than 10k parts
- Add tests
- Add more checks in `calc_part_size` to comply with S3 limits
Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-640
Must be ported back to 2025.3/4 and 2026.1 since we may encounter this bug in production clusters
Closesscylladb/scylladb#28592
* github.com:scylladb/scylladb:
s3_client: add more constrains to the calc_part_size
s3_client: add tests for calc_part_size
s3_client: correct multipart part-size logic to respect 10k limit
The previous calculation could produce more than 10,000 parts for large
uploads because we mixed values in bytes and MiB when determining the
part size. This could result in selecting a part size that still
exceeded the AWS multipart upload limit. The updated logic now ensures
the number of parts never exceeds the allowed maximum.
This change also aligns the implementation with the code comment: we
prefer a 50 MiB part size because it provides the best performance, and
we use it whenever it fits within the 10,000-part limit. If it does not,
we increase the part size (in bytes, aligned to MiB) to stay within the
limit.
Generalize error handling by creating exception dispatcher which allows to write error handlers by sequentially applying handlers the same way one would write `catch ()` blocks
Previously we only inspected std::system_error inside
std::nested_exception to support a specific TLS-related failure
mode. However, nested exceptions may contain any type, including
other restartable (retryable) errors. This change unwraps one
nested exception per iteration and re-applies all known handlers
until a match is found or the chain is exhausted.
Closesscylladb/scylladb#28240
The loop that unwraps nested exception, rethrows nested exception and saves pointer to the temporary std::exception& inner on stack, then continues. This pointer is, thus, pointing to a released temporary
Closesscylladb/scylladb#28143
To configure S3 storage, one needs to do
```
object_storage_endpoints:
- name: s3.us-east-1.amazonaws.com
port: 443
https: true
aws_region: us-east-1
```
and for GCS it's
```
object_storage_endpoints:
- name: https://storage.googleapis.com:433
type: gs
credentials_file: <gcp account credentials json file>
```
This PR updates the S3 part to look like
```
object_storage_endpoints:
- name: https://s3.us-east-1.amazonaws.com:443
aws_region: us-east-1
```
fixes: #26570
This is 2nd attempt, previous one (#27360) was reverted because it reported endpoint configs in new format via API and CQL always, even if the endpoint was configured in the old way. This "broke" scylla manager and some dtests. This version has this bug fixed, and endpoints are reported in the same format as they were configured with.
About correctness of the changes.
No modifications to existing tests are made here, so old format is respected correctly (as far as it's covered by tests). To prove the new format works the the test_get_object_store_endpoints is extended to validate both options. Some preparations to this test to make this happen come on their own with the PR #28111 to show that they are valid and pass before changing the core code.
Enhancing the way configuration is made, likely no need to backport.
Closesscylladb/scylladb#28112
* github.com:scylladb/scylladb:
test: Validate S3 endpoints new format works
docs: Update docs according to new endpoints config option format
object_storage: Create s3 client with "extended" endpoint name
s3/storage: Tune config updating
sstable: Shuffle args for s3_client_wrapper
test: Rename badconf variable into objconf
test: Split the object_store/test_get_object_store_endpoints test
A data_sink that stores buffers into an in-memory collection had
appeared in seastar recently. In Scylla there's similar thing that uses
memory_data_sink_buffer as a container, so it's possible to drop the
data_sink_impl iself in favor of seastar implementation.
For that to work there should be append_buffers() overload for the
aforementioned container. For its nice implementation the container, in
turn, needs to get push_back() method and value_type trait. The method
already exists, but is called put(), so just rename it. There's one more
user of it this method in S3 client, and it can enjoy the added
append_buffers() helper.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28124
For this, add the s3::client::make(endpoint, ...) overload that accepts
endpoint in proto://host:port format. Then it parses the provided url
and calls the legacy one, that accepts raw host string and config with
port, https bit, etc.
The generic object_storage_endpoint_param no longer needs to carry the
internal s3::endpoint_config, the config option parsing changes
respectively.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Don't prepare s3::endpoint_config from generic code, jut pass the region
and iam_role_arn (those that can potentially change) to the callback.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This reverts commit 1bb897c7ca, reversing
changes made to 954f2cbd2f. It makes
incompatible changes to the object storage configuration format, breaking
tests [1]. It's likely that it doesn't break any production configuration,
but we can't be sure.
Fixes#27966Closesscylladb/scylladb#27969
To configure S3 storage, one needs to do
```
object_storage_endpoints:
- name: s3.us-east-1.amazonaws.com
port: 443
https: true
aws_region: us-east-1
```
and for GCS it's
```
object_storage_endpoints:
- name: https://storage.googleapis.com:433
type: gs
credentials_file: <gcp account credentials json file>
```
This PR updates the S3 part to look like
```
object_storage_endpoints:
- name: https://s3.us-east-1.amazonaws.com:443
aws_region: us-east-1
```
fixes: #26570
Not-yet released feature, no need to backport. Old configs are not accepted any longer. If it's needed, then this decision needs to be revised.
Closesscylladb/scylladb#27360
* github.com:scylladb/scylladb:
object_storage: Temporarily handle pure endpoint addresses as endpoints
code: Remove dangling mentions of s3::endpoint_config
docs: Update docs according to new endpoints config option format
object_storage: Create s3 client with "extended" endpoint name
test: Add named constants for test_get_object_store_endpoints endpoint names
s3/storage: Tune config updating
sstable: Shuffle args for s3_client_wrapper
For this, add the s3::client::make(endpoint, ...) overload that accepts
endpoint in proto://host:port format. Then it parses the provided url
and calls the legacy one, that accepts raw host string and config with
port, https bit, etc.
The generic object_storage_endpoint_param no longer needs to carry the
internal s3::endpoint_config, the config option parsing changes
respectively.
Tests, that generate the config files, and docs are updated.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Don't prepare s3::endpoint_config from generic code, jut pass the region
and iam_role_arn (those that can potentially change) to the callback.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Add handling for a broader set of transient network-related `std::errc` values in `aws_error::from_system_error`. Treat these conditions as retryable when the client re-creates the socket for each request.
Fixes: https://github.com/scylladb/scylladb/issues/27349Closesscylladb/scylladb#27350
Refactor `chunked_download_source` to eliminate redundant exception
handling by leveraging the new `make_request` override with custom
retry strategy. This streamlines the download fiber logic, improving
readability and maintainability.
Reformats the `make_request` function declarations to improve readability
due to the large number of arguments. This aligns with our formatting
guidelines and makes the code easier to maintain.
handler
Introduce an override for `make_request` in `s3_client` to support
custom retry strategies and error handlers, enabling flexibility
beyond the default client behavior and improving control over request
handling
In the `copy_part` method, move the `input_stream<char>` argument
into a local variable before use. Failing to do so can lead to a
SIGSEGV or trigger an abort under address sanitizer.
Add a missing `case` clause to the `switch` statement to correctly
handle scenarios where `unexpected_status_error` is thrown. This
fixes overlooked error handling and improves robustness.
In AWS credentials providers, replace `retryable_http_client` with
Seastar's native HTTP client. Integrate the newly added
`default_aws_retry_strategy` to handle retries more efficiently and
reduce dependency on external retry logic.
Add a new class derived from Seastar's `default_retry_strategy`.
Relocate the `should_retry` implementation from Scylla's
`default_retry_strategy` into the new class to centralize and
standardize retry behavior.
Renames the `default_retry_strategy` class to `default_aws_retry_strategy`
to clarify its association with the S3 client implementation. This avoids
confusion with the unrelated `seastar::default_retry_strategy` class.