scylladb

Author	SHA1	Message	Date
Ernest Zaslavsky	29960b83b5	s3_test: extract file writing code to a function Reduce code doing the same over and over again by extracting file writing code to a function	2025-08-14 16:18:43 +03:00
Ernest Zaslavsky	dd51e50f60	s3_client: add memory fallback in `chunked_download_source` Introduce fallback logic in `chunked_download_source` to handle memory exhaustion. When memory is low, feed the `deque` with only one uncounted buffer at a time. This allows slow but steady progress without getting stuck on the memory semaphore. Fixes: https://github.com/scylladb/scylladb/issues/25453 Fixes: https://github.com/scylladb/scylladb/issues/25262 Closes scylladb/scylladb#25452	2025-08-14 09:52:10 +03:00
Ernest Zaslavsky	380c73ca03	s3_client: make memory semaphore acquisition abortable Add `abort_source` to the `get_units` call for the memory semaphore in the S3 client, allowing the acquisition process to be aborted. Fixes: https://github.com/scylladb/scylladb/issues/25454 Closes scylladb/scylladb#25469	2025-08-13 08:48:55 +03:00
Ernest Zaslavsky	e4ebe6a309	s3_creds: Make `reload` unconditional Assume that any caller invoking `reload` intends to refresh credentials. Remove conditional logic that checks for expiration before reloading.	2025-08-03 17:41:35 +03:00
Ernest Zaslavsky	68855c90ca	s3_creds: Add test exposing credentials renewal issue Add a test demonstrating that renewing credentials does not update their expiration. After requesting credentials again, the expiration remains unchanged, indicating no actual update occurred.	2025-08-03 17:41:25 +03:00
Ernest Zaslavsky	acf15eba8e	s3_test: Add s3_client test for non-retryable error handling Introduce a test that injects a non-retryable error and verifies that the chunked download source throws an exception as expected.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	d2d69cbc8c	s3_client: Stop retries in chunked download source Disable retries for S3 requests in the chunked download source to prevent duplicate chunks from corrupting the buffer queue. The response handler now throws an exception to bypass the retry strategy, allowing the next range to be attempted cleanly. This exception is only triggered for retryable errors; unretryable ones immediately halt further requests.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	ec59fcd5e4	s3_client: Add test for Content-Range fix Introduce a test that accurately verifies the Content-Range behavior, ensuring the previous fix is properly validated.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	9ad7a456fe	s3_client: Refactor `range` class for state validation Revamped the `range` class to actively manage its state by enforcing validation on all modifications. This prevents overflow, invalid states, and ensures the object size does not exceed the 5TiB limit in S3.	2025-06-16 16:02:24 +03:00
Ernest Zaslavsky	30199552ac	s3_client: Mitigate connection exhaustion in `download_source` The existing `download_source` implementation optimizes performance by keeping the connection to S3 open and draining data directly from the socket. While this eliminates the overhead (60-100ms) of repeatedly establishing new connections, it leads to rapid exhaustion of client- side connections. On a single shard, two `mx_readers` for load and stream are enough to trigger this issue. Since each client typically holds two connections, readers keeping index and data sources open can cause deadlocks where processes stall due to unavailable connections. Introduce `chunked_download_source`, a new S3 download method built on `download_source`, to dynamically manage connections: - Buffers data in 5MiB chunks using a producer-consumer model - Closes connections once buffers reach capacity, returning them to the pool for other clients - Uses a filling fiber that resumes fetching once buffers are consumed from the queue Performance remains comparable to `download_source`, achieving 95MiB/s for sequential 1GiB downloads from S3. However, preloading large chunks may cause read amplification. Fixes: https://github.com/scylladb/scylladb/issues/23785 Closes scylladb/scylladb#23880	2025-06-10 12:58:24 +03:00
Ernest Zaslavsky	edaa3f4bdd	s3_tests: Improve and extend copy object test coverage Refactored the copy object test to enhance readability and maintainability. The test was simplified and split into smaller, more focused parts. Additionally, a "proxied" variant of the test was introduced to expand coverage.	2025-04-21 20:54:14 +03:00
Ernest Zaslavsky	252a0a14af	s3_tests: Implement post-test cleanup for uploaded objects Ensure cleanup after tests by deleting objects uploaded to MinIO. This improves resource management and maintains a clean test environment.	2025-04-21 20:54:14 +03:00
Ernest Zaslavsky	a369dda049	s3_client: implement S3 copy object Add support for the CopyObject API to enable direct copying of S3 objects between locations. This approach eliminates networking overhead on the client side, as the operation is handled internally by S3.	2025-04-17 09:47:47 +03:00
Pavel Emelyanov	bd313c581f	test: Add unit test for newly introduced download source Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-21 12:01:06 +03:00
Ernest Zaslavsky	88c4fa6569	s3: Implement S3 Fully Qualified Name Manipulation Functions Added utility functions to handle S3 Fully Qualified Names (FQN). These functions enable parsing, splitting, and identification of S3 paths, enhancing our ability to work with S3 object storage more effectively.	2025-03-09 09:50:36 +02:00
Robert Bindar	27f2d64725	Remove object storage config credentials provider During development of #22428 we decided that we have no need for `object-storage.yaml`, and we'd rather store the endpoints in `scylla.yaml` and get a REST api to exopose the endpoints for free. This patch removes the credentials provider used to read the aws keys from this yaml file. Followup work will remove the `object-storage.yaml` file altogether and move the endpoints to `scylla.yaml`. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#22951	2025-03-07 10:40:58 +03:00
Ernest Zaslavsky	dee4fc7150	aws creds: add STS and Instance Metadata service credentials providers This commit introduces two new credentials providers: STS and Instance Metadata Service. The S3 client's provider chain has been updated to incorporate these new providers. Additionally, unit tests have been added to ensure coverage of the new functionality.	2025-02-05 14:57:19 +02:00
Ernest Zaslavsky	d534051bea	aws creds: add env. and file credentials providers This commit entirely removes credentials from the endpoint configuration. It also eliminates all instances of manually retrieving environment credentials. Instead, the construction of file and environment credentials has been moved to their respective providers. Additionally, a new aws_credentials_provider_chain class has been introduced to support chaining of multiple credential providers.	2025-02-05 14:57:19 +02:00
Ernest Zaslavsky	c911fc4f34	s3 creds: move credentials out of endpoint config This commit refactors the way AWS credentials are managed in Scylla. Previously, credentials were included in the endpoint configuration. However, since credentials and endpoint configurations serve different purposes and may have different lifetimes, it’s more logical to manage them separately. Moving forward, credentials will be completely removed from the endpoint_config to ensure clear separation of concerns.	2025-02-04 16:45:23 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Ernest Zaslavsky	4035e0877d	s3_tests: Add s3 test to check object re-uploading Add s3 test to check existing object re-uploading succeeds Closes scylladb/scylladb#21544	2024-11-28 12:46:59 +03:00
Ernest Zaslavsky	029837a4a1	aws_errors: Make error messages more verbose. Add more information to the error messages to make the failure reason clearer. Also add tests to check exceptions propagated from s3 client failure.	2024-11-07 21:01:25 +02:00
Ernest Zaslavsky	0c62635f05	test/boost/s3_test: add error injection scenarios to existing test suite Add variants of existing S3 tests that route through a proxy instead of connecting directly to MinIO. The proxy allows injecting errors to validate error handling and recovery mechanisms under failure conditions.	2024-11-07 21:01:25 +02:00
Ernest Zaslavsky	8919e0abab	test: Switch `s3_test` to use proxy Switch `s3_test` to use the S3 proxy which is used to randomly inject retryable S3 errors to test the "retry" part of the S3 client. Fix `put_object` to make it retryable	2024-11-07 21:01:25 +02:00
Pavel Emelyanov	86bc5b11fe	s3-client: Add support for lister::filter Directory lister comes with a filter function that tells lister which entries to skip by its .get() method. For uniformity, add the same to S3 bucket_lister. After this change the lister reports shorter name in the returned directory entry (with the prefix cut), so also need to tune up the unit test respectively. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-27 16:15:40 +03:00
Pavel Emelyanov	05adee4c82	test: Add test for s3::client::bucket_lister Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-13 21:15:43 +03:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Kefu Chai	061def001d	s3/client: add client::upload_file() this member function prepares for the backup feature, where the object to be stored in the object storage is already persisted as a file on local filesystem. this brings us two benefits: - with the file, we don't need to accumulate the payloads in memory and send them in batch, as we do in upload_sink and in upload_jumbo_sink. this puts less pressure on the memory subsystem. - with the file, we can read multiple parts in parallel if multpart upload applies to it, this helps to improve the throughput. so, this new helper is introduced to help upload an sstable from local filesystem to the object storage. Fixes #16287 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-07-23 14:39:30 +08:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Pavel Emelyanov	76705b6ba2	test/s3: Avoid object range overflow There's a test case the validates uploading sink by getting random portions of the uploaded object. The portions are generated as len = random % chunk_size off = random % file_size - len The latter may apparently render negative value which will translate into huuuuge 64-bit range offset which, in turn, would result in invalid http range specifier and getting object part fails with status OK Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-07 10:54:54 +03:00
Pavel Emelyanov	c5d85bdf79	s3/client: Don't GET object contents on out-of-bound reads If S3 readable file is used inside file input stream, the latter may call its read methods with position that is above file size. In that case server replies with generic http error and the fact that the range was invalid is encoded into reply body's xml. That's not great to catch this via wrong reply status exception and xml parsing all the more so we can know that the read is out-of-bound in advance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-29 12:09:52 +03:00
Pavel Emelyanov	855626f7de	s3/client: Map http exceptions into storage_io_error When http request resolves with excpetion it makes sense to translate the network exception into storage exceptio to make upper layers think that it was some sort of IO error, not SUDDENLY and http one. The translation is, for now, pretty simple: - 404 and 3xx -> ENOENT - 403(forbidden) and 401(unauthorized) -> EACCESS - anything else -> EIO Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-21 16:47:50 +03:00
Pavel Emelyanov	caa3e751f7	test: Add s3_client test for upload PUT fallback The test case creates non-jumbo upload simk and puts some bytes into it, then flushes. In order to make sure the fallback did took place the multipar memory tracker sempahore is broken in advance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-24 15:03:53 +03:00
Kefu Chai	f3f31f0c65	main.cc: rename aws option - s/aws_key/aws_access_key_id/ - s/aws_secret/aws_secret_access_key/ - s/aws_token/aws_session_token/ rename them to more popular names, these names are also used by boto's API. this should improve the readability and consistency. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-23 14:31:32 +08:00
Kefu Chai	ac3406e537	utils/s3/creds: rename aws_config member variables - s/key/access_key_id/ - s/secret/secret_access_key/ - s/token/session_token/ so they are more aligned with the AWS document. for instance, in https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#ConstructingTheAuthenticationHeader AWSAccessKeyId is used in the "Authorization" header. this would help with the readability and maintainability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-23 14:28:07 +08:00
Avi Kivity	1da6a939fe	Merge 'Track memory usage of S3 object uploads' from Pavel Emelyanov The S3 uploading sink needs to collect buffers internally before sending them out, because the minimal upload-able part size is 5Mb. When the necessary amount of bytes is accumulated, the part uploading fibers starts in the background. On flush the sink waits for all the fibers to complete and handles failure of any. Uploading parallelism is nowadays limited by the means of the http client max-connections parameter. However, when a part uploading fibers waits for it connection it keeps the 5Mb+ buffers on the request's body, so even though the number of uploading parts is limited, the number of _waiting_ parts is effectively not. This PR adds a shard-wide limiter on the number of background buffers S3 clients (and theirs http clients) may use. Closes scylladb/scylladb#15497 * github.com:scylladb/scylladb: s3::client: Track memory in client uploads code: Configure s3 clients' memory usage s3::client: Construct client with shared semaphore sstables::storage_manager: Introduce config	2023-09-21 18:24:42 +03:00
Kefu Chai	c364efb998	utils/s3: auth using AWS_SESSION_TOKEN when accessing AWS resources, uses are allowed to long-term security credentials, they can also the temporary credentials. but if the latter are used, we have to pass a session token along with the keys. see also https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html so, if we want to programatically get authenticated, we need to set the "x-amz-security-token" header, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials so, in this change, we 1. add another member named `token` in `s3::endpoint_config::aws_config` for storing "AWS_SESSION_TOKEN". 2. populate the setting from "object_storage.yaml" and "$AWS_SESSION_TOKEN" environment variable. 3. set "x-amz-security-token" header if `s3::endpoint_config::aws_config::token` is not empty. this should allow us to test s3 client and s3 object store backend with S3 bucket, with the temporary credentials. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15486	2023-09-21 13:26:11 +03:00
Pavel Emelyanov	182a5348d4	code: Configure s3 clients' memory usage This sets the real limits on the memory semaphore. - scylla sets it to 1% of total memory, 10Mb min, 100Mb max - tests set it to 16Mb - perf test sets it to all available memory Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:29 +03:00
Pavel Emelyanov	b299757884	s3::client: Construct client with shared semaphore The semaphore will be used to cap memory consumption by client. This patch makes sure the reference to a semaphore exists as an argument to client's constructor, not more than that. In scylla binary, the semaphore sits on storage_manager. In tests the semaphore is some local object. For now the semaphore is unused and is initialized locked as this patch just pushes the needed argument all the way around, next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:07 +03:00
Botond Dénes	cc16502691	Merge 'Add metrics to S3 client' from Pavel Emelyanov The added metrics include: - http client metrics, which include the number of connections, the number of active connections and the number of new connections made so far - IO metrics that mimic those for traditional IO -- total number of object read/write ops, total number of get/put/uploaded bytes and individual IO request delay (round-trip, including body transfer time) fixes: #13369 Closes #14494 * github.com:scylladb/scylladb: s3/client: Add IO stats metrics s3/client: Add HTTP client metrics s3/client: Split make_request() s3/client: Wrap http client with struct group_client s3/client: Move client::stats to namespace scope s3/client: Keep part size local variable	2023-09-14 09:49:08 +03:00
Pavel Emelyanov	4dc4f65b18	test/s3: Remove AWS_S3_EXTRA usage Now when the keys and region can be configured with "standard" environment variables, the old custom one can be removed. No automation uses that it was purely a support for manual testing of a client against AWS's S3 server Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 11:16:13 +03:00
Pavel Emelyanov	1d00cc5baa	test/s3: Run tests over non-anonymous bucket Currently minio applies anonymous public policy for the test bucket and all tests just use unsigned S3 requests. This patch generates a policy for the temporary minio user and removes the anon public one. All tests are updated respectively to use the provided key:secret pair. The use-https bit is off by default as minio still starts with plain http. That's OK for now, all tests are local and have no secret data anyway Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 11:16:13 +03:00
Pavel Emelyanov	e8e8539c7c	code: Rename S3_PUBLIC_BUCKET_FOR_TEST The bucket is going to stop being public, rename the env variable in advance to make the essential patch smaller Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 10:25:53 +03:00
Pavel Emelyanov	627c1932e4	s3/client: Move client::stats to namespace scope The stats is stats about object, not about client, so it's better if it lives in namespace scope. Also it will avoid conflicts with client stats that will be reported as metrics (later patch) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Kefu Chai	77faec4f38	s3/test: use seastar::deferred() to perform cleanup let's use RAII to remove the object use as a fixture, so we don't leave some object in the bucket for testing. this might interfere with other tests which share the same minio server with the test which fails to do its clean up if an exception is thrown. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-20 10:04:54 +08:00
Kefu Chai	7a9c802fc3	s3/test: close using deferred_close() let's use RAII to tear down the client and the input file, so we can always perform the cleanups even if the test throws. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-20 10:04:54 +08:00
Raphael S. Carvalho	da18a9badf	Fix test.py with compaction groups test.py with --x-log2-compaction-groups option rotted a little bit. Some boost tests added later didn't use the correct header which parses the option or they didn't adjust suite.yaml. Perhaps it's time to set up a weekly (or bi-weekly) job to verify there are no regressions with it. It's important as it stresses the data plane for tablets reusing the existing tests available. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #14732	2023-07-18 16:57:11 +03:00
Kefu Chai	ef78b31b43	s3/client: add tagging ops with tagging ops, we will be able to attach kv pairs to an object. this will allow us to mark sstable components with taggings, and filter them based on them. * test/pylib/minio_server.py: enable anonymous user to perform more actions. because the tagging related ops are not enabled by "mc anonymous set public", we have to enable them using "set-json" subcommand. * utils/s3/client: add methods to manipulate taggings. * test/boost/s3_test: add a simple test accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14486	2023-07-11 09:30:46 +03:00
Pavel Emelyanov	ce6a1ca13b	Update seastar submodule * seastar afe39231...99d28ff0 (16): > file/util: Include seastar.hh > http/exception: Use http::reply explicitly > http/client: Include lost condition-variable.hh > util: file: drop unnecessary include of reactor.hh > tests: perf: add a markdown printer > http/client: Introduce unexpected_status_error for client requests > sharded: avoid #include <seastar/core/reactor.hh> for run_in_background() > code: Use std::is_invocable_r_v instead of InvokeReturns > http/client: Add ability to change pool size on the fly > http/client: Add getters for active/idle connections counts > http/client: Count and limit the number of connections > http/client: Add connection->client RAII backref > build: use the user-specified compiler when building DPDK > build: use proper toolchain based on specified compiler > build: only pass CMAKE_C_COMPILER when building ingredients > build: use specified compiler when building liburing Two changes are folded into the commit: 1. missing seastar/core/coroutine.hh include in one .cc file that got it indirectly included before seastar reactor.hh drop from file.hh 2. http client now returns unexpected_status_error instead of std::runtime_error, so s3 test is updated respectively Closes #14168	2023-06-07 20:25:49 +03:00
Pavel Emelyanov	b3df2d0db0	s3/test: Tune-up multipart upload test alignment Currently the test uses a sequence of 1024-bytes buffers. This lets minio server actively de-duplicate those blocks by page boundary (it's a guess, but it it's truish because minio reports back equivalent ETags for lots of uploading parts). Make the buffer not be power of two so that when squashed together the resulting 2^X buffers don't get equal. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-16 12:23:18 +03:00

1 2

62 Commits