Fixes#1559Fixes#1330
This PR focuses on three main changes:
1. **Fix object lock error codes and descriptions**
When an object was WORM-protected and delete/overwrite was disallowed due to object lock configurations, the gateway incorrectly returned the `s3.ErrObjectLocked` error code and description. These have now been corrected.
2. **Update `PutObjectRetention` behavior**
Previously, when an object already had a retention mode set, the gateway only allowed modifications if the mode was changed from `GOVERNANCE` to `COMPLIANCE`, and only when the user had the `s3:BypassGovernanceRetention` permission.
The logic has been updated: if the existing retention mode is the same as the one being applied, the operation is now allowed regardless of other factors.
3. **Fix error checks in integration tests (AWS SDK regression)**
Due to an AWS SDK regression, integration tests were previously limited to checking partial error descriptions. This issue seems to be resolved for some actions (though the ticket is still open: https://github.com/aws/aws-sdk-go-v2/issues/2921). Error checks have been reverted back to full description comparisons where possible.
Similar to:
8e18b43116
fix: lex sort order of listobjects backend.Walk
But now the "Versions" walk.
The original backend.WalkVersions function used the native WalkDir and ReadDir
which did not guarantee lexicographic ordering of results for cases where
including directory slash changes the sort order. This caused incorrect
paginated responses because S3 APIs require strict lexicographic ordering
where directories with trailing slashes sort correctly relative to files.
For example, dir1/a.b/ must come before dir1/a/ in the results, but
fs.WalkDir was returning them in filesystem sort order which reversed
the order due to not taking in account the trailing "/".
The original Walk function used the native WalkDir and ReadDir which did not
guarantee lexicographic ordering of results for cases where including directory
slash changes the sort order. This caused incorrect paginated responses because
S3 APIs require strict lexicographic ordering where directories with trailing
slashes sort correctly relative to files. For example, dir1/a.b/ must come
before dir1/a/ in the results, but fs.WalkDir was returning them in filesystem
sort order which reversed the order due to not taking in account the trailing
"/".
This also lead to cases of continuous looping of paginated listobjects results
when the marker was set out of order from the expected results.
To address this fundamental ordering issue, the entire directory traversal
mechanism was replaced with a custom lexicographic sorting approach. The new
implementation reads each directory's contents using ReadDir, then sorts the
entries using custom sort keys that append trailing slashes to directory paths.
This ensures that dir1/a.b/ correctly sorts before dir1/a/, as well as other
similar failing cases, according to ASCII character ordering rules.
Fixes#1283
Fixes#1520
Removes the incorrect logic for HeadObject returning successful response, when querying an incomplete multipart upload.
Implements the logic to return `NotImplemented` error if `GetObject`/`HeadObject` is attempted with `partNumber` in azure and posix backends. The front-end part is preserved to be used in s3 proxy backend.
Closes#821
**Implements conditional operations across object APIs:**
* **PutObject** and **CompleteMultipartUpload**:
Supports conditional writes with `If-Match` and `If-None-Match` headers (ETag comparisons).
Evaluation is based on an existing object with the same key in the bucket. The operation is allowed only if the preconditions are satisfied. If no object exists for the key, these headers are ignored.
* **CopyObject** and **UploadPartCopy**:
Adds conditional reads on the copy source object with the following headers:
* `x-amz-copy-source-if-match`
* `x-amz-copy-source-if-none-match`
* `x-amz-copy-source-if-modified-since`
* `x-amz-copy-source-if-unmodified-since`
The first two are ETag comparisons, while the latter two compare against the copy source’s `LastModified` timestamp.
* **AbortMultipartUpload**:
Supports the `x-amz-if-match-initiated-time` header, which is true only if the multipart upload’s initialization time matches.
* **DeleteObject**:
Adds support for:
* `If-Match` (ETag comparison)
* `x-amz-if-match-last-modified-time` (LastModified comparison)
* `x-amz-if-match-size` (object size comparison)
Additionally, this PR updates precondition date parsing logic to support both **RFC1123** and **RFC3339** formats. Dates set in the future are ignored, matching AWS S3 behavior.
Closes#1518
Adds the `x-amz-object-size` header to the `PutObject` response, indicating the size of the uploaded object. This change is applied to the POSIX, Azure, and S3 proxy backends.
The debuglogger should be a top level module since we expect
all modules within the project to make use of this. If its
hidden in s3api, then contributors are less likely to make
use of this outside of s3api.
Closes#882
Implements conditional reads for `GetObject` and `HeadObject` in the gateway for both POSIX and Azure backends. The behavior is controlled by the `If-Match`, `If-None-Match`, `If-Modified-Since`, and `If-Unmodified-Since` request headers, where the first two perform ETag comparisons and the latter two compare against the object’s `LastModified` date. No validation is performed for invalid ETags or malformed date formats, and precondition date headers are expected to follow RFC1123; otherwise, they are ignored.
The Integration tests cover all possible combinations of conditional headers, ensuring the feature is 100% AWS S3–compatible.
The CORS actions were directly proxied in s3 proxy backend. The new implementation stores/retreives/deletes bucket cors configuration in `meta` bucket.
This changes the marker/continuation token from the object name
to the marker from the azure list objects pager. This is needed
because passing the object name as the token to the azure next
call causes the Azure API to throw 400 Bad Request with
InvalidQueryParameterValue. So we have to use the azure marker
for compatibility with the azure API pager.
To do this we have to align the s3 list objects request to the
Azure ListBlobsHierarchyPager. The v2 requests have an optional
startafter where we will have to page through the azure blobs
to find the correct starting point, but after this we will
only return with the single paginated results form the Azure
pager to maintain the correct markers all the way through to
Azure.
The ListObjects (non V2) assumes that the marker must be an object
name, so for this case we have to page through the azure listings
for each call to find the correct starting point. This makes the
V2 method far more efficient, but maintains correctness for the
ListObjects.
Also remove continuation token string checks in the integration
tests since this is supposed to be an opaque token that the
client should not care about. This will help to maintain the
tests for mutliple backend types.
Fixes#1457
Closes#1003
**Changes Introduced:**
1. **S3 Bucket CORS Actions**
* Implemented the following S3 bucket CORS APIs:
* `PutBucketCors` – Configure CORS rules for a bucket.
* `GetBucketCors` – Retrieve the current CORS configuration for a bucket.
* `DeleteBucketCors` – Remove CORS configuration from a bucket.
2. **CORS Preflight Handling**
* Added an `OPTIONS` endpoint to handle browser preflight requests.
* The endpoint evaluates incoming requests against bucket CORS rules and returns the appropriate `Access-Control-*` headers.
3. **CORS Middleware**
* Implemented middleware that:
* Checks if a bucket has CORS configured.
* Detects the `Origin` header in the request.
* Adds the necessary `Access-Control-*` headers to the response when the request matches the bucket CORS configuration.
The C++ SDK (and maybe others?) assume that the S3 ETags
without a "-" in the string are MD5 checksums. So the Azure
ETag that does not have a "-" but also is not an MD5 checksum
will fail some of the sdk internal validation checks.
Fix this by appending "-1" to the ETag to make it look like
the multipart format ETag that will skip the sdk verfication
check.
Fixes: #1380
Co-authored-by: Ben McClelland <ben.mcclelland@versity.com>
This adds a bunch of test cases for non-0 len object, 0 len
object, and directory objects to match verified AWS responses
for the various range bytes cases.
This fixes the posix head/get range responses for these test
cases as well.
Fixes#1342
This PR includes two main changes:
1. It fixes the case where `x-amz-checksum-x` (precalculated checksum headers) are not provided for `UploadPart`, and the checksum type for the multipart upload is `FULL_OBJECT`. In this scenario, the server no longer returns an error.
2. When no `x-amz-checksum-x` is provided for `UploadPart`, and `x-amz-sdk-checksum-algorithm` is also missing, the gateway now calculates the part checksum based on the multipart upload's checksum algorithm and stores it accordingly.
Additionally, the PR adds integration tests for:
* The two cases above
* The case where only `x-amz-sdk-checksum-algorithm` is provided
Fixes#1388Fixes#1389Fixes#1390Fixes#1401
Adds the `x-amz-copy-source` header validation for `CopyObject` and `UploadPartCopy` in front-end.
The error:
```
ErrInvalidCopySource: {
Code: "InvalidArgument",
Description: "Copy Source must mention the source bucket and key: sourcebucket/sourcekey.",
HTTPStatusCode: http.StatusBadRequest,
},
```
is now deprecated.
The conditional read/write headers validation in `CopyObject` should come with #821 and #822.
Fixes#1036
Fixes the issue when calling a non-existing root endpoint(POST /) the gateway returns `NoSuchBucket`. Now it returns the correct `MethodNotAllowed` error.
Add helper util auth.UpdateBucketACLOwner() that sets new
default ACL based on new owner and removes old bucket policy.
The ChangeBucketOwner() remains in the backend.Backend
interface in case there is ever a backend that needs to manage
ownership in some other way than with bucket ACLs. The arguments
are changing to clarify the updated owner. This will break any
plugins implementing the old interface. They should use the new
auth.UpdateBucketACLOwner() or implement the corresponding
change specific for the backend.
This fixes a panic seen when there were a lot of multipart uploads in the
same bucket requiring multiple paginated responses. for example:
panic: runtime error: index out of range [11455] with length 1000
goroutine 418 [running]:
github.com/versity/versitygw/backend/posix.(*Posix).ListMultipartUploads(0xc0004300
/Users/ben/repo/versitygw/backend/posix/posix.go:2122 +0xd25
github.com/versity/versitygw/s3api/controllers.S3ApiController.ListActions({{0x183c
...
This change updates the ListMultipartUploads implementation to properly advance
past the (KeyMarker, UploadIDMarker) tuple when paginating, ensuring that each
response starts after the marker and does not include duplicate uploads.
We were incorrctly trying to pass through the admin request
actions through to the backend s3 service in s3proxy. This
was resulting in internal server errors since not all s3
backends would understand these requests. Instead the
gateway needs to handle these requests directly.
Fixes#1381
The CRC32, CRC32c, and CRC64NVME data integrity checksums support calculating
the composite full object values for multi-part uploads using the checksum
and length of the individual parts.
Previously, we were reading all of the part data to recalculate the full
object checksum values during the complete multipart upload call. This
disabled the optimized copy_file_range() for certain filesystems such as
XFS because the part data was being read. If the data is not read, and
the file handle is passed directly to io.Copy(), then the filesystem is
allowed to optimize the copying of the data from the source to destination
files.
This now allows both the optimized copy_file_range() optimizations as well
as the data integrity features enabled for support composite checksum types.
Implements a middleware that validates incoming bucket and object names before authentication. This helps prevent malicious attacks that attempt to access restricted or unreachable data in `POSIX`.
Adds test cases to cover such attack scenarios, including false negatives where encoded paths are used to try accessing resources outside the intended bucket.
Removes bucket validation from all other layers—including `controllers` and both `POSIX` and `ScoutFS` backends — by moving the logic entirely into the middleware layer.
The http body stream is not a seekable stream, so most operation
retry attempts will fail with an internal server error. This
change tells the s3 client within the gateway to not retry any
requests, and instead let the client of the gateway handle the
error retry.
Fixes#1353
We were using the metadata retrieval to check for existing
buckets during create, and then return either BucketAlreadyExists
or ErrBucketAlreadyOwnedByYou accordingly.
Howver, the metadata retrieval was returning success with a
default ACL when the bucket metadata did not already exist
causing the gateway to always think this bucket existed.
Fix here is to let the metadata retrieval know that we do not
want the default ACL for this case.
This fixes the listing of buckets when multi tenant mode is
enabled with a metadata bucket. The following behavior changes
are fixed:
* prevent listing of metadata bucket by all accounts
* prevent listing of non-owned buckets by user/userplus
* return correct BucketAlreadyExists/BucketAlreadyOwnedByYou
for attempts to create existing bucket
Fixes#1326
There were a couple of cases that would return an error for the
non existing bucket acl instead of treating that as the default
acl.
This also cleans up the backends that were doing their own
acl parsing instead of using the auth.ParseACL() function.
Fixes#1304
Fixes#1288
If the checksum algorithm/type is not specified during multipart upload initialization, it is considered `null`, and the `ListParts` result should also set it to `null`.
Fixes#1276
Creates the custom `s3response.CopyObjectOutput` type to handle the `LastModified` date property formatting correctly. It uses `time.RFC3339` to format the date to match the format that s3 uses.
Common prefixes were originally stored in a `map[string]struct{}`, which was then converted to a slice and sorted. The new implementation stores the common prefixes in a `map[string]int`, where the map value represents the index of the common prefix. There's no need to sort the common prefixes array, as `fs.WalkDir` comes with sorted directories and files.
Fixes#1258Fixes#1257Closes#1244
Adds range queries support for `HeadObject`.
Fixes the range parsing logic for `GetObject`, which is used for `HeadObject` as well. Both actions follow the same rules for range parsing.
Fixes the error message returned by `GetObject`.