Commit Graph

393 Commits

Author SHA1 Message Date
niksis02
b3ed7639f0 feat: implements conditional reads for GetObject and HeadObject
Closes #882

Implements conditional reads for `GetObject` and `HeadObject` in the gateway for both POSIX and Azure backends. The behavior is controlled by the `If-Match`, `If-None-Match`, `If-Modified-Since`, and `If-Unmodified-Since` request headers, where the first two perform ETag comparisons and the latter two compare against the object’s `LastModified` date. No validation is performed for invalid ETags or malformed date formats, and precondition date headers are expected to follow RFC1123; otherwise, they are ignored.

The Integration tests cover all possible combinations of conditional headers, ensuring the feature is 100% AWS S3–compatible.
2025-09-01 18:33:01 -07:00
niksis02
4c41b8be3b feat: changes cors implementation in s3 to store/retreive in meta bucket
The CORS actions were directly proxied in s3 proxy backend. The new implementation stores/retreives/deletes bucket cors configuration in `meta` bucket.
2025-08-28 01:43:11 +04:00
Ben McClelland
1eeb7de0b6 feat: add versioning dir option to scoutfs backend
This adds the same versioning dir option that is found in the
posix backend to scoutfs backend. Functionality is the same.
2025-08-26 11:20:35 -07:00
nitin
630651254f fix: update marker/continuation token to be the azure next marker
This changes the marker/continuation token from the object name
to the marker from the azure list objects pager. This is needed
because passing the object name as the token to the azure next
call causes the Azure API to throw 400 Bad Request with
InvalidQueryParameterValue. So we have to use the azure marker
for compatibility with the azure API pager.

To do this we have to align the s3 list objects request to the
Azure ListBlobsHierarchyPager. The v2 requests have an optional
startafter where we will have to page through the azure blobs
to find the correct starting point, but after this we will
only return with the single paginated results form the Azure
pager to maintain the correct markers all the way through to
Azure.

The ListObjects (non V2) assumes that the marker must be an object
name, so for this case we have to page through the azure listings
for each call to find the correct starting point. This makes the
V2 method far more efficient, but maintains correctness for the
ListObjects.

Also remove continuation token string checks in the integration
tests since this is supposed to be an opaque token that the
client should not care about. This will help to maintain the
tests for mutliple backend types.

Fixes #1457
2025-08-25 11:28:42 -07:00
niksis02
09031a30e5 feat: bucket cors implementation
Closes #1003

**Changes Introduced:**

1. **S3 Bucket CORS Actions**

   * Implemented the following S3 bucket CORS APIs:

     * `PutBucketCors` – Configure CORS rules for a bucket.
     * `GetBucketCors` – Retrieve the current CORS configuration for a bucket.
     * `DeleteBucketCors` – Remove CORS configuration from a bucket.

2. **CORS Preflight Handling**

   * Added an `OPTIONS` endpoint to handle browser preflight requests.
   * The endpoint evaluates incoming requests against bucket CORS rules and returns the appropriate `Access-Control-*` headers.

3. **CORS Middleware**

   * Implemented middleware that:

     * Checks if a bucket has CORS configured.
     * Detects the `Origin` header in the request.
     * Adds the necessary `Access-Control-*` headers to the response when the request matches the bucket CORS configuration.
2025-08-20 20:45:09 +04:00
nitin
0eadc3871e fix: add -1 to azure etag to avoid client sdk verfications
The C++ SDK (and maybe others?) assume that the S3 ETags
without a "-" in the string are MD5 checksums. So the Azure
ETag that does not have a "-" but also is not an MD5 checksum
will fail some of the sdk internal validation checks.

Fix this by appending "-1" to the ETag to make it look like
the multipart format ETag that will skip the sdk verfication
check.

Fixes: #1380

Co-authored-by: Ben McClelland <ben.mcclelland@versity.com>
2025-08-14 14:14:12 -07:00
Ben McClelland
e134f63ebc fix: add test cases and fix behavior for head/get range requests
This adds a bunch of test cases for non-0 len object, 0 len
object, and directory objects to match verified AWS responses
for the various range bytes cases.

This fixes the posix head/get range responses for these test
cases as well.
2025-08-12 14:46:58 -07:00
Ben McClelland
b0054fc415 Merge pull request #1435 from ondrap/pr2
Fix O_TMPFILE Linkat race, cleanup of scoutfs integration, fix MoveData non-aligned problem
2025-08-08 08:18:02 -07:00
Ondrej Palkovsky
f0858a47d5 Small cleanups. 2025-08-08 08:56:44 +02:00
Ondrej Palkovsky
298d4ec6b4 Merged scoutfs and posix ListObjects and ListObjectsV2 2025-08-08 08:37:16 +02:00
Ondrej Palkovsky
3934beae2f Lowercase err message. 2025-08-08 07:36:13 +02:00
Ondrej Palkovsky
936239b619 DRY of scoutfs integration, alignment testing for scoutfs.MoveData 2025-08-07 18:28:38 +02:00
Ondrej Palkovsky
e62337f055 Fix O_TMPFILE Linkat race. 2025-08-07 18:28:32 +02:00
Ondrej Palkovsky
8e6dd45ce5 Fix race in GetObject 2025-08-04 15:50:46 +02:00
niksis02
69ba00a25f fix: fixes the UploadPart failure with no precalculated checksum header for FULL_OBJECT checksum type
Fixes #1342

This PR includes two main changes:

1. It fixes the case where `x-amz-checksum-x` (precalculated checksum headers) are not provided for `UploadPart`, and the checksum type for the multipart upload is `FULL_OBJECT`. In this scenario, the server no longer returns an error.

2. When no `x-amz-checksum-x` is provided for `UploadPart`, and `x-amz-sdk-checksum-algorithm` is also missing, the gateway now calculates the part checksum based on the multipart upload's checksum algorithm and stores it accordingly.

Additionally, the PR adds integration tests for:

* The two cases above
* The case where only `x-amz-sdk-checksum-algorithm` is provided
2025-07-28 23:01:35 +04:00
niksis02
e5850ff11f feat: adds copy source validation for x-amz-copy-source header.
Fixes #1388
Fixes #1389
Fixes #1390
Fixes #1401

Adds the `x-amz-copy-source` header validation for `CopyObject` and `UploadPartCopy` in front-end.
The error:
```
	ErrInvalidCopySource: {
		Code:           "InvalidArgument",
		Description:    "Copy Source must mention the source bucket and key: sourcebucket/sourcekey.",
		HTTPStatusCode: http.StatusBadRequest,
	},
```
is now deprecated.

The conditional read/write headers validation in `CopyObject` should come with #821 and #822.
2025-07-22 14:40:11 -07:00
niksis02
dc16c0448f feat: implements integration tests for the new advanced router 2025-07-22 21:00:24 +04:00
niksis02
b7c758b065 feat: implements advanced routing for bucket POST and object PUT operations.
Fixes #1036

Fixes the issue when calling a non-existing root endpoint(POST /) the gateway returns `NoSuchBucket`. Now it returns the correct `MethodNotAllowed` error.
2025-07-22 20:55:22 +04:00
Ben McClelland
9cc29af073 Merge pull request #1382 from versity/ben/s3proxy-change-bucket-owner
fix: admin bucket actions for s3proxy
2025-07-09 16:37:37 -07:00
Ben McClelland
f295df2217 fix: add new auth method to update ownership within acl
Add helper util auth.UpdateBucketACLOwner() that sets new
default ACL based on new owner and removes old bucket policy.

The ChangeBucketOwner() remains in the backend.Backend
interface in case there is ever a backend that needs to manage
ownership in some other way than with bucket ACLs. The arguments
are changing to clarify the updated owner. This will break any
plugins implementing the old interface. They should use the new
auth.UpdateBucketACLOwner() or implement the corresponding
change specific for the backend.
2025-07-09 16:16:34 -07:00
Ben McClelland
cbd3eb1cd2 fix: ListMultipartUploads pagination panic and duplicate results
This fixes a panic seen when there were a lot of multipart uploads in the
same bucket requiring multiple paginated responses. for example:
panic: runtime error: index out of range [11455] with length 1000
goroutine 418 [running]:
github.com/versity/versitygw/backend/posix.(*Posix).ListMultipartUploads(0xc0004300
/Users/ben/repo/versitygw/backend/posix/posix.go:2122 +0xd25
github.com/versity/versitygw/s3api/controllers.S3ApiController.ListActions({{0x183c
...

This change updates the ListMultipartUploads implementation to properly advance
past the (KeyMarker, UploadIDMarker) tuple when paginating, ensuring that each
response starts after the marker and does not include duplicate uploads.
2025-07-09 15:36:16 -07:00
Ben McClelland
c196b5f999 fix: admin bucket actions for s3proxy
We were incorrctly trying to pass through the admin request
actions through to the backend s3 service in s3proxy. This
was resulting in internal server errors since not all s3
backends would understand these requests. Instead the
gateway needs to handle these requests directly.

Fixes #1381
2025-07-09 09:13:14 -07:00
Ben McClelland
36509daec7 chore: use time.Equal for s3proxy time equality checks
Fixes lint warnings related to using time.Equal instead of == for
time equality checks.
2025-07-07 14:20:36 -07:00
Ben McClelland
7b8b483dfc feat: calculate full object crc for multi-part uploads for compatible checksums
The CRC32, CRC32c, and CRC64NVME data integrity checksums support calculating
the composite full object values for multi-part uploads using the checksum
and length of the individual parts.

Previously, we were reading all of the part data to recalculate the full
object checksum values during the complete multipart upload call. This
disabled the optimized copy_file_range() for certain filesystems such as
XFS because the part data was being read. If the data is not read, and
the file handle is passed directly to io.Copy(), then the filesystem is
allowed to optimize the copying of the data from the source to destination
files.

This now allows both the optimized copy_file_range() optimizations as well
as the data integrity features enabled for support composite checksum types.
2025-07-03 19:58:53 -07:00
niksis02
98a7b7f402 feat: adds a middleware to validate bucket/object names
Implements a middleware that validates incoming bucket and object names before authentication. This helps prevent malicious attacks that attempt to access restricted or unreachable data in `POSIX`.

Adds test cases to cover such attack scenarios, including false negatives where encoded paths are used to try accessing resources outside the intended bucket.

Removes bucket validation from all other layers—including `controllers` and both `POSIX` and `ScoutFS` backends — by moving the logic entirely into the middleware layer.
2025-07-04 00:55:03 +04:00
Ben McClelland
b09efa532c Merge pull request #1370 from versity/ben/s3-client-retry
fix: prevent internal request retry to s3proxy backend
2025-07-03 11:39:06 -07:00
Ben McClelland
0d73e3ebe2 fix: prevent internal request retry to s3proxy backend
The http body stream is not a seekable stream, so most operation
retry attempts will fail with an internal server error. This
change tells the s3 client within the gateway to not retry any
requests, and instead let the client of the gateway handle the
error retry.

Fixes #1353
2025-07-03 10:20:44 -07:00
Ben McClelland
5ba5327ba6 fix: s3proxy create bucket always returning BucketAlreadyExists
We were using the metadata retrieval to check for existing
buckets during create, and then return either BucketAlreadyExists
or ErrBucketAlreadyOwnedByYou accordingly.

Howver, the metadata retrieval was returning success with a
default ACL when the bucket metadata did not already exist
causing the gateway to always think this bucket existed.

Fix here is to let the metadata retrieval know that we do not
want the default ACL for this case.
2025-07-02 16:29:28 -07:00
Ben McClelland
6541232a2d fix: s3 backend user bucket listing
This fixes the listing of buckets when multi tenant mode is
enabled with a metadata bucket. The following behavior changes
are fixed:
* prevent listing of metadata bucket by all accounts
* prevent listing of non-owned buckets by user/userplus
* return correct BucketAlreadyExists/BucketAlreadyOwnedByYou
for attempts to create existing bucket

Fixes #1326
2025-06-19 10:19:29 -07:00
Ben McClelland
32c6f2e463 fix: non existing bucket acl parsing
There were a couple of cases that would return an error for the
non existing bucket acl instead of treating that as the default
acl.

This also cleans up the backends that were doing their own
acl parsing instead of using the auth.ParseACL() function.

Fixes #1304
2025-05-20 13:46:20 -07:00
niksis02
3e50e29306 fix: overrides empty checksum type and algorithm with 'null' for ListParts
Fixes #1288

If the checksum algorithm/type is not specified during multipart upload initialization, it is considered `null`, and the `ListParts` result should also set it to `null`.
2025-05-14 08:22:45 -07:00
niksis02
afbcbcac13 fix: fixes all the available actions date xml marshalling for response body.
Fixes the response body parsing for all available actions to correctly parse date fields (e.g., `LastModified`) into the correct format.
2025-05-13 23:59:59 +04:00
Ben McClelland
d3bcd8ffc5 Merge pull request #1289 from versity/sis/copy-object-date
fix: fixes the LastModified date formatting in CopyObject result.
2025-05-12 13:15:47 -07:00
niksis02
323717bcf1 fix: fixes the LastModified date formatting in CopyObject result.
Fixes #1276

Creates the custom `s3response.CopyObjectOutput` type to handle the `LastModified` date property formatting correctly. It uses `time.RFC3339` to format the date to match the format that s3 uses.
2025-05-12 23:30:47 +04:00
niksis02
d3585e6c1c feat: optimizes backend.Walk and backend.WalkVersions to avoid sorting the common prefixes.
Common prefixes were originally stored in a `map[string]struct{}`, which was then converted to a slice and sorted. The new implementation stores the common prefixes in a `map[string]int`, where the map value represents the index of the common prefix. There's no need to sort the common prefixes array, as `fs.WalkDir` comes with sorted directories and files.
2025-05-10 01:59:39 +04:00
niksis02
3740d79173 fix: adds the surrounding quotes on ETag in PutObject for dir objects and in UploadPartCopy.
Fixes #1277
Fixes #1235

Adds surrounding quotes on `ETag` when creating a directory object. Adds the quotes in `UploadPartCopy` as well.
2025-05-09 00:29:23 +04:00
sebastian-heinz
42013d365b use path style 2025-05-06 10:28:16 +08:00
Ben McClelland
4b34ef1a5f Merge pull request #1263 from versity/sis/headobject-range
fix: fixes the range parsing for GetObject. Adds range query support for HeadObject.
2025-05-05 12:23:15 -07:00
niksis02
dfa1ed2358 fix: fixes the range parsing for GetObject. Adds range query support for HeadObject.
Fixes #1258
Fixes #1257
Closes #1244

Adds range queries support for `HeadObject`.
Fixes the range parsing logic for `GetObject`, which is used for `HeadObject` as well. Both actions follow the same rules for range parsing.

Fixes the error message returned by `GetObject`.
2025-05-05 22:41:12 +04:00
Ben McClelland
a60d6a7faa fix: scoutfs racing mutlipart uploads internal error
When multiple uploads with the same object key are racing, we can
end up with an EEXIST when trying to link the final object into
the namespace. When this happens, we should just remove the
existing file and try again since the semantics are that the
last upload should win.
2025-05-03 09:30:45 -07:00
Ben McClelland
a29f7b1839 fix: scoutfs missing ListObjectsV2() start after
This brings ListObjectsV2 for scoutfs in sync with posix to handle
the start after and continuation token ases.
2025-05-03 09:15:01 -07:00
Ben McClelland
6321406008 fix: scoutfs missing ListObjects() response fields
This fixes some tests that were fialing due to missing response
fields in ListObjects().
2025-05-03 09:07:56 -07:00
Ben McClelland
a9fcf63063 feat: cleanup calling of debuglogger with managed debug setting 2025-05-02 17:05:59 -07:00
Ben McClelland
9f13b544f7 fix: scoutfs etag check for multipart uploads
The Etag can be quoted or not, so the check to verify the part
Etag must remove the quotes before checking for equality. This
check is the same now as posix.
2025-05-02 10:07:47 -07:00
Ben McClelland
9244e9100d fix: xml response field names for complete multipart upload
The xml encoding for the s3.CompleteMultipartUploadOutput response
type was not producing exactly the right field names for the
expected complete multipart upload result.

This change follows the pattern we have had to do for other xml
responses to create our own type that will encode better to the
expected response.

This will change the backend.Backend interface, so plugins and
other backends will have to make the corresponding changes.
2025-04-30 14:36:48 -07:00
Ben McClelland
4eba4e031c Merge pull request #1251 from versity/sis/uploadpart-etag-quotes
fix: adds quotes to part Etag in UploadPart
2025-04-30 14:35:34 -07:00
niksis02
32faf9a4c3 fix: adds quotes to part Etag in UploadPart
Fixes #1233

Add double quotes to the `ETag` in `UploadPart`.
2025-04-30 23:26:18 +04:00
Timothy Tschampel
dea4b6382f add additional constructor with s3.Client instance 2025-04-29 09:10:54 -07:00
niksis02
5e6056467e fix: fixes tagging string parsing for PutObject, CopyObject and CreateMultipartUpload
Fixes #1215
Fixes #1216

`PutObject`, `CopyObject` and `CreateMultipartUpload` accept tag string as an http request header which should be url-encoded. The tag string should be a valid url-encoded string and each key/value pair should be valid, otherwise they should fail with `APIError`.

If the provided tag set contains duplicate `keys` the calls should fail with the same `InvalidURLEncodedTagging` error.

Not all url-encoded characters are supported by `S3`. The tagging string should contain only `letters`, `digits` and the following special chars:
- `-`
- `.`
- `/`
- `_`
- `+`
- ` `(space)

And their url-encoded versions: e.g. `%2F`(/), `%2E`(.) ... .

If the provided tagging string contains invalid `key`/`value`, the calls should fail with the following errors respectively:
`invalid key` - `(InvalidTag) The TagKey you have provided is invalid`
`invalid value` - `(InvalidTag) The TagValue you have provided is invalid`
2025-04-28 20:28:20 +04:00
Ben McClelland
b4486b095d Merge pull request #1234 from versity/sis/tagging-parse-errs
fix: handles tag parsing error cases for PutBucketTagging and PutObjectTagging
2025-04-23 14:51:37 -07:00