Commit Graph

68 Commits

Author SHA1 Message Date
niksis02
d861dc8e30 fix: fixes unsigned streaming upload parsing and checksum calculation
Fixes #1600
Fixes #1603
Fixes #1607
Fixes #1626
Fixes #1632
Fixes #1652
Fixes #1653
Fixes #1656
Fixes #1657
Fixes #1659

This PR focuses mainly on unsigned streaming payload **trailer request payload parsing** and **checksum calculation**. For streaming uploads, there are essentially two ways to specify checksums:

1. via `x-amz-checksum-*` headers,
2. via `x-amz-trailer`,
   or none — in which case the checksum should default to **crc64nvme**.

Previously, the implementation calculated the checksum only from `x-amz-checksum-*` headers. Now, `x-amz-trailer` is also treated as a checksum-related header and indicates the checksum algorithm for streaming requests. If `x-amz-trailer` is present, the payload must include a trailing checksum; otherwise, an error is returned.

`x-amz-trailer` and any `x-amz-checksum-*` header **cannot** be used together — doing so results in an error.

If `x-amz-sdk-checksum-algorithm` is specified, then either `x-amz-trailer` or one of the `x-amz-checksum-*` headers must also be present, and the algorithms must match. If they don’t, an error is returned.

The old implementation used to return an internal error when no `x-amz-trailer` was received in streaming requests or when the payload didn’t contain a trailer. This is now fixed.

Checksum calculation used to happen twice in the gateway (once in the chunk reader and once in the backend). A new `ChecksumReader` is introduced to prevent double computation, and the trailing checksum is now read by the backend from the chunk reader. The logic for stacking `io.Reader`s in the Fiber context is preserved, but extended: once a `ChecksumReader` is stacked, all following `io.Reader`s are wrapped with `MockChecksumReader`, which simply delegates to the underlying checksum reader. In the backend, a simple type assertion on `io.Reader` provides the necessary checksum metadata (algorithm, value, etc.).
2025-12-03 01:32:18 +04:00
niksis02
eae11b44c5 fix: adds versionId validation for object level actions
Fixes #1630

S3 returns `InvalidArgument: Invalid version id specified` for invalid version IDs in object-level actions that accept `versionId` as a query parameter. The `versionId` in S3 follows a specific structure, and if the input string doesn’t match this structure, the error is returned. In the gateway, the `versionId` is generated using the `ulid` package, which also has a defined structure. This PR adds validation for object-level operations that work with object versions by using the ULID parser.

These actions include: `HeadObject`, `GetObject`, `PutObjectTagging`, `GetObjectTagging`, `DeleteObjectTagging`, `PutObjectLegalHold`, `GetObjectLegalHold`, `PutObjectRetention`, `GetObjectRetention`, `DeleteObject`, `CopyObject`, `UploadPartCopy`, and `GetObjectAttributes`.
2025-11-11 22:23:50 +04:00
Ben McClelland
8a733b8cbf Merge pull request #1605 from versity/sis/mp-metadata
fix: makes object metadata keys lowercase in object creation actions
2025-10-28 22:01:47 -07:00
niksis02
8c3e49d0bb fix: fixes checksum header and algorithm mismatch error
Fixes #1598

`PutObject` and `UploadPart` accept x-amz-checksum-* calculated checksum headers and `x-amz-sdk-checksum-algorithm`. If the checksum algorithm specified in sdk algorithm doesn't match the one in x-amz-checksum-*, it now returns the correct error message: `Value for x-amz-sdk-checksum-algorithm header is invalid.`.
2025-10-28 14:40:28 -07:00
niksis02
045bdec60c fix: makes object metadata keys lowercase in object creation actions
Fixes #1482

The metadata keys should always be converted to lowercase in `PutObject`, `CreateMultipartUpload`, and `CopyObject`. This implementation converts the metadata keys to lowercase in the front end, ensuring they are stored in lowercase in the backend.
2025-10-29 01:09:24 +04:00
niksis02
12f4920c8d feat: implements checksum calculation for all actions
Closes #1549
Fixes #1593
Fixes #1521
Fixes #1427
Fixes #1311
Fixes #1301
Fixes #1040

This PR primarily focuses on checksum calculation within the gateway, but it also includes several related fixes and improvements.

It introduces a middleware responsible for handling and calculating checksums for the `x-amz-checksum-*` headers and `Content-MD5`. The middleware is applied only to actions that expect a request body or checksum headers. It also enforces validation for actions that require a non-empty request body, returning an error if the body is missing. Similarly, it returns an error for actions where at least one checksum header (`Content-MD5` or `x-amz-checksum-*`) is required but none is provided.
The implementation is based on [https://gist.github.com/niksis02/eec3198f03e561a0998d67af75c648d7](the reference table), tested directly against S3:

It also fixes the error case where the `x-amz-sdk-checksum-algorithm` header is present but no corresponding `x-amz-checksum-*` or `x-amz-trailer` header is included.

Additionally, the PR improves validation for the `x-amz-content-sha256` header. For actions that require this header, an error is now returned when it’s missing. For actions that don’t require it, the middleware no longer enforces its presence. Following the common S3 pattern, the header remains mandatory for admin routes.

Finally, the `x-amz-content-sha256` header is now optional for anonymous requests, as it is not required in that case.
2025-10-25 01:58:03 +04:00
Ben McClelland
69a3483269 Merge pull request #1592 from versity/sis/bucket-object-tag-validation
fix: fixes the bucket/object tagging key/value name validation
2025-10-20 12:21:01 -07:00
niksis02
ebf7a030cc fix: fixes the bucket/object tagging key/value name validation
Fixes #1579

S3 enforces a specific rule for validating bucket and object tag key/value names. This PR integrates the regexp pattern used by S3 for tag validation.
Official S3 documentation for tag validation rules: [AWS S3 Tag](https://docs.aws.amazon.com/AmazonS3/latest/API/API_control_Tag.html)

There are two types of tagging inputs for buckets and objects:

1. **On existing buckets/objects** — used in the `PutObjectTagging` and `PutBucketTagging` actions, where tags are provided in the request body.
2. **On object creation** — used in the `PutObject`, `CreateMultipartUpload`, and `CopyObject` actions, where tags are provided in the request headers and must be URL-encoded.

This implementation ensures correct validation for both types of tag inputs.
2025-10-20 15:19:38 +04:00
niksis02
24679a82ac fix: fixes the composite checksums in CompleteMultipartUpload
Fixes #1359

The composite checksums in **CompleteMultipartUpload** generally follow the format `checksum-<number_of_parts>`. Previously, the gateway treated composite checksums as regular checksums without distinguishing between the two formats.

In S3, the `x-amz-checksum-*` headers accept both plain checksum values and the `checksum-<number_of_parts>` format. However, after a successful `CompleteMultipartUpload` request, the final checksum is always stored with the part number included.

This implementation adds support for parsing both formats—checksums with and without the part number. From now on, composite checksums are consistently stored with the part number included.

Additionally, two integration tests are added:

* One verifies the final composite checksum with part numbers.
* Another ensures invalid composite checksums are correctly rejected.
2025-10-17 16:45:07 +04:00
niksis02
d15d348226 fix: fixes the response header names normalizing
Fixes #1484

Removes response header name normalization to prevent Fiber from converting them to camel case. Also fixes the `HeadBucket` response headers by changing their capital letters to lowercase and corrects the `x-amz-meta` headers to use lowercase instead of camel case.
2025-10-15 01:27:53 +04:00
Ben McClelland
4c3965d87e feat: add option to disable strict bucket name checks
Some systems may choose to allow non-aws compliant bucket names
and/or handle the bucket naem validation in the backend instead.
This adds the option to turn off the strict bucket name validation
checks in the frontend API handlers.

When frontend bucket name validation is disabled, we need to do
sanity checks for posix compliant names in the posix/scoutfs
backends. This is automatically enabled when strict bucket
name validation is disabled.

Fixes #1564
2025-10-08 14:34:52 -07:00
niksis02
ebdda06633 fix: adds BadDigest error for incorrect Content-Md5 s
Closes #1525

* Adds validation for the `Content-MD5` header.
  * If the header value is invalid, the gateway now returns an `InvalidDigest` error.
  * If the value is valid but does not match the payload, it returns a `BadDigest` error.
* Adds integration test cases for `PutBucketCors` with `Content-MD5`.
2025-09-19 19:51:23 +04:00
niksis02
7a098b925f feat: implement conditional writes
Closes #821

**Implements conditional operations across object APIs:**

* **PutObject** and **CompleteMultipartUpload**:
  Supports conditional writes with `If-Match` and `If-None-Match` headers (ETag comparisons).
  Evaluation is based on an existing object with the same key in the bucket. The operation is allowed only if the preconditions are satisfied. If no object exists for the key, these headers are ignored.

* **CopyObject** and **UploadPartCopy**:
  Adds conditional reads on the copy source object with the following headers:

  * `x-amz-copy-source-if-match`
  * `x-amz-copy-source-if-none-match`
  * `x-amz-copy-source-if-modified-since`
  * `x-amz-copy-source-if-unmodified-since`
    The first two are ETag comparisons, while the latter two compare against the copy source’s `LastModified` timestamp.

* **AbortMultipartUpload**:
  Supports the `x-amz-if-match-initiated-time` header, which is true only if the multipart upload’s initialization time matches.

* **DeleteObject**:
  Adds support for:

  * `If-Match` (ETag comparison)
  * `x-amz-if-match-last-modified-time` (LastModified comparison)
  * `x-amz-if-match-size` (object size comparison)

Additionally, this PR updates precondition date parsing logic to support both **RFC1123** and **RFC3339** formats. Dates set in the future are ignored, matching AWS S3 behavior.
2025-09-09 01:55:38 +04:00
Ben McClelland
24b1c45db3 cleanup: move debuglogger to top level for full project access
The debuglogger should be a top level module since we expect
all modules within the project to make use of this. If its
hidden in s3api, then contributors are less likely to make
use of this outside of s3api.
2025-09-01 20:02:02 -07:00
niksis02
b3ed7639f0 feat: implements conditional reads for GetObject and HeadObject
Closes #882

Implements conditional reads for `GetObject` and `HeadObject` in the gateway for both POSIX and Azure backends. The behavior is controlled by the `If-Match`, `If-None-Match`, `If-Modified-Since`, and `If-Unmodified-Since` request headers, where the first two perform ETag comparisons and the latter two compare against the object’s `LastModified` date. No validation is performed for invalid ETags or malformed date formats, and precondition date headers are expected to follow RFC1123; otherwise, they are ignored.

The Integration tests cover all possible combinations of conditional headers, ensuring the feature is 100% AWS S3–compatible.
2025-09-01 18:33:01 -07:00
Ben McClelland
8cad7fd6d9 feat: add response header overrides for GetObject
GetObject allows overriding response headers with the following
paramters:
response-cache-control
response-content-disposition
response-content-encoding
response-content-language
response-content-type
response-expires

This is only valid for signed (and pre-singed) requests. An error
is returned for anonymous requests if these are set.

More info on the GetObject overrides can be found in the GetObject
API reference.

This also clarifies the naming of the AccessOptions IsPublicBucket
to IsPublicRequest to indicate this is a public access request
and not just accessing a bucket that allows public access.

Fixes #1501
2025-08-30 14:13:20 -07:00
niksis02
e18c4f4080 fix: ignores special checksum headers when parsing x-amz-checksum-x headers
Fixes #1345

The previous implementation incorrectly parsed the `x-amz-sdk-checksum-algorithm` header for the `CompleteMultipartUpload` operation, even though this header is not expected and should be ignored. It also mistakenly treated the `x-amz-checksum-algorithm` header as an invalid value for `x-amz-checksum-x`.

The updated implementation only parses the `x-amz-sdk-checksum-algorithm` header for `PutObject` and `UploadPart` operations. Additionally, `x-amz-checksum-algorithm` and `x-amz-checksum-type` headers are now correctly ignored when parsing the precalculated checksum headers (`x-amz-checksum-x`).
2025-07-26 01:33:00 +04:00
niksis02
3363988206 fix: makes checksum type and algorithm case insensitive in CreateMultipartUpload
Fixes #1339

`x-amz-checksum-type` and `x-amz-checksum-algorithm` request headers should be case insensitive in `CreateMultipartUpload`.

The changes include parsing the header values to upper case before validating and passing to back-end. `x-amz-checksum-type` response header was added in`CreateMultipartUpload`, which was missing before.
2025-07-25 20:35:26 +04:00
niksis02
e5850ff11f feat: adds copy source validation for x-amz-copy-source header.
Fixes #1388
Fixes #1389
Fixes #1390
Fixes #1401

Adds the `x-amz-copy-source` header validation for `CopyObject` and `UploadPartCopy` in front-end.
The error:
```
	ErrInvalidCopySource: {
		Code:           "InvalidArgument",
		Description:    "Copy Source must mention the source bucket and key: sourcebucket/sourcekey.",
		HTTPStatusCode: http.StatusBadRequest,
	},
```
is now deprecated.

The conditional read/write headers validation in `CopyObject` should come with #821 and #822.
2025-07-22 14:40:11 -07:00
niksis02
394675a5a8 feat: implements unit tests for controller utilities 2025-07-22 20:55:23 +04:00
niksis02
ba76aea17a feat: adds unit tests for the object HEAD and GET controllers. 2025-07-22 20:55:22 +04:00
niksis02
d2038ca973 feat: implements advanced routing for HeadObject and bucket PUT operations. 2025-07-22 20:55:22 +04:00
Ben McClelland
003bf5db0b fix: convert deprecated fasthttp VisitAll() to All()
An update to fasthttp has deprecated the VisitAll() method
for an iterator function All() that can be used to range over
all headers.
This should fix the staticcheck warnings for calling the
deprecated function.
2025-07-07 22:34:01 -07:00
niksis02
dbc710da2d feat: implements host-style bucket addressing in the gateway.
Closes #803

Implements host-style bucket addressing in the gateway. This feature can be enabled by running the gateway with the `--virtual-domain` flag and specifying a virtual domain name.
Example:

```bash
    ./versitygw -a user -s secret --virtual-domain localhost:7070 posix /tmp/vgw
```

The implementation follows this approach: it introduces a middleware (`HostStyleParser`) that parses the bucket name from the `Host` header and appends it to the URL path. This effectively transforms the request into a path-style bucket addressing format, which the gateway already supports. With this design, the gateway can handle both path-style and host-style requests when running in host-style mode.

For local testing, one can either set up a local DNS server to wildcard-match all subdomains of a specified domain and resolve them to the local IP address, or manually add entries to `/etc/hosts` to resolve bucket-prefixed hosts to the server IP (e.g., `127.0.0.1`).
2025-05-22 00:36:45 +04:00
Ben McClelland
a9fcf63063 feat: cleanup calling of debuglogger with managed debug setting 2025-05-02 17:05:59 -07:00
niksis02
f831578d51 fix: handles tag parsing error cases for PutBucketTagging and PutObjectTagging
Fixes #1214
Fixes #1231
Fixes #1232

Implements `utils.ParseTagging` which is a generic implementation of parsing tags for both `PutObjectTagging` and `PutBucketTagging`.

- The actions now return `MalformedXML` if the provided request body is invalid.
- Adds validation to return `InvalidTag` if duplicate keys are present in tagging.
- For invalid tag keys, it creates a new error: `ErrInvalidTagKey`.
2025-04-23 20:35:19 +04:00
niksis02
bbb5a22c89 feat: makes debug loggin prettier. Adds missing logs in FE and utility functions
Added missing debug logs in the `front-end` and `utility` functions.
Enhanced debug logging with the following improvements:

- Each debug message is now prefixed with [DEBUG] and appears in color.
- The full request URL is printed at the beginning of each debug log block.
- Request/response details are wrapped in framed sections for better readability.
- Headers are displayed in a colored box.
- XML request/response bodies are pretty-printed with indentation and color.
2025-04-17 22:46:05 +04:00
niksis02
ed44fe1969 fix: Handles the error cases for empty checksum headers for PutObject and UploadPart
Fixes #1186
Fixes #1188
Fixes #1189

If multiple checksum headers are provided, no matter if they are empty or not, the gateway should return `(InvalidRequest): Expecting a single x-amz-checksum- header. Multiple checksum Types are not allowed.`
An empty checksum header is considered as invalid, because it's not valid crc32, crc32c ...
2025-04-04 23:17:22 +04:00
niksis02
832371afb1 fix: Fixes the case for GetObjectAttributes to return InvalidArgument if a single invalid object attribute is provided.
Fixes #1000

`GetObjectAttributes` returned `InvalidRequest` instead of `InvalidArgument` with description `Invalid attribute name specified.`.
Fixes the logic in `ParseObjectAttributes` to ignore empty values for `X-Amz-Object-Attributes` headers to return `InvalidArgument` if all the specified object attributes are invalid.
2025-03-28 07:27:35 +04:00
niksis02
9e0f56f807 fix: Fixes the returned error type for object legal hold status and object lock mode in PutObject, CopyObject and CreateMultipartUpload.
Fixes #1141
Fixes #1142

Changes the error type to `InvalidArgument` for `x-amz-object-lock-legal-hold` and `x-amz-object-lock-mode` headers invalid values.
2025-03-18 13:58:49 +04:00
niksis02
7d6505ec06 fix: Adds validation for x-amz-checksum- headers. Makes x-amz-sdk-checksum-algorithm header case insensitive 2025-03-05 22:06:20 +04:00
Ben McClelland
85ba390ebd fix: utils StreamResponseBody() memory use for large get requests
The StreamResponseBody() called ctx.Write() in a loop with a small
buffer in an attempt to stream data back to client. But the
ctx.Write() was just calling append buffer to the response instead
of streaming the data back to the client.

The correct way to stream the response back is to use
(ctx *fasthttp.RequestCtx).SetBodyStream() to set the body stream
reader, and the response will automatically get streamed back
using the reader. This will also call Close() on our body
since we are providing an io.ReadCloser.

Testing this should be done with single large get requests such as
aws s3api get-object --bucket bucket --key file /tmp/data
for very large objects. The testing shows significantly reduced
memory usage for large objects once the streaming is enabled.

Fixes #1082
2025-02-26 11:20:41 -08:00
niksis02
e5811e4ce7 fix: Fixes the entity limiter validation for ListObjects(V2), ListParts, ListMultipartUploads, ListBuckets actions 2025-02-20 15:45:42 +04:00
niksis02
132d0ae631 feat: Adds the CRC64NVME checksum support in the gateway. Adds checksum-type support for the checksum implementation 2025-02-16 17:10:06 +04:00
niksis02
6956757557 feat: Integrates object integrity checksums(CRC32, CRC32C, SHA1, SHA256) into the gateway 2025-02-14 14:14:00 +04:00
niksis02
c094086d83 fix: Fixes the response body streaming for GetObject, implementing a chunk streamer 2025-01-15 23:11:04 +04:00
jonaustin09
66c13ef982 fix: Adds a check to ensure the x-amz-object-attributes header is set and non-empty. 2024-10-31 17:05:54 -04:00
jonaustin09
06e2f2183d fix: Changes GetObjectAttributes action xml encoding root element to GetObjectAttributesResponse. Adds input validation for x-amz-object-attributes header. Adds x-amz-delete-marker and x-maz-version-id headers for GetObjectAttributes action. Adds VersionId in HeadObject response, if it's not specified in the request 2024-10-30 15:42:15 -04:00
jonaustin09
3b903f6044 fix: Fixes max-parts, max-keys, max-uploads validation defaulting to 1000 2024-10-22 14:28:50 -04:00
jonaustin09
600aca8bdc fix: Fixed the request uri path escape to support object key special characters 2024-09-17 13:28:30 -04:00
Jon Austin
d79f978df9 feat: Added the standard storage class to all the available get/list actions responses in posix. (#765) 2024-08-27 15:28:40 -07:00
jonaustin09
7545e6236c feat: Implement bucket ownership controls
Bucket ACLs are now disabled by default the same as AWS.
By default the object ownership is BucketOwnerEnforced
which means that bucket ACLs are disabled. If one attempts
to set bucket ACL the following error is returned both in
the gateway and on AWS:
	ErrAclNotSupported: {
		Code:           "AccessControlListNotSupported",
		Description:    "The bucket does not allow ACLs",
		HTTPStatusCode: http.StatusBadRequest,
	},

ACls can be enabled with PutBucketOwnershipControls

Changed bucket canned ACL translation

New backend interface methods:
PutBucketOwnershipControls
GetBucketOwnershipControls
DeleteBucketOwnershipControls

Added these to metrics
2024-06-28 21:03:09 -07:00
jonaustin09
43f509d971 fix: Added missing properties support for CreateMultipartUpload action: ContentType, ObjectLock, Tagging, Metadata 2024-05-22 12:16:55 -07:00
Ben McClelland
3fc8956baf fix: increase valid timestampe window from 1 to 15 minutes
According to:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#RESTAuthenticationTimeStamp
The valid time wondow for authenticated requests is 15 minutes,
and when outside of that window should return RequestTimeTooSkewed.
2024-05-01 13:56:34 -07:00
jonaustin09
0c3771ae2d feat: Added GetObjectAttributes actions implementation in posix, azure and s3 backends. Added integration tests for GetObjectAttributes action 2024-04-29 15:31:53 -04:00
jonaustin09
fbaba0b944 feat: Added object WORM protection by object-lock feature from AWS with the following actions support: PutObjectLockConfiguration, GetObjectLockConfiguration, PutObjectRetention, GetObjectRetention, PutObjectLegalHold, GetObjectLegalHold 2024-04-22 13:13:40 -07:00
Ben McClelland
dac69caac3 fix: escape path and query for presign signature validation
fixes #462
2024-03-18 15:16:17 -07:00
Ben McClelland
b555c92940 fix: include all request signed headers in signature canonical string
Fixes #457. There are some buggy clients that include headers not
actually set on the request in the signed headers list. For these
we need to include them in the signature canoncal string with
empty values.
2024-03-14 09:56:36 -07:00
jonaustin09
e21e514997 feat: Added 20 integration tests for v4 authentication with query params. Fixed few bugs in v4 query params authentication 2024-02-12 16:31:01 -05:00
jonaustin09
be17b3fd33 feat: Closes #355. Added support for presigned URLs, particularly v4 authentication with query params 2024-02-07 09:17:35 -05:00