Commit Graph

8 Commits

Author SHA1 Message Date
Ben McClelland e137e8d375 fix: connection early termination resulting in internal error
When the connection terminates before all bytes read, we were
getting an io.ErrUnexpectedEOF that was not being handled as
a standard io.EOF resulting in an internal error being raised.
Translate io.ErrUnexpectedEOF to io.EOF so that we return the
normal errors for unexpected content. Add a log message so
that its clear the error is due to the connection being
terminated before all data sent and not the fault of the
gateway.
2026-06-01 10:12:41 -07:00
niksis02 9f786b3c2c feat: global error refactoring
Fixes #2123
Fixes #2120
Fixes #2116
Fixes #2111
Fixes #2108
Fixes #2086
Fixes #2085
Fixes #2083
Fixes #2081
Fixes #2080
Fixes #2073
Fixes #2072
Fixes #2071
Fixes #2069
Fixes #2044
Fixes #2043
Fixes #2042
Fixes #2041
Fixes #2040
Fixes #2039
Fixes #2036
Fixes #2035
Fixes #2034
Fixes #2028
Fixes #2020
Fixes #1842
Fixes #1810
Fixes #1780
Fixes #1775
Fixes #1736
Fixes #1705
Fixes #1663
Fixes #1645
Fixes #1583
Fixes #1526
Fixes #1514
Fixes #1493
Fixes #1487
Fixes #959
Fixes #779
Closes #823
Closes #85

Refactor global S3 error handling around structured error types and centralized XML response generation.

All S3 errors now share the common APIError base for the fields every error has: Code, HTTP status code, and Message. Non-traditional errors that need AWS-compatible XML fields now have dedicated typed errors in the s3err package. Each typed error implements the shared S3Error behavior so controllers and middleware can handle errors consistently while still emitting error-specific XML fields.

Add a dedicated InvalidArgumentError type because InvalidArgument is used widely across request validation, auth, copy source handling, object lock validation, multipart validation, and header parsing. The new InvalidArgument path uses explicit InvalidArgErrorCode constants with predefined descriptions and ArgumentName values, keeping call sites readable while preserving the correct InvalidArgument XML shape and optional ArgumentValue.

New structured errors added in s3err:
- `AccessForbiddenError`: Method, ResourceType
- `BadDigestError`: CalculatedDigest, ExpectedDigest
- `BucketError`: BucketName
- `ContentSHA256MismatchError`: ClientComputedContentSHA256, S3ComputedContentSHA256
- `EntityTooLargeError`: ProposedSize, MaxSizeAllowed
- `EntityTooSmallError`: ProposedSize, MinSizeAllowed
- `ExpiredPresignedURLError`: ServerTime, XAmzExpires, Expires
- `InvalidAccessKeyIdError`: AWSAccessKeyId
- `InvalidArgumentError`: Description, ArgumentName, ArgumentValue
- `InvalidChunkSizeError`: Chunk, BadChunkSize
- `InvalidDigestError`: ContentMD5
- `InvalidLocationConstraintError`: LocationConstraint
- `InvalidPartError`: UploadId, PartNumber, ETag
- `InvalidRangeError`: RangeRequested, ActualObjectSize
- `InvalidTagError`: TagKey, TagValue
- `KeyTooLongError`: Size, MaxSizeAllowed
- `MetadataTooLargeError`: Size, MaxSizeAllowed
- `MethodNotAllowedError`: Method, ResourceType, AllowedMethods
- `NoSuchUploadError`: UploadId
- `NoSuchVersionError`: Key, VersionId
- `NotImplementedError`: Header, AdditionalMessage
- `PreconditionFailedError`: Condition
- `RequestTimeTooSkewedError`: RequestTime, ServerTime, MaxAllowedSkewMilliseconds
- `SignatureDoesNotMatchError`: AWSAccessKeyId, StringToSign, SignatureProvided, StringToSignBytes, CanonicalRequest, CanonicalRequestBytes

Fix CompleteMultipartUpload validation in the Azure backend so missing or empty `ETag` values return the appropriate S3 error instead of allowing a gateway panic.

Fix presigned authentication expiration validation to compare server time in `UTC`, matching the `UTC` timestamp used by presigned URL signing.

Add request ID and host ID support across S3 requests. Each request now receives AWS S3-like identifiers, returned in response headers as `x-amz-request-id` and `x-amz-id-2` and included in all XML error responses as RequestId and HostId. The generated ID structure is designed to resemble AWS S3 request IDs and host IDs.

The request signature calculation/validation for streaming uploads was previously delayed until the request body was fully read, both for Authorization header authentication and presigned URLs.
Now, the signature is validated immediately in the authorization middlewares without reading the request body, since the signature calculation itself does not depend on the request body. Instead, only the `x-amz-content-sha256` SHA-256 hash calculation is delayed.
2026-05-21 23:49:34 +04:00
niksis02 d2fa265fb8 feat: support sha512, md5, xxhash3, xxhash64, xxhash128 data integrity checksums
Integrate the new S3 checksum types in the gateway, including `SHA512`, `MD5`, `XXHASH64`, `XXHASH3`, and `XXHASH128`. This adds checksum calculation, validation, schema handling, and test coverage for the expanded checksum support.

These external packages have been used:
- `github.com/zeebo/xxh3` for `XXHASH3` and `XXHASH128`
- `github.com/cespare/xxhash/v2` for `XXHASH64`

Adjust integration tests because `aws-sdk-go-v2/service/s3` does not support automatic checksum calculation for the new checksum algorithms and returns an SDK-level error when only the checksum algorithm is provided. Only precalculated checksum values are acceptable for these checksum types.

References:
- `https://github.com/aws/aws-sdk-go-v2/issues/3404`
- `https://github.com/aws/aws-sdk-go-v2/issues/3403`
2026-05-04 08:50:39 -07:00
niksis02 ebdda06633 fix: adds BadDigest error for incorrect Content-Md5 s
Closes #1525

* Adds validation for the `Content-MD5` header.
  * If the header value is invalid, the gateway now returns an `InvalidDigest` error.
  * If the value is valid but does not match the payload, it returns a `BadDigest` error.
* Adds integration test cases for `PutBucketCors` with `Content-MD5`.
2025-09-19 19:51:23 +04:00
Ben McClelland 7b8b483dfc feat: calculate full object crc for multi-part uploads for compatible checksums
The CRC32, CRC32c, and CRC64NVME data integrity checksums support calculating
the composite full object values for multi-part uploads using the checksum
and length of the individual parts.

Previously, we were reading all of the part data to recalculate the full
object checksum values during the complete multipart upload call. This
disabled the optimized copy_file_range() for certain filesystems such as
XFS because the part data was being read. If the data is not read, and
the file handle is passed directly to io.Copy(), then the filesystem is
allowed to optimize the copying of the data from the source to destination
files.

This now allows both the optimized copy_file_range() optimizations as well
as the data integrity features enabled for support composite checksum types.
2025-07-03 19:58:53 -07:00
niksis02 132d0ae631 feat: Adds the CRC64NVME checksum support in the gateway. Adds checksum-type support for the checksum implementation 2025-02-16 17:10:06 +04:00
niksis02 6956757557 feat: Integrates object integrity checksums(CRC32, CRC32C, SHA1, SHA256) into the gateway 2025-02-14 14:14:00 +04:00
Ben McClelland ba501e482d feat: steaming requests for put object and put part
This builds on the previous work that sets up the body streaming
for the put object and put part requests. This adds the auth and
checksum readers to postpone the v4auth checks and the content
checksum until the end of the body stream.

This means that the backend with start reading the data from the
body stream before the request is fully validated and signatures
checked. So the backend must check the error returned from the
body reader for the final auth and content checks. The backend
is expected to discard the data upon error.

This should increase performance and reduce memory utilization
to no longer require caching the entire request body in memory
for put object and put part.
2023-12-14 19:19:46 -08:00