Fixes#2123Fixes#2120Fixes#2116Fixes#2111Fixes#2108Fixes#2086Fixes#2085Fixes#2083Fixes#2081Fixes#2080Fixes#2073Fixes#2072Fixes#2071Fixes#2069Fixes#2044Fixes#2043Fixes#2042Fixes#2041Fixes#2040Fixes#2039Fixes#2036Fixes#2035Fixes#2034Fixes#2028Fixes#2020Fixes#1842Fixes#1810Fixes#1780Fixes#1775Fixes#1736Fixes#1705Fixes#1663Fixes#1645Fixes#1583Fixes#1526Fixes#1514Fixes#1493Fixes#1487Fixes#959Fixes#779Closes#823Closes#85
Refactor global S3 error handling around structured error types and centralized XML response generation.
All S3 errors now share the common APIError base for the fields every error has: Code, HTTP status code, and Message. Non-traditional errors that need AWS-compatible XML fields now have dedicated typed errors in the s3err package. Each typed error implements the shared S3Error behavior so controllers and middleware can handle errors consistently while still emitting error-specific XML fields.
Add a dedicated InvalidArgumentError type because InvalidArgument is used widely across request validation, auth, copy source handling, object lock validation, multipart validation, and header parsing. The new InvalidArgument path uses explicit InvalidArgErrorCode constants with predefined descriptions and ArgumentName values, keeping call sites readable while preserving the correct InvalidArgument XML shape and optional ArgumentValue.
New structured errors added in s3err:
- `AccessForbiddenError`: Method, ResourceType
- `BadDigestError`: CalculatedDigest, ExpectedDigest
- `BucketError`: BucketName
- `ContentSHA256MismatchError`: ClientComputedContentSHA256, S3ComputedContentSHA256
- `EntityTooLargeError`: ProposedSize, MaxSizeAllowed
- `EntityTooSmallError`: ProposedSize, MinSizeAllowed
- `ExpiredPresignedURLError`: ServerTime, XAmzExpires, Expires
- `InvalidAccessKeyIdError`: AWSAccessKeyId
- `InvalidArgumentError`: Description, ArgumentName, ArgumentValue
- `InvalidChunkSizeError`: Chunk, BadChunkSize
- `InvalidDigestError`: ContentMD5
- `InvalidLocationConstraintError`: LocationConstraint
- `InvalidPartError`: UploadId, PartNumber, ETag
- `InvalidRangeError`: RangeRequested, ActualObjectSize
- `InvalidTagError`: TagKey, TagValue
- `KeyTooLongError`: Size, MaxSizeAllowed
- `MetadataTooLargeError`: Size, MaxSizeAllowed
- `MethodNotAllowedError`: Method, ResourceType, AllowedMethods
- `NoSuchUploadError`: UploadId
- `NoSuchVersionError`: Key, VersionId
- `NotImplementedError`: Header, AdditionalMessage
- `PreconditionFailedError`: Condition
- `RequestTimeTooSkewedError`: RequestTime, ServerTime, MaxAllowedSkewMilliseconds
- `SignatureDoesNotMatchError`: AWSAccessKeyId, StringToSign, SignatureProvided, StringToSignBytes, CanonicalRequest, CanonicalRequestBytes
Fix CompleteMultipartUpload validation in the Azure backend so missing or empty `ETag` values return the appropriate S3 error instead of allowing a gateway panic.
Fix presigned authentication expiration validation to compare server time in `UTC`, matching the `UTC` timestamp used by presigned URL signing.
Add request ID and host ID support across S3 requests. Each request now receives AWS S3-like identifiers, returned in response headers as `x-amz-request-id` and `x-amz-id-2` and included in all XML error responses as RequestId and HostId. The generated ID structure is designed to resemble AWS S3 request IDs and host IDs.
The request signature calculation/validation for streaming uploads was previously delayed until the request body was fully read, both for Authorization header authentication and presigned URLs.
Now, the signature is validated immediately in the authorization middlewares without reading the request body, since the signature calculation itself does not depend on the request body. Instead, only the `x-amz-content-sha256` SHA-256 hash calculation is delayed.
The MoveData() requires that all but the last part be 4k aligned.
We accidentally were including the alignment check for the last
part causing large uploads where the total object was not a
multiple of 4k to fallback to copying the last part. For very large
part sizes this was triggering timeouts in some clients.
Fixes#2030
When an object has offline blocks, the restore status was incorrectly set to `ongoing-request="false"` instead of omitting the header entirely, which causes s3 clients fail on parsing the x-amz-restore header.
Remove the incorrect `stageNotInProgress` constant and simplify the `requestOngoing` initialization to reflect the correct default.
Fixes#1896
Enforces the S3 `5 GiB` copy source size limit across the posix and azure
backends for `CopyObject` and `UploadPartCopy`, returning `InvalidRequest` when
the source object exceeds the threshold.
The limit is now configurable via `--copy-object-threshold`
(`VGW_COPY_OBJECT_THRESHOLD`, default 5 GiB).
A new `--mp-max-parts flag` (`VGW_MP_MAX_PARTS`, default `10000`) has been added to make multipart upload parts number limit configurable.
No integration test has been added, as GitHub Actions cannot reliably
handle large objects.
The scoutfs backend uses the move blocks ioctl when combining parts into the
final multipart upload object. Once a move blocks from any part is successful,
the original data is no longer in the part file. If the multipart upload fails
and retries, future complete multipart upload calls will not have the correct
data within the part files anymore.
To prevent this case, once a move blocks call is successful for an upload, any
future failure for the complete upload is set to auto-abort the upload to force
clients to re-upload the part data again.
This brings scoutfs in-line with the posix concurrency limiter.
This fixes a hang with scoutfs due to not correctly initializing
the concurrency in posix leading to a concurrency of 0 allowed.
This also adds a sane default to the posix concurrency when not
initialized.
The scoutfs filesystem allows setting project IDs on files and
directories for project level accounting tracking. This adds the
option to set the project id for the following:
create bucket
put object
put part
complete multipart upload
The project id will only be set if all of the following is true:
- set project id option enabled
- filesystem format version supports projects (version >1)
- account project id > 0
Some systems may choose to allow non-aws compliant bucket names
and/or handle the bucket naem validation in the backend instead.
This adds the option to turn off the strict bucket name validation
checks in the frontend API handlers.
When frontend bucket name validation is disabled, we need to do
sanity checks for posix compliant names in the posix/scoutfs
backends. This is automatically enabled when strict bucket
name validation is disabled.
Fixes#1564
The debuglogger should be a top level module since we expect
all modules within the project to make use of this. If its
hidden in s3api, then contributors are less likely to make
use of this outside of s3api.
Implements a middleware that validates incoming bucket and object names before authentication. This helps prevent malicious attacks that attempt to access restricted or unreachable data in `POSIX`.
Adds test cases to cover such attack scenarios, including false negatives where encoded paths are used to try accessing resources outside the intended bucket.
Removes bucket validation from all other layers—including `controllers` and both `POSIX` and `ScoutFS` backends — by moving the logic entirely into the middleware layer.
When multiple uploads with the same object key are racing, we can
end up with an EEXIST when trying to link the final object into
the namespace. When this happens, we should just remove the
existing file and try again since the semantics are that the
last upload should win.
The Etag can be quoted or not, so the check to verify the part
Etag must remove the quotes before checking for equality. This
check is the same now as posix.
The xml encoding for the s3.CompleteMultipartUploadOutput response
type was not producing exactly the right field names for the
expected complete multipart upload result.
This change follows the pattern we have had to do for other xml
responses to create our own type that will encode better to the
expected response.
This will change the backend.Backend interface, so plugins and
other backends will have to make the corresponding changes.
Added missing debug logs in the `front-end` and `utility` functions.
Enhanced debug logging with the following improvements:
- Each debug message is now prefixed with [DEBUG] and appears in color.
- The full request URL is printed at the beginning of each debug log block.
- Request/response details are wrapped in framed sections for better readability.
- Headers are displayed in a colored box.
- XML request/response bodies are pretty-printed with indentation and color.
We had put the error handling in for the read only filesystems
when O_TMPFILE is supported, but missed the CreateTemp() fallback
case. This fixes this case to also return the method not allowed
error.
This also adds the error handling for the scoutfs case as well.
Fixes#1195
This syncs recent updates to posix for scoutfs backend including
the extra metadata such as Content-Disposition,
Content-Language, Cache-Control and Expires. This also fixes the
directory object listings that have a double trailing slash due
to the change in the backend.Walk().
This also simplifies head-object to call the posix on and then
post process for glacier changes. This allows keeping in closer
sync with posix head-object over time.
The part files for multipart uploads are considered temporary
files and should not be archived by default. This adds the
noarchive attribute to the part files to prevent scoutam from
trying to archive these.
There is a new parameter, disablenoarchive, that will prevent
adding the noarchive attribute to these files for the case
where there is a desire to archive these temp files.
This was previously not including the bucket directory for the
mutlipart temp file cleanup. This fixes leftovers in the tmp
directories after uploading multipart uploads.
Fixes#1004Fixes#1122Fixes#1120
Separates `GetObject` and `UploadPartCopy` range parsing/validation.
`GetObject` returns a successful response if acceptRange is invalid.
Adjusts the range upper limit, if it exceeds the actual objects size for `GetObject`.
Corrects the `ContentRange` in the `GetObject` response.
Fixes the `UploadPartCopy` action copy source range parsing/validation.
`UploadPartCopy` returns `InvalidArgument` if the copy source range is not valid.
The fileystem will return ENOTDIR if we try to access a file path
where a parent directory within the path already exists as a file.
In this case we need to return a standard 404 no such key since
the request object does not exist within the filesytem.
Fixes#942
We had 0755 hard coded for newly created directories before. This
adds a user option to configure what the default mode permissions
should be for newly created directories.
Fixes#878
This fixes the cases for racing uploads with the same object names.
Before we were making some bad assumptions about what would cause
an error when trying to link/rename the final object name into
the namespace, but missed the case that another upload for the
same name could be racing with this upload and causing an incorrect
error.
This also changes the order of setting metadata to prevent
accidental setting of metadata for the current upload to another
racing upload.
This also fix auth.CheckObjectAccess() when objects are removed
while this runs.
Fixes#854
It is better if we let the s3response module handle the xml
formatting spec specifics, and let the backends not worry
about how to format the time fields. This should help to
prevent any future backend modifications or additions from
accidental incorrect time formatting.
Changed ListObjectsV2 and ListObjects actions return types from
*s3.ListObjects(V2)Output to s3response.ListObjects(V2)Result.
Changed the listing objects timestamp to RFC3339 to match AWS
S3 objects timestamp.
Fixes#752
The posix limits wont exactly match up with the AWS key length
limits because posix has component length limits as well as path
length limits.
This reponds with the aws compatible KeyTooLongError under these
conditions now.
Note that delete object returns success even in the error cases.
Fixes#755
The API hanlders and backend were stripping trailing "/" in object
paths. So if an object exists and a request came in for head/get/delete/copy
for that same name but with a trailing "/" indicating request should
be for directory object, the "/" would be stripped and the request
would be handlied for the incorrect file object.
This fix adds in checks to handle the case with the training "/"
in the request.
Fixes#709
For large directories, the treewalk can take longer than the
client request timeout. If the client times out the request
then we need to stop walking the filesystem and just return
the context error.
This should prevent the gateway from consuming system resources
uneccessarily after an incoming request is terminated.
This adds the ability to treat symlinks to directories at the top
level gateway directory as buckets the same as normal directories.
This could be a potential security issue allowing traversal into
other filesystems within the system, so is defaulted to off. This
can be enabled when specifically needed for both posix and scoutfs
backend systems.
Fixes#644
The restore object api request handler was incorrectly trying to
unmarshal the request body, but for the stadnard (all?) case the
request body is emtpy. We only need the bucket and opbject params
for now.
This also adds a fix to actually honor the enable glacier mode
in scoutfs.
Add meta.MetadataStorer compatibility to scoutfs so that scoutfs
is using the same interface as posix. This fixes the metadata
retrieval and adds the recently supported object lock compatibility
as well.