Commit Graph

69 Commits

Author SHA1 Message Date
niksis02 9f786b3c2c feat: global error refactoring
Fixes #2123
Fixes #2120
Fixes #2116
Fixes #2111
Fixes #2108
Fixes #2086
Fixes #2085
Fixes #2083
Fixes #2081
Fixes #2080
Fixes #2073
Fixes #2072
Fixes #2071
Fixes #2069
Fixes #2044
Fixes #2043
Fixes #2042
Fixes #2041
Fixes #2040
Fixes #2039
Fixes #2036
Fixes #2035
Fixes #2034
Fixes #2028
Fixes #2020
Fixes #1842
Fixes #1810
Fixes #1780
Fixes #1775
Fixes #1736
Fixes #1705
Fixes #1663
Fixes #1645
Fixes #1583
Fixes #1526
Fixes #1514
Fixes #1493
Fixes #1487
Fixes #959
Fixes #779
Closes #823
Closes #85

Refactor global S3 error handling around structured error types and centralized XML response generation.

All S3 errors now share the common APIError base for the fields every error has: Code, HTTP status code, and Message. Non-traditional errors that need AWS-compatible XML fields now have dedicated typed errors in the s3err package. Each typed error implements the shared S3Error behavior so controllers and middleware can handle errors consistently while still emitting error-specific XML fields.

Add a dedicated InvalidArgumentError type because InvalidArgument is used widely across request validation, auth, copy source handling, object lock validation, multipart validation, and header parsing. The new InvalidArgument path uses explicit InvalidArgErrorCode constants with predefined descriptions and ArgumentName values, keeping call sites readable while preserving the correct InvalidArgument XML shape and optional ArgumentValue.

New structured errors added in s3err:
- `AccessForbiddenError`: Method, ResourceType
- `BadDigestError`: CalculatedDigest, ExpectedDigest
- `BucketError`: BucketName
- `ContentSHA256MismatchError`: ClientComputedContentSHA256, S3ComputedContentSHA256
- `EntityTooLargeError`: ProposedSize, MaxSizeAllowed
- `EntityTooSmallError`: ProposedSize, MinSizeAllowed
- `ExpiredPresignedURLError`: ServerTime, XAmzExpires, Expires
- `InvalidAccessKeyIdError`: AWSAccessKeyId
- `InvalidArgumentError`: Description, ArgumentName, ArgumentValue
- `InvalidChunkSizeError`: Chunk, BadChunkSize
- `InvalidDigestError`: ContentMD5
- `InvalidLocationConstraintError`: LocationConstraint
- `InvalidPartError`: UploadId, PartNumber, ETag
- `InvalidRangeError`: RangeRequested, ActualObjectSize
- `InvalidTagError`: TagKey, TagValue
- `KeyTooLongError`: Size, MaxSizeAllowed
- `MetadataTooLargeError`: Size, MaxSizeAllowed
- `MethodNotAllowedError`: Method, ResourceType, AllowedMethods
- `NoSuchUploadError`: UploadId
- `NoSuchVersionError`: Key, VersionId
- `NotImplementedError`: Header, AdditionalMessage
- `PreconditionFailedError`: Condition
- `RequestTimeTooSkewedError`: RequestTime, ServerTime, MaxAllowedSkewMilliseconds
- `SignatureDoesNotMatchError`: AWSAccessKeyId, StringToSign, SignatureProvided, StringToSignBytes, CanonicalRequest, CanonicalRequestBytes

Fix CompleteMultipartUpload validation in the Azure backend so missing or empty `ETag` values return the appropriate S3 error instead of allowing a gateway panic.

Fix presigned authentication expiration validation to compare server time in `UTC`, matching the `UTC` timestamp used by presigned URL signing.

Add request ID and host ID support across S3 requests. Each request now receives AWS S3-like identifiers, returned in response headers as `x-amz-request-id` and `x-amz-id-2` and included in all XML error responses as RequestId and HostId. The generated ID structure is designed to resemble AWS S3 request IDs and host IDs.

The request signature calculation/validation for streaming uploads was previously delayed until the request body was fully read, both for Authorization header authentication and presigned URLs.
Now, the signature is validated immediately in the authorization middlewares without reading the request body, since the signature calculation itself does not depend on the request body. Instead, only the `x-amz-content-sha256` SHA-256 hash calculation is delayed.
2026-05-21 23:49:34 +04:00
Ben McClelland dd27c6cd27 fix: scoutfs multipart alignment check for last part
The MoveData() requires that all but the last part be 4k aligned.
We accidentally were including the alignment check for the last
part causing large uploads where the total object was not a
multiple of 4k to fallback to copying the last part. For very large
part sizes this was triggering timeouts in some clients.
2026-04-30 08:19:14 -07:00
niksis02 48bfa9f4cf fix: correct HeadObject restore status for offline objects in scoutfs
Fixes #2030

When an object has offline blocks, the restore status was incorrectly set to `ongoing-request="false"` instead of omitting the header entirely, which causes s3 clients fail on parsing the x-amz-restore header.
Remove the incorrect `stageNotInProgress` constant and simplify the `requestOngoing` initialization to reflect the correct default.
2026-04-09 19:15:31 +04:00
niksis02 bbe246e8ec fix: enforce 5gb copy source object size threshold.
Fixes #1896

Enforces the S3 `5 GiB` copy source size limit across the posix and azure
backends for `CopyObject` and `UploadPartCopy`, returning `InvalidRequest` when
the source object exceeds the threshold.

The limit is now configurable via `--copy-object-threshold`
(`VGW_COPY_OBJECT_THRESHOLD`, default 5 GiB).
A new `--mp-max-parts flag` (`VGW_MP_MAX_PARTS`, default `10000`) has been added to make multipart upload parts number limit configurable.

No integration test has been added, as GitHub Actions cannot reliably
handle large objects.
2026-03-31 22:44:03 +04:00
Ben McClelland 495b38a899 fix: abort scoutfs multipart uploads on error after successful moveblocks
The scoutfs backend uses the move blocks ioctl when combining parts into the
final multipart upload object. Once a move blocks from any part is successful,
the original data is no longer in the part file. If the multipart upload fails
and retries, future complete multipart upload calls will not have the correct
data within the part files anymore.

To prevent this case, once a move blocks call is successful for an upload, any
future failure for the complete upload is set to auto-abort the upload to force
clients to re-upload the part data again.
2026-03-26 16:28:42 -07:00
57_Wolve c32ddfff1a Update stage constants for ongoing requests
Fix issue with incompatible S3 response for offline and staging status.

Resolves issue with Restic Glacier support.
2026-03-18 09:46:55 -05:00
Ben McClelland b3eac9781f feat: add concurrency limiter to scoutfs
This brings scoutfs in-line with the posix concurrency limiter.
This fixes a hang with scoutfs due to not correctly initializing
the concurrency in posix leading to a concurrency of 0 allowed.

This also adds a sane default to the posix concurrency when not
initialized.
2026-02-26 17:34:29 -08:00
Ben McClelland 01fc142c1e fix: correct spelling for debuglogger.InternalError() (#1784) 2026-01-24 06:44:54 -08:00
Ben McClelland 3c3e9dd8b1 feat: add project id support for scoutfs backend
The scoutfs filesystem allows setting project IDs on files and
directories for project level accounting tracking. This adds the
option to set the project id for the following:
create bucket
put object
put part
complete multipart upload

The project id will only be set if all of the following is true:
- set project id option enabled
- filesystem format version supports projects (version >1)
- account project id > 0
2025-11-14 15:36:10 -08:00
Ben McClelland 40da4a31d3 chore: cleanup unused constants
We have some leftover constants from some previous changes. This
just cleans up all that are no longer needed.
2025-10-09 12:19:00 -07:00
Ben McClelland 4c3965d87e feat: add option to disable strict bucket name checks
Some systems may choose to allow non-aws compliant bucket names
and/or handle the bucket naem validation in the backend instead.
This adds the option to turn off the strict bucket name validation
checks in the frontend API handlers.

When frontend bucket name validation is disabled, we need to do
sanity checks for posix compliant names in the posix/scoutfs
backends. This is automatically enabled when strict bucket
name validation is disabled.

Fixes #1564
2025-10-08 14:34:52 -07:00
Ben McClelland 24b1c45db3 cleanup: move debuglogger to top level for full project access
The debuglogger should be a top level module since we expect
all modules within the project to make use of this. If its
hidden in s3api, then contributors are less likely to make
use of this outside of s3api.
2025-09-01 20:02:02 -07:00
Ben McClelland 1eeb7de0b6 feat: add versioning dir option to scoutfs backend
This adds the same versioning dir option that is found in the
posix backend to scoutfs backend. Functionality is the same.
2025-08-26 11:20:35 -07:00
Ondrej Palkovsky f0858a47d5 Small cleanups. 2025-08-08 08:56:44 +02:00
Ondrej Palkovsky 298d4ec6b4 Merged scoutfs and posix ListObjects and ListObjectsV2 2025-08-08 08:37:16 +02:00
Ondrej Palkovsky 936239b619 DRY of scoutfs integration, alignment testing for scoutfs.MoveData 2025-08-07 18:28:38 +02:00
niksis02 98a7b7f402 feat: adds a middleware to validate bucket/object names
Implements a middleware that validates incoming bucket and object names before authentication. This helps prevent malicious attacks that attempt to access restricted or unreachable data in `POSIX`.

Adds test cases to cover such attack scenarios, including false negatives where encoded paths are used to try accessing resources outside the intended bucket.

Removes bucket validation from all other layers—including `controllers` and both `POSIX` and `ScoutFS` backends — by moving the logic entirely into the middleware layer.
2025-07-04 00:55:03 +04:00
niksis02 3740d79173 fix: adds the surrounding quotes on ETag in PutObject for dir objects and in UploadPartCopy.
Fixes #1277
Fixes #1235

Adds surrounding quotes on `ETag` when creating a directory object. Adds the quotes in `UploadPartCopy` as well.
2025-05-09 00:29:23 +04:00
Ben McClelland a60d6a7faa fix: scoutfs racing mutlipart uploads internal error
When multiple uploads with the same object key are racing, we can
end up with an EEXIST when trying to link the final object into
the namespace. When this happens, we should just remove the
existing file and try again since the semantics are that the
last upload should win.
2025-05-03 09:30:45 -07:00
Ben McClelland a29f7b1839 fix: scoutfs missing ListObjectsV2() start after
This brings ListObjectsV2 for scoutfs in sync with posix to handle
the start after and continuation token ases.
2025-05-03 09:15:01 -07:00
Ben McClelland 6321406008 fix: scoutfs missing ListObjects() response fields
This fixes some tests that were fialing due to missing response
fields in ListObjects().
2025-05-03 09:07:56 -07:00
Ben McClelland a9fcf63063 feat: cleanup calling of debuglogger with managed debug setting 2025-05-02 17:05:59 -07:00
Ben McClelland 9f13b544f7 fix: scoutfs etag check for multipart uploads
The Etag can be quoted or not, so the check to verify the part
Etag must remove the quotes before checking for equality. This
check is the same now as posix.
2025-05-02 10:07:47 -07:00
Ben McClelland 9244e9100d fix: xml response field names for complete multipart upload
The xml encoding for the s3.CompleteMultipartUploadOutput response
type was not producing exactly the right field names for the
expected complete multipart upload result.

This change follows the pattern we have had to do for other xml
responses to create our own type that will encode better to the
expected response.

This will change the backend.Backend interface, so plugins and
other backends will have to make the corresponding changes.
2025-04-30 14:36:48 -07:00
Ben McClelland bd986e97f3 Merge pull request #1220 from versity/sis/missing-debug-logs-fe
feat: makes debug loggin prettier. Adds missing logs in FE and utily functions
2025-04-18 08:28:58 -07:00
niksis02 bbb5a22c89 feat: makes debug loggin prettier. Adds missing logs in FE and utility functions
Added missing debug logs in the `front-end` and `utility` functions.
Enhanced debug logging with the following improvements:

- Each debug message is now prefixed with [DEBUG] and appears in color.
- The full request URL is printed at the beginning of each debug log block.
- Request/response details are wrapped in framed sections for better readability.
- Headers are displayed in a colored box.
- XML request/response bodies are pretty-printed with indentation and color.
2025-04-17 22:46:05 +04:00
Ben McClelland df6dcff429 fix: return method not allowed for read only fs for fallback tempfile
We had put the error handling in for the read only filesystems
when O_TMPFILE is supported, but missed the CreateTemp() fallback
case. This fixes this case to also return the method not allowed
error.

This also adds the error handling for the scoutfs case as well.

Fixes #1195
2025-04-12 07:27:43 -07:00
Ben McClelland 092d3b0384 feat: sync recent posix changes to scoutfs
This syncs recent updates to posix for scoutfs backend including
the extra metadata such as Content-Disposition,
Content-Language, Cache-Control and Expires. This also fixes the
directory object listings that have a double trailing slash due
to the change in the backend.Walk().

This also simplifies head-object to call the posix on and then
post process for glacier changes. This allows keeping in closer
sync with posix head-object over time.
2025-03-14 15:58:31 -07:00
Ben McClelland d034f87f60 feat: add noarchive to scoutfs part files
The part files for multipart uploads are considered temporary
files and should not be archived by default. This adds the
noarchive attribute to the part files to prevent scoutam from
trying to archive these.

There is a new parameter, disablenoarchive, that will prevent
adding the noarchive attribute to these files for the case
where there is a desire to archive these temp files.
2025-03-10 14:52:20 -07:00
Ben McClelland f77058b817 fix: scoutfs multipart cleanup in complete/abort mp
This was previously not including the bucket directory for the
mutlipart temp file cleanup. This fixes leftovers in the tmp
directories after uploading multipart uploads.
2025-03-10 13:44:24 -07:00
niksis02 96af2b6471 fix: Fixes GetObject and UploadPartCopy actions data range parsing.
Fixes #1004
Fixes #1122
Fixes #1120

Separates `GetObject` and `UploadPartCopy` range parsing/validation.

`GetObject` returns a successful response if acceptRange is invalid.
Adjusts the range upper limit, if it exceeds the actual objects size for `GetObject`.
Corrects the `ContentRange` in the `GetObject` response.

Fixes the `UploadPartCopy` action copy source range parsing/validation.
`UploadPartCopy` returns `InvalidArgument` if the copy source range is not valid.
2025-03-08 01:39:21 +04:00
Ben McClelland 0312a1e3dc fix: internal server error when object parent dir is a file
The fileystem will return ENOTDIR if we try to access a file path
where a parent directory within the path already exists as a file.
In this case we need to return a standard 404 no such key since
the request object does not exist within the filesytem.

Fixes #942
2024-11-08 08:21:14 -08:00
Ben McClelland 2c713c58f9 feat: add option to configure mode permissions on new directories
We had 0755 hard coded for newly created directories before. This
adds a user option to configure what the default mode permissions
should be for newly created directories.

Fixes #878
2024-10-16 14:31:03 -07:00
Ben McClelland b7a2e8a2c3 fix: unexpected errors during upload races
This fixes the cases for racing uploads with the same object names.
Before we were making some bad assumptions about what would cause
an error when trying to link/rename the final object name into
the namespace, but missed the case that another upload for the
same name could be racing with this upload and causing an incorrect
error.

This also changes the order of setting metadata to prevent
accidental setting of metadata for the current upload to another
racing upload.

This also fix auth.CheckObjectAccess() when objects are removed
while this runs.

Fixes #854
2024-10-07 17:24:44 -07:00
jonaustin09 7d368be82e feat: Implemented object locking for object versions 2024-09-30 17:26:49 -04:00
Ben McClelland ee202b76f3 fix: move RFC 3339 time formatting to s3response
It is better if we let the s3response module handle the xml
formatting spec specifics, and let the backends not worry
about how to format the time fields. This should help to
prevent any future backend modifications or additions from
accidental incorrect time formatting.
2024-08-26 21:08:24 -07:00
jonaustin09 684ab2371b fix: Changed ListObjects and ListObjectsV2 actions return types
Changed ListObjectsV2 and ListObjects actions return types from
*s3.ListObjects(V2)Output to s3response.ListObjects(V2)Result.

Changed the listing objects timestamp to RFC3339 to match AWS
S3 objects timestamp.

Fixes #752
2024-08-26 15:46:45 -07:00
Ben McClelland 453136bd5a fix: return KeyTooLongError when filenames exceed allowed length
The posix limits wont exactly match up with the AWS key length
limits because posix has component length limits as well as path
length limits.

This reponds with the aws compatible KeyTooLongError under these
conditions now.

Note that delete object returns success even in the error cases.

Fixes #755
2024-08-24 14:53:42 -07:00
Ben McClelland 797376a235 fix: head/get/delete/copy directory object should fail when corresponding file object exists
The API hanlders and backend were stripping trailing "/" in object
paths. So if an object exists and a request came in for head/get/delete/copy
for that same name but with a trailing "/" indicating request should
be for directory object, the "/" would be stripped and the request
would be handlied for the incorrect file object.

This fix adds in checks to handle the case with the training "/"
in the request.

Fixes #709
2024-08-05 11:55:32 -07:00
Ben McClelland b421598647 fix: allow objecting listing in scoutfs for files created without etag attrs
Files created outside of versitygw can be missing etag attributes.
Allow the empty etag to not cause errors with listing the files.

Fixes #694
2024-07-31 10:01:00 -07:00
Ben McClelland 55cf7674b8 Merge pull request #673 from versity/ben/symlinks
feat: add option to allow symlinked directories as buckets
2024-07-16 08:40:41 -07:00
Ben McClelland a8adb471fe fix: cancel filesystem traversal when listing request cancelled
For large directories, the treewalk can take longer than the
client request timeout. If the client times out the request
then we need to stop walking the filesystem and just return
the context error.

This should prevent the gateway from consuming system resources
uneccessarily after an incoming request is terminated.
2024-07-15 13:47:21 -07:00
Ben McClelland f6dd2f947c feat: add option to allow symlinked directories as buckets
This adds the ability to treat symlinks to directories at the top
level gateway directory as buckets the same as normal directories.

This could be a potential security issue allowing traversal into
other filesystems within the system, so is defaulted to off. This
can be enabled when specifically needed for both posix and scoutfs
backend systems.

Fixes #644
2024-07-13 10:21:15 -07:00
jonaustin09 e773872c48 feat: Implemented response body streaming for GetObject action 2024-07-08 15:56:24 -04:00
Ben McClelland d4d064de19 fix: remove unnecessary no xattr definitions 2024-06-12 16:40:59 -07:00
Ben McClelland d98ca9b034 Merge pull request #622 from versity/ben/glacier_mode_fix
fix: restore object request handler and scoutfs glacier enable
2024-06-11 13:56:48 -07:00
jonaustin09 7ea386aec9 fix: Bug fixing for azure backend. Added a new integration test case for ListParts 2024-06-11 16:14:35 -04:00
Ben McClelland f0005a0047 fix: restore object request handler and scoutfs glacier enable
The restore object api request handler was incorrectly trying to
unmarshal the request body, but for the stadnard (all?) case the
request body is emtpy. We only need the bucket and opbject params
for now.

This also adds a fix to actually honor the enable glacier mode
in scoutfs.
2024-06-11 12:57:46 -07:00
Ben McClelland 576dfc5884 fix: correct metadata, tags, and lock info for scoutfs multipart objects
Add meta.MetadataStorer compatibility to scoutfs so that scoutfs
is using the same interface as posix. This fixes the metadata
retrieval and adds the recently supported object lock compatibility
as well.
2024-06-10 17:57:07 -07:00
Ben McClelland c81403fe90 feat: add metadata storage abstraction layer
Closes #511. This adds an abstraction layer to the metadata
storage to allow for future non-xattr metadata storage
implementations.
2024-04-15 13:57:31 -07:00