The CRC32, CRC32c, and CRC64NVME data integrity checksums support calculating
the composite full object values for multi-part uploads using the checksum
and length of the individual parts.
Previously, we were reading all of the part data to recalculate the full
object checksum values during the complete multipart upload call. This
disabled the optimized copy_file_range() for certain filesystems such as
XFS because the part data was being read. If the data is not read, and
the file handle is passed directly to io.Copy(), then the filesystem is
allowed to optimize the copying of the data from the source to destination
files.
This now allows both the optimized copy_file_range() optimizations as well
as the data integrity features enabled for support composite checksum types.
Implements a middleware that validates incoming bucket and object names before authentication. This helps prevent malicious attacks that attempt to access restricted or unreachable data in `POSIX`.
Adds test cases to cover such attack scenarios, including false negatives where encoded paths are used to try accessing resources outside the intended bucket.
Removes bucket validation from all other layers—including `controllers` and both `POSIX` and `ScoutFS` backends — by moving the logic entirely into the middleware layer.
The http body stream is not a seekable stream, so most operation
retry attempts will fail with an internal server error. This
change tells the s3 client within the gateway to not retry any
requests, and instead let the client of the gateway handle the
error retry.
Fixes#1353
We were using the metadata retrieval to check for existing
buckets during create, and then return either BucketAlreadyExists
or ErrBucketAlreadyOwnedByYou accordingly.
Howver, the metadata retrieval was returning success with a
default ACL when the bucket metadata did not already exist
causing the gateway to always think this bucket existed.
Fix here is to let the metadata retrieval know that we do not
want the default ACL for this case.
This fixes the listing of buckets when multi tenant mode is
enabled with a metadata bucket. The following behavior changes
are fixed:
* prevent listing of metadata bucket by all accounts
* prevent listing of non-owned buckets by user/userplus
* return correct BucketAlreadyExists/BucketAlreadyOwnedByYou
for attempts to create existing bucket
Fixes#1326
There were a couple of cases that would return an error for the
non existing bucket acl instead of treating that as the default
acl.
This also cleans up the backends that were doing their own
acl parsing instead of using the auth.ParseACL() function.
Fixes#1304
Fixes#1288
If the checksum algorithm/type is not specified during multipart upload initialization, it is considered `null`, and the `ListParts` result should also set it to `null`.
Fixes#1276
Creates the custom `s3response.CopyObjectOutput` type to handle the `LastModified` date property formatting correctly. It uses `time.RFC3339` to format the date to match the format that s3 uses.
Common prefixes were originally stored in a `map[string]struct{}`, which was then converted to a slice and sorted. The new implementation stores the common prefixes in a `map[string]int`, where the map value represents the index of the common prefix. There's no need to sort the common prefixes array, as `fs.WalkDir` comes with sorted directories and files.
Fixes#1258Fixes#1257Closes#1244
Adds range queries support for `HeadObject`.
Fixes the range parsing logic for `GetObject`, which is used for `HeadObject` as well. Both actions follow the same rules for range parsing.
Fixes the error message returned by `GetObject`.
When multiple uploads with the same object key are racing, we can
end up with an EEXIST when trying to link the final object into
the namespace. When this happens, we should just remove the
existing file and try again since the semantics are that the
last upload should win.
The Etag can be quoted or not, so the check to verify the part
Etag must remove the quotes before checking for equality. This
check is the same now as posix.
The xml encoding for the s3.CompleteMultipartUploadOutput response
type was not producing exactly the right field names for the
expected complete multipart upload result.
This change follows the pattern we have had to do for other xml
responses to create our own type that will encode better to the
expected response.
This will change the backend.Backend interface, so plugins and
other backends will have to make the corresponding changes.
Fixes#1215Fixes#1216
`PutObject`, `CopyObject` and `CreateMultipartUpload` accept tag string as an http request header which should be url-encoded. The tag string should be a valid url-encoded string and each key/value pair should be valid, otherwise they should fail with `APIError`.
If the provided tag set contains duplicate `keys` the calls should fail with the same `InvalidURLEncodedTagging` error.
Not all url-encoded characters are supported by `S3`. The tagging string should contain only `letters`, `digits` and the following special chars:
- `-`
- `.`
- `/`
- `_`
- `+`
- ` `(space)
And their url-encoded versions: e.g. `%2F`(/), `%2E`(.) ... .
If the provided tagging string contains invalid `key`/`value`, the calls should fail with the following errors respectively:
`invalid key` - `(InvalidTag) The TagKey you have provided is invalid`
`invalid value` - `(InvalidTag) The TagValue you have provided is invalid`
Fixes#1214Fixes#1231Fixes#1232
Implements `utils.ParseTagging` which is a generic implementation of parsing tags for both `PutObjectTagging` and `PutBucketTagging`.
- The actions now return `MalformedXML` if the provided request body is invalid.
- Adds validation to return `InvalidTag` if duplicate keys are present in tagging.
- For invalid tag keys, it creates a new error: `ErrInvalidTagKey`.
Closes#1111
Bucket ACLs and policies are now stored in the meta bucket as objects with the following prefixes:
- `vgw-meta-acl-<bucket-name>`
- `vgw-meta-policy-<bucket-name>`
The name of the meta bucket is provided during S3 proxy initialization. The gateway verifies whether the specified bucket exists; if it does not, an error is returned.
If no meta bucket is provided, the S3 proxy returns default values for ACL and policy actions.
Added missing debug logs in the `front-end` and `utility` functions.
Enhanced debug logging with the following improvements:
- Each debug message is now prefixed with [DEBUG] and appears in color.
- The full request URL is printed at the beginning of each debug log block.
- Request/response details are wrapped in framed sections for better readability.
- Headers are displayed in a colored box.
- XML request/response bodies are pretty-printed with indentation and color.
O_TMPFILE can fail if the location we need to link the final
file is not within the same filesystem. This can happen if
there are different filesystem mounts within a bucket or if
using zfs nested datasets within a bucket.
Fixes#1194Fixes#1035
We had put the error handling in for the read only filesystems
when O_TMPFILE is supported, but missed the CreateTemp() fallback
case. This fixes this case to also return the method not allowed
error.
This also adds the error handling for the scoutfs case as well.
Fixes#1195
Fixes#816
`ListObjects(V2)` used to return truncated response, if delimiter is provided and the result is limited by max-keys and the number of common prefixes is the same as `max-keys`.
e.g
PUT -> `foo/bar`
PUT -> `foo/quxx`
LIST: `max-keys=1;delim=/` -> foo/
`ListObjects(V2)` should return `foo/` as common prefix and `truncated` should be `false`.
The PR makes this fix to return `non-truncated` response for the above described case.
Fixes#1182
S3 calculates the `CRC64NVME` checksum of an object on object upload(`PutObject`), when no checksum algorithm or precalculated checksum header is provided.
Makes the `CRC64NVME` checksum as default for `PutObject`, when no checksum is provided.
Fixes#1181
`DeleteObjects` should remove non-empty directory objects, which has been uploaded as a separate object.
e.g
Upload -> `foo/bar`
Upload -> `foo/`
Delete -> `foo/`
The last action call should succeed.
The PR introduces changes which removes `ETag` from the directory object attempting to `delete`, which has been uploaded as a separate object.
Closes#819
ListObjects returns object owner data in each object entity in the result, while ListObjectsV2 has fetch-owner query param, which indicates if the objects owner data should be fetched.
Adds these changes in the gateway to add `Owner` data in `ListObjects` and `ListObjectsV2` result. In aws the objects can be owned by different users in the same bucket. In the gateway all the objects are owned by the bucket owner.
Fixes#1021
`foo` and `foo/` object paths were considered as the same in `CopyObject` source and destination object paths in posix.
Implements the `joinPathWithTrailer` function, which calls `filepath.Join` and adds trailing `/` if it existed in the original path. This way the implementation puts separation
between directory and file objects with the same name.
Implements the bucket cors s3 actions in FE to return `NotImplemented` error.
Actions implemented:
- `PutBucketCors`
- `GetBucketCors`
- `DeleteBucketCors`
`Note`: no logic is implemented for the actions in any backend and no input or output data validation is added.
Fixes#998Closes#1125Closes#1126Closes#1127
Implements objects meta properties(Content-Disposition, Content-Language, Content-Encoding, Cache-Control, Expires) and tagging besed on the directives(metadata, tagging) in CopyObject in posix and azure backends. The properties/tagging should be coppied from the source object if "COPY" directive is provided and it should be replaced otherwise.
Changes the object copy principle in azure: instead of using the `CopyFromURL` method from azure sdk, it first loads the object then creates one, to be able to compare and store the meta properties.
This syncs recent updates to posix for scoutfs backend including
the extra metadata such as Content-Disposition,
Content-Language, Cache-Control and Expires. This also fixes the
directory object listings that have a double trailing slash due
to the change in the backend.Walk().
This also simplifies head-object to call the posix on and then
post process for glacier changes. This allows keeping in closer
sync with posix head-object over time.
Closes#1128
Adds `Content-Disposition`, `Content-Language`, `Cache-Control` and `Expires` object meta properties support in posix and azure backends.
Changes the `PutObject` and `CreateMultipartUpload` actions backend input type to custom `s3response` types to be able to store `Expires` as any string.
The part files for multipart uploads are considered temporary
files and should not be archived by default. This adds the
noarchive attribute to the part files to prevent scoutam from
trying to archive these.
There is a new parameter, disablenoarchive, that will prevent
adding the noarchive attribute to these files for the case
where there is a desire to archive these temp files.
This was previously not including the bucket directory for the
mutlipart temp file cleanup. This fixes leftovers in the tmp
directories after uploading multipart uploads.
Fixes#1004Fixes#1122Fixes#1120
Separates `GetObject` and `UploadPartCopy` range parsing/validation.
`GetObject` returns a successful response if acceptRange is invalid.
Adjusts the range upper limit, if it exceeds the actual objects size for `GetObject`.
Corrects the `ContentRange` in the `GetObject` response.
Fixes the `UploadPartCopy` action copy source range parsing/validation.
`UploadPartCopy` returns `InvalidArgument` if the copy source range is not valid.