versitygw

mirror of https://github.com/versity/versitygw.git synced 2026-04-26 15:35:05 +00:00

Author	SHA1	Message	Date
Ben McClelland	efd1885d21	Merge pull request #2023 from versity/sis/move-versionid-validation-backend fix: move versionId validation to backend	2026-04-10 11:23:20 -07:00
Ben McClelland	aa3c223adb	Merge pull request #2027 from anaelorlinski/fix-multipart-upload fix for multipart upload when using sidecar meta	2026-04-10 11:21:59 -07:00
niksis02	48bfa9f4cf	fix: correct HeadObject restore status for offline objects in scoutfs Fixes #2030 When an object has offline blocks, the restore status was incorrectly set to `ongoing-request="false"` instead of omitting the header entirely, which causes s3 clients fail on parsing the x-amz-restore header. Remove the incorrect `stageNotInProgress` constant and simplify the `requestOngoing` initialization to reflect the correct default.	2026-04-09 19:15:31 +04:00
Anael ORLINSKI	a673900b51	fix for multipart upload when using sidecar meta	2026-04-07 23:37:47 +02:00
niksis02	b473aa0545	fix: move versionId validation to backend Closes #1813 We use a specific `versionId` format(`ulid` package) to generate versionIds in posix, which is not compatible to S3. The versionId validation was performed in frontend which is a potential source of failure for s3 proxy configured on an s3 service which doesn't use ulid for versionId generation(e.g. aws S3). These changes move the specific `ulid` versionId validation to posix to not force any specific versionId format in the gateway.	2026-04-07 01:56:51 +04:00
Ben McClelland	1fca33e738	Merge pull request #2006 from versity/ben/racing-put-delete fix: retry link on ENOENT caused by racing DeleteObject	2026-04-02 09:57:48 -07:00
Ben McClelland	c26012905c	fix: retry link on ENOENT caused by racing DeleteObject A concurrent PutObject and DeleteObject on the same prefix directory can race: PutObject opens an O_TMPFILE in MetaTmpDir (not yet visible in the fs) DeleteObject removes the last visible object in the prefix directory and calls removeParents(), which rmdir's the now-empty prefix directory PutObject's link() tries to link the fd into a parent directory that no longer exists Fix by detecting ENOENT in the final link step (Linkat, Rename, and MoveFile) and retrying after recreating the parent directory. Also extract linkatOTmpfile() to consolidate the Linkat+EEXIST→Renameat logic that was previously inline in link(). Fixes #1988	2026-04-02 08:14:44 -07:00
niksis02	052f2364cc	feat: implement x-amz-source-expected-bucket-owner for CopyObject and UploadPartCopy Closes #1897 Extract the `X-Amz-Source-Expected-Bucket-Owner` header for CopyObject and UploadPartCopy. Verify the source bucket owner in the backend and if the provided access key id doesn't match, return an `AccessDenied` error.	2026-04-01 21:44:33 +04:00
Ben McClelland	e0209ebab4	Merge pull request #1997 from versity/sis/copyobject-threshold fix: enforce 5gb copy source object size threshold.	2026-03-31 12:27:11 -07:00
niksis02	59002b2650	feat: implement integration tests for browser-based POST object	2026-03-31 22:47:04 +04:00
niksis02	bbe246e8ec	fix: enforce 5gb copy source object size threshold. Fixes #1896 Enforces the S3 `5 GiB` copy source size limit across the posix and azure backends for `CopyObject` and `UploadPartCopy`, returning `InvalidRequest` when the source object exceeds the threshold. The limit is now configurable via `--copy-object-threshold` (`VGW_COPY_OBJECT_THRESHOLD`, default 5 GiB). A new `--mp-max-parts flag` (`VGW_MP_MAX_PARTS`, default `10000`) has been added to make multipart upload parts number limit configurable. No integration test has been added, as GitHub Actions cannot reliably handle large objects.	2026-03-31 22:44:03 +04:00
Ben McClelland	13b3dc5267	fix: serialize concurrent CompleteMultipartUpload calls via rename When two requests raced to complete the same multipart upload, the first caller to finish would remove the part files and upload directory. The second caller, already past the initial existence check, would then fail mid-flight with confusing errors such as ErrInvalidPart or an I/O error when trying to open a part that no longer exists. Fix this by atomically renaming the upload directory from <uploadID> to <uploadID>.inprogress at the very start of CompleteMultipartUploadWithCopy, before any part data is read. A concurrent caller will now find the original directory absent and receive a clean NoSuchUpload error. A deferred rename restores the original name if the complete does not succeed, allowing the client to retry. ListMultipartUploads is updated to skip any directories whose name ends in .inprogress so in-flight completes do not appear as pending uploads.	2026-03-27 16:14:20 -07:00
Ben McClelland	0234f3ecc7	Merge pull request #1992 from versity/ben/cleanup-fd fix: cleanup file descriptor leaks with chown fails	2026-03-27 11:43:31 -07:00
Ben McClelland	495b38a899	fix: abort scoutfs multipart uploads on error after successful moveblocks The scoutfs backend uses the move blocks ioctl when combining parts into the final multipart upload object. Once a move blocks from any part is successful, the original data is no longer in the part file. If the multipart upload fails and retries, future complete multipart upload calls will not have the correct data within the part files anymore. To prevent this case, once a move blocks call is successful for an upload, any future failure for the complete upload is set to auto-abort the upload to force clients to re-upload the part data again.	2026-03-26 16:28:42 -07:00
Ben McClelland	927d1d668a	fix: cleanup file descriptor leaks with chown fails We were missing a few cases of cleaning up temp files and file descriptors in the openTmpFile Chown() error cases.	2026-03-26 15:45:20 -07:00
Jakob van Santen	22f04312a7	fix: cleanup tempfiles when error prevents calling link When a client provides an invalid or incomplete body for a single-part upload, the handler returns before the link stage. Factor tempfile removal into cleanup() to catch all cases.	2026-03-20 08:25:44 -07:00
Jakob van Santen	56cb36d45a	feat: add option to disable copy-file-range for multipart uploads The standard io.Copy() will attempt to use copy_file_range when available. However, this can cause problems with certain filesystems. Add an option that will prevent io.Copy() from being able to use copy_file_range to force a standard data copy between file descriptors when needed for completing multipart uploads.	2026-03-20 08:23:01 -07:00
Ben McClelland	ae411fa3c1	Merge pull request #1974 from jvansanten/no-truncate Always write complete files in sidecar meta provider	2026-03-19 16:01:50 -07:00
Ben McClelland	88e6396950	Merge pull request #1889 from versity/ben/fast-walk feat: implement quicker backend/posix walk algorithm	2026-03-19 16:00:54 -07:00
tonyipm	33917ad6f3	feat: implement quicker backend/posix walk algorithm Rewrite the posix Walk implementation to avoid the extra ReadDir per directory that was noted as a TODO in the old code. The new algorithm holds all traversal state in a walkState struct and uses processDir to interleave sibling entries in correct S3 lexicographic order without a second syscall. Key changes: prefix optimisation: jump directly into the deepest matching directory rather than scanning from the root on every call marker short-circuit: skip entire subtrees that are lexically before the marker, making paginated listing faster Co-authored-by: Ben McClelland <ben.mcclelland@versity.com>	2026-03-18 17:39:16 -07:00
Ben McClelland	02f925a84b	Merge pull request #1978 from 57-Wolve/57-Wolve-patch-s3-glacier-response Update scoutfs glacier stage constants for ongoing requests	2026-03-18 10:25:06 -07:00
57_Wolve	c32ddfff1a	Update stage constants for ongoing requests Fix issue with incompatible S3 response for offline and staging status. Resolves issue with Restic Glacier support.	2026-03-18 09:46:55 -05:00
Jakob van Santen	d8f927519a	Do not expose directories that can't be buckets	2026-03-18 15:33:10 +01:00
Jakob van Santen	0559807783	Always write complete files in sidecar meta provider Some filesystems like dCache don't allow file truncation	2026-03-18 15:29:29 +01:00
Ben McClelland	97cc6bf23b	chore: run go modernize tool This is a fixup of the codebase using: go run golang.org/x/tools/go/analysis/passes/modernize/cmd/modernize@latest -fix ./... This has no bahvior changes, and only updates safe changes for modern go features.	2026-03-10 09:47:37 -07:00
Ben McClelland	8795c15621	feat: s3proxy default to credential chain with optional anonymous access When access/secret are not provided, let AWS SDK v2 resolve credentials from the default provider chain (env vars, IRSA, ECS/EC2 roles, etc.) instead of forcing anonymous credentials. Add an explicit anonymous credentials option for s3 proxy to force backend anonymous access. Fixes #1955	2026-03-09 17:41:50 -07:00
Ben McClelland	0f92d0ef4f	Merge pull request #1916 from versity/ben/azure-test-falures azure test failure fixes	2026-03-05 17:07:41 -08:00
niksis02	97bb70509f	fix: change the way object metadata is stored in posix Fixes #1909 Previously, the mapping between object metadata and posix object was as follows: for each metadata key, we stored a separate xattr with the `user.X-Amz-Meta.<key>` prefix. This resulted in syscall overhead when storing and deleting large numbers of metadata keys. In addition, very long metadata keys caused failures because most posix filesystems limit xattr key lengths to 127–255 bytes, while S3 does not enforce such a per-key limit. The logic has now been changed so that all object metadata is stored in a single xattr, `user.metadata`, as a JSON key/value object. For backward compatibility, metadata GET operations still fall back to the old mechanism (`metadata key -> xattr key`) when `user.metadata` is not present. A new CLI utility has been added to convert all legacy object metadata to the new metadata format within the provided directory. Example usage: ``` versitygw utils convert-xattr-metadata path/to/bucket ``` or ``` versitygw utils cxm path/to/bucket ``` It is recommended to run this command on bucket directories to convert all legacy metadata for every object in the bucket.	2026-03-06 01:44:14 +04:00
Ben McClelland	afbeb7cb6e	fix: azure modernize part number loop check The part number loop check can be simplified with slices.Contains.	2026-03-03 08:58:31 -08:00
Ben McClelland	271313b036	fix: azure close download body in CopyObject to prevent resource leak When copying between two different Azure blobs, the source download stream body was only consumed by PutObject but never explicitly closed. If PutObject or any subsequent step returned an error, the underlying HTTP connection held by the Azure SDK was never released, leaking both the connection and any internal SDK retry goroutines attached to it. Added a deferred close on downloadResp.Body immediately after the successful DownloadStream call to ensure the body is always drained and released regardless of the outcome.	2026-03-03 08:58:31 -08:00
Ben McClelland	dc31696e53	fix: azure ListBuckets pagination to use client-side continuation tokens Azure's ListContainers Marker parameter requires an opaque internal token (e.g. /accountname/containername) rather than a plain container name, so passing MaxResults and our ContinuationToken directly to the Azure API caused 400 OutOfRangeInput errors. Rework ListBuckets to iterate all Azure pages client-side, skip entries at or before the ContinuationToken (matching the posix backend's "start after" semantics), and stop once MaxBuckets items have been collected, setting ContinuationToken to the last returned bucket name. This avoids using Azure's NextMarker entirely and correctly handles both unpaginated and paginated requests.	2026-03-03 08:58:31 -08:00
Ben McClelland	929048cbee	fix: azure PresignedAuth_UploadPart test failure Azure Storage's StageBlock REST API rejects Content-Length: 0 with InvalidHeaderValue. The tests (PresignedAuth_UploadPart, UploadPart_success) upload a nil/empty body, which causes the Azure SDK to send Content-Length: 0. Azurite is lenient and accepts it; real Azure Storage does not. Use a new metadata key ("Zerobytesparts") sett on the .sgwtmp/multipart/<uploadId>/<object-hash> blob to track and 0 length parts.	2026-03-03 08:58:22 -08:00
Ben McClelland	fc52052e33	Merge pull request #1910 from versity/sis/posix-dir-obj-metadata-tagging feat: add tagging support for directory objects in posix	2026-03-02 13:05:51 -08:00
Ben McClelland	75f5db445f	Merge pull request #1905 from versity/ben/test-walk-versions fix: improve WalkVersions() ancestor-directory guard for prefix filtering	2026-03-02 12:55:33 -08:00
niksis02	8550dba36f	feat: add tagging support for directory objects in posix Closes #1857 Adds object Tagging support for directory objects in `PutObject` posix. Updates the integration tests to test object metadata and tagging both for file and directory objects.	2026-03-02 18:56:50 +04:00
Ben McClelland	62da7be9cd	Merge pull request #1907 from versity/ben/posix-cleanup chore: fix typos and error return wrapping types in posix	2026-03-02 05:44:07 -08:00
Ben McClelland	fc5c0a36a6	chore: fix typos and error return wrapping types in posix This is just a cleanup of typos, error messages and error types to correctly wrap the returned errors in the posix backend. No major logic changes.	2026-02-28 10:38:29 -08:00
Ben McClelland	7695be56b0	fix: store part checksums at destination path in UploadPartCopy In sidecar mode, the three StoreAttribute/storeChecksums calls after the copy loop were using objPath (the source object path) instead of the destination part's bucket and partPath. This caused checksums and the internal part-crc64nvme to be written under the source object's sidecar directory, making them unresolvable when CompleteMultipartUploadWithCopy tried to retrieve them. All three stores now use *upi.Bucket and partPath, consistent with the etag store directly below them.	2026-02-28 10:24:15 -08:00
Ben McClelland	760f252936	fix: improve WalkVersions() ancestor-directory guard for prefix filtering Replace the length-comparison condition with an explicit predicate that is both more readable and correctly scoped: skip only when the visited directory is a strict ancestor of the specified prefix (not a descendant and not when prefix is empty). Adds tests from the original bug report (#1864) to verify the fix and guard against future regressions.	2026-02-28 10:10:17 -08:00
Ben McClelland	ca3c76b0f9	Merge pull request #1902 from s3-on-win/fix-1864-ListObjectVersions fix: not return parent keys for ListObjectVersions	2026-02-28 10:00:43 -08:00
Ben McClelland	809ba9e580	Merge pull request #1899 from versity/sis/upload-part-crc64nvme feat: optimize multipart upload checksum calculation.	2026-02-28 09:55:34 -08:00
Ben McClelland	98acad9c99	Merge pull request #1887 from versity/sis/complete-mp-checksum fix: store final checksum on CompleteMultipartUpload	2026-02-28 09:54:55 -08:00
Ben McClelland	0ad928a4d8	Merge pull request #1894 from versity/sis/getobject-directory-object-checksum feat: adds checksums for directory objects in posix	2026-02-28 09:39:51 -08:00
Kai Hambrecht	8234c317b8	Fix versitygw#1864 to not return parent keys for ListObjectVersions with prefix	2026-02-27 15:55:42 +01:00
niksis02	d03a33110d	feat: optimize multipart upload checksum calculation. This PR optimizes multipart upload checksum handling. When a checksum algorithm/type is specified at multipart-upload initiation, each `UploadPart` request computes, validates, and stores the corresponding part checksum. During `CompleteMultipartUpload`, the final checksum is derived either via composite checksum calculation or by composing the CRC-family checksums. When no checksum algorithm is specified during multipart-upload initiation, each `UploadPart` may supply a different checksum algorithm for data-integrity verification. To support this scenario, a new mechanism has been implemented: for every `UploadPart`, a crc64nvme checksum is always computed. * If the client uses crc64nvme for the part upload, a single hash reader is used. * Otherwise, two hash readers are used—one for crc64nvme and one for the user-provided checksum. The crc64nvme value is stored in part xattrs under `user.part-crc64nvme` and later used during `CompleteMultipartUpload` as a composable checksum source. In `CompleteMultipartUpload`, the hash reader is entirely removed; the gateway no longer re-reads part data to compute the final checksum. The logic now follows two distinct paths: 1. Checksum algorithm/type specified at MP initiation * All required per-part checksums have already been stored. * If the checksum type is `FULL_OBJECT`, the gateway uses the composable path. * If the type is `COMPOSITE`, the gateway follows the checksum-combining path. 2. No checksum algorithm specified at MP initiation * The gateway loads the stored per-part `crc64nvme` values and composes them to compute the final checksum. The previous `composableCRC` check has been removed because all `FULL_OBJECT` algorithms are inherently composable (`crc32`, `crc32c`, `crc64nvme`). Validation now relies solely on `checksum.Type`.	2026-02-27 15:13:21 +04:00
niksis02	4ebe40829e	fix: store final checksum on CompleteMultipartUpload Previously, if no object checksum type/algorithm was specified when initiating a multipart upload, the CompleteMultipartUpload request would compute the final object’s CRC64NVME checksum but not persist it. This logic has now been fixed, and in the scenario described above the checksum is stored on the final object. There should no longer be any case where a CompleteMultipartUpload request finishes without persisting the final object checksum.	2026-02-27 15:12:57 +04:00
Ben McClelland	b3eac9781f	feat: add concurrency limiter to scoutfs This brings scoutfs in-line with the posix concurrency limiter. This fixes a hang with scoutfs due to not correctly initializing the concurrency in posix leading to a concurrency of 0 allowed. This also adds a sane default to the posix concurrency when not initialized.	2026-02-26 17:34:29 -08:00
niksis02	24364754fd	feat: adds checksums for directory objects in posix Add data-integrity checksum support in `PutObject` in the POSIX backend for directory objects. Since the only way to upload a directory object is via `PutObject`, this logic validates and stores the checksum of the empty payload. Support for `GetObject` has also been added to retrieve and return directory-object checksums.	2026-02-26 22:36:56 +04:00
Ben McClelland	e2821fc855	feat: add option to disable s3proxy client data integrity checks AWS introduced a relatively newer option for data integrity checks that not all non-AWS server support yet. See this for mmore info: https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html This change adds a new option: disable-data-integrity-check to disable the data integrity checks in the client sdk for the servers that may not yet support this. Use this only when the s3 service for the proxy does not support the data integrity features. Fixes #1867	2026-02-21 11:49:20 -08:00
niksis02	46bcc8af35	fix: fixes object default Content-Type Fixes #1849 If no `Content-Type` is provided during object upload, S3 defaults it to `application/octet-stream`. This behavior was missing in the gateway, causing backends to persist an empty `Content-Type`, which Fiber then overrides with its default `text/plain`. The behavior has now been corrected for the object upload operations: `PutObject`, `CreateMultipartUpload`, and `CopyObject`.	2026-02-18 01:44:52 +04:00

1 2 3 4 5 ...

502 Commits