mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-17 15:21:31 +00:00
* fix(s3api): self-heal stale .versions latest-version pointer on read When the `.versions` directory metadata points at a version file that has gone missing (e.g. a crash between deleting the latest version and rewriting the pointer, or a concurrent delete racing with a read), `getLatestObjectVersion` bailed with a hard error that required manual repair. Tagging, ACL, retention, copy-source, and HEAD/GET all surfaced NoSuchKey even though other versions remained on disk. On `ErrNotFound`/`codes.NotFound` from the pointed-at version lookup, rescan the `.versions` directory, pick the newest remaining non-delete- marker entry, persist the repaired pointer (best-effort), and return that entry. If only delete markers (or nothing) remain, the caller still sees an error and the object correctly appears absent. Extracted the selection logic into a pure `selectLatestContentVersion` helper so `updateLatestVersionAfterDeletion` and the new self-heal path share a single implementation. A warning is logged whenever the heal kicks in so stale-pointer incidents remain visible in operator logs. * fix(s3api): self-heal must promote newest version even if delete marker Review feedback (gemini-code-assist): the self-heal path used `selectLatestContentVersion`, which skips delete markers. That had two bugs: 1. If the chronologically newest entry was a delete marker, an older content version would be promoted, effectively "undeleting" an object that was actually deleted. 2. If only delete markers remained, heal returned an error and the caller surfaced a hard 500 instead of the correct 404-with- x-amz-delete-marker response. Add `selectLatestVersion` that picks the newest entry regardless of type (content or delete marker) and use it in `healStaleLatestVersionPointer`. The promoted entry flows back through `doGetLatestObjectVersion` unchanged; downstream handlers already detect `ExtDeleteMarkerKey` on the returned entry and render NoSuchKey + `x-amz-delete-marker: true` (see `s3api_object_handlers.go:722-728`). Kept `selectLatestContentVersion` in place for `updateLatestVersionAfterDeletion`, which deliberately limits the pointer to a live content version in the post-deletion flow — changing that is out of scope for this fix. Added four tests for the new selector: - promotes newest delete marker over older content (the reviewer case) - picks content when content is newest - promotes newest delete marker when only delete markers remain - returns nil latestEntry on empty/untagged input * fix(s3api): paginate .versions scan in self-heal path Review feedback (gemini-code-assist, coderabbitai): the self-heal rescan did a single-shot list(..., 1000). For objects whose version ids use the old raw-timestamp format, filer ordering is lexicographic-ascending = oldest-first. If the .versions directory held more than one page of entries, the first page contained only the oldest, and the heal would promote an older version as the new "latest" — silently surfacing stale data on subsequent reads. Paginate through the whole directory with `filer.PaginationSize`, running `selectLatestVersion` per page and keeping a single running best candidate across pages (via `compareVersionIds`). This mirrors the pagination pattern already used by `getObjectVersionList` in the same file and closes the window for old-format stacks larger than one page. New-format (inverted-timestamp) ids were not affected because their lexicographic order matches newest-first, but paginating is still the right fix. Also updated the function doc to reflect that self-heal now promotes the newest entry regardless of type (content version or delete marker). * fix(s3api): don't resurrect deleted objects; wrap ErrNotFound sentinel Two review findings addressed: 1. `updateLatestVersionAfterDeletion` was using `selectLatestContentVersion` which skipped delete markers. Scenario: PUT v1, DELETE (dm1 written, pointer->dm1), PUT v3 (pointer->v3), user explicitly deletes v3 by versionId. Remaining files: v1, dm1. S3 semantics say the current version is the chronologically newest = dm1 (object appears deleted). Old code would promote v1, "undeleting" the object silently. Switched to `selectLatestVersion`, which picks the newest entry regardless of type. Also paginated the scan the same way the self-heal path does — otherwise a single-page `list(1000)` still mis-selects the latest for old-format version id stacks that exceed one page. Removed the `hasDeleteMarkers || !isLast` branch: with `selectLatestVersion` any delete marker participates in the comparison and shows up as the latest when it is newest, so the "keep the directory on ambiguity" guard becomes unreachable. Full pagination also makes the `!isLast` guard unnecessary — we either see every entry or surface a list error. 2. `healStaleLatestVersionPointer`'s "no remaining version" error was a plain `fmt.Errorf`, so callers could not distinguish genuine object-absence from a scan failure via `errors.Is`. Wrap it with `filer_pb.ErrNotFound` so the sentinel flows through the chain (the outer wrap already uses `%w`). No test additions needed — `TestSelectLatestVersion_PromotesNewestDeleteMarker` already asserts the "newer delete marker beats older content" invariant that drives both fixes. * refactor(s3api): drop unused selectLatestContentVersion after review Review feedback flagged a stale comment claiming selectLatestContentVersion "mirrors" the post-deletion semantics of updateLatestVersionAfterDeletion. That claim became false when the post-deletion path switched to selectLatestVersion in the previous commit. Verified no production callers remain — only the helper's own tests referenced it — so the cleaner fix is to delete the dead code rather than rewrite the comment to explain "this exists but isn't used." - Removed selectLatestContentVersion. - Removed the four TestSelectLatestContentVersion_* tests. - Preserved a renamed TestSelectLatestVersion_MixedFormats so the mixed-format comparator coverage still runs against the active selector.
see https://blog.aqwari.net/xml-schema-go/ 1. go get aqwari.net/xml/cmd/xsdgen 2. Add EncodingType element for ListBucketResult in AmazonS3.xsd 3. xsdgen -o s3api_xsd_generated.go -pkg s3api AmazonS3.xsd 4. Remove empty Grantee struct in s3api_xsd_generated.go 5. Remove xmlns: sed s'/http:\/\/s3.amazonaws.com\/doc\/2006-03-01\/\ //' s3api_xsd_generated.go