Files
seaweedfs/.github
Chris Lu d0a09ea178 fix(s3): honor ChecksumAlgorithm on presigned URL uploads (#9076)
* fix(s3): honor ChecksumAlgorithm on presigned URL uploads

AWS SDK presigners hoist x-amz-sdk-checksum-algorithm (and related
checksum headers) into the signed URL's query string, so servers must
read either location. detectRequestedChecksumAlgorithm only looked at
request headers, so presigned PUTs with ChecksumAlgorithm set validated
and stored no additional checksum, and HEAD/GET never returned the
x-amz-checksum-* header.

Read these parameters from headers first, then fall back to a
case-insensitive query-string lookup. Apply the same fallback when
comparing an object-level checksum value against the computed one.

Fixes #9075

* test(s3): presigned URL checksum integration tests (#9075)

Adds test/s3/checksum with end-to-end coverage for flexible-checksum
behavior on presigned URL uploads. Tests generate a presigned PUT URL
with ChecksumAlgorithm set, upload the body with a plain http.Client
(bypassing AWS SDK middleware so the server must honor the query-string
hoisted x-amz-sdk-checksum-algorithm), then HEAD/GET with
ChecksumMode=ENABLED and assert the stored x-amz-checksum-* header.

Covers SHA256, SHA1, and a negative control with no checksum requested.
Wires the new directory into s3-go-tests.yml as its own CI job.

* perf(s3): parse presigned query once in detectRequestedChecksumAlgorithm

Previously, each header fallback called getHeaderOrQuery, which re-parsed
r.URL.Query() and allocated a new map on every invocation — up to eight
times per PutObject request. Parse the raw query at most once per request
(only when non-empty) and pass the pre-parsed url.Values into a new
lookupHeaderOrQuery helper.

Also drops a redundant strings.ToLower allocation in the case-insensitive
query key scan (strings.EqualFold already handles ASCII case folding).

Addresses review feedback from gemini-code-assist on PR #9076.

* test(s3): honor credential env vars and add presigned upload timeout

- init() now reads S3_ACCESS_KEY/S3_SECRET_KEY (and AWS_ACCESS_KEY_ID /
  AWS_SECRET_ACCESS_KEY / AWS_REGION fallbacks) so that
  `make test-with-server ACCESS_KEY=... SECRET_KEY=...` no longer
  authenticates with hardcoded defaults while the server has been
  started with different credentials.
- uploadViaPresignedURL uses a dedicated http.Client with a 30s timeout
  instead of http.DefaultClient, so a stalled server fails fast in CI
  instead of blocking until the suite's global timeout fires.

Addresses review feedback from coderabbitai on PR #9076.

* test(s3): pass S3_PORT and credentials through to checksum tests

- 'make test' now exports S3_ENDPOINT, S3_ACCESS_KEY, and S3_SECRET_KEY
  derived from the Makefile variables so the Go test process talks to
  the same endpoint/credentials that start-server was launched with.
- start-server cleans up the background SeaweedFS process and PID file
  when the readiness poll times out, preventing stale port conflicts on
  subsequent runs.

Addresses review feedback from coderabbitai on PR #9076.

* ci(s3): raise checksum tests step timeout

make test-with-server builds weed_binary, waits up to 90s for readiness,
then runs go test -timeout=10m. The previous 12-minute step timeout only
had ~2 minutes of headroom over the Go timeout, risking the Actions
runner killing the step before tests reported a real failure.

Bumps the job timeout from 15 to 20 minutes and the step timeout from
12 to 16 minutes, matching other S3 integration jobs.

Addresses review feedback from coderabbitai on PR #9076.

* perf(s3): thread pre-parsed query through putToFiler hot path

Parse the request's query string once at the top of putToFiler and
reuse the resulting url.Values for both the checksum-algorithm detection
and the expected-checksum verification. Previously, the verification
path called getHeaderOrQuery which re-parsed r.URL.Query() again on
every PutObject, defeating the previous commit's single-parse goal.

- Add parseRequestQuery + detectRequestedChecksumAlgorithmQ (the
  pre-parsed-query variant). detectRequestedChecksumAlgorithm is now a
  thin wrapper used by callers that do a single lookup.
- putToFiler parses once and threads the result through both call sites.
- Remove getHeaderOrQuery and update the unit test to use
  lookupHeaderOrQuery directly.

Addresses follow-up review from gemini-code-assist on PR #9076.

* test(s3): check io.ReadAll error in uploadViaPresignedURL helper

* test(s3): drop SHA1 presigned test case

The AWS SDK v2 presigner signs a Content-MD5 header at presign time
for SHA1 PutObject requests even when no body is attached (the MD5 of
the empty payload gets baked into the signed headers). Uploading the
real body via a plain http.Client then trips SeaweedFS's MD5 validation
and returns BadDigest — an SDK/presigner quirk, not a SeaweedFS bug.

The SHA256 positive case already exercises the server-side
query-hoisted algorithm path that issue #9075 is about, and the unit
tests in weed/s3api cover each algorithm's header mapping. Drop the
SHA1 integration case rather than chase SDK-specific workarounds.

* test(s3): provide real Content-MD5 to presigned checksum test

AWS SDK v2's flexible-checksum middleware signs a Content-MD5 header at
presign time. There is no body to hash at that point, so it seeds the
header with MD5 of the empty payload. When the real body is then PUT
with a plain http.Client, SeaweedFS's server-side Content-MD5
verification correctly rejects the upload with BadDigest.

Pre-compute the MD5 of the test body and thread it into
PutObjectInput.ContentMD5 so the signed Content-MD5 matches the body
that will actually be uploaded. The test still exercises the
server-side path that reads X-Amz-Sdk-Checksum-Algorithm from the
query string (the fix that PR #9076 is validating).

* test(s3): send the signed Content-MD5 header on presigned upload

uploadViaPresignedURL now accepts an extraHeaders map so callers can
thread through headers that the presigner signed but the raw http
request would otherwise omit. The SHA256 test passes the Content-MD5
it computed, matching what the presigner baked into the signature.

Fixes SignatureDoesNotMatch seen in CI after the previous commit set
ContentMD5 on the presign input without sending the corresponding
header on the actual upload.

* test(s3): build presigned URL with the raw v4 signer

The AWS SDK v2 s3.PresignClient runs the flexible-checksum middleware
for any PutObject input that carries ChecksumAlgorithm. That middleware
injects a Content-MD5 header at presign time, and with no body present
it seeds MD5-of-empty. Any subsequent upload of a non-empty body
through a plain http.Client then trips SeaweedFS's Content-MD5
verification and returns BadDigest — not the code path that issue
#9075 is about.

Replace the PresignClient usage in the integration test with a direct
call to v4.Signer.PresignHTTP, building a canonical URL whose query
string already contains x-amz-sdk-checksum-algorithm=SHA256. This is
exactly the shape of URL a browser/curl client would receive from any
presigner that hoists the algorithm header, and it exercises the
server-side fix from PR #9076 without dragging in SDK-specific
middleware quirks.

* test(s3): set X-Amz-Expires on presigned URL before signing

v4.Signer.PresignHTTP does not add X-Amz-Expires on its own — the
caller has to seed it into the request's query string so the signer
includes it in the canonical query and the server accepts the
presigned URL. Without it, SeaweedFS correctly returns
AuthorizationQueryParametersError.

Also adds a .gitignore for the make-managed test volume data, log
file, and PID file so local `make test-with-server` runs do not leave
artifacts tracked by git.

Verified by running the integration tests locally:
  make test-with-server → both presigned checksum tests PASS.
2026-04-14 21:52:49 -07:00
..
2022-04-24 21:44:53 -07:00
2021-12-16 11:07:53 -08:00