mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-25 11:10:20 +00:00
fix(s3): stop S3 Tables routes from swallowing buckets named "buckets" or "get-table" (#9566)
* fix(s3): stop S3 Tables routes from swallowing buckets named "buckets" or "get-table"
The S3 Tables REST endpoints share top-level paths with the regular S3
API (/buckets for ListTableBuckets/CreateTableBucket, /get-table for
GetTable). They are registered first on the same router as the bucket
subrouter, so a path-style request such as GET /buckets?list-type=2 on
a bucket actually named "buckets" matched ListTableBuckets and returned
JSON. AWS SDK V2 (and Hadoop s3a / Spark) then failed XML parsing with
"Unexpected character '{' (code 123) in prolog".
Disambiguate by requiring the AWS V4 credential scope to name the
s3tables service on the colliding routes. Regular S3 SDKs sign with
service=s3, S3 Tables SDKs sign with service=s3tables, and the scope is
present in both the Authorization header and the X-Amz-Credential query
parameter for presigned URLs, so the matcher works for both flavors.
ARN-bearing S3 Tables routes (/buckets/<arn>, /namespaces/<arn>, etc.)
already cannot collide because colons are not valid in bucket names, so
they are left untouched.
* fix(s3): accept AWS JSON RPC content type as S3 Tables intent signal
The Iceberg catalog integration tests send unsigned PUT /buckets with
Content-Type: application/x-amz-json-1.1 to create table buckets. With
only the credential-scope check, those requests fell through to the
regular S3 CreateBucket handler and the suite went red on this branch.
Extend the matcher so a request is recognized as S3 Tables when either:
- its AWS V4 credential scope names SERVICE=s3tables; or
- it carries the canonical AWS JSON RPC 1.1 content type and is
unsigned (a request explicitly signed for SERVICE=s3 still wins).
The regular S3 SDKs do not send application/x-amz-json-1.1, so the
signal is safe for the colliding paths (/buckets, /get-table).
Also add an AWS SDK V2 for Go integration test under
test/s3/sdk_v2_routing/ that drives the SDK's own XML deserializer
against a bucket literally named "buckets" and "get-table" — the SDK
errors before the test asserts if the server returns the wrong body
shape. Wired up via .github/workflows/s3-sdk-v2-routing-tests.yml,
mirroring the etag/acl workflow.
* s3api: extend service matcher to all S3 Tables routes; simplify scope check
- Apply serviceMatcher to every S3 Tables route, not just the bare-path
ones. ARN-bearing paths could otherwise be hit by an S3 object key
that starts with arn:aws:s3tables:..., inside a bucket named
"buckets", "namespaces", "tables", or "tag". One matcher everywhere
closes both collision classes.
- Replace strings.Split + index lookup with strings.Contains for the
credential-scope check. The scope shape is fixed at
AK/DATE/REGION/SERVICE/aws4_request, slashes only delimit components,
and access keys are alphanumeric — so /s3tables/ matches iff SERVICE
is exactly s3tables. Existing unit cases (including the
access-key-substring case) still pass.
- Read the GetObject body in the SDK v2 routing test with io.ReadAll;
the single Read could return short and make the equality check flaky.
* s3api: drop content-type fallback; sign s3 tables harness traffic instead
The content-type fallback in isS3TablesSignedRequest let an anonymous
regular-S3 request whose body type is application/x-amz-json-1.1 hit
an S3 Tables route when the path-style object key happened to be
shaped like an S3 Tables ARN (e.g. PutObject on bucket "buckets"
with key arn:aws:s3tables:.../bucket/foo/policy). Narrow the matcher
back to the AWS V4 credential scope so only requests signed for
SERVICE=s3tables match the S3 Tables routes.
Update the Iceberg catalog test harness — the only caller still
sending unsigned PUT /buckets — to sign with SERVICE=s3tables. The
mini instance runs in default-allow mode, so the signature itself is
not verified; only the credential scope matters for the route match.
Drop the stale unit cases for the JSON-RPC content-type signal and
the routing test that exercised unsigned harness traffic.
This commit is contained in:
110
.github/workflows/s3-sdk-v2-routing-tests.yml
vendored
Normal file
110
.github/workflows/s3-sdk-v2-routing-tests.yml
vendored
Normal file
@@ -0,0 +1,110 @@
|
||||
name: "S3 SDK V2 Route Disambiguation Tests"
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ master ]
|
||||
paths:
|
||||
- 'weed/s3api/**'
|
||||
- 'test/s3/sdk_v2_routing/**'
|
||||
- '.github/workflows/s3-sdk-v2-routing-tests.yml'
|
||||
pull_request:
|
||||
branches: [ master ]
|
||||
paths:
|
||||
- 'weed/s3api/**'
|
||||
- 'test/s3/sdk_v2_routing/**'
|
||||
- '.github/workflows/s3-sdk-v2-routing-tests.yml'
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.head_ref || github.ref }}/s3-sdk-v2-routing-tests
|
||||
cancel-in-progress: true
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
s3-sdk-v2-routing-tests:
|
||||
name: S3 SDK V2 Routing Tests
|
||||
runs-on: ubuntu-22.04
|
||||
timeout-minutes: 10
|
||||
steps:
|
||||
- name: Check out code
|
||||
uses: actions/checkout@v6
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v6
|
||||
with:
|
||||
go-version-file: 'go.mod'
|
||||
|
||||
- name: Install SeaweedFS
|
||||
run: |
|
||||
cd weed && go install -buildvcs=false
|
||||
|
||||
- name: Start weed mini (S3 on :8333)
|
||||
# Pins the regression for issue #9559: AWS SDK V2 / Hadoop s3a
|
||||
# listing a bucket literally named "buckets" must get an XML
|
||||
# ListObjectsV2 response, not the JSON ListTableBuckets body
|
||||
# served by the S3 Tables REST endpoint on the same path.
|
||||
run: |
|
||||
mkdir -p /tmp/seaweedfs-sdk-v2-routing
|
||||
cat > /tmp/seaweedfs-sdk-v2-routing-s3.json <<'JSON'
|
||||
{
|
||||
"identities": [
|
||||
{
|
||||
"name": "admin",
|
||||
"credentials": [
|
||||
{"accessKey": "some_access_key1", "secretKey": "some_secret_key1"}
|
||||
],
|
||||
"actions": ["Admin", "Read", "Write"]
|
||||
}
|
||||
]
|
||||
}
|
||||
JSON
|
||||
AWS_ACCESS_KEY_ID=some_access_key1 \
|
||||
AWS_SECRET_ACCESS_KEY=some_secret_key1 \
|
||||
weed mini \
|
||||
-dir=/tmp/seaweedfs-sdk-v2-routing \
|
||||
-s3.port=8333 \
|
||||
-s3.config=/tmp/seaweedfs-sdk-v2-routing-s3.json \
|
||||
-ip=127.0.0.1 \
|
||||
> /tmp/weed-mini.log 2>&1 &
|
||||
echo $! > /tmp/weed-mini.pid
|
||||
|
||||
for i in $(seq 1 30); do
|
||||
if curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:8333/ | grep -qE "^(200|403)$"; then
|
||||
echo "weed mini is ready"
|
||||
exit 0
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
echo "weed mini failed to start within 30s"
|
||||
tail -50 /tmp/weed-mini.log
|
||||
exit 1
|
||||
|
||||
- name: Run SDK V2 routing tests
|
||||
env:
|
||||
S3_ENDPOINT: http://127.0.0.1:8333
|
||||
AWS_ACCESS_KEY_ID: some_access_key1
|
||||
AWS_SECRET_ACCESS_KEY: some_secret_key1
|
||||
AWS_REGION: us-east-1
|
||||
run: go test -v -timeout=5m ./test/s3/sdk_v2_routing/...
|
||||
|
||||
- name: Stop weed mini
|
||||
if: always()
|
||||
run: |
|
||||
if [ -f /tmp/weed-mini.pid ]; then
|
||||
kill "$(cat /tmp/weed-mini.pid)" 2>/dev/null || true
|
||||
fi
|
||||
|
||||
- name: Show server log on failure
|
||||
if: failure()
|
||||
run: |
|
||||
echo "=== weed mini log (last 200 lines) ==="
|
||||
tail -n 200 /tmp/weed-mini.log 2>/dev/null || echo "no log available"
|
||||
|
||||
- name: Archive log
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v7
|
||||
with:
|
||||
name: s3-sdk-v2-routing-server-log
|
||||
path: /tmp/weed-mini.log
|
||||
retention-days: 3
|
||||
Reference in New Issue
Block a user