Files
Chris Lu a769c938ec test(s3tables): Unity Catalog OSS integration tests against SeaweedFS (#9308)
* test(s3tables): add Unity Catalog OSS integration test against SeaweedFS

Mirrors the configuration used by the upstream playground at
data-engineering-helpers/mds-in-a-box/unitycatalog-playground.

Three test variants under test/s3tables/unity_catalog:

- TestUnityCatalogDeltaIntegration: aws.masterRoleArn empty / static
  keys; catalog/schema/EXTERNAL Delta CRUD + temporary-table-credentials
  S3 round-trip (the playground's working configuration).
- TestUnityCatalogMasterRoleIntegration: aws.masterRoleArn set to a
  SeaweedFS-side role with a permissive trust policy; UC's StsClient
  is pinned at SeaweedFS via AWS_ENDPOINT_URL_STS, and the test asserts
  the vended creds carry a session_token and a non-static access key,
  proving the role-vended path the playground notes as not-yet-working
  actually does work today.
- TestUnityCatalogDeltaRsRoundTrip: writes/reads a real Delta table at
  the registered storage_location using delta-rs in a slim Python
  container, with temporary credentials fetched from UC.

All three self-skip without Docker or a weed binary, matching the
sibling lakekeeper / polaris tests.

* test(s3tables): tighten Unity Catalog tests against actual UC OSS behavior

After running the suite locally, ground the assertions in what the
upstream UC OSS Docker image actually does against SeaweedFS today.

- Static-key playground configuration
  (TestUnityCatalogDeltaIntegration): catalog/schema/EXTERNAL Delta CRUD
  pass against the SeaweedFS-backed warehouse. The temporary-table-
  credentials subtest is renamed and inverted to assert the failure mode
  the playground reports -- UC's AwsCredentialVendor falls through to an
  internal StsClient.assumeRole when masterRoleArn and sessionToken are
  both empty, which has no real STS to talk to. Bucket path is also
  fixed to match UC's getStorageBase() lookup (s3://lakehouse vs the
  playground's s3://lakehouse/warehouse, which the upstream code never
  matches).

- Master-role variant (TestUnityCatalogMasterRoleIntegration): split
  into two passing slices. Slice 1 proves SeaweedFS' STS endpoint
  vending UnityCatalogVendedRole works via the Go AWS SDK and the vended
  creds round-trip on S3. Slice 2 boots UC with aws.masterRoleArn set
  and verifies catalog/schema/Delta CRUD. The third hop -- UC's Java
  StsClient actually reaching SeaweedFS' STS handler during
  /temporary-table-credentials -- is logged but not asserted, since the
  AWS Java SDK's STS request currently lands on a SeaweedFS S3 path
  rather than the STS handler.

- Delta-RS round-trip (TestUnityCatalogDeltaRsRoundTrip): gated on
  UC_DELTA_RS_RUN=1 since it depends on the master-role STS handoff
  above. The Dockerfile / writer script stay in tree so the test runs
  end-to-end the moment that hop is fixed.

README rewritten to be explicit about what each test validates today
and what is still pending.

Result: `go test -run TestUnityCatalog ./test/s3tables/unity_catalog/...`
passes cleanly with weed + Docker available, and self-skips otherwise.

* test(s3tables): exercise unity catalog integrations

* ci: run Unity Catalog integration tests on PRs

Adds a unity-catalog-integration-tests job to s3-tables-tests.yml,
modeled on the existing lakekeeper / dremio jobs. Pre-pulls the UC
image and python:3.11-slim (used by the delta-rs writer container) and
runs `go test ./test/s3tables/unity_catalog`.

Format-check and go-vet jobs already recurse into ./test/s3tables/...
so the new package is covered there too.

* test/ci: address PR review

Tighten the UC readiness probe to require 200, not <500, so a
401/403/404 during startup surfaces immediately instead of being
treated as ready (CodeRabbit).

Pin the UC image to v0.4.0 in both the workflow and the test default,
matching the pinned-tag convention the rest of s3-tables-tests.yml
uses (CodeRabbit). Use UC_IMAGE=unitycatalog/unitycatalog:main to
re-test against current upstream.

* docs: separate UC static-key vs master-role failure modes

The README mixed the two together. Static-key empty-sessionToken
short-circuits with "S3 bucket configuration not found." before UC
even fires an STS call; the AccessDenied I described is what happens
in the master-role variant where UC's Java StsClient actually reaches
SeaweedFS. Cross-link the playground PR that fixes the static-key
vending side.

Also drop the "what most playground users actually run" hand-wave
under MANAGED tables.

* docs: trim README

Drop the playground cross-reference and the "two layers fail
independently" framing.

* docs: pin down what's actually pending

Investigated the master-role STS handoff with a sniffer in front of
SeaweedFS' STS port. UC's StsClient is constructed without an
endpointOverride and never reads aws.endpoint or AWS_ENDPOINT_URL_STS;
verified by pointing AWS_ENDPOINT_URL_STS at port 1 and seeing the
same real-AWS InvalidClientTokenId 403 with zero traffic to SeaweedFS.

The fix is upstream in UC. Updated the README and the master-role
test's t.Logf to say so precisely, and dropped the stale "Spark client"
bullet (delta-rs covers that path).

* test(s3tables): use BaseEndpoint instead of deprecated resolver

EndpointResolverWithOptions is deprecated in aws-sdk-go-v2; the
supported way to override a service endpoint is via the per-service
Options.BaseEndpoint. Switch the assume-role helper to that pattern so
the test stops compiling against deprecated API and the resolver
boilerplate disappears.

Addresses gemini review on PR #9308.

* test(s3tables): drop unused splitS3URI helper

Helper had no callers; gemini caught it on PR #9308. Easy to bring
back from git history if needed.

* test(s3tables): extract last token of docker run output as container ID

docker run -d may prefix the container ID with image-pull progress
when the image isn't cached locally. strings.TrimSpace on the whole
output then gave a multi-line string, not the ID. Take the last
whitespace-separated token so the ID survives a fresh CI runner.

Addresses gemini review on PR #9308.

* test(s3tables): cap Unity Catalog response body reads at 10 MiB

io.ReadAll without a limit could OOM the test runner if the UC
container hands back an unexpectedly large body. 10 MiB is well
above any well-formed catalog response and turns a misbehaving
server into a test failure instead of a runner crash.

Addresses gemini review on PR #9308.

* docs: link UC fix PR and call out UC's mocked-Sts test pattern

UC's own credential-vending tests substitute StsClient with an in-process
EchoAwsStsClient (BaseCRUDTestWithMockCredentials) or Mockito.mockStatic
(CloudCredentialVendorTest), so the wire path between UC's Java SDK and
a real STS server is untested -- which is why the missing endpointOverride
slipped through upstream. Linked the upstream fix at
unitycatalog/unitycatalog#1532.
2026-05-04 21:14:22 -07:00
..

Unity Catalog OSS integration tests

These tests run Unity Catalog OSS in Docker against an embedded SeaweedFS S3 endpoint. The server.properties mirrors the upstream playground at mds-in-a-box/unitycatalog-playground.

Test Variant Status
TestUnityCatalogDeltaIntegration static keys, aws.masterRoleArn= empty passes; covers catalog/schema/EXTERNAL Delta CRUD against SeaweedFS-backed warehouse and asserts that UC's /temporary-table-credentials cannot vend usable creds with this configuration -- exactly the gap the playground reports.
TestUnityCatalogMasterRoleIntegration aws.masterRoleArn=arn:aws:iam::000000000000:role/UnityCatalogVendedRole passes; proves SeaweedFS' STS endpoint accepts sts:AssumeRole for the role UC would use (Go SDK round-trip), and that UC starts and accepts CRUD when wired with the master-role config. UC's own StsClient still talks to real AWS regardless of aws.endpoint / AWS_ENDPOINT_URL_STS (UC bug, see below); that hop is logged via t.Logf rather than asserted.
TestUnityCatalogDeltaRsRoundTrip static keys + delta-rs Python client passes; resolves table metadata through UC and writes/reads a real Delta table at the registered storage_location using python:3.11-slim + deltalake with the SeaweedFS test credentials.

Prerequisites

  • Docker available locally (the tests call docker run / docker build directly).
  • A weed binary at the repo root (weed/weed) or on $PATH.

Run

go test -timeout 15m \
    -run 'TestUnityCatalog' \
    ./test/s3tables/unity_catalog/...

Pin a specific Unity Catalog image (defaults to unitycatalog/unitycatalog:v0.4.0):

UC_IMAGE=unitycatalog/unitycatalog:main \
    go test -timeout 15m -run TestUnityCatalogDeltaIntegration \
    ./test/s3tables/unity_catalog/...

The tests self-skip when Docker is unavailable or no weed binary is on the path; running under -short also skips them.

Why the static-key path can't vend usable creds

UC OSS' AwsCredentialVendor.createPerBucketCredentialGenerator:

if (config.getSessionToken() != null && !config.getSessionToken().isEmpty()) {
    return new AwsCredentialGenerator.StaticAwsCredentialGenerator(config);
}
return createStsCredentialGenerator(config);

With aws.masterRoleArn= empty and s3.sessionToken.0= empty (this test's configuration), /temporary-table-credentials short-circuits with "S3 bucket configuration not found." before UC fires any STS call. Setting a stub s3.sessionToken.0 switches UC to StaticAwsCredentialGenerator and the endpoint returns the static keys, but the response carries that stub session token -- SeaweedFS won't recognize it on the next S3 call, so the vended creds aren't usable for table I/O. Clients have to fall back to the static keys directly.

With aws.masterRoleArn set, UC's AwsCredentialGenerator.StsAwsCredentialGenerator builds the StsClient with only .region(...) and .credentialsProvider(...) -- no .endpointOverride(). The SDK's generic env-var resolution doesn't kick in for that builder shape, so even with AWS_ENDPOINT_URL_STS=... (or the matching aws.endpointUrlSts Java property, or the catch-all AWS_ENDPOINT_URL=...) the StsClient still targets real AWS and gets back InvalidClientTokenId. Verified by pointing the env var at port 1: UC reports the same AWS-issued 403 that it reports against SeaweedFS, and a sniffer in front of SeaweedFS' STS port records zero traffic. SeaweedFS' STS handler itself works -- the Go SDK round-trip in assumeRoleViaSeaweedFS proves that against the same SeaweedFS instance.

UC's own AWS credential-vending tests don't catch this because they mock StsClient away entirely -- BaseCRUDTestWithMockCredentials injects a custom stsClientBuilderSupplier returning an EchoAwsStsClient that synthesizes credentials in-process, and CloudCredentialVendorTest uses Mockito.mockStatic(StsClient.class). No test ever exercises the wire path between UC's Java SDK and a real STS endpoint, so the missing endpointOverride slipped through.

Fix is upstream in unitycatalog/unitycatalog#1532, which adds an aws.endpoint property and applies it to both the StsClient and the S3Client builders. Until that lands, the master-role test logs the failure but does not assert it.

What the tests actually validate today

  • Unity Catalog accepts a SeaweedFS-backed server.properties and starts.
  • Catalog / schema / EXTERNAL Delta table CRUD all work against the SeaweedFS warehouse via the UC REST API.
  • SeaweedFS' STS endpoint correctly issues sts:AssumeRole credentials for the UnityCatalogVendedRole and those credentials are accepted on S3 round-trips (Go AWS SDK).
  • Delta-RS resolves a UC table's storage_location and can write/read Delta data through the SeaweedFS S3 endpoint with the test credentials.

What is still pending

Nothing on the SeaweedFS side. The remaining gap (UC's StsClient ignoring endpoint config) needs a UC OSS patch upstream.

MANAGED tables

Not exercised. UC OSS gates them behind server.managed-table.enabled=true and a two-step staging flow (POST /staging-tables then POST /tables); EXTERNAL Delta is the simpler path and what these tests cover.