Files
Chris Lu a769c938ec test(s3tables): Unity Catalog OSS integration tests against SeaweedFS (#9308)
* test(s3tables): add Unity Catalog OSS integration test against SeaweedFS

Mirrors the configuration used by the upstream playground at
data-engineering-helpers/mds-in-a-box/unitycatalog-playground.

Three test variants under test/s3tables/unity_catalog:

- TestUnityCatalogDeltaIntegration: aws.masterRoleArn empty / static
  keys; catalog/schema/EXTERNAL Delta CRUD + temporary-table-credentials
  S3 round-trip (the playground's working configuration).
- TestUnityCatalogMasterRoleIntegration: aws.masterRoleArn set to a
  SeaweedFS-side role with a permissive trust policy; UC's StsClient
  is pinned at SeaweedFS via AWS_ENDPOINT_URL_STS, and the test asserts
  the vended creds carry a session_token and a non-static access key,
  proving the role-vended path the playground notes as not-yet-working
  actually does work today.
- TestUnityCatalogDeltaRsRoundTrip: writes/reads a real Delta table at
  the registered storage_location using delta-rs in a slim Python
  container, with temporary credentials fetched from UC.

All three self-skip without Docker or a weed binary, matching the
sibling lakekeeper / polaris tests.

* test(s3tables): tighten Unity Catalog tests against actual UC OSS behavior

After running the suite locally, ground the assertions in what the
upstream UC OSS Docker image actually does against SeaweedFS today.

- Static-key playground configuration
  (TestUnityCatalogDeltaIntegration): catalog/schema/EXTERNAL Delta CRUD
  pass against the SeaweedFS-backed warehouse. The temporary-table-
  credentials subtest is renamed and inverted to assert the failure mode
  the playground reports -- UC's AwsCredentialVendor falls through to an
  internal StsClient.assumeRole when masterRoleArn and sessionToken are
  both empty, which has no real STS to talk to. Bucket path is also
  fixed to match UC's getStorageBase() lookup (s3://lakehouse vs the
  playground's s3://lakehouse/warehouse, which the upstream code never
  matches).

- Master-role variant (TestUnityCatalogMasterRoleIntegration): split
  into two passing slices. Slice 1 proves SeaweedFS' STS endpoint
  vending UnityCatalogVendedRole works via the Go AWS SDK and the vended
  creds round-trip on S3. Slice 2 boots UC with aws.masterRoleArn set
  and verifies catalog/schema/Delta CRUD. The third hop -- UC's Java
  StsClient actually reaching SeaweedFS' STS handler during
  /temporary-table-credentials -- is logged but not asserted, since the
  AWS Java SDK's STS request currently lands on a SeaweedFS S3 path
  rather than the STS handler.

- Delta-RS round-trip (TestUnityCatalogDeltaRsRoundTrip): gated on
  UC_DELTA_RS_RUN=1 since it depends on the master-role STS handoff
  above. The Dockerfile / writer script stay in tree so the test runs
  end-to-end the moment that hop is fixed.

README rewritten to be explicit about what each test validates today
and what is still pending.

Result: `go test -run TestUnityCatalog ./test/s3tables/unity_catalog/...`
passes cleanly with weed + Docker available, and self-skips otherwise.

* test(s3tables): exercise unity catalog integrations

* ci: run Unity Catalog integration tests on PRs

Adds a unity-catalog-integration-tests job to s3-tables-tests.yml,
modeled on the existing lakekeeper / dremio jobs. Pre-pulls the UC
image and python:3.11-slim (used by the delta-rs writer container) and
runs `go test ./test/s3tables/unity_catalog`.

Format-check and go-vet jobs already recurse into ./test/s3tables/...
so the new package is covered there too.

* test/ci: address PR review

Tighten the UC readiness probe to require 200, not <500, so a
401/403/404 during startup surfaces immediately instead of being
treated as ready (CodeRabbit).

Pin the UC image to v0.4.0 in both the workflow and the test default,
matching the pinned-tag convention the rest of s3-tables-tests.yml
uses (CodeRabbit). Use UC_IMAGE=unitycatalog/unitycatalog:main to
re-test against current upstream.

* docs: separate UC static-key vs master-role failure modes

The README mixed the two together. Static-key empty-sessionToken
short-circuits with "S3 bucket configuration not found." before UC
even fires an STS call; the AccessDenied I described is what happens
in the master-role variant where UC's Java StsClient actually reaches
SeaweedFS. Cross-link the playground PR that fixes the static-key
vending side.

Also drop the "what most playground users actually run" hand-wave
under MANAGED tables.

* docs: trim README

Drop the playground cross-reference and the "two layers fail
independently" framing.

* docs: pin down what's actually pending

Investigated the master-role STS handoff with a sniffer in front of
SeaweedFS' STS port. UC's StsClient is constructed without an
endpointOverride and never reads aws.endpoint or AWS_ENDPOINT_URL_STS;
verified by pointing AWS_ENDPOINT_URL_STS at port 1 and seeing the
same real-AWS InvalidClientTokenId 403 with zero traffic to SeaweedFS.

The fix is upstream in UC. Updated the README and the master-role
test's t.Logf to say so precisely, and dropped the stale "Spark client"
bullet (delta-rs covers that path).

* test(s3tables): use BaseEndpoint instead of deprecated resolver

EndpointResolverWithOptions is deprecated in aws-sdk-go-v2; the
supported way to override a service endpoint is via the per-service
Options.BaseEndpoint. Switch the assume-role helper to that pattern so
the test stops compiling against deprecated API and the resolver
boilerplate disappears.

Addresses gemini review on PR #9308.

* test(s3tables): drop unused splitS3URI helper

Helper had no callers; gemini caught it on PR #9308. Easy to bring
back from git history if needed.

* test(s3tables): extract last token of docker run output as container ID

docker run -d may prefix the container ID with image-pull progress
when the image isn't cached locally. strings.TrimSpace on the whole
output then gave a multi-line string, not the ID. Take the last
whitespace-separated token so the ID survives a fresh CI runner.

Addresses gemini review on PR #9308.

* test(s3tables): cap Unity Catalog response body reads at 10 MiB

io.ReadAll without a limit could OOM the test runner if the UC
container hands back an unexpectedly large body. 10 MiB is well
above any well-formed catalog response and turns a misbehaving
server into a test failure instead of a runner crash.

Addresses gemini review on PR #9308.

* docs: link UC fix PR and call out UC's mocked-Sts test pattern

UC's own credential-vending tests substitute StsClient with an in-process
EchoAwsStsClient (BaseCRUDTestWithMockCredentials) or Mockito.mockStatic
(CloudCredentialVendorTest), so the wire path between UC's Java SDK and
a real STS server is untested -- which is why the missing endpointOverride
slipped through upstream. Linked the upstream fix at
unitycatalog/unitycatalog#1532.
2026-05-04 21:14:22 -07:00
..
2026-02-08 20:06:32 -08:00

Populate data run:

  - make -C test/s3tables help
  - make -C test/s3tables populate-trino
  - make -C test/s3tables populate-spark

  Run:

  - make -C test/s3tables populate
  - If your account id differs, override: make -C test/s3tables populate
    TABLE_ACCOUNT_ID=000000000000