Files
seaweedfs/weed
qzhello 69c84801e4 fix(s3tables/iceberg): make metadata spec-compliant and accept real-world manifest names (#9703)
* fix(s3tables/iceberg): make metadata spec-compliant and accept real-world manifest names

Two related issues prevent SeaweedFS S3 Tables from interoperating with
strict Iceberg clients (Java/Spark/Flink/Trino):

1. iceberg-go v0.5.0 serializes empty TableMetadata state by dropping
   keys via `omitempty` on optional pointer/slice fields. The Iceberg
   table spec, however, requires `current-snapshot-id`, `snapshots`,
   `snapshot-log`, `metadata-log`, and `refs` to be present even when
   empty (`current-snapshot-id` must be -1 for a table with no
   snapshots). Java's TableMetadataParser uses JsonUtil.getLong on
   `current-snapshot-id` and throws "Cannot parse missing long
   current-snapshot-id" against responses produced by this server.

2. The Iceberg layout validator only accepts manifest filenames that
   match Iceberg's internal naming (`{uuid}-m{n}.avro`,
   `snap-{n}-{n}-{uuid}.avro`). Real writers — notably Flink's sink —
   emit manifests like
   `{flink-job-id}-{checkpoint}-{operator-id}-{n}.avro`, which the
   validator rejects with 403, breaking INSERT commits.

Fixes:

* Add ensureMetadataSpecCompliance helper that backfills the five
  spec-required empty-state fields when iceberg-go omits them or emits
  explicit JSON null. Apply it on every code path that writes
  v*.metadata.json to S3 or returns metadata to clients
  (handlers_table create-table, handlers_commit, commit_helpers
  create-on-commit, plus MarshalJSON on LoadTableResult and
  CommitTableResponse). Real values from non-empty tables are never
  overwritten.

* Add catch-all regex entries to metadataFilePatterns accepting any
  *.avro / *.metadata.json filename composed of [A-Za-z0-9._-]. The
  Iceberg spec does not mandate filename format; the strict patterns
  remain for documentation. Metadata-directory subdirectory rejection
  and the data-file path validation are unchanged.

No upstream dependencies are forked: iceberg-go stays at v0.5.0 and
go.mod is untouched. The compliance layer can be removed once upstream
emits spec-compliant output.

Tests (all pass under `go test -race`):
- metadata_compliance_test.go: 5 cases covering missing fields,
  preserved real values, explicit null, invalid JSON, empty input.
- iceberg_layout_test.go: 3 groups (16 subtests) covering real-world
  manifest names from Flink/Spark/Iceberg, security boundary
  (subdirectories, bad extensions), and data-file regression.

* fix(s3tables/iceberg): preserve metadata key order and keep config field stable

Two small follow-ups on the spec-compliance fix:

* ensureMetadataSpecCompliance now splices missing keys in at the byte
  level just before the closing brace, so iceberg-go's struct-declared
  key order survives the backfill. The previous unmarshal/remarshal
  through map[string]json.RawMessage silently alphabetized every key in
  the document, which is spec-legal but breaks byte-equality fixtures
  and any downstream hashing of the persisted metadata. The slower
  remarshal path is kept for the rare explicit-null replacement case.

* LoadTableResult.MarshalJSON now serializes Config without omitempty,
  matching the struct field tag. The custom marshaler had silently
  flipped the tag to ,omitempty, which made the "config" key disappear
  from the response whenever s3Endpoint was unset (since
  buildFileIOConfig returned an empty but non-nil Properties map).

Tests:
- PreservesOriginalKeyOrder pins the byte-level output against
  iceberg-go's emitted shape; would have caught the alphabetization
  regression.
- EmptyObjectBackfilled covers the {} -> sentinels-only case (no
  leading comma).
- AllPresentReturnsSameBytes confirms the no-op path returns input
  bytes unchanged, with whitespace intact.
- iceberg_layout_test pins the catch-all $ anchor: metadata/file.avro.txt
  must still be rejected.

* fix(s3tables/iceberg): guard ensureMetadataSpecCompliance against top-level null

json.Unmarshal of a JSON `null` literal succeeds but leaves the map nil.
The current byte-append path no-ops gracefully on this input, but the
slow remarshal path would panic with "assignment to entry in nil map"
if the input ever combined `null` with the explicit-null detection. Add
an explicit nil-map short-circuit so the safety property is obvious
from the source, and a test that pins the contract.

* test(s3tables/iceberg): assert full byte equality in AllPresentReturnsSameBytes

The prefix check only caught a missing "{\n  " opener, so the test
would have passed even if the function silently reordered keys or
collapsed whitespace later in the document. Switch to a full string
comparison so any future regression in the no-op path is loud.

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-05-27 13:05:41 -07:00
..
2026-04-10 17:31:14 -07:00
2026-04-14 20:48:24 -07:00
2026-04-23 10:05:51 -07:00