mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-14 05:41:29 +00:00
* fix(iceberg): advertise default namespace location for REST clients
Trino's REST catalog has two code paths for CREATE TABLE depending on
whether a table location can be resolved before the catalog call:
// TrinoRestCatalog.newCreateTableTransaction (Trino 479)
if (location.isEmpty()) {
return tableBuilder.create().newTransaction(); // EAGER: REST POST now
}
return tableBuilder.withLocation(...).createTransaction(); // DEFERRED
`tableLocation` resolves to null when the user does not pass
`location = '...'` AND `defaultTableLocation` returns null. The latter
happens whenever the namespace's `loadNamespaceMetadata` response has no
`location` property — and our handler returned exactly that.
In the eager branch Trino calls REST POST /v1/.../tables immediately, so
our handleCreateTable persists `<location>/metadata/v1.metadata.json` to
the filer. Trino's IcebergMetadata.beginCreateTable then runs
`fileSystem.listFiles(location).hasNext()` on the same path, finds the
metadata file we just wrote, and throws "Cannot create a table on a
non-empty location" — even on a brand-new schema and table name.
Synthesize a default `location` of `s3://<bucket>/<flattened-namespace>`
on Get/Create namespace responses when one is not stored. With a non-
null `defaultTableLocation`, Trino takes the deferred branch, picks a
unique `<namespace-location>/<table>-<UUID>` path (UUID added by the
standard `iceberg.unique-table-location=true` setting), and the empty-
location check passes. The actual REST POST is deferred to commit time,
so our metadata write lands alongside the data files Trino has already
produced.
Existing namespaces with an explicit `location` property are untouched —
the synthesis only kicks in when the property is absent.
Refs #9074
* test(trino): regression for fresh CREATE TABLE without explicit location
Exercises the follow-up scenario reported on issue #9074: a CREATE TABLE
on a brand-new schema and brand-new table name with NO `location = '...'`
clause, followed by a CTAS on top — exactly the SQL pattern from the
report. Before advertising a default namespace location, the first
CREATE TABLE failed with
Cannot create a table on a non-empty location:
s3://iceberg-tables/<schema>/<table>, set
'iceberg.unique-table-location=true' in your Iceberg catalog
properties to use unique table locations for every table.
even though the bucket was genuinely empty. The bug surfaced because our
handleCreateTable persists `<location>/metadata/v1.metadata.json` during
Trino's eager-create branch, and Trino's post-create listFiles
emptiness check then trips on the metadata file we just wrote.
The test asserts the CTAS succeeds AND the resulting table contains the
source rows, since the reporter saw the table get created with empty
data when the query failed.
Refs #9074