mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-14 13:51:33 +00:00
* test(s3): force-drop collection after deleteBucket across tagging/versioning/cors/copying Each test creates a unique bucket (= new SeaweedFS collection) and the master's warm-create issues a 7-volume grow batch. The S3 DeleteBucket-driven collection sweep snapshots the layout once, but in-flight `volume_grow` requests keep registering volumes after the snapshot, leaking 1-3 volumes per bucket. On a single `weed mini` data node with the auto-derived volume cap, those leaks pile up fast and every subsequent PutObject 500s with "Not enough data nodes found". Mirror the retention-suite fix (commitsac3a756d,363d5caa) into the four other suites that share the same shape: defer a master-side /col/delete after each bucket teardown, and sweep stale prefix-matching buckets before every new createBucket. Each suite gets its own MASTER_ENDPOINT default plus the allTestBucketPrefixes / cleanupAllTestBuckets / cleanupLeftoverTestBuckets / forceDeleteCollection helpers. Skipped: iam (separate framework + v1 SDK; already passes -master.volumeSizeLimitMB=100), sse (different cleanup signature; docker-compose already passes -volumeSizeLimitMB=50), policy (its TestCluster already uses -master.volumeSizeLimitMB=32 and tears the whole cluster down). Suites under 5 tests cannot exhaust the cap and were left untouched. * test(s3): default master endpoint to 127.0.0.1 to avoid IPv6 resolution Mirror the explicit IPv4 default already used by test/s3/copying. On hosts where `localhost` resolves to ::1 first the master HTTP listener (bound on 0.0.0.0) is unreachable, so the new force-delete-collection helper would silently skip cleanup. Pin the default to 127.0.0.1 in tagging/cors/versioning; the MASTER_ENDPOINT env var still wins when set. * test(s3): skip buckets newer than this process in leftover-bucket sweep cleanupAllTestBuckets/cleanupTestBuckets used to delete every bucket whose name matched the suite's prefix. With shared S3/master endpoints, a second concurrent `go test` run could have its live buckets torn down mid-test by the first run's sweep. Capture testRunStart at package init (with a 1-minute backdate for clock skew) and skip any bucket whose CreationDate is newer than that. Stale buckets from panicked or interrupted prior runs (the original target of the sweep) still get collected because they were created before this process started. * test(s3-copying): sweep stale prefix buckets from createBucket, not just from a few callers The s3-copying suite already had cleanupTestBuckets that walks every bucket and drops the test-copying-* prefix matches, but only four tests in the file invoke it; the rest go straight to createBucket. So a panicked or interrupted prior run could leak buckets that survive into the next run and exhaust the data node's volume slots before any of the prefix-sweeping tests get a chance to run. Hoist the sweep into createBucket so every test that creates a bucket starts on a clean slate. The per-process CreationDate filter from the prior commit keeps this safe under concurrent runs. * test(s3): scope leftover-bucket sweep to this run via runID marker The CreationDate filter only protected against runs that started later than this process. A different `go test` against the same S3/master endpoints that started *earlier* and is still active has buckets older than testRunStart, so the sweep would still tear them down mid-test. Replace the time window with a per-process runID baked into bucket names: every bucket this run creates gets `r{runID}-` after the suite's BucketPrefix, and the sweep only matches that owned subset. Concurrent runs each carry their own runID and never see each other's buckets. Trade-off: buckets left behind by a crashed prior run carry a different runID and won't be cleaned up by this sweep. That recovery path now belongs to `make clean` / data-dir wipe, which is what CI already does between jobs. Fixed-name versioning buckets (e.g. test-versioning-directories) bypass getNewBucketName and so also bypass this sweep — they are short-lived and handled by their own deferred deleteBucket. * test(s3): drop cleanupLeftoverTestBuckets wrapper, call cleanupAllTestBuckets directly cleanupLeftoverTestBuckets was a one-line forwarder kept only as a semantic name at the createBucket call site. Inline the call and update the doc comments to point at the actual implementation. tagging/cors/versioning all had the wrapper; copying never did.