5 Commits

Author SHA1 Message Date
Chris Lu
e648c76bcf go fmt 2026-04-10 17:31:14 -07:00
Chris Lu
fbe758efa8 test: consolidate port allocation into shared test/testutil package (#8982)
* test: consolidate port allocation into shared test/testutil package

Move duplicated port allocation logic from 15+ test files into a single
shared package at test/testutil/. This fixes a port collision bug where
independently allocated ports could overlap via the gRPC offset
(port+10000), causing weed mini to reject the configuration.

The shared package provides:
- AllocatePorts: atomic allocation of N unique ports
- AllocateMiniPorts/MustFreeMiniPorts: gRPC-offset-aware allocation
  that prevents port A+10000 == port B collisions
- WaitForPort, WaitForService, FindBindIP, WriteIAMConfig, HasDocker

* test: address review feedback and fix FUSE build

- Revert fuse_integration change: it has its own go.mod and cannot
  import the shared testutil package
- AllocateMiniPorts: hold all listeners open until the entire batch is
  allocated, preventing race conditions where other processes steal ports
- HasDocker: add 5s context timeout to avoid hanging on stalled Docker
- WaitForService: only treat 2xx HTTP status codes as ready

* test: use global rand in AllocateMiniPorts for better seeding

Go 1.20+ auto-seeds the global rand generator. Using it avoids
identical sequences when multiple tests call at the same nanosecond.

* test: revert WaitForService status code check

S3 endpoints return non-2xx (e.g. 403) on bare GET requests, so
requiring 2xx caused the S3 integration test to time out. Any HTTP
response is sufficient proof that the service is running.

* test: fix gofmt formatting in s3tables test files
2026-04-08 11:30:02 -07:00
Chris Lu
98714b9f70 fix(test): address flaky S3 distributed lock integration test (#8888)
* fix(test): address flaky S3 distributed lock integration test

Two root causes:

1. Lock ring convergence race: After waitForFilerCount(2) confirms the
   master sees both filers, there's a window where filer0's lock ring
   still only contains itself (master's LockRingUpdate broadcast is
   delayed by the 1s stabilization timer). During this window filer0
   considers itself primary for ALL keys, so both filers can
   independently grant the same lock.

   Fix: Add waitForLockRingConverged() that acquires the same lock
   through both filers and verifies mutual exclusion before proceeding.

2. Hash function mismatch: ownerForObjectLock used util.HashStringToLong
   (MD5 + modulo) to predict lock owners, but the production DLM uses
   CRC32 consistent hashing via HashRing. This meant the test could
   pick keys that route to the same filer, not exercising the
   cross-filer coordination it intended to test.

   Fix: Use lock_manager.NewHashRing + GetPrimary() to match production
   routing exactly.

* fix(test): verify lock denial reason in convergence check

Ensure the convergence check only returns true when the second lock
attempt is denied specifically because the lock is already owned,
avoiding false positives from transient errors.

* fix(test): check one key per primary filer in convergence wait

A single arbitrary key can false-pass: if its real primary is the filer
with the stale ring, mutual exclusion holds trivially because that filer
IS the correct primary. Generate one test key per distinct primary using
the same consistent-hash ring as production, so a stale ring on any
filer is caught deterministically.
2026-04-02 13:11:04 -07:00
Chris Lu
0884acd70c ci: add S3 mutation regression coverage (#8804)
* test/s3: stabilize distributed lock regression

* ci: add S3 mutation regression workflow

* test: fix delete regression readiness probe

* test: address mutation regression review comments
2026-03-28 20:17:20 -07:00
Chris Lu
0adb78bc6b s3api: make conditional mutations atomic and AWS-compatible (#8802)
* s3api: serialize conditional write finalization

* s3api: add conditional delete mutation checks

* s3api: enforce destination conditions for copy

* s3api: revalidate multipart completion under lock

* s3api: rollback failed put finalization hooks

* s3api: report delete-marker version deletions

* s3api: fix copy destination versioning edge cases

* s3api: make versioned multipart completion idempotent

* test/s3: cover conditional mutation regressions

* s3api: rollback failed copy version finalization

* s3api: resolve suspended delete conditions via latest entry

* s3api: remove copy test null-version injection

* s3api: reject out-of-order multipart completions

* s3api: preserve multipart replay version metadata

* s3api: surface copy destination existence errors

* s3api: simplify delete condition target resolution

* test/s3: make conditional delete assertions order independent

* test/s3: add distributed lock gateway integration

* s3api: fail closed multipart versioned completion

* s3api: harden copy metadata and overwrite paths

* s3api: create delete markers for suspended deletes

* s3api: allow duplicate multipart completion parts
2026-03-27 19:22:26 -07:00