seaweedfs

mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-05-22 17:51:30 +00:00

Author	SHA1	Message	Date
Chris Lu	5004b4e542	feat(s3/lifecycle): delete streaming algorithm path (Phase 5b) (#9466 ) * feat(s3/lifecycle): delete streaming algorithm path (Phase 5b) Phase 5a (PR #9465) retired the algorithm flag and made daily_replay the only execution path. The streaming-side code (scheduler.Scheduler, scheduler.BucketBootstrapper, dispatcher.Pipeline, dispatcher.Dispatcher, dispatcher.FilerPersister, and their tests) has had no in-tree caller since then. This PR deletes it. Net change: ~4800 lines removed, ~130 added (the scheduler/configload tests' helper file the deleted bootstrap_test.go used to host). Removed: - weed/s3api/s3lifecycle/scheduler/{bootstrap,bootstrap_test, scheduler,scheduler_test,pipeline_fanout_test, refresh_default,refresh_s3tests}.go - weed/s3api/s3lifecycle/dispatcher/{dispatcher,dispatcher_test, dispatcher_helpers_test,edge_cases_test,multi_shard_test, pipeline,pipeline_test,pipeline_helpers_test,toproto_test, dispatch_ticks_default,dispatch_ticks_s3tests}.go - weed/s3api/s3lifecycle/dispatcher/filer_persister_test.go (FilerPersister deleted; FilerStore tests don't need their own file) - weed/shell/command_s3_lifecycle_run_shard{,_test}.go (debug-only shell command that only ever wrapped the streaming pipeline; the production worker now exercises the same path every daily run) Trimmed: - dispatcher/filer_persister.go down to FilerStore + NewFilerStoreClient — the small interface daily_replay's cursor persister (dailyrun.FilerCursorPersister) plugs into. Kept (still consumed by daily_replay): - scheduler/configload.{go,_test.go} (LoadCompileInputs, AllActivePriorStates) - dispatcher/sibling_lister.{go,_test.go} (NewFilerSiblingLister, FilerSiblingLister) - dispatcher/filer_persister.go (FilerStore, NewFilerStoreClient) scheduler/testhelpers_test.go restores fakeFilerClient, fakeListStream, dirEntry, fileEntry — helpers the configload tests used to share with the deleted bootstrap_test.go. Updates the handler-package doc strings and one reader-package comment that still named the streaming pipeline. * fix(s3/lifecycle): hold lock through tree read in test filer client gemini caught an inconsistency in scheduler/testhelpers_test.go: LookupDirectoryEntry reads c.tree under c.mu, but ListEntries was releasing the lock before reading c.tree. The map is effectively static during tests so there's no actual race today, but matching the convention keeps the helper safe if a future test mutates the tree mid-run.	2026-05-12 12:54:52 -07:00
Chris Lu	ad77362be3	test(s3/lifecycle): bundle reader + scheduler helper coverage (#9412 ) * test(s3/lifecycle): bundle reader + scheduler helper coverage Bundles direct tests for previously-uncovered helpers in two packages. Bumps reader 73.2% → 79.2% and scheduler 71.6% → 73.6%. Reader Event predicates (4): - IsCreate: NewEntry-only event classifies as create - IsDelete: OldEntry-only event classifies as delete - both entries (update): neither IsCreate nor IsDelete (strict exclusivity so router routes updates through their own path) - no entries (degenerate): neither (so a metadata-only filer event with no payload doesn't trigger spurious dispatches) Reader LogStartup (4): exercises both shape branches (single-shard ShardID vs ShardPredicate), the explicit-StartTsNs override path, and the Cursor.MinTsNs fallback when StartTsNs=0. Side-effect-only function; tests pin compile-time shape and visit each code path. Scheduler pipelineFanout.InjectEvent (5): - nil event silently absorbed (no follow-up panic in receiving pipeline) - unknown shard returns nil (forward-compat for future shard-mapping gaps) - known shard succeeds - ctx cancellation propagates when underlying pipeline's buffer fills - routes to the correct pipeline among multiple, with cross-pipeline isolation proven via per-pipeline buffer state * test(s3/lifecycle): rename canceled to canceledCtx in fanout test Per gemini review on #9412: a bare 'canceled' identifier reads like a bool. Rename to canceledCtx so the type is obvious at the call site.	2026-05-09 22:02:09 -07:00
Chris Lu	284d37c3b6	test(s3/lifecycle): cover InMemoryPersister deep-copy contract (#9397 ) * test(s3/lifecycle): cover InMemoryPersister deep-copy contract 8 tests pin the persister contract other lifecycle tests rely on for cursor checkpointing: Load on an unknown shard returns an empty map (not an error); Save then Load roundtrips; Save copies the input so caller-side mutation doesn't bleed into stored state; Load returns a copy so caller-side mutation of the snapshot doesn't bleed back; Save replaces (not merges) prior state so stale resume points don't survive restart; different shards stay isolated; saving an empty map clears state; concurrent Save+Load is race-free under -race. A regression on any of these silently corrupts downstream tests. * test(s3/lifecycle): assert.NotContains for InMemoryPersister key absence assert.Empty on a map[K]V index returns true when the value is the zero value, which would mask a key that leaked through with int64(0). Use assert.NotContains so the assertion fails on key presence regardless of the stored value.	2026-05-09 19:47:16 -07:00
Chris Lu	255e9cd0f7	test(s3/lifecycle): cover reader cursor + Run validation contracts (#9389 ) * test(s3/lifecycle): cover reader cursor + Run validation contracts Layer 2 tests pinning four reader-package contracts the dispatcher pipeline depends on: MinTsNs anchors at frozen positions, Snapshot returns a deep copy in both directions, Restore replaces (not merges), and Run validates ShardID/Events/BucketsPath before subscribing. * test(s3/lifecycle): tighten cursor composition assertions Snapshot deep-copy: also assert cursor doesn't see keys added to the returned map. Restore replace: freeze before second Restore and assert IsFrozen returns false after, pinning the contract that Restore wipes frozen state alongside the value map. Run validation: bound the call with a 5s context timeout so a regression that lets Run reach the nil client surfaces as a failure instead of a hang.	2026-05-09 14:32:11 -07:00
Chris Lu	2f7ac1d664	feat(s3/lifecycle): NoncurrentVersionExpiration via bootstrap (Phase 5b/3) (#9383 ) * feat(s3/lifecycle): NoncurrentVersionExpiration via bootstrap (Phase 5b/3) Bootstrap now expands every <key>.versions/ directory into one event per version with sibling state pre-computed. The router fires NoncurrentDays / NewerNoncurrent off these events using SuccessorModTime as the noncurrent clock; previously these rules never ran on a versioned bucket because buildObjectInfo couldn't classify version-folder events without the latest pointer. Mechanics walkBucketDir treats a directory ending in .versions and carrying ExtLatestVersionIdKey as a SeaweedFS .versions container — emit it once and skip the recursion. Coincidentally-named directories without the latest pointer recurse normally. BucketBootstrapper.expandVersionsDir lists the children, sorts newest-first by mtime, resolves the latest position from the pointer, and injects a synthesized reader.Event per version with BootstrapVersion populated. NoncurrentIndex is 0-based among noncurrents in newest-first order; SuccessorModTime is the immediate newer sibling's mtime (zero for the latest). Pointer naming a missing or absent version falls back to the newest-by-mtime sibling so a race window can't flag every entry as noncurrent. routeBootstrapVersion uses BootstrapVersion to build ObjectInfo directly (bypassing the version-folder skip in buildObjectInfo) and runs the standard match loop. ABORT_MPU is excluded by kind-shape gate. The schedule clock uses SuccessorModTime for noncurrents and ModTime for the latest, so the dispatcher fires when the rule's days threshold is met. Match.ObjectKey is the LOGICAL key, Match.VersionID is the marker's stored version_id — the dispatcher reaches deleteSpecificObjectVersion or createDeleteMarker correctly. Layer 2 tests cover both sides. Router: latest fires ExpirationDays; noncurrent fires NoncurrentDays; NewerNoncurrentVersions retains the N newest noncurrents; ABORT_MPU never matches. Bootstrap: .versions dir emitted once and not recursed; missing latest pointer falls back to newest; backdated PUT (latest pointer is older by mtime) keeps the right noncurrent index; delete-marker flag propagates. * fix(s3/lifecycle): no VersionID for latest expirations, child-based dir disambig Two correctness gaps in Phase 5b/3. Bootstrap was pinning the version_id on every Match. For EXPIRATION_DAYS / EXPIRATION_DATE on the latest version this is unsafe: between schedule and dispatch a fresh PUT can land, the dispatcher would still identity-match against the original version's bytes (it still exists at that path) and the resulting delete marker would hide the new latest. Drop VersionID for those kinds; an empty VersionID makes the dispatcher fetch the current latest, where identity-CAS resolves to STALE_IDENTITY and bootstrap re-schedules with the new latest's identity. NoncurrentDays / NewerNoncurrent / EXPIRED_DELETE_MARKER still pin the version_id since those are version-targeted. isVersionsDir gating on ExtLatestVersionIdKey lost a race window: createDeleteMarker writes the version file before updating the parent's Extended pointer, so a walk between those two steps would see a .versions/ dir without the pointer, recurse into it, and emit raw version files that the router drops. Match the suffix only and let expandVersionsDir disambiguate by child inspection: if any child carries ExtVersionIdKey it's a real .versions container and we expand; otherwise it's a coincidentally-named user folder and we recurse via the bucket-walk's own callback so nested entries still flow through. Tests: latest-expiration assertion flipped to expect empty VersionID; new tests cover the coincidentally-named-folder recursion and the race-window expansion (children present, pointer absent). * fix(s3/lifecycle): filter directory + missing-version-id children at listing expandVersionsDir's listing callback collected every child with attributes; subdirectories or entries without ExtVersionIdKey would make it past the empty-id skip in the inner loop but still inflate NumVersions and skew NoncurrentIndex (the rank derives from the filtered slice's position, which was wrong when the unfiltered slice was sorted). Drop directories at listing time and partition the file children into a versions slice that's the actual rank source. Test cleanups: out-of-order-mtime test now sets v1 older than v2 so latestPos > 0 actually exercises the rank-skip branch in expandVersionsDir; bootstrapVersionEntry preserves nanosecond precision via MtimeNs to match markerLoneEntry's pattern; drop a leftover unused idx variable. * fix(s3/lifecycle): null version + canonical version-id tiebreak Two correctness gaps in Phase 5b/3 bootstrap. Null versions live at the bare logical path, not under .versions/. Bootstrap previously expanded only .versions/<key>/ children, so: - pre-versioning objects with newer .versions/ history never had their null version expired by NoncurrentDays - suspended-bucket writes (which clear the .versions/ latest pointer so null becomes current) had every .versions/ child wrongly classified as latest by the buildObjectInfo fallback expandVersionsDir now looks up the bare key via NewFullPath + LookupEntry, accepts a regular file or an explicit S3 directory-key marker (Mime set), and folds it into the sibling set with VersionID="null". Latest resolution: pointer present + names a real id wins; pointer absent + null exists makes null latest; otherwise falls back to newest sibling. The walker's regular emission for the bare entry would otherwise duplicate, so walkBucketDir now does a two-pass walk per directory level — .versions/ first, then everything else with a per-walk skipBare set keyed by bucket-relative path that expandVersionsDir populates when it claims a bare null sibling. Sort tiebreak: PUTs only set second-level Mtime, so two versions written in the same second tied. The unstable secondary order let old-format version filenames sort oldest-first and corrupt NoncurrentIndex under NewerNoncurrentVersions retention. Add CompareVersionIds to s3lifecycle/version_time.go (mirrors the canonical comparator in s3api/s3api_version_id.go to avoid the import cycle) and use it as a secondary key after mtime equality. Tests: pre-versioning null-as-noncurrent, suspended null-as-current, directory-key marker as null version, end-to-end claim through walkBucketDir's two-pass ordering, and same-second tiebreak via canonical version-id ordering. fakeFilerClient grows a LookupDirectoryEntry implementation backed by the same in-memory tree. * fix(s3/lifecycle): only treat explicit-null bare entries as current The pointer-missing branch in expandVersionsDir made null latest as soon as a bare object was found. That's correct for suspended-bucket writes (s3api_object_handlers_put.go writes the bare entry with ExtVersionIdKey="null") but wrong for the pre-versioning race window: a brand-new version under .versions/<file> exists before the parent's ExtLatestVersionIdKey update lands, and a pre-versioning bare object has no version-id marker. Marking that older bare object latest hides the real new version and skips noncurrent expiration of the null until the next process restart/bootstrap. Distinguish the two: lookupNullVersion now returns whether the bare entry's Extended map carries ExtVersionIdKey="null" (the suspended write marker). expandVersionsDir's pointer-missing branch only promotes null to latest when explicit; otherwise it falls back to newest-sibling, which is safe for the race window since the new version's mtime is fresher than the bare object's. The existing suspended-null test now uses a new helper that adds the explicit marker. New regression test covers the race window: bare entry without the marker + a fresh .versions/<v1> file + missing parent pointer must keep v1 as latest and the null as noncurrent. * fix(s3/lifecycle): only the newest item can be the explicit-null latest The pointer-missing branch in expandVersionsDir scanned every item for an explicit null and promoted it to latest. After a suspended->enabled transition that's the wrong call: createVersion writes the version file before updating ExtLatestVersionIdKey, so a bootstrap that lands in the race window sees an older bare null with ExtVersionIdKey="null" plus a newer .versions/<v-new> child and no parent pointer. Promoting the null misclassifies v-new as noncurrent and skips both the new version's current-version expiration and the null's noncurrent scheduling until the next bootstrap. Constrain the explicit-null branch to items[0]: if the suspended-null write is genuinely current it'll be the newest by mtime AND tagged. Anything else falls through to the newest-sibling default. Adds a regression test for the suspended->re-enabled race. * fix(s3/lifecycle): paginate bootstrap directory listings SeaweedList(..., limit=0) is a single-page request: the filer caps limit=0 at DirListingLimit (1000 by default) and returns whatever fits in one round trip. expandVersionsDir and walkBucketDir both relied on that, so any directory bigger than the cap silently truncated. For noncurrent retention this is correctness, not just scale — a hot key with more versions than the cap had its rank/sort math computed off the first page only, NumVersions, NoncurrentIndex, SuccessorModTime, and the latest-fallback all wrong, with the older versions never scheduled until a future bootstrap. Add a listAll helper that drives pagination via StartFromFileName + inclusive=false, looping until a page returns fewer entries than the configured page size. Use it in both call sites. Page size is a var (listPageSize, default 1024) so tests can shrink it without generating thousands of entries. The fake filer client now mirrors the real semantics: sort children by name, honor StartFromFileName/InclusiveStartFrom, cap at Limit. New regression tests force a small page size and assert the full result set is processed and the call count matches what pagination should drive. * perf(s3/lifecycle): stream bucket walk in two passes instead of buffering walkBucketDir was paginating into a children slice and then iterating twice (pass 1: .versions/, pass 2: everything else). For flat buckets with millions of entries the buffer is a real memory spike. Drop the materialization: each pass now drives its own listAll over the same directory and acts on entries as they stream in. The skipBare ordering contract is preserved — pass 2 still runs after pass 1 finishes — and the per-pass paging keeps memory bounded by listPageSize. Tradeoff: each directory level is listed twice. For workloads where that matters more than the memory headroom, we can revisit; the correctness/scale dial here is what the noncurrent rules need. Updated three tests for the new call count: each walk now records 2 listings per directory (pass 1 + pass 2). The KickOffNew dedup tests expect 2 calls per bucket; the pagination test expects 6 instead of 3.	2026-05-09 10:48:32 -07:00
Chris Lu	85abf3ca88	feat(shell): s3.lifecycle.run-shard + integration test (#9361 ) * feat(shell): s3.lifecycle.run-shard for manual Phase 3 dispatch Subscribes to the filer meta-log filtered to one (bucket, key-prefix-hash) shard, routes events through the compiled lifecycle engine, and dispatches due actions to the S3 server's LifecycleDelete RPC. Persists the per-shard cursor to /etc/s3/lifecycle/cursors/shard-NN.json so subsequent runs resume. Operator-runnable harness for end-to-end Phase 3 validation while the plugin-worker auto-scheduler is still pending. EventBudget bounds a single invocation; flags expose dispatch + checkpoint cadence. Discovers buckets by walking the configured DirBuckets path and reading each bucket entry's Extended[s3-bucket-lifecycle-configuration-xml] through lifecycle_xml.ParseCanonical. All compiled actions are seeded BootstrapComplete=true so the run dispatches whatever fires immediately; production bootstrap walks set this incrementally per bucket. * test(s3/lifecycle): integration test driving the run-shard shell command Spins up 'weed mini', creates a bucket with a 1-day expiration on a prefix, PUTs the target object, then rewrites the entry's Mtime via filer UpdateEntry to 30 days ago. Runs 's3.lifecycle.run-shard' for every shard via 'weed shell' subprocess and asserts the backdated object is deleted within 30s, and the in-prefix-but-recent object remains. The S3 API rejects Expiration.Days < 1, so 'wait a day' is unworkable. Backdating via the filer's gRPC sidesteps that constraint while still exercising the real Reader -> Router -> Schedule -> Dispatcher -> LifecycleDelete RPC path end-to-end. Wires a new s3-lifecycle-tests job into s3-go-tests.yml. The test runs all 16 shards because ShardID(bucket, key) is hash-based and the test shouldn't couple to that detail; running every shard keeps the test independent of the hash function. * fix(shell/s3.lifecycle.run-shard): address review findings - Reject negative -events explicitly. Help text already defines 0 as unbounded; negative budgets created ambiguous behavior in pipeline.Run. - Bound the gRPC dial with a 30s timeout instead of context.Background() so an unreachable S3 endpoint doesn't hang the shell. - Paginate the bucket listing in loadLifecycleCompileInputs. SeaweedList takes a single-RPC limit; the prior 4096 silently dropped buckets past that page on large clusters. Loop with startFrom until a page comes back short. - Surface parse errors instead of swallowing them. Buckets with malformed lifecycle XML now print the first three errors verbatim and a count for the rest, so an operator running this command for diagnostics can find what's wrong. * feat(shell/s3.lifecycle.run-shard): -shards range/set with one subscription Adds -shards "lo-hi" or "a,b,c" to the manual run command and threads the same model through Reader and Pipeline. - reader.Reader gains ShardPredicate (func(int) bool) and StartTsNs; ShardID stays for the single-shard short form. Event carries the computed ShardID so consumers can route per-shard without rehashing. - dispatcher.Pipeline gains Shards []int. When set, Run holds one Cursor + Schedule + Dispatcher per shard, opens one filer SubscribeMetadata stream with a predicate covering the whole set, and routes events into the matching shard's schedule from a single dispatch goroutine — no per-shard goroutine fan-out. - shell command parses -shard or -shards (mutually exclusive), formats progress messages with a contiguous-range label when applicable, and validates against ShardCount. Integration test now uses -shards 0-15 (one subprocess invocation) instead of a 16-iteration loop. * fix(s3/lifecycle): allow Reader with StartTsNs=0 + Cursor=nil The reader rejected the legitimate 'fresh subscription from epoch' state when called from a fresh Pipeline.Run on a multi-shard worker (no cursor file yet, all shards' MinTsNs=0). The downstream SubscribeMetadata call handles SinceNs=0 fine; the up-front check was over-defensive and broke the auto-scheduler completely (CI showed 5-second-cadence retries with this exact error). * fix(s3/lifecycle): schedule from ModTime not eventTime A backdated or out-of-band entry update has eventTime ≈ now while ModTime is far in the past; eventTime+Delay would push the dispatch into the future even though the rule already fires. ModTime+Delay is the correct fire moment. The dispatcher's identity-CAS still catches drift between schedule and dispatch. * fix(s3/lifecycle): -runtime cap on run-shard so it exits on quiet shards The CI integration test sets -events 200 expecting the subprocess to return after 200 in-shard events. But -events counts only events that pass the shard filter; the test produces ~5 such events (bucket create, lifecycle PUT, two object PUTs, mtime backdate), so the reader stays in stream.Recv forever and runShellCommand hangs the test deadline. - weed/shell/command_s3_lifecycle_run_shard.go: add -runtime D flag. When > 0, Pipeline.Run runs under context.WithTimeout(D); on expiry the reader/dispatcher drain cleanly and the cursor saves. - weed/s3api/s3lifecycle/dispatcher/pipeline.go: treat context.DeadlineExceeded the same as context.Canceled at exit (both are graceful shutdown signals). * test(s3/lifecycle): pass -runtime 10s to run-shard Pair with the new -runtime flag so the subprocess exits cleanly after 10s instead of waiting for an event budget that never lands on quiet shards. * refactor(s3/lifecycle): extract HashExtended to s3lifecycle pkg The worker's router needs the same length-prefixed sha256 of the entry's Extended map; pulling it out of the s3api private file lets both sides import it. * fix(s3/lifecycle): worker captures ExtendedHash for identity-CAS Without this, the dispatcher sends ExpectedIdentity.ExtendedHash = nil while the live entry on the server has a non-nil hash, so every dispatch returns NOOP_RESOLVED:STALE_IDENTITY and nothing is ever deleted. * fix(s3/lifecycle): identity HeadFid via GetFileIdString Meta-log events go through BeforeEntrySerialization, which clears FileChunk.FileId and writes the Fid struct instead. Reading .FileId directly returns "" on the worker side while the server's freshly fetched entry still has a populated string, so the identity-CAS would mismatch and every expiration ended in NOOP_RESOLVED:STALE_IDENTITY. * fix(s3/lifecycle): treat gRPC Canceled/DeadlineExceeded as graceful exit errors.Is doesn't unwrap a gRPC status error back to the stdlib ctx errors, so a subscription that ends because runCtx was canceled was being logged as a fatal reader error. Check status.Code as well so the shell's -runtime cap exits cleanly. * fix(test/s3/lifecycle): pass the gRPC port (not HTTP) to run-shard run-shard's -s3 flag dials the LifecycleDelete gRPC service, which listens on s3.port + 10000. The integration test was passing the HTTP port instead, so the dispatcher's RPC just timed out and the shell command exited under -runtime with no work done. * chore(test/s3/lifecycle): drop emoji from Makefile output * docs(test/s3/lifecycle): correct '-shards 0-15' wording * fix(s3/lifecycle): reject out-of-range shard IDs in Pipeline.Run The shell's parseShardsSpec already validates, but a programmatic caller (scheduler, future worker config) shouldn't be able to silently produce no-op states by passing -1 or 99. * fix(s3/lifecycle): bound drain + final-save with their own timeouts Shutdown was using context.Background, so a stuck dispatcher RPC or filer save could keep Pipeline.Run from ever returning. * fix(test/s3/lifecycle): drop self-killing pkill in stop-server The pkill pattern \"weed mini -dir=...\" is also in the running shell's argv (it's the recipe body), so pkill -f matches its own bash and the recipe exits with Terminated. CI test job passed but the cleanup step failed with exit 2. The PID file is sufficient on its own. * docs(test/s3/lifecycle): document S3_GRPC_ENDPOINT env var	2026-05-08 09:59:10 -07:00
Chris Lu	3a192c6c57	fix(s3/lifecycle): address Phase 3 post-merge review (#9354 #9355 #9356 ) (#9357 ) * fix(s3/lifecycle): reader handles bare /buckets parent and pre-normalizes prefix extractBucketKey accepted /buckets/ but rejected /buckets (no trailing slash); some delete events emit the bare form, so bucket-root events were silently dropped. Pre-normalize BucketsPath once on Run instead of recomputing per event. * perf(s3/lifecycle): pool sha256 hashers in ShardID ShardID runs on every meta-log event before the shard filter; a fresh sha256.New per call produces measurable allocator pressure under load. sync.Pool reuses hashers across calls. * fix(s3/lifecycle): router skips hard deletes and missing-attribute events A hard delete carries no schedule-relevant state — Expiration would hit NOOP_RESOLVED at dispatch and ExpiredObjectDeleteMarker fires from a Create on the latest version. Skip rather than burn a schedule slot. Missing Attributes leaves ModTime at year 0001, which makes ExpirationDays fire immediately at dispatch. Skip the event instead. Drop the unused 'versioned' parameter from buildObjectInfo; the dispatcher's identity-CAS handles version drift in Phase 5. * fix(s3/lifecycle): EntryIdentity.MtimeNs holds true nanoseconds Both computeEntryIdentity (server) and buildIdentity (router) wrote entry.Attributes.Mtime (seconds) into a field named MtimeNs. The CAS worked because both sides agreed, but the encoding contradicted the field name and would break if either side later started using true nanoseconds. Combine Mtime1e9 + the FuseAttributes.MtimeNs nanosecond component on both sides; the test was updated to match. fix(s3/lifecycle): dispatcher distinguishes ctx cancel from transport errors A canceled or deadline-exceeded RPC is shutdown, not a transport failure: re-queue the Match at its original DueTime with no retry-budget burn so a quick restart can't escalate it to BLOCKED. * fix(s3/lifecycle): reader fallback prefix normalization mirrors Run The fallback path that builds prefix from r.BucketsPath when bucketsPathSlash is empty (test-only entry into extractBucketKey) was appending an unconditional '/', producing '//' if BucketsPath already ended with one. Use the same normalization Run does. * fix(s3/lifecycle): ObjectInfo.ModTime carries the nanosecond component ModTime dropped FuseAttributes.MtimeNs, leaving ExpirationDays one nanosecond off relative to EntryIdentity.MtimeNs. Pass both to time.Unix so the precision matches the CAS witness.	2026-05-07 16:54:24 -07:00
Chris Lu	0f6c6b0524	feat(s3/lifecycle): shard-aware meta-log reader (Phase 3 PR-B) (#9354 ) feat(s3/lifecycle): shard-aware meta-log reader - ShardCount=16; ShardID(bucket,key)=top-4-bits of sha256(bucket\|\|/\|\|key) - Reader subscribes via SubscribeMetadata starting at Cursor.MinTsNs(), filters events by shard, emits to caller-owned Events channel - Cursor: per-(shard, ActionKey) position with monotonic Advance, Freeze for blocked actions, MinTsNs for subscription resume - Persister interface with InMemoryPersister for tests; filer-backed impl lands with the worker integration	2026-05-07 15:42:37 -07:00

8 Commits