Files
seaweedfs/weed/command
Chris Lu c2f5db3a02 perf(filer.sync): don't serialize descendants behind dir attribute updates (#9079)
* perf(filer.sync): don't serialize descendants behind dir attribute updates

The MetadataProcessor treated every in-flight directory job as a subtree
barrier: any active dir job at /foo forced all file events under /foo to
wait, and because the admit loop runs on the single stream.Recv()
goroutine, a stalled descendant also stalled the whole gRPC stream. For
large directories this turned every attribute-only dir event (mtime /
xattr / chmod bumps) into a full-subtree pinch point.

Classify dir jobs as barrier (create / delete / rename) vs non-barrier
(filer_pb.IsUpdate on a directory — same parent and same name, i.e. an
in-place attribute update). Only barrier dirs block descendants and get
blocked by ancestor barrier dirs. Non-barrier dir updates still bump the
ancestor descendantCount, so an incoming barrier dir on an ancestor
still waits for them — preserving the "delete /a waits for in-flight
/a/b update" safety.

Tests cover the loosened cases and the preserved barriers:
non-barrier update doesn't block a file descendant, barrier create
still does, barrier delete still waits for in-flight descendants, and
a barrier ancestor still waits for a non-barrier descendant update.

* fix(filer.sync): serialize same-path barrier dir jobs against concurrent ops

Review (Gemini) flagged that pathConflicts had latent same-path gaps
that predated this PR but deserve fixing alongside the dir-conflict
loosening: two barrier dir jobs at the same path could run concurrently
(e.g. create /a and delete /a), and a file job at the same path as an
in-flight barrier dir wasn't blocked either.

Tighten pathConflicts so that:
- an active barrier dir at p blocks every incoming job at p (file,
  barrier dir, or non-barrier attribute update) — same-path promotions,
  renames, and delete/create collisions must serialize;
- an active file at p blocks incoming files and barrier dirs at p;
- non-barrier dir updates at the same path still overlap with each
  other (attribute bumps are last-writer-wins, intentional).

TestDirVsDirConflict and TestFileUnderActiveDirConflict flip their
"same path does not conflict" assertions to match. New
TestSamePathBarrierSerialization covers all five same-path cases
explicitly.

* fix(filer.sync): serialize incoming barrier dir against same-path non-barrier update

Bug introduced by the previous same-path tightening commit and caught
in review (CodeRabbit, critical): a kindNonBarrierDir at /dir1 was not
indexed at its own path, so a later kindBarrierDir at /dir1 saw neither
activeBarrierDirPaths["/dir1"] nor descendantCount["/dir1"] (the latter
only counts strict descendants) and was admitted concurrently with the
in-flight attribute update. That violated the "barrier at p serializes
all work at p" rule.

Track non-barrier dir jobs in a new activeNonBarrierDirPaths map and
check it only from the incoming-barrier-dir branch of pathConflicts.
The map is deliberately invisible to the ancestor check, so non-barrier
updates still don't serialize file descendants — the loosening this PR
is about stays intact.

Regression test added in TestSamePathBarrierSerialization covers both
the admission conflict and the index cleanup on job completion.
2026-04-14 18:34:05 -07:00
..
2026-02-20 18:42:00 -08:00
2026-02-20 18:42:00 -08:00
2026-02-20 18:42:00 -08:00
2026-04-10 17:31:14 -07:00
2025-10-13 18:05:17 -07:00
2026-02-20 18:42:00 -08:00
2025-12-14 16:02:06 -08:00
2026-03-15 09:44:14 -07:00
2026-02-20 18:42:00 -08:00
2026-02-20 18:42:00 -08:00
2025-10-13 18:05:17 -07:00
2019-11-28 18:44:27 -08:00
2025-07-02 18:03:17 -07:00
2026-04-10 17:31:14 -07:00
2026-02-20 18:42:00 -08:00