mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-24 02:31:28 +00:00
Admin computes a per-worker share of cluster_deletes_per_second at
ExecuteJob time and ships it to the worker via
ClusterContext.Metadata. The worker reads the share, constructs a
golang.org/x/time/rate.Limiter, and passes it to dailyrun.Run via
cfg.Limiter (Phase 2 already plumbed the field). Phase 5 deletes the
streaming path; until then streaming ignores the cap.
Why allocate at admin: the cluster cap is a single knob operators
care about. Dividing it locally per worker would either need
out-of-band coordination or accept N× the configured budget. Admin
is the only party that knows how many execute-capable workers there
are, so it owns the math.
Admin side (weed/admin/plugin):
- Registry.CountCapableExecutors(jobType) returns the number of
non-stale workers with CanExecute=true.
- New file cluster_rate_limit.go: decorateClusterContextForJob clones
the input ClusterContext and injects two metadata keys for
s3_lifecycle. cloneClusterContext duplicates Metadata so per-job
decoration doesn't race shared base state.
- executeJobWithExecutor calls the decorator after loading the admin
config; other job types pass through unchanged.
Worker side (weed/worker/tasks/s3_lifecycle):
- New cluster_rate_limit.go declares the constants both sides agree
on (admin-config field names, metadata keys). Plain strings on the
admin side keep weed/admin/plugin free of a dependency on the
s3_lifecycle worker package; the two sets of constants are pinned
to identical values and a mismatch would silently disable rate
limiting.
- handler.go executeDailyReplay reads ClusterContext.Metadata,
builds a rate.Limiter, and passes it into dailyrun.Config{Limiter}.
Missing/empty/non-positive values → no limiter (legacy unlimited
behavior). burst defaults to 2 × rate, clamped to ≥1 to avoid a
bucket that never refills.
- Admin form gains two fields under "Scope": cluster_deletes_per_second
(rate, 0 = unlimited) and cluster_deletes_burst (0 = 2 × rate).
Metric:
- New S3LifecycleDispatchLimiterWaitSeconds histogram observes how
long each Limiter.Wait blocks before a LifecycleDelete RPC.
Operators tune the cap by reading p95 — near-zero means the cap
isn't binding, a long tail at 1/rate means it is.
Tests:
- weed/admin/plugin/cluster_rate_limit_test.go: 9 cases covering
pass-through for non-allocator job types, rps=0 / no-executors
skip, even sharing, burst sharing, burst=0 omit (worker default
kicks in), burst floor of 1, no mutation of input metadata, nil
input.
- weed/worker/tasks/s3_lifecycle/cluster_rate_limit_test.go: 7 cases
covering nil/empty/missing metadata, non-positive/invalid rate,
positive rate builds correctly, burst missing defaults to 2× rate,
tiny rate clamps burst to ≥1.
Build clean. Phase 2 (#9446) and Phase 4 engine (#9447) are the
parents; this branch stacks on Phase 2 since it consumes
dailyrun.Config{Limiter} which lands there.
see https://blog.aqwari.net/xml-schema-go/ 1. go get aqwari.net/xml/cmd/xsdgen 2. Add EncodingType element for ListBucketResult in AmazonS3.xsd 3. xsdgen -o s3api_xsd_generated.go -pkg s3api AmazonS3.xsd 4. Remove empty Grantee struct in s3api_xsd_generated.go 5. Remove xmlns: sed s'/http:\/\/s3.amazonaws.com\/doc\/2006-03-01\/\ //' s3api_xsd_generated.go