mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-17 07:11:30 +00:00
master
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d221a64262 |
fix(ec): skip re-encode when EC shards already exist for the volume (#9448) (#9458)
* fix(ec): skip re-encode when EC shards already exist for the volume (#9448) When an earlier EC encoding succeeded but the post-encode source-delete left a regular replica behind on one of the servers, the next detection cycle proposes the same volume again. The new encode tries to redistribute shards to targets that already have them mounted, the volume server returns `ec volume %d is mounted; refusing overwrite`, the task fails, and detection re-queues the volume. The cycle repeats forever — issue #9448. The existing `metric.IsECVolume` skip catches the case where the canonical metric is reported on the EC-shard side of the heartbeat, but when the master sees BOTH a regular replica AND its EC shards in the same volume list, the canonical metric we pick is the regular replica and IsECVolume is false. Add a second guard that checks the topology directly via `findExistingECShards` (already present and indexed) and skip the volume when any shards exist, logging a warning that points the admin at the stuck source. This breaks the loop. Auto-cleanup of the orphaned replica is left as follow-up work — deleting a source replica from inside the detector is only safe with a re-verification step right before the delete, plus a config opt-in, and is best done in its own change. * fix(ec): #9448 guard only fires when EC shard set is complete The first version of the #9448 guard tripped on `len(existingShards) > 0`, which is broader than necessary. The existing recovery branch in the encode arm (around the `existingECShards` block, ~line 216) is designed to fold partial leftover shards from a previously failed encode into the new task as cleanup sources. Skipping unconditionally on any existing shards made that branch dead code, regressing the recovery behavior Gemini flagged in the review of |
||
|
|
532b088262 |
fix(ec): preserve source disk type across EC encoding (#9423) (#9449)
* fix(ec): carry source disk type on VolumeEcShardsMount (#9423) When EC shards land on a target whose disk type differs from the source volume's, master heartbeats wrongly reported under the target disk's type. Add source_disk_type to VolumeEcShardsMountRequest; the target server applies it to the in-memory EcVolume via SetDiskType so the mount notification and steady-state heartbeat both carry the source's disk type. Empty value falls back to the location's disk type (used by disk-scan reload paths). The override is not persisted with the volume — disk type stays an environmental property and .vif remains portable. * fix(ec): plumb source disk type through plugin worker (#9423) Add source_disk_type to ErasureCodingTaskParams (field 8; 7 reserved), populate it from the metric the detector already collects, thread it through ec_task into the MountEcShards helper, and forward it on the VolumeEcShardsMount RPC. * fix(ec): mirror source disk type plumbing in rust volume server (#9423) The volume_ec_shards_mount handler now forwards source_disk_type into mount_ec_shard → DiskLocation::mount_ec_shards. When non-empty it overrides ec_vol.disk_type (and each mounted shard's disk_type) via the new set_disk_type method; empty value keeps the location's disk type, so disk-scan reload and reconcile paths are unchanged. Also picks up two pre-existing proto drifts that 'make gen' synced from weed/pb (LockRingUpdate in master.proto, listing_cache_ttl_seconds in remote.proto). * feat(ec): bias placement toward preferred disk type (#9423) Add DiskCandidate.DiskType and PlacementRequest.PreferredDiskType. When PreferredDiskType is non-empty, SelectDestinations partitions suitable disks into matching/fallback tiers and runs the rack/server/ disk-diversity passes on the matching tier first; the fallback tier is only consulted if the matching pool can't satisfy ShardsNeeded. PlacementResult.SpilledToOtherDiskType lets callers warn on spillover. Empty PreferredDiskType keeps the existing single-pool behavior. * fix(ec): plumb source disk type into placement planner (#9423) diskInfosToCandidates now copies DiskInfo.DiskType into the placement candidate, and ecPlacementPlanner.selectDestinations forwards metric.DiskType as PreferredDiskType so EC shards land on disks matching the source volume's disk type when possible. A glog warning fires when placement had to spill to other disk types. * test(ec): integration coverage for source-disk-type plumbing (#9423) store_ec_disk_type_test exercises Store.MountEcShards end-to-end: a shard physically lives on an HDD location, MountEcShards is called with sourceDiskType="ssd", and the test asserts that the in-memory EcVolume, the mounted shard, the NewEcShardsChan notification, and the steady-state heartbeat all report under the source's disk type. A companion test pins the empty-source path so disk-scan reload keeps the location's disk type. detection_disk_type_test exercises the worker plumbing: with a cluster of nodes carrying both HDD and SSD disks, planECDestinations must place every shard on SSD when metric.DiskType="ssd"; with only one SSD node and 13 HDD nodes it must still satisfy a 10+4 layout via spillover (and log a warning). * revert(ec): drop unrelated proto drift in seaweed-volume/proto (#9423) make gen pulled two pre-existing OSS changes into the rust proto tree (LockRingUpdate / by_plugin in master.proto, listing_cache_ttl_seconds in remote.proto). Reviewers flagged it as scope creep — none of the rust EC fix references those fields. Restore both files to origin/master so this branch only touches EC-related symbols. * fix(ec placement): treat empty disk type as hdd and skip used racks on spill (#9423) partitionByDiskType used raw string comparison, so a PreferredDiskType of "hdd" never matched candidates whose DiskType is "" (the HardDriveType sentinel that weed/storage/types uses). EC encoding of an HDD source would spill onto any HDD reporting "" even when the cluster has plenty of matching capacity. Normalize both sides through normalizeDiskType, which lowercases and folds "" → "hdd", mirroring types.ToDiskType without taking a dependency on it. selectFromTier's rack-diversity pass also kept revisiting racks the preferred tier had already used when running on the fallback tier, which negated PreferDifferentRacks on spillover. Skip racks already in usedRacks so fallback placements still spread onto new racks. * fix(ec): empty-source remount must not clobber existing disk type (#9423) mount_ec_shards_with_idx_dir runs more than once per vid (RPC mount, disk-scan reload, orphan-shard reconcile). After an RPC sets the source-derived disk type, any later call passing source_disk_type="" was resetting ec_vol.disk_type back to the location's value, which reintroduces the heartbeat drift this PR is meant to fix. Only default to the location's disk type when the EC volume is fresh (no shards mounted yet); otherwise leave the recorded type alone so empty-source reloads preserve whatever the original mount RPC set. |
||
|
|
5d43f84df7 |
refactor(plugin): rename detection_interval_seconds → detection_interval_minutes (#9366)
Minutes is the natural granularity for detection cadence — every production handler already set the seconds field to a 60-multiple (17*60, 30*60, 3600, 24*60*60). Switching to minutes drops the *60 arithmetic and matches the unit conventions used elsewhere in the plugin worker forms. - Proto: AdminRuntimeDefaults + AdminRuntimeConfig.detection_interval_* field renamed. - Helpers: durationFromMinutes / minutesFromDuration alongside the existing seconds variants in plugin_scheduler.go. - Handlers: vacuum, ec_balance, balance, erasure_coding, iceberg, admin_script, s3_lifecycle now declare DetectionIntervalMinutes. - Admin: scheduler_status + types + UI templ + plugin_api.go pass through the new field; UI label and table cells switch to "min". |
||
|
|
1f6f473995 |
refactor(worker): co-locate plugin handlers with their task packages (#9301)
* refactor(worker): co-locate plugin handlers with their task packages
Move every per-task plugin handler from weed/plugin/worker/ into the
matching weed/worker/tasks/<name>/ package, so each task owns its
detection, scheduling, execution, and plugin handler in one place.
Step 0 (within pluginworker, no behavior change): extract shared helpers
that previously lived inside individual handler files into dedicated
files and export the ones now consumed across packages.
- activity.go: BuildExecutorActivity, BuildDetectorActivity
- config.go: ReadStringConfig/Double/Int64/Bytes/StringList, MapTaskPriority
- interval.go: ShouldSkipDetectionByInterval
- volume_state.go: VolumeState + consts, FilterMetricsByVolumeState/Location
- collection_filter.go: CollectionFilterMode + consts
- volume_metrics.go: export CollectVolumeMetricsFromMasters,
MasterAddressCandidates, FetchVolumeList
- testing_senders_test.go: shared test stubs
Phase 1: move the per-task plugin handlers (and the iceberg subpackage)
into their task packages.
weed/plugin/worker/vacuum_handler.go -> weed/worker/tasks/vacuum/plugin_handler.go
weed/plugin/worker/ec_balance_handler.go -> weed/worker/tasks/ec_balance/plugin_handler.go
weed/plugin/worker/erasure_coding_handler.go -> weed/worker/tasks/erasure_coding/plugin_handler.go
weed/plugin/worker/volume_balance_handler.go -> weed/worker/tasks/balance/plugin_handler.go
weed/plugin/worker/iceberg/ -> weed/worker/tasks/iceberg/
weed/plugin/worker/handlers/handlers.go now blank-imports all five
task subpackages so their init() registrations fire.
weed/command/mini.go and the worker tests construct the handler with
vacuum.DefaultMaxExecutionConcurrency (the constant moved with the
vacuum handler).
admin_script remains in weed/plugin/worker/ because there is no
underlying weed/worker/tasks/admin_script/ package to merge with.
* refactor(worker): update test/plugin_workers imports for moved handlers
Three handler constructors moved out of pluginworker into their task
packages — update the integration test files in test/plugin_workers/
to import from the new locations:
pluginworker.NewVacuumHandler -> vacuum.NewVacuumHandler
pluginworker.NewVolumeBalanceHandler -> balance.NewVolumeBalanceHandler
pluginworker.NewErasureCodingHandler -> erasure_coding.NewErasureCodingHandler
The pluginworker import is kept where the file still uses
pluginworker.WorkerOptions / pluginworker.JobHandler.
* refactor(worker): update test/s3tables iceberg import path
The iceberg subpackage moved from weed/plugin/worker/iceberg/ to
weed/worker/tasks/iceberg/. test/s3tables/maintenance/maintenance_integration_test.go
still imported the old path, breaking S3 Tables / RisingWave / Trino /
Spark / Iceberg-catalog / STS integration test builds.
Mirrors the OSS-side fix needed by every job in the run that
transitively imports test/s3tables/maintenance.
* chore: gofmt PR-touched files
The S3 Tables Format Check job runs `gofmt -l` over weed/s3api/s3tables
and test/s3tables, then fails if anything is unformatted. Files this
PR moved or modified had import-grouping and trailing-spacing issues
introduced by perl-based renames; reformat them with gofmt -w.
Touched files:
test/plugin_workers/erasure_coding/{detection,execution}_test.go
test/s3tables/maintenance/maintenance_integration_test.go
weed/plugin/worker/handlers/handlers.go
weed/worker/tasks/{balance,ec_balance,erasure_coding,vacuum}/plugin_handler*.go
* refactor(worker): bounds-checked int conversions for plugin config values
CodeQL flagged 18 go/incorrect-integer-conversion warnings on the moved
plugin handler files: results of pluginworker.ReadInt64Config (which
ultimately calls strconv.ParseInt with bit size 64) were being narrowed
to int32/uint32/int without an upper-bound check, so a malicious or
malformed admin/worker config value could overflow the target type.
Add three helpers in weed/plugin/worker/config.go that wrap
ReadInt64Config and clamp out-of-range values back to the caller's
fallback:
ReadInt32Config (math.MinInt32 .. math.MaxInt32)
ReadUint32Config (0 .. math.MaxUint32)
ReadIntConfig (math.MinInt32 .. math.MaxInt32, platform-portable)
Update each flagged call site in the four moved task packages to use
the bounds-checked helper. For protobuf uint32 fields (volume IDs)
the variable type also becomes uint32, removing the trailing
uint32(volumeID) casts and changing the "missing volume_id" check
from `<= 0` to `== 0`.
Touched files:
weed/plugin/worker/config.go
weed/worker/tasks/balance/plugin_handler.go
weed/worker/tasks/erasure_coding/plugin_handler.go
weed/worker/tasks/vacuum/plugin_handler.go
* refactor(worker): use ReadIntConfig for clamped derive-worker-config helpers
CodeQL still flagged three call sites where ReadInt64Config was being
narrowed to int after a value-range clamp (max_concurrent_moves <= 50,
batch_size <= 100, min_server_count >= 2). The clamp is correct but
CodeQL's flow analysis didn't recognize the bound, so it flagged them
as unbounded narrowing.
Switch to ReadIntConfig (already int32-bounded by the helper) for
those three sites, drop the now-redundant int64 intermediate variables.
Also drops the now-unused `> math.MaxInt32` clamp in
ec_balance.deriveECBalanceWorkerConfig (the helper covers it).
|