96 Commits

Author SHA1 Message Date
Chris Lu
1c0e24f06a fix(balance): don't move remote-tiered volumes; don't fatal on missing .idx (#9335)
* fix(volume): don't fatal on missing .idx for remote-tiered volume

A .vif left behind without its .idx (orphaned by a crashed move, partial
copy, or hand-edit) would trip glog.Fatalf in checkIdxFile and take the
whole volume server down on boot, killing every healthy volume on it
too. For remote-tiered volumes treat it as a per-volume load error so
the server can come up and the operator can clean up the stray .vif.

Refs #9331.

* fix(balance): skip remote-tiered volumes in admin balance detection

The admin/worker balance detector had no equivalent of the shell-side
guard ("does not move volume in remote storage" in
command_volume_balance.go), so it scheduled moves on remote-tiered
volumes. The "move" copies .idx/.vif to the destination and then calls
Volume.Destroy on the source, which calls backendStorage.DeleteFile —
deleting the remote object the destination's new .vif now points at.

Populate HasRemoteCopy on the metrics emitted by both the admin
maintenance scanner and the worker's master poll, then drop those
volumes at the top of Detection.

Fixes #9331.

* Apply suggestion from @gemini-code-assist[bot]

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix(volume): keep remote data on volume-move-driven delete

The on-source delete after a volume move (admin/worker balance and
shell volume.move) ran Volume.Destroy with no way to opt out of the
remote-object cleanup. Volume.Destroy unconditionally calls
backendStorage.DeleteFile for remote-tiered volumes, so a successful
move would copy .idx/.vif to the destination and then nuke the cloud
object the destination's new .vif was already pointing at.

Add VolumeDeleteRequest.keep_remote_data and plumb it through
Store.DeleteVolume / DiskLocation.DeleteVolume / Volume.Destroy. The
balance task and shell volume.move set it to true; the post-tier-upload
cleanup of other replicas and the over-replication trim in
volume.fix.replication also set it to true since the remote object is
still referenced. Other real-delete callers keep the default. The
delete-before-receive path in VolumeCopy also sets it: the inbound copy
carries a .vif that may reference the same cloud object as the
existing volume.

Refs #9331.

* test(storage): in-process remote-tier integration tests

Cover the four operations the user is most likely to run against a
cloud-tiered volume — balance/move, vacuum, EC encode, EC decode — by
registering a local-disk-backed BackendStorage as the "remote" tier and
exercising the real Volume / DiskLocation / EC encoder code paths.

Locks in:
- Destroy(keepRemoteData=true) preserves the remote object (move case)
- Destroy(keepRemoteData=false) deletes it (real-delete case)
- Vacuum/compact on a remote-tier volume never deletes the remote object
- EC encode requires the local .dat (callers must download first)
- EC encode + rebuild round-trips after a tier-down

Tests run in-process and finish in under a second total — no cluster,
binary, or external storage required.

* fix(rust-volume): keep remote data on volume-move-driven delete

Mirror the Go fix in seaweed-volume: plumb keep_remote_data through
grpc volume_delete → Store.delete_volume → DiskLocation.delete_volume
→ Volume.destroy, and skip the s3-tier delete_file call when the flag
is set. The pre-receive cleanup in volume_copy passes true for the
same reason as the Go side: the inbound copy carries a .vif that may
reference the same cloud object as the existing volume.

The Rust loader already warns rather than fataling on a stray .vif
without an .idx (volume.rs load_index_inmemory / load_index_redb), so
no counterpart to the Go fatal-on-missing-idx fix is needed.

Refs #9331.

* fix(volume): preserve remote tier on IO-error eviction; fix EC test target

Two review nits:

- Store.MaybeAddVolumes' periodic cleanup pass deleted IO-errored
  volumes with keepRemoteData=false, so a transient local fault on a
  remote-tiered volume would also nuke the cloud object. Track the
  delete reason via a parallel slice and pass keepRemoteData=v.HasRemoteFile()
  for IO-error evictions; TTL-expired evictions still pass false.

- TestRemoteTier_ECEncodeDecode_AfterDownload deleted shards 0..3 but
  called them "parity" — by the klauspost/reedsolomon convention shards
  0..DataShardsCount-1 are data and DataShardsCount..TotalShardsCount-1
  are parity. Switch the loop to delete the parity range so the
  intent matches the indices.

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-05-06 15:19:43 -07:00
Chris Lu
0fed72d95a volume.tier.move: fulfill target replication before deleting old replicas (#8950)
* volume.tier.move: fulfill target replication before deleting old replicas

When -toReplication is specified, volume.tier.move now creates all
required replicas on the destination tier before deleting old replicas.
This closes the data-loss window where only one copy existed on the
target tier while awaiting volume.fix.replication.

If replication fulfillment fails, old replicas are preserved and marked
writable so the volume remains accessible.

Also extracts replicateVolumeToServer and configureVolumeReplication
helpers to reduce duplication across volume.tier.move and
volume.fix.replication.

Fixes #8937

* volume.tier.move: always fulfill replication before deleting old replicas

When -toReplication is specified, use that replication setting.
Otherwise, read the volume's existing replication from the super block.
In both cases, all required replicas are created on the destination
tier before old replicas are deleted.

If replication fulfillment fails (e.g. not enough destination nodes),
old replicas are preserved and marked writable so no data is lost.

* volume.tier.move: address review feedback on ensureReplicationFulfilled

- Add 5s delay before re-collecting topology to allow master heartbeat
  propagation after the move
- Add nil guard for targetTierReplicas to prevent panic if the moved
  replica is not yet visible in the topology
- Treat configureVolumeReplication failure as a hard error instead of a
  warning, so the rollback logic preserves old replicas

* volume.tier.move: harden replication config error handling

- Make configureVolumeReplication failure on the primary moved replica a
  hard error that aborts the move, instead of logging and continuing
- Configure replication metadata on all existing target-tier replicas
  (not just newly created ones) when -toReplication is specified
- Deletion of old replicas cannot affect new replicas since the
  locations list only contains pre-move servers (verified, no change)

* volume.tier.move: fix cleanup deleting fulfilled replicas and broken recovery

Fix 1: The cleanup loop now preserves pre-existing target-tier replicas
that ensureReplicationFulfilled counted toward the replication target.
Previously, a mixed-tier volume with an existing replica on the target
tier could have that replica deleted right after being counted as
fulfilled, leaving the volume under-replicated.

ensureReplicationFulfilled now returns a preserveServers set that the
deletion loop checks before removing any old replica.

Fix 2: Failure paths after LiveMoveVolume (which deletes the source
replica) now use restoreSurvivingReplicasWritable instead of
markVolumeReplicasWritable. The old helper stopped on first error, so
attempting to mark the already-deleted source writable would prevent
all surviving replicas from being restored. The new helper skips the
deleted source and continues through all remaining locations, logging
per-replica errors instead of aborting.

* volume.tier.move: mark preserved replicas writable, skip nodes with existing volume

Fix 1: Preserved pre-existing target-tier replicas were left read-only
after the move completed. They were marked read-only at the start
(along with all other replicas) but never restored since the old code
deleted them. Now they are explicitly marked writable before cleanup.

Fix 2: The fulfillment loop could pick a candidate node that already
hosts this volume on a different disk type, causing a VolumeCopy
conflict. Added a guard that skips any node already hosting the volume
(on any disk) before attempting replication.
2026-04-06 14:55:37 -07:00
qzh
4c72512ea2 fix(shell): avoid marking skipped or unplaced volumes as fixed (#8866)
* fix(s3api): fix AWS Signature V2 format and validation

* fix(s3api): Skip space after "AWS" prefix (+1 offset)

* test(s3api): add unit tests for Signature V2 authentication fix

* fix(s3api): simply comparing signatures

* validation for the colon extraction in expectedAuth

* fix(shell): avoid marking skipped or unplaced volumes as fixed

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2026-04-01 01:20:25 -07:00
Chris Lu
0a46577700 Fix #8040: Support '_default' keyword in collectionPattern to match default collection (#8046)
* Fix #8040: Support 'default' keyword in collectionPattern to match default collection

The default collection in SeaweedFS is represented as an empty string internally.
Previously, it was impossible to specifically target only the default collection
because:
- Empty collectionPattern matched ALL collections (filter was skipped)
- Using collectionPattern="default" tried to match the literal string "default"

This commit adds special handling for the keyword "default" in collectionPattern
across multiple shell commands:
- volume.tier.move
- volume.list
- volume.fix.replication
- volume.configure.replication

Now users can use -collectionPattern="default" to specifically target volumes
in the default collection (empty collection name), while maintaining backward
compatibility where empty pattern matches all collections.

Updated help text to document this feature.

* Update compileCollectionPattern to support 'default' keyword

This extends the fix to all commands that use regex-based collection
pattern matching:
- ec.encode
- ec.decode
- volume.tier.download
- volume.balance

The compileCollectionPattern function now treats "default" as a special
keyword that compiles to the regex "^$" (matching empty strings), making
it consistent with the other commands that use filepath.Match.

* Use CollectionDefault constant instead of hardcoded "default" string

Refactored the collection pattern matching logic to use a central constant
CollectionDefault defined in weed/shell/common.go. This improves maintainability
and ensures consistency across all shell commands.

* Address PR review feedback: simplify logic and use '_default' keyword

Changes:
1. Changed CollectionDefault from "default" to "_default" to avoid collision
   with literal collection names
2. Simplified pattern matching logic to reduce code duplication across all
   affected commands
3. Fixed error handling in command_volume_tier_move.go to properly propagate
   filepath.Match errors instead of swallowing them
4. Updated documentation to clarify how to match a literal "default"
   collection using regex patterns like "^default$"

This addresses all feedback from PR review comments.

* Remove unnecessary documentation about matching literal 'default'

Since we changed the keyword to '_default', users can now simply use
'default' to match a literal collection named "default". The previous
documentation about using regex patterns was confusing and no longer needed.

* Fix error propagation and empty pattern handling

1. command_volume_tier_move.go: Added early termination check after
   eachDataNode callback to stop processing remaining nodes if a pattern
   matching error occurred, improving efficiency

2. command_volume_configure_replication.go: Fixed empty pattern handling
   to match all collections (collectionMatched = true when pattern is empty),
   mirroring the behavior in other commands

These changes address the remaining PR review feedback.
2026-01-16 12:31:48 -08:00
Chris Lu
2b3ff3cd05 verbose mode 2025-12-26 12:42:00 -08:00
steve.wei
5c25df20f2 feat(volume.fix): show all replica locations for misplaced volumes (#7560) 2025-11-27 00:04:45 -08:00
Lisandro Pin
9744382a18 Rework parameters passing for functions within volume.check.disk. (#7448)
* Rework parameters passing for functions within `volume.check.disk`.

We'll need to rework this logic to account for read-only volumes, and there're already way too many parameters shuffled around.

Grouping these into a single struct simplifies the overall codebase.

* similar fix

* Improved Error Handling in Tests

* propagate the errors

* edge cases

* edge case on modified time

* clean up

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-10 16:03:38 -08:00
Lisandro Pin
76e4a51964 Unify the parameter to disable dry-run on weed shell commands to -apply (instead of -force). (#7450)
* Unify the parameter to disable dry-run on weed shell commands to --apply (instead of --force).

* lint

* refactor

* Execution Order Corrected

* handle deprecated force flag

* fix help messages

* Refactoring]: Using flag.FlagSet.Visit()

* consistent with other commands

* Checks for both flags

* fix toml files

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-09 19:58:38 -08:00
nightcoffee
aed91baa2e [weed] update volume.fix.replication description (#7340)
* [weed] update volume.fix.replication description

* Update master-cloud.toml

* Update master.toml
2025-10-21 12:38:40 -07:00
Roman Shishkin
b6f5fb4b45 Human-readable processed bytes in volume.fix.replication (#7253) 2025-09-18 06:44:40 -07:00
Lisandro Pin
64198dad83 Paralleize operations for weed shell's volume.fix.replication. (#6789)
Paralleize operations for `weed shell`s `volume.fix.replication`.
2025-07-30 10:58:30 -07:00
dsd
da2a234b00 [weed] change -n to -force (#6421) 2025-01-08 09:57:18 -08:00
dsd
20cbc9e4eb skip error while executing volume.fix.replication (#6382) 2024-12-20 07:36:13 -08:00
chrislu
ec155022e7 "golang.org/x/exp/slices" => "slices" and go fmt 2024-12-19 19:25:06 -08:00
Trim21
d43fa07f06 use readable bytes size string in shell output (#6288) 2024-11-25 17:25:17 -08:00
Konstantin Lebedev
a143c888e5 [shell] don't require lock when there are no changes for volume.fix.replication (#6266)
* don't require lock when there are no changes

* revert takeAction
2024-11-21 08:17:25 -08:00
chrislu
07cf8cf22d minor 2024-11-19 08:31:33 -08:00
Lisandro Pin
0d5393641e Unify usage of shell.EcNode.dc as DataCenterId. (#6258) 2024-11-19 06:33:18 -08:00
chrislu
20929f2a57 adjust resource heavy for volume.fix.replication 2024-09-29 11:32:18 -07:00
chrislu
6564ceda91 skip resource heavy commands from running on master nodes 2024-09-29 10:51:17 -07:00
chrislu
ec30a504ba refactor 2024-09-29 10:38:22 -07:00
chrislu
701abbb9df add IsResourceHeavy() to command interface 2024-09-28 20:23:01 -07:00
Max Denushev
f1e700ce2f Fix/copy before delete replication (#6064)
* fix(shell): volume.fix.replication misplaced volumes unsatisfying replication factor

* fix(shell): simplify replication check

* fix(shell): add test for satisfyReplicaCurrentLocation
2024-09-26 08:34:13 -07:00
skycope
6e4b9181f5 fix "volume.fix.replication" move many replications only to one volumeServer (#5522) 2024-04-23 06:33:50 -07:00
steve.wei
67ead9b18f fix(volume.fix.replication): adjust volume count, not free volume count (#5479) 2024-04-08 07:30:04 -07:00
zehweh
2b9dda7d2e fix isMisplaced() in command_volume_fix_replication.go (#4988) 2023-11-07 07:58:19 -08:00
Konstantin Lebedev
dffe00a822 fix: logger place msg (#4880) 2023-10-02 08:29:09 -07:00
Konstantin Lebedev
dd580190b4 fix: avoid deleting one replica without sync (#4875)
* fix: avoid deleting one replica without sync
https://github.com/seaweedfs/seaweedfs/issues/4647

* Update weed/shell/command_volume_fix_replication.go

Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>

* fix: revert this existing do option to positive

---------

Co-authored-by: Konstantin Lebedev <9497591+kmlebedev@users.noreply.github.co>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2023-09-27 23:12:10 -07:00
Konstantin Lebedev
df4ded758e fix: avoid deleting more than one replica (#4873)
https://github.com/seaweedfs/seaweedfs/issues/4647

Co-authored-by: Konstantin Lebedev <9497591+kmlebedev@users.noreply.github.co>
2023-09-26 00:20:48 -07:00
chrislu
645ae8c57b Revert "Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs""
This reverts commit 8cb42c39
2023-09-25 09:35:16 -07:00
chrislu
8cb42c39ad Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs"
This reverts commit 2e5aa06026, reversing
changes made to 4d414f54a2.
2023-09-18 16:12:50 -07:00
dependabot[bot]
a04bd4d26f Bump github.com/rclone/rclone from 1.63.1 to 1.64.0 (#4850)
* Bump github.com/rclone/rclone from 1.63.1 to 1.64.0

Bumps [github.com/rclone/rclone](https://github.com/rclone/rclone) from 1.63.1 to 1.64.0.
- [Release notes](https://github.com/rclone/rclone/releases)
- [Changelog](https://github.com/rclone/rclone/blob/master/RELEASE.md)
- [Commits](https://github.com/rclone/rclone/compare/v1.63.1...v1.64.0)

---
updated-dependencies:
- dependency-name: github.com/rclone/rclone
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* API changes

* go mod

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2023-09-18 14:43:05 -07:00
chrislu
3365468d0d added an error message 2023-08-08 20:35:21 -07:00
Konstantin Lebedev
25535e9c36 Delete volume is empty (#4561)
* use onlyEmpty for deleteVolume
https://github.com/seaweedfs/seaweedfs/issues/4559

* fix IsEmpty

* fix test

---------

Co-authored-by: Konstantin Lebedev <9497591+kmlebedev@users.noreply.github.co>
2023-06-12 10:42:44 -07:00
chrislu
214b7cd286 volume.fix.replication: adjust the retry checking times 2023-02-22 10:47:52 -08:00
chrislu
21c0587900 go fmt 2022-09-14 23:06:44 -07:00
Brian
4e3e2b1b82 Add option in volume.fix.replication to only fix under-replication and not delete volumes (#3640) 2022-09-10 08:05:28 -07:00
chrislu
676e27c589 shell: stop long running jobs if lock is lost 2022-08-22 14:12:23 -07:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
chrislu
f97acdd489 volume.fix.replication fix retry logic
fix https://github.com/chrislusf/seaweedfs/issues/3136
2022-06-03 08:45:29 -07:00
Konstantin Lebedev
44f53ceda6 fix collectionIsMismatch charset 2022-05-16 13:23:23 +05:00
Konstantin Lebedev
10d435f2c2 fix skip loop 2022-05-16 13:16:27 +05:00
Konstantin Lebedev
279053572c avoid delete volume replica if collection mismatch 2022-05-16 13:07:05 +05:00
justin
3551ca2fcf enhancement: replace sort.Slice with slices.SortFunc to reduce reflection 2022-04-18 10:35:43 +08:00
chrislu
f18803424a volume.balance: add delay during tight loop
fix https://github.com/chrislusf/seaweedfs/issues/2637
2022-02-08 00:53:55 -08:00
chrislu
9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
chrislu
a2d3f89c7b add lock messages 2021-12-10 13:24:38 -08:00
chrislu
e6c026db65 volume.fix.replication: fix misplaced volumes
fix https://github.com/chrislusf/seaweedfs/issues/2416
2021-12-05 16:56:25 -08:00
Chris Lu
5435027ff0 volume copy: stream out copying progress and avoid grpc request timeout
fix https://github.com/chrislusf/seaweedfs/issues/2386
2021-10-24 02:52:56 -07:00
Chris Lu
e862b2529a refactor 2021-10-01 12:10:11 -07:00