46 Commits

Author SHA1 Message Date
Chris Lu
1c0e24f06a fix(balance): don't move remote-tiered volumes; don't fatal on missing .idx (#9335)
* fix(volume): don't fatal on missing .idx for remote-tiered volume

A .vif left behind without its .idx (orphaned by a crashed move, partial
copy, or hand-edit) would trip glog.Fatalf in checkIdxFile and take the
whole volume server down on boot, killing every healthy volume on it
too. For remote-tiered volumes treat it as a per-volume load error so
the server can come up and the operator can clean up the stray .vif.

Refs #9331.

* fix(balance): skip remote-tiered volumes in admin balance detection

The admin/worker balance detector had no equivalent of the shell-side
guard ("does not move volume in remote storage" in
command_volume_balance.go), so it scheduled moves on remote-tiered
volumes. The "move" copies .idx/.vif to the destination and then calls
Volume.Destroy on the source, which calls backendStorage.DeleteFile —
deleting the remote object the destination's new .vif now points at.

Populate HasRemoteCopy on the metrics emitted by both the admin
maintenance scanner and the worker's master poll, then drop those
volumes at the top of Detection.

Fixes #9331.

* Apply suggestion from @gemini-code-assist[bot]

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix(volume): keep remote data on volume-move-driven delete

The on-source delete after a volume move (admin/worker balance and
shell volume.move) ran Volume.Destroy with no way to opt out of the
remote-object cleanup. Volume.Destroy unconditionally calls
backendStorage.DeleteFile for remote-tiered volumes, so a successful
move would copy .idx/.vif to the destination and then nuke the cloud
object the destination's new .vif was already pointing at.

Add VolumeDeleteRequest.keep_remote_data and plumb it through
Store.DeleteVolume / DiskLocation.DeleteVolume / Volume.Destroy. The
balance task and shell volume.move set it to true; the post-tier-upload
cleanup of other replicas and the over-replication trim in
volume.fix.replication also set it to true since the remote object is
still referenced. Other real-delete callers keep the default. The
delete-before-receive path in VolumeCopy also sets it: the inbound copy
carries a .vif that may reference the same cloud object as the
existing volume.

Refs #9331.

* test(storage): in-process remote-tier integration tests

Cover the four operations the user is most likely to run against a
cloud-tiered volume — balance/move, vacuum, EC encode, EC decode — by
registering a local-disk-backed BackendStorage as the "remote" tier and
exercising the real Volume / DiskLocation / EC encoder code paths.

Locks in:
- Destroy(keepRemoteData=true) preserves the remote object (move case)
- Destroy(keepRemoteData=false) deletes it (real-delete case)
- Vacuum/compact on a remote-tier volume never deletes the remote object
- EC encode requires the local .dat (callers must download first)
- EC encode + rebuild round-trips after a tier-down

Tests run in-process and finish in under a second total — no cluster,
binary, or external storage required.

* fix(rust-volume): keep remote data on volume-move-driven delete

Mirror the Go fix in seaweed-volume: plumb keep_remote_data through
grpc volume_delete → Store.delete_volume → DiskLocation.delete_volume
→ Volume.destroy, and skip the s3-tier delete_file call when the flag
is set. The pre-receive cleanup in volume_copy passes true for the
same reason as the Go side: the inbound copy carries a .vif that may
reference the same cloud object as the existing volume.

The Rust loader already warns rather than fataling on a stray .vif
without an .idx (volume.rs load_index_inmemory / load_index_redb), so
no counterpart to the Go fatal-on-missing-idx fix is needed.

Refs #9331.

* fix(volume): preserve remote tier on IO-error eviction; fix EC test target

Two review nits:

- Store.MaybeAddVolumes' periodic cleanup pass deleted IO-errored
  volumes with keepRemoteData=false, so a transient local fault on a
  remote-tiered volume would also nuke the cloud object. Track the
  delete reason via a parallel slice and pass keepRemoteData=v.HasRemoteFile()
  for IO-error evictions; TTL-expired evictions still pass false.

- TestRemoteTier_ECEncodeDecode_AfterDownload deleted shards 0..3 but
  called them "parity" — by the klauspost/reedsolomon convention shards
  0..DataShardsCount-1 are data and DataShardsCount..TotalShardsCount-1
  are parity. Switch the loop to delete the parity range so the
  intent matches the indices.

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-05-06 15:19:43 -07:00
Chris Lu
d605feb403 refactor(command): expand "~" in all path-style CLI flags (#9306)
* refactor(command): expand "~" in all path-style CLI flags

Many of weed's path-bearing flags (-s3.config, -s3.iam.config,
-admin.dataDir, -webdav.cacheDir, -volume.dir.idx, TLS cert/key
files, profile output paths, mount cache dirs, sftp key files, ...)
were never run through util.ResolvePath, so a value like "~/iam.json"
was used literally. Tilde only worked when the shell expanded it,
which silently fails for the common -flag=~/path form (bash leaves
the tilde literal in --opt=~/path).

- Extend util.ResolvePath to also handle "~user" / "~user/rest",
  matching shell tilde expansion. Add unit tests.
- Apply util.ResolvePath at the top of each shared start* function
  (s3, webdav, sftp) so mini/server/filer/standalone callers all
  inherit it; resolve at the few one-off use sites (mount cache
  dirs, volume idx folder, mini admin.dataDir, profile paths).
- Drop the duplicate expandHomeDir helper from admin.go in favor of
  the now-equivalent util.ResolvePath.

* fixup: handle comma-separated -dir flags for tilde expansion

`weed mini -dir`, `weed server -dir`, and `weed volume -dir` accept
comma-separated paths (`dir[,dir]...`). Calling util.ResolvePath on
the whole string mishandled multi-folder values with tilde, e.g.
"~/d1,~/d2" would resolve as if "d1,~/d2" were a single subpath.

- Add util.ResolveCommaSeparatedPaths: split on ",", run each entry
  through ResolvePath, rejoin. Short-circuits when no "~" present.
- Use it for *miniDataFolders (mini.go), *volumeDataFolders (server.go),
  and resolve each entry of v.folders in-place (volume.go) so all
  downstream consumers see resolved paths.
- Add 7-case TestResolveCommaSeparatedPaths covering empty, single,
  multiple, and mixed inputs.

* address PR review: metaFolder + Windows backslash

- master.go: resolve *m.metaFolder at the top of runMaster so
  util.FullPath(*m.metaFolder) on the next line sees an expanded
  path. Drop the now-redundant ResolvePath in TestFolderWritable.
- server.go: same treatment for *masterOptions.metaFolder, paired
  with the existing cpu/mem profile resolves. Drop the redundant
  inner ResolvePath at TestFolderWritable.
- file_util.go: ResolvePath now accepts filepath.Separator as a
  separator after the tilde, so "~\\data" works on Windows. Other
  platforms keep current behaviour (backslash stays literal because
  it is a valid filename character in usernames and paths).
- file_util_test.go: add two cases using filepath.Separator that
  exercise the new code path on Windows and remain a no-op on Unix.

* address PR review: resolve "~" in remaining command path flags

Comprehensive sweep of path-bearing flags across every weed
subcommand, applying util.ResolvePath in-place at the top of each
run* function so all downstream consumers see expanded paths.

- webdav.go: resolve *wo.cacheDir at the top of startWebDav so
  mini/server/filer/standalone callers all inherit it.
- mount_std.go: cpu/mem profile paths.
- filer_sync.go: cpu/mem profile paths.
- mq_broker.go: cpu/mem profile paths.
- benchmark.go: cpuprofile output path.
- backup.go: -dir resolved once at runBackup; drop the duplicated
  inline ResolvePath in NewVolume calls.
- compact.go: -dir resolved at runCompact; drop inline ResolvePath.
- export.go: -dir and -o resolved at runExport; drop inline
  ResolvePath in LoadFromIdx and ScanVolumeFile.
- download.go: -dir resolved at runDownload; drop inline.
- update.go: -dir resolved at runUpdate so filepath.Join uses the
  expanded path; drop inline ResolvePath in TestFolderWritable.
- scaffold.go: -output expanded before filepath.Join.
- worker.go: -workingDir expanded before being passed to runtime.

* address PR review: resolve option-struct paths at run* entry points

server.go:381 propagates s3Options.config to filerOptions.s3ConfigFile
*before* startS3Server runs, which meant the filer-side code saw the
unresolved tilde-prefixed pointer. Same pattern for webdavOptions and
sftpOptions (and equivalent in mini.go / filer.go).

The fix: hoist resolution from the shared start* functions up to the
run* entry points, where every shared pointer is set up before any
propagation happens.

- s3.go, webdav.go, sftp.go: extract a resolvePaths() method on each
  Options struct that runs every path field through util.ResolvePath
  in-place. Idempotent.
- runS3, runWebDav, runSftp: call the standalone struct's resolvePaths
  before starting metrics / loading security config.
- runServer, runMini, runFiler: call resolvePaths on every embedded
  options struct, plus resolve loose flags (serverIamConfig,
  miniS3Config, miniIamConfig, miniMasterOptions.metaFolder, and
  filer's defaultLevelDbDirectory) so they're expanded before any
  pointer copy or use.
- Drop the now-redundant inline ResolvePath at filer's
  defaultLevelDbDirectory composition.

* address PR review: re-resolve mini -dir post-config, cover misc paths

- mini.go: applyConfigFileOptions can overwrite -dir with a literal
  ~/data from mini.options. Re-resolve *miniDataFolders after the
  config-file apply, alongside the other path resolves, so the mini
  filer no longer ends up with a literal ~/data/filerldb2.
- benchmark.go: resolve *b.idListFile (-list).
- filer_sync.go: resolve *syncOptions.aSecurity / .bSecurity
  (-a.security / -b.security) before LoadClientTLSFromFile.
- filer_cat.go: resolve *filerCat.output (-o) before os.OpenFile.
- admin.go: drop trailing blank line at EOF (git diff --check).

* address PR review: resolve -a.security/-b.security/-config before use

Three follow-up fixes:

- filer_sync.go: the -a.security / -b.security resolves were placed
  *after* LoadClientTLSFromFile / LoadHTTPClientFromFile were called,
  so weed filer.sync -a.security=~/a.toml still passed the literal
  tilde path. Hoist the resolves above the security-loading block so
  TLS clients see expanded paths.
- filer_sync_verify.go: same flag pair was never resolved at all in
  the verify command; resolve at the top of runFilerSyncVerify.
- filer_meta_backup.go: -config (the backup_filer.toml path) was
  passed directly to viper. Resolve at the top of runFilerMetaBackup.
- mini.go: master.dir defaulted to the entire comma-joined
  miniDataFolders. With weed mini -dir=~/d1,~/d2 (or any multi-dir
  setup), TestFolderWritable then stat'd the joined string instead
  of a single directory. Default to the first entry via StringSplit
  to mirror the disk-space calculation a few lines below, and drop
  the now-redundant ResolvePath in TestFolderWritable.
2026-05-03 21:46:21 -07:00
Lisandro Pin
0721e3c1e9 Rework volume compaction (a.k.a vacuuming) logic to cleanly support new parameters. (#8337)
We'll leverage on this to support a "ignore broken needles" option, necessary
to properly recover damaged volumes, as described in
https://github.com/seaweedfs/seaweedfs/issues/7442#issuecomment-3897784283 .
2026-02-16 02:15:14 -08:00
Chris Lu
50f067bcfd backup: handle volume not found when backing up (#7465)
* handle volume not found when backing up

* error handling on reading volume ttl and replication

* fix Inconsistent error handling: should continue to next location.

* adjust messages

* close volume

* refactor

* refactor

* proper v.Close()
2025-11-11 08:52:23 -08:00
Chris Lu
5ab49e2971 Adjust cli option (#7418)
* adjust "weed benchmark" CLI to use readOnly/writeOnly

* consistently use "-master" CLI option

* If both -readOnly and -writeOnly are specified, the current logic silently allows it with -writeOnly taking precedence. This is confusing and could lead to unexpected behavior.
2025-10-31 17:08:00 -07:00
chrislu
2f1b3d68d7 pass volume version when creating a volume 2025-06-19 01:15:25 -07:00
qinguoyi
b74b506e52 add command backup destory volume error log (#5846) 2024-08-02 00:04:57 -07:00
vadimartynov
b796c21fa9 Added loadSecurityConfigOnce (#5792) 2024-07-16 09:15:55 -07:00
vadimartynov
8aae82dd71 Added context for the MasterClient's methods to avoid endless loops (#5628)
* Added context for the MasterClient's methods to avoid endless loops

* Returned WithClient function. Added WithClientCustomGetMaster function

* Hid unused ctx arguments

* Using a common context for the KeepConnectedToMaster and WaitUntilConnected functions

* Changed the context termination check in the tryConnectToMaster function

* Added a child context to the tryConnectToMaster function

* Added a common context for KeepConnectedToMaster and WaitUntilConnected functions in benchmark
2024-06-14 11:40:34 -07:00
柏杰
0b0fb9b9e4 avoid data race read volume.IsEmpty (#4574)
* avoid data race read volume.IsEmpty

-   avoid phantom read isEmpty for onlyEmpty
-   use `v.DataBackend.GetStat()` in v.dataFileAccessLock scope

* add Destroy(onlyEmpty: true) test

* add Destroy(onlyEmpty: false) test

* remove unused `IsEmpty()`

* change literal `8` to `SuperBlockSize`
2023-06-14 14:39:58 -07:00
Guo Lei
5b905fb2b7 Lazy loading (#3958)
* types packages is imported more than onece

* lazy-loading

* fix bugs

* fix bugs

* fix unit tests

* fix test error

* rename function

* unload ldb after initial startup

* Don't load ldb when starting volume server if ldbtimeout is set.

* remove uncessary unloadldb

* Update weed/command/server.go

Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>

* Update weed/command/volume.go

Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>

Co-authored-by: guol-fnst <goul-fnst@fujitsu.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2022-11-14 00:19:27 -08:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
chrislu
735038b2c1 backup do not need to use preallocation
fix https://github.com/chrislusf/seaweedfs/issues/3044
2022-05-13 13:46:52 -07:00
Chris Lu
3be3c17f59 volume vacuum: avoid timeout with streaming progress report
fix https://github.com/chrislusf/seaweedfs/issues/2396
2021-10-24 01:55:34 -07:00
Chris Lu
e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu
d1d1fc772c move some volume lookup operations to grpc
jwt related lookup will come in next commit
2021-08-12 20:33:00 -07:00
Chris Lu
3575d41009 go fmt 2021-02-17 20:57:08 -08:00
Chris Lu
6daa932f5c refactoring to get master function, instead of passing master values directly
this will enable retrying later
2021-02-17 20:55:55 -08:00
Chris Lu
6d30b21b10 volume: add "-dir.idx" option for separate index storage
fix https://github.com/chrislusf/seaweedfs/issues/1265
2020-11-27 03:17:10 -08:00
Chris Lu
f43146b237 resolve directories if containing home directory 2020-07-16 22:50:14 -07:00
Chris Lu
d439d83772 volume: follow compactionBytePerSecond
related to https://github.com/chrislusf/seaweedfs/issues/1108
2020-03-11 10:32:17 -07:00
Chris Lu
d335f04de6 support env variables to overwrite toml file 2020-01-29 09:09:55 -08:00
Chris Lu
efd2f50ede compaction changed to .idx based deletion 2019-12-24 14:55:50 -08:00
Chris Lu
09ca936c78 shell: add ec.decode command 2019-12-23 12:48:20 -08:00
Chris Lu
19b6a16003 changed from os.file to backend.DataStorageBackend 2019-10-29 00:35:16 -07:00
j.laycock
6fc6322c90 Change joeslay paths to chrislusf paths 2019-09-12 14:18:21 +01:00
Tom Maxwell
4a878c0006 Changed the InMemory bool to a uint32 so that it can be used to alter how much space to reserve 2019-09-04 15:27:14 +01:00
j.laycock
1f01eb78e8 Rename mem_map to mMap, remove some in_memory variables being passed around, added MemoryMapped member to volume struct 2019-09-03 17:00:59 +01:00
Tom Maxwell
d637d86d22 Changes to try and pass the URL parameters through - in memory flag not working still 2019-09-03 15:41:28 +01:00
j.laycock
595a1beff0 Swap imports to use joeslay 2019-09-02 11:28:40 +01:00
xushuxun
5904d78bd4 weed backup: add ttl and replication parameter 2019-08-16 11:05:22 +08:00
Chris Lu
ede876cfdb periodic scripts exeuction from leader master 2019-06-05 01:30:24 -07:00
Chris Lu
b335f81a4f volume: add option to limit compaction speed 2019-05-03 17:22:39 -07:00
Chris Lu
3b3651dea3 volume: atomic copying file, adds version and stopOffset 2019-04-19 12:29:49 -07:00
Chris Lu
ac2727853f fix needle map entry size 2019-04-19 00:39:34 -07:00
Chris Lu
e5506152c0 refactoring 2019-04-18 21:43:36 -07:00
Chris Lu
13ad5c1966 refactoring 2019-04-17 22:04:49 -07:00
Wine93
361912224d typo: remove blank 2019-04-11 09:18:53 +00:00
Chris Lu
7a14cdc90c refactoring, go fmt 2019-03-25 23:18:40 -07:00
Chris Lu
70815e9124 WIP 2019-03-25 09:16:12 -07:00
Chris Lu
77b9af531d adding grpc mutual tls 2019-02-18 12:11:52 -08:00
Sergey
aa5ccff6d2 fixing of typos 2019-02-06 18:59:15 +05:00
bingoohuang
ab6be025d7 go fmt and fix some typo 2019-01-17 09:17:19 +08:00
Chris Lu
fda771c83f migrate volume sync status to grpc API on volume server 2018-10-15 01:19:15 -07:00
Chris Lu
ed44f12f6d support Fallocate on linux 2017-01-08 11:01:46 -08:00
Chris Lu
5ce6bbf076 directory structure change to work with glide
glide has its own requirements. My previous workaround caused me some
code checkin errors. Need to fix this.
2016-06-02 18:09:14 -07:00