* refactor(command): expand "~" in all path-style CLI flags
Many of weed's path-bearing flags (-s3.config, -s3.iam.config,
-admin.dataDir, -webdav.cacheDir, -volume.dir.idx, TLS cert/key
files, profile output paths, mount cache dirs, sftp key files, ...)
were never run through util.ResolvePath, so a value like "~/iam.json"
was used literally. Tilde only worked when the shell expanded it,
which silently fails for the common -flag=~/path form (bash leaves
the tilde literal in --opt=~/path).
- Extend util.ResolvePath to also handle "~user" / "~user/rest",
matching shell tilde expansion. Add unit tests.
- Apply util.ResolvePath at the top of each shared start* function
(s3, webdav, sftp) so mini/server/filer/standalone callers all
inherit it; resolve at the few one-off use sites (mount cache
dirs, volume idx folder, mini admin.dataDir, profile paths).
- Drop the duplicate expandHomeDir helper from admin.go in favor of
the now-equivalent util.ResolvePath.
* fixup: handle comma-separated -dir flags for tilde expansion
`weed mini -dir`, `weed server -dir`, and `weed volume -dir` accept
comma-separated paths (`dir[,dir]...`). Calling util.ResolvePath on
the whole string mishandled multi-folder values with tilde, e.g.
"~/d1,~/d2" would resolve as if "d1,~/d2" were a single subpath.
- Add util.ResolveCommaSeparatedPaths: split on ",", run each entry
through ResolvePath, rejoin. Short-circuits when no "~" present.
- Use it for *miniDataFolders (mini.go), *volumeDataFolders (server.go),
and resolve each entry of v.folders in-place (volume.go) so all
downstream consumers see resolved paths.
- Add 7-case TestResolveCommaSeparatedPaths covering empty, single,
multiple, and mixed inputs.
* address PR review: metaFolder + Windows backslash
- master.go: resolve *m.metaFolder at the top of runMaster so
util.FullPath(*m.metaFolder) on the next line sees an expanded
path. Drop the now-redundant ResolvePath in TestFolderWritable.
- server.go: same treatment for *masterOptions.metaFolder, paired
with the existing cpu/mem profile resolves. Drop the redundant
inner ResolvePath at TestFolderWritable.
- file_util.go: ResolvePath now accepts filepath.Separator as a
separator after the tilde, so "~\\data" works on Windows. Other
platforms keep current behaviour (backslash stays literal because
it is a valid filename character in usernames and paths).
- file_util_test.go: add two cases using filepath.Separator that
exercise the new code path on Windows and remain a no-op on Unix.
* address PR review: resolve "~" in remaining command path flags
Comprehensive sweep of path-bearing flags across every weed
subcommand, applying util.ResolvePath in-place at the top of each
run* function so all downstream consumers see expanded paths.
- webdav.go: resolve *wo.cacheDir at the top of startWebDav so
mini/server/filer/standalone callers all inherit it.
- mount_std.go: cpu/mem profile paths.
- filer_sync.go: cpu/mem profile paths.
- mq_broker.go: cpu/mem profile paths.
- benchmark.go: cpuprofile output path.
- backup.go: -dir resolved once at runBackup; drop the duplicated
inline ResolvePath in NewVolume calls.
- compact.go: -dir resolved at runCompact; drop inline ResolvePath.
- export.go: -dir and -o resolved at runExport; drop inline
ResolvePath in LoadFromIdx and ScanVolumeFile.
- download.go: -dir resolved at runDownload; drop inline.
- update.go: -dir resolved at runUpdate so filepath.Join uses the
expanded path; drop inline ResolvePath in TestFolderWritable.
- scaffold.go: -output expanded before filepath.Join.
- worker.go: -workingDir expanded before being passed to runtime.
* address PR review: resolve option-struct paths at run* entry points
server.go:381 propagates s3Options.config to filerOptions.s3ConfigFile
*before* startS3Server runs, which meant the filer-side code saw the
unresolved tilde-prefixed pointer. Same pattern for webdavOptions and
sftpOptions (and equivalent in mini.go / filer.go).
The fix: hoist resolution from the shared start* functions up to the
run* entry points, where every shared pointer is set up before any
propagation happens.
- s3.go, webdav.go, sftp.go: extract a resolvePaths() method on each
Options struct that runs every path field through util.ResolvePath
in-place. Idempotent.
- runS3, runWebDav, runSftp: call the standalone struct's resolvePaths
before starting metrics / loading security config.
- runServer, runMini, runFiler: call resolvePaths on every embedded
options struct, plus resolve loose flags (serverIamConfig,
miniS3Config, miniIamConfig, miniMasterOptions.metaFolder, and
filer's defaultLevelDbDirectory) so they're expanded before any
pointer copy or use.
- Drop the now-redundant inline ResolvePath at filer's
defaultLevelDbDirectory composition.
* address PR review: re-resolve mini -dir post-config, cover misc paths
- mini.go: applyConfigFileOptions can overwrite -dir with a literal
~/data from mini.options. Re-resolve *miniDataFolders after the
config-file apply, alongside the other path resolves, so the mini
filer no longer ends up with a literal ~/data/filerldb2.
- benchmark.go: resolve *b.idListFile (-list).
- filer_sync.go: resolve *syncOptions.aSecurity / .bSecurity
(-a.security / -b.security) before LoadClientTLSFromFile.
- filer_cat.go: resolve *filerCat.output (-o) before os.OpenFile.
- admin.go: drop trailing blank line at EOF (git diff --check).
* address PR review: resolve -a.security/-b.security/-config before use
Three follow-up fixes:
- filer_sync.go: the -a.security / -b.security resolves were placed
*after* LoadClientTLSFromFile / LoadHTTPClientFromFile were called,
so weed filer.sync -a.security=~/a.toml still passed the literal
tilde path. Hoist the resolves above the security-loading block so
TLS clients see expanded paths.
- filer_sync_verify.go: same flag pair was never resolved at all in
the verify command; resolve at the top of runFilerSyncVerify.
- filer_meta_backup.go: -config (the backup_filer.toml path) was
passed directly to viper. Resolve at the top of runFilerMetaBackup.
- mini.go: master.dir defaulted to the entire comma-joined
miniDataFolders. With weed mini -dir=~/d1,~/d2 (or any multi-dir
setup), TestFolderWritable then stat'd the joined string instead
of a single directory. Default to the first entry via StringSplit
to mirror the disk-space calculation a few lines below, and drop
the now-redundant ResolvePath in TestFolderWritable.
* feat(filer.backup): -initialSnapshot seeds destination from live tree
Replaying the metadata event log on a fresh sync only leaves files that
still exist on the source at replay time: any entry that was created and
later deleted is replayed as a create/delete pair and never materializes
on the destination. Users who wipe the destination and re-run
filer.backup therefore see "only new files" instead of a full backup,
even when -timeAgo=876000h is passed and the subscription genuinely
starts from epoch (ref discussion #8672).
Add a -initialSnapshot opt-in flag: when set on a fresh sync (no prior
checkpoint, -timeAgo unset), walk the live filer tree under -filerPath
via TraverseBfs and seed the destination through sink.CreateEntry, then
persist the walk-start timestamp as the checkpoint and subscribe from
there. Capturing the timestamp before the walk lets the subscription
catch any create/update/delete racing with the walk — sink CreateEntry
is idempotent across the builtin sinks so replay is safe.
Honors existing -filerExcludePaths / -filerExcludeFileNames /
-filerExcludePathPatterns filters and skips /topics/.system/log the
same way the subscription path does.
Also log "starting from <t> (no prior checkpoint)" instead of a
misleading "resuming from 1970-01-01" when the KV has no stored offset.
* fix(filer.backup): guard initialSnapshot counters under TraverseBfs workers
TraverseBfs fans the callback out across 5 worker goroutines, so the
entryCount / byteCount updates and the 5-second progress-log gate in
runInitialSnapshot were racing. Switch the counters to atomic.Int64 and
protect the lastLog check/update with a short-scoped mutex so the heavy
sink.CreateEntry call stays outside the critical section.
Flagged by gemini-code-assist on #9126; verified with go test -race.
* fix(filer.backup): harden initialSnapshot against transient errors and path edge cases
Three review items from CodeRabbit on #9126:
1. getOffset errors no longer leave isFreshSync=true. Before, a transient
KV read failure would cause runFilerBackup's retry loop to redo the
full -initialSnapshot walk on every retry. Treat any offset-read
error as "not fresh" so the snapshot only runs when we've verified
there really is no prior checkpoint.
2. initialSnapshotTargetKey now normalizes sourcePath to a trailing-
slash base before stripping the prefix, so edge cases where
sourceKey equals sourcePath (trailing-slash mismatch or root-entry
emission) no longer index past the end. Unit tests cover both
forms.
3. Documented the TraverseBfs-enumerates-excluded-subtrees performance
characteristic on runInitialSnapshot, since pruning requires a
separate change to TraverseBfs itself.
* fix(filer.backup): retry setOffset after initialSnapshot to avoid full re-walks
If the snapshot walk finishes but the subsequent setOffset fails, the
retry loop in runFilerBackup will re-enter doFilerBackup with an empty
checkpoint and run the full BFS again — on a multi-million-entry tree
that's hours of wasted work over a 100-byte KV write. Retry the write a
handful of times with exponential backoff before giving up, and log
loudly at the final failure (with snapshotTsNs + sinkId) so operators
recognize the symptom instead of guessing at mysterious repeated walks.
Nitpick raised by CodeRabbit on #9126.
* fix(filer.backup): initialSnapshot ignore404, skew margin, exclude dir-entry itself
Three review items from CodeRabbit on #9126:
1. ignore404Error now threads into runInitialSnapshot. If a file is listed
by TraverseBfs and then deleted before CreateEntry reads its chunks,
the follow path already ignores 404s — the snapshot path was aborting
and triggering a full re-walk. Treat an ignorable 404 as "skip this
entry, continue."
2. snapshotTsNs now uses `time.Now() - 1min` instead of `time.Now()`.
Metadata events are stamped server-side, so a fast backup-host clock
could skip events that fire during or right after the walk. Matches
the 1-minute margin meta_aggregator.go applies on initial peer
traversal; duplicate replay is harmless because CreateEntry is
idempotent.
3. Exclude checks now run against the entry's own full path, not just
its parent. A walked directory whose full path matches SystemLogDir
or -filerExcludePaths was being seeded to the destination; only its
descendants were being skipped. Verified with a manual repro where
-filerExcludePaths=/data/skipdir now keeps the skipdir entry itself
off the destination.
* refactor(filer): share destKey helper between buildKey and initialSnapshot
Extract destKey(dataSink, targetPath, sourcePath, sourceKey, mTime) from
buildKey in filer_sync.go. Both the event-log path (buildKey) and the
initialSnapshot walk (initialSnapshotTargetKey) now go through the same
helper, so a walk-seeded file and an event-replayed file always resolve
to the same destination key.
As a bonus, buildKey picks up the defensive trailing-slash normalization
that initialSnapshotTargetKey introduced — no more index-past-end risk
when sourceKey happens to equal sourcePath. Also tightens the mTime
lookup to guard against nil Attributes (caught by an existing test
against buildKey when I first moved the lookup out of the incremental
branch).
* fix(sync): use per-cluster TLS for HTTP volume connections in filer.sync (#8965)
When filer.sync runs with -a.security and -b.security flags, only gRPC
connections received per-cluster TLS configuration. HTTP clients for
volume server reads and uploads used a global singleton with the default
security.toml, causing TLS verification failures when clusters use
different self-signed certificates.
Load per-cluster HTTPS client config from the security files and pass
dedicated HTTP clients to FilerSource (for downloads) and FilerSink
(for uploads) so each direction uses the correct cluster's certificates.
* fix(sync): address review feedback for per-cluster HTTP TLS
- Add insecure_skip_verify support to NewHttpClientWithTLS and read it
from per-cluster security config via https.client.insecure_skip_verify
- Error on partial mTLS config (cert without key or vice versa)
- Add nil-check for client parameter in DownloadFileWithClient
- Document SetUploader as init-only (same pattern as SetChunkConcurrency)
* filer.sync: show active chunk transfers when sync progress stalls
When the sync watermark is not advancing, print each in-progress chunk
transfer with its file path, bytes received so far, and current status
(downloading, uploading, or waiting with backoff duration). This helps
diagnose which files are blocking progress during replication.
Closes#8542
* filer.sync: include last error in stall diagnostics
* filer.sync: fix data races in ChunkTransferStatus
Add sync.RWMutex to ChunkTransferStatus and lock around all field
mutations in fetchAndWrite. ActiveTransfers now returns value copies
under RLock so callers get immutable snapshots.
* filer.sync: support per-cluster mTLS with -a.security and -b.security flags
When syncing between two clusters that use different certificate authorities,
a single security.toml cannot authenticate to both. Add -a.security and
-b.security flags so each filer can use its own security.toml for TLS.
Closes#8481
* security: fatal on failure to read explicitly provided security config
When -a.security or -b.security is specified, falling back to insecure
credentials on read error would silently bypass mTLS. Fatal instead.
* fix(filer.sync): use source filer's fromTsMs flag in initOffsetFromTsMs
A→B was using bFromTsMs and B→A was using aFromTsMs — these were
swapped. Each path should seed the target's offset with the source
filer's starting timestamp.
* security: return error from LoadClientTLSFromFile, resolve relative PEM paths
Change LoadClientTLSFromFile to return (grpc.DialOption, error) so
callers can handle failures explicitly instead of a silent insecure
fallback. Resolve relative PEM paths (grpc.ca, grpc.client.cert,
grpc.client.key) against the config file's directory.
* rename metadata events
* fix subscription filter to use NewEntry.Name for rename path matching
The server-side subscription filter constructed the new path using
OldEntry.Name instead of NewEntry.Name when checking if a rename
event's destination matches the subscriber's path prefix. This could
cause events to be incorrectly filtered when a rename changes the
file name.
* fix bucket events to handle rename of bucket directories
onBucketEvents only checked IsCreate and IsDelete. A bucket directory
rename via AtomicRenameEntry now emits a single rename event (both
OldEntry and NewEntry non-nil), which matched neither check. Handle
IsRename by deleting the old bucket and creating the new one.
* fix replicator to handle rename events across directory boundaries
Two issues fixed:
1. The replicator filtered events by checking if the key (old path)
was under the source directory. Rename events now use the old path
as key, so renames from outside into the watched directory were
silently dropped. Now both old and new paths are checked, and
cross-boundary renames are converted to create or delete.
2. NewParentPath was passed to the sink without remapping to the
sink's target directory structure, causing the sink to write
entries at the wrong location. Now NewParentPath is remapped
alongside the key.
* fix filer sync to handle rename events crossing directory boundaries
The early directory-prefix filter only checked resp.Directory (old
parent). Rename events now carry the old parent as Directory, so
renames from outside the source path into it were dropped before
reaching the existing cross-boundary handling logic. Check both old
and new directories against sourcePath and excludePaths so the
downstream old-key/new-key logic can properly convert these to
create or delete operations.
* fix metadata event path matching
* fix metadata event consumers for rename targets
* Fix replication rename target keys
Logical rename events now reach replication sinks with distinct source and target paths.\n\nHandle non-filer sinks as delete-plus-create on the translated target key, and make the rename fallback path create at the translated target key too.\n\nAdd focused tests covering non-filer renames, filer rename updates, and the fallback path.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix filer sync rename path scoping
Use directory-boundary matching instead of raw prefix checks when classifying source and target paths during filer sync.\n\nAlso apply excludePaths per side so renames across excluded boundaries downgrade cleanly to create/delete instead of being misclassified as in-scope updates.\n\nAdd focused tests for boundary matching and rename classification.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix replicator directory boundary checks
Use directory-boundary matching instead of raw prefix checks when deciding whether a source or target path is inside the watched tree or an excluded subtree.\n\nThis prevents sibling paths such as /foo and /foobar from being misclassified during rename handling, and preserves the earlier rename-target-key fix.\n\nAdd focused tests for boundary matching and rename classification across sibling/excluded directories.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix etc-remote rename-out handling
Use boundary-safe source/target directory membership when classifying metadata events under DirectoryEtcRemote.\n\nThis prevents rename-out events from being processed as config updates, while still treating them as removals where appropriate for the remote sync and remote gateway command paths.\n\nAdd focused tests for update/removal classification and sibling-prefix handling.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Defer rename events until commit
Queue logical rename metadata events during atomic and streaming renames and publish them only after the transaction commits successfully.\n\nThis prevents subscribers from seeing delete or logical rename events for operations that later fail during delete or commit.\n\nAlso serialize notification.Queue swaps in rename tests and add failure-path coverage.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Skip descendant rename target lookups
Avoid redundant target lookups during recursive directory renames once the destination subtree is known absent.\n\nThe recursive move path now inserts known-absent descendants directly, and the test harness exercises prefixed directory listing so the optimization is covered by a directory rename regression test.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Tighten rename review tests
Return filer_pb.ErrNotFound from the bucket tracking store test stub so it follows the FilerStore contract, and add a webhook filter case for same-name renames across parent directories.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix HardLinkId format verb in InsertEntryKnownAbsent error
HardLinkId is a byte slice. %d prints each byte as a decimal number
which is not useful for an identifier. Use %x to match the log line
two lines above.
* only skip descendant target lookup when source and dest use same store
moveFolderSubEntries unconditionally passed skipTargetLookup=true for
every descendant. This is safe when all paths resolve to the same
underlying store, but with path-specific store configuration a child's
destination may map to a different backend that already holds an entry
at that path. Use FilerStoreWrapper.SameActualStore to check per-child
and fall back to the full CreateEntry path when stores differ.
* add nil and create edge-case tests for metadata event scope helpers
* extract pathIsEqualOrUnder into util.IsEqualOrUnder
Identical implementations existed in both replication/replicator.go and
command/filer_sync.go. Move to util.IsEqualOrUnder (alongside the
existing FullPath.IsUnder) and remove the duplicates.
* use MetadataEventTargetDirectory for new-side directory in filer sync
The new-side directory checks and sourceNewKey computation used
message.NewParentPath directly. If NewParentPath were empty (legacy
events, older filer versions during rolling upgrades), sourceNewKey
would be wrong (/filename instead of /dir/filename) and the
UpdateEntry parent path rewrite would panic on slice bounds.
Derive targetDir once from MetadataEventTargetDirectory, which falls
back to resp.Directory when NewParentPath is empty, and use it
consistently for all new-side checks and the sink parent path.
* Fix filerExcludeFileName to support directory names and path components
The original implementation only matched excludeFileName against
message.NewEntry.Name, which caused two issues:
1. Nil pointer panic on delete events (NewEntry is nil)
2. Files inside excluded directories were still backed up because
the parent directory name was not checked
This patch:
- Checks all path components in resp.Directory against the regexp
- Adds nil guard for message.NewEntry before accessing .Name
- Also checks message.OldEntry.Name for rename/delete events
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add -filerExcludePathPattern flag and fix nil panic in filerExcludeFileName
Separate concerns between two exclude mechanisms:
- filerExcludeFileName: matches entry name only (leaf node)
- filerExcludePathPattern (NEW): matches any path component via regexp,
so files inside matched directories are also excluded
Also fixes nil pointer panic when filerExcludeFileName encounters
delete events where NewEntry is nil.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Refactor exclude logic: per-side exclusion for rename events, reduce duplication
- Extract isEntryExcluded() to compute exclusion per old/new side,
so rename events crossing an exclude boundary are handled as
delete + create instead of being entirely skipped
- Extract compileExcludePattern() to deduplicate regexp compilation
- Replace strings.Split with allocation-free pathContainsMatch()
- Check message.NewParentPath (not just resp.Directory) for new side
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Move regexp compilation out of retry loop to fail fast on config errors
compileExcludePattern for -filerExcludeFileName and -filerExcludePathPattern
are configuration-time validations that will never succeed on retry.
Move them to runFilerBackup before the reconnect loop and use glog.Fatalf
on failure, so invalid patterns are caught immediately at startup instead
of being retried every 1.7 seconds indefinitely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add wildcard matching helpers for path and filename exclusion
* Replace regexp exclude patterns with wildcard-based flags, deprecate -filerExcludeFileName
Add -filerExcludeFileNames and -filerExcludePathPatterns flags that accept
comma-separated wildcard patterns (*, ?) using the existing wildcard library.
Mark -filerExcludeFileName as deprecated but keep its regexp behavior.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
* improve large file sync throughput for remote.cache and filer.sync
Three main throughput improvements:
1. Adaptive chunk sizing for remote.cache: targets ~32 chunks per file
instead of always starting at 5MB. A 500MB file now uses ~16MB chunks
(32 chunks) instead of 5MB chunks (100 chunks), reducing per-chunk
overhead (volume assign, gRPC call, needle write) by 3x.
2. Configurable concurrency at every layer:
- remote.cache chunk concurrency: -chunkConcurrency flag (default 8)
- remote.cache S3 download concurrency: -downloadConcurrency flag
(default raised from 1 to 5 per chunk)
- filer.sync chunk concurrency: -chunkConcurrency flag (default 32)
3. S3 multipart download concurrency raised from 1 to 5: the S3 manager
downloader was using Concurrency=1, serializing all part downloads
within each chunk. This alone can 5x per-chunk download speed.
The concurrency values flow through the gRPC request chain:
shell command → CacheRemoteObjectToLocalClusterRequest →
FetchAndWriteNeedleRequest → S3 downloader
Zero values in the request mean "use server defaults", maintaining
full backward compatibility with existing callers.
Ref #8481
* fix: use full maxMB for chunk size cap and remove loop guard
Address review feedback:
- Use full maxMB instead of maxMB/2 for maxChunkSize to avoid
unnecessarily limiting chunk size for very large files.
- Remove chunkSize < maxChunkSize guard from the safety loop so it
can always grow past maxChunkSize when needed to stay under 1000
chunks (e.g., extremely large files with small maxMB).
* address review feedback: help text, validation, naming, docs
- Fix help text for -chunkConcurrency and -downloadConcurrency flags
to say "0 = server default" instead of advertising specific numeric
defaults that could drift from the server implementation.
- Validate chunkConcurrency and downloadConcurrency are within int32
range before narrowing, returning a user-facing error if out of range.
- Rename ReadRemoteErr to readRemoteErr to follow Go naming conventions.
- Add doc comment to SetChunkConcurrency noting it must be called
during initialization before replication goroutines start.
- Replace doubling loop in chunk size safety check with direct
ceil(remoteSize/1000) computation to guarantee the 1000-chunk cap.
* address Copilot review: clamp concurrency, fix chunk count, clarify proto docs
- Use ceiling division for chunk count check to avoid overcounting
when file size is an exact multiple of chunk size.
- Clamp chunkConcurrency (max 1024) and downloadConcurrency (max 1024
at filer, max 64 at volume server) to prevent excessive goroutines.
- Always use ReadFileWithConcurrency when the client supports it,
falling back to the implementation's default when value is 0.
- Clarify proto comments that download_concurrency only applies when
the remote storage client supports it (currently S3).
- Include specific server defaults in help text (e.g., "0 = server
default 8") so users see the actual values in -h output.
* fix data race on executionErr and use %w for error wrapping
- Protect concurrent writes to executionErr in remote.cache worker
goroutines with a sync.Mutex to eliminate the data race.
- Use %w instead of %v in volume_grpc_remote.go error formatting
to preserve the error chain for errors.Is/errors.As callers.
* fix: use keyed fields in struct literals
- Replace unsafe reflect.StringHeader/SliceHeader with safe unsafe.String/Slice (weed/query/sqltypes/unsafe.go)
- Add field names to Type_ScalarType struct literals (weed/mq/schema/schema_builder.go)
- Add Duration field name to FlexibleDuration struct literals across test files
- Add field names to bson.D struct literals (weed/filer/mongodb/mongodb_store_kv.go)
Fixes go vet warnings about unkeyed struct literals.
* fix: remove unreachable code
- Remove unreachable return statements after infinite for loops
- Remove unreachable code after if/else blocks where all paths return
- Simplify recursive logic by removing unnecessary for loop (inode_to_path.go)
- Fix Type_ScalarType literal to use enum value directly (schema_builder.go)
- Call onCompletionFn on stream error (subscribe_session.go)
Files fixed:
- weed/query/sqltypes/unsafe.go
- weed/mq/schema/schema_builder.go
- weed/mq/client/sub_client/connect_to_sub_coordinator.go
- weed/filer/redis3/ItemList.go
- weed/mq/client/agent_client/subscribe_session.go
- weed/mq/broker/broker_grpc_pub_balancer.go
- weed/mount/inode_to_path.go
- weed/util/skiplist/name_list.go
* fix: avoid copying lock values in protobuf messages
- Use proto.Merge() instead of direct assignment to avoid copying sync.Mutex in S3ApiConfiguration (iamapi_server.go)
- Add explicit comments noting that channel-received values are already copies before taking addresses (volume_grpc_client_to_master.go)
The protobuf messages contain sync.Mutex fields from the message state, which should not be copied.
Using proto.Merge() properly merges messages without copying the embedded mutex.
* fix: correct byte array size for uint32 bit shift operations
The generateAccountId() function only needs 4 bytes to create a uint32 value.
Changed from allocating 8 bytes to 4 bytes to match the actual usage.
This fixes go vet warning about shifting 8-bit values (bytes) by more than 8 bits.
* fix: ensure context cancellation on all error paths
In broker_client_subscribe.go, ensure subscriberCancel() is called on all error return paths:
- When stream creation fails
- When partition assignment fails
- When sending initialization message fails
This prevents context leaks when an error occurs during subscriber creation.
* fix: ensure subscriberCancel called for CreateFreshSubscriber stream.Send error
Ensure subscriberCancel() is called when stream.Send fails in CreateFreshSubscriber.
* ci: add go vet step to prevent future lint regressions
- Add go vet step to GitHub Actions workflow
- Filter known protobuf lock warnings (MessageState sync.Mutex)
These are expected in generated protobuf code and are safe
- Prevents accumulation of go vet errors in future PRs
- Step runs before build to catch issues early
* fix: resolve remaining syntax and logic errors in vet fixes
- Fixed syntax errors in filer_sync.go caused by missing closing braces
- Added missing closing brace for if block and function
- Synchronized fixes to match previous commits on branch
* fix: add missing return statements to daemon functions
- Add 'return false' after infinite loops in filer_backup.go and filer_meta_backup.go
- Satisfies declared bool return type signatures
- Maintains consistency with other daemon functions (runMaster, runFilerSynchronize, runWorker)
- While unreachable, explicitly declares the return satisfies function signature contract
* fix: add nil check for onCompletionFn in SubscribeMessageRecord
- Check if onCompletionFn is not nil before calling it
- Prevents potential panic if nil function is passed
- Matches pattern used in other callback functions
* docs: clarify unreachable return statements in daemon functions
- Add comments documenting that return statements satisfy function signature
- Explains that these returns follow infinite loops and are unreachable
- Improves code clarity for future maintainers
* Add consistent -debug and -debug.port flags to commands
Add -debug and -debug.port flags to weed master, weed volume, weed s3,
weed mq.broker, and weed filer.sync commands for consistency with
weed filer.
When -debug is enabled, an HTTP server starts on the specified port
(default 6060) serving runtime profiling data at /debug/pprof/.
For mq.broker, replaced the older -port.pprof flag with the new
-debug and -debug.port pattern for consistency.
* Update weed/util/grace/pprof.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* filer.sync: fix race condition on first checkpoint save
Initialize lastWriteTime to time.Now() instead of zero time to prevent
the first checkpoint save from being triggered immediately when the
first event arrives. This gives async jobs time to complete and update
the watermark before the checkpoint is saved.
Previously, the zero time caused lastWriteTime.Add(3s).Before(now) to
be true on the first event, triggering an immediate checkpoint save
attempt. But since jobs are processed asynchronously, the watermark
was still 0 (initial value), causing the save to be skipped due to
the 'if offsetTsNs == 0 { return nil }' check.
Fixes#7717
* filer.sync: save checkpoint on graceful shutdown
Add graceful shutdown handling to save the final checkpoint when
filer.sync is terminated. Previously, any sync progress within the
last 3-second checkpoint interval would be lost on shutdown.
Changes:
- Add syncState struct to track current processor and offset save info
- Add atomic pointers syncStateA2B and syncStateB2A for both directions
- Register grace.OnInterrupt hook to save checkpoints on shutdown
- Modify doSubscribeFilerMetaChanges to update sync state atomically
This ensures that when filer.sync is restarted, it resumes from the
correct position instead of potentially replaying old events.
Fixes#7717
* fix(filer.sync): initializing the offset is related to the path
* fix(filer.sync): the offset maybe to be set to 0.
Co-authored-by: zhihao.qu <zhihao.qu@ly.com>