Files
seaweedfs/weed/mount/weedfs_write.go
Chris Lu da2e90aefd fix(mount): sanitize non-UTF-8 filenames; keep marshal errors per-request (#9207)
* fix(mount): sanitize non-UTF-8 filenames; keep marshal errors per-request (#9139)

A single file with invalid-UTF-8 bytes in its name (e.g. a GNOME Trash
"partial" like \x10\x98=\\\x8a\x7f.trashinfo.9a51454f.partial) made every
FUSE-initiated filer RPC fail with:

  rpc error: code = Internal desc = grpc: error while marshaling:
  string field contains invalid UTF-8

and then produced an avalanche of "connection is closing" errors on
unrelated LookupEntry / ReadDirAll / UpdateEntry calls, causing the
volume-server QPS dips reported in #9139.

Root cause is twofold:

1. Proto3 `string` fields require valid UTF-8, but the FUSE kernel passes
   raw name bytes. Create/Mknod/Mkdir/Unlink/Rmdir/Rename/Lookup/Link/
   Symlink all forwarded those bytes directly into CreateEntryRequest.Name,
   DeleteEntryRequest.Name, StreamRenameEntryRequest.{Old,New}Name and
   Entry.Name. saveDataAsChunk also copied the FullPath into
   AssignVolumeRequest.Path unchecked.

2. When the marshal failed, shouldInvalidateConnection treated the
   resulting codes.Internal as a connection problem and dropped the
   shared cached ClientConn — canceling every other in-flight RPC on it.

Fix:

- Add sanitizeFuseName (strings.ToValidUTF8 with '?' replacement, matching
  util.FullPath.DirAndName) and make checkName return the sanitized name.
  Apply at every FUSE entry point that passes a name to the filer RPC,
  including Unlink/Rmdir (which did not previously call checkName) and
  both oldName/newName in Rename. Add a backstop scrub for
  AssignVolumeRequest.Path so async flush paths cannot reintroduce
  invalid bytes from a pre-sanitization cached FullPath.

- In weed/pb.shouldInvalidateConnection, detect client-side marshal
  errors via the gRPC library's "error while marshaling" prefix and
  return false: the connection is healthy, only the request is bad.

Refs: https://github.com/seaweedfs/seaweedfs/issues/9139#issuecomment-4301184231

* fix(mount,util): use '_' for invalid-UTF-8 replacement (URL-safe)

Sanitized filenames flow downstream into HTTP URLs (volume-server uploads,
filer HTTP API, S3/WebDAV gateways). '?' is the URL query-string
delimiter and would split the path the first time the name lands in one,
so swap every invalid-UTF-8 replacement to '_'. This covers the two
pre-existing sites in weed/util/fullpath.go as well, keeping all paths
sanitized the same way.

* refactor(pb): detect client-side marshal errors via errors.As, not substring

Replace the raw `strings.Contains(err.Error(), ...)` check with a
type-based carve-out: use errors.As against the `GRPCStatus() *Status`
interface to pull the original Status out of any fmt.Errorf("...: %w")
wrapping, then match the library-owned "grpc:" prefix on that Status's
Message.

Why not errors.Is against a proto-level sentinel: gRPC's encode()
collapses the inner proto error with "%v" (stringification) before
wrapping it in a Status, so the original error type does not survive
into the caller. The Status itself is the structural signal that does
survive.

Why not status.FromError: when the caller wraps the Status error with
fmt.Errorf("...: %w", ...), status.FromError rewrites Status.Message
with the full err.Error() of the outermost wrapper, which defeats a
prefix check on the library-owned message. errors.As gives us the
original Status whose Message is still verbatim from the gRPC library.

A new test asserts that a plain errors.New("grpc: error while marshaling: …")
— i.e. the same text attached to something that is NOT a gRPC status —
does not short-circuit invalidation, so we never silently keep a cached
connection alive based on a coincidental substring match.

* refactor(util): centralize UTF-8 sanitization; add FullPath.Sanitized

Addresses review feedback on PR #9207.

Nitpick: every invalid-UTF-8 replacement across the codebase (DirAndName,
Name, mount.sanitizeFuseName, the weedfs_write.go backstop) now goes
through a single util.SanitizeUTF8Name helper, so the replacement char
('_' — URL-safe) is chosen in one place.

Outside-diff: three proto fields took raw FullPath strings that could
break marshaling if an entry ever carried invalid UTF-8
(CreateEntryRequest.Directory in Mkdir, DeleteEntryRequest.Directory in
Unlink, AssignVolumeRequest.Path in command_fs_merge_volumes). The
reviewer's suggested fix — using DirAndName() — would have silently
changed Directory from parent to grandparent, because DirAndName
sanitizes only the trailing component. Added FullPath.Sanitized(), which
scrubs every component, and applied it at the three sites. Exposure is
narrow in practice (FUSE-boundary sanitization and the gRPC-side
isClientSideMarshalError carve-out already cover the #9139 cascade),
but the defense-in-depth is cheap and consistent with the existing
AssignVolume backstop.

New tests in weed/util/fullpath_test.go document:
- SanitizeUTF8Name: valid UTF-8 passes through unchanged; invalid bytes
  become '_' (not '?', which is URL-special).
- FullPath.Sanitized: scrubs bytes in any component, not just the last.
- FullPath.DirAndName: dir remains raw on purpose — callers needing a
  clean full path must use Sanitized(). The test pins this behavior so
  it is not accidentally "fixed" in a way that changes the (dir, name)
  semantics callers depend on.
2026-04-23 19:17:35 -07:00

95 lines
3.4 KiB
Go

package mount
import (
"fmt"
"io"
"github.com/seaweedfs/seaweedfs/weed/filer"
"github.com/seaweedfs/seaweedfs/weed/glog"
"github.com/seaweedfs/seaweedfs/weed/operation"
"github.com/seaweedfs/seaweedfs/weed/pb/filer_pb"
"github.com/seaweedfs/seaweedfs/weed/util"
)
func (wfs *WFS) saveDataAsChunk(fullPath util.FullPath) filer.SaveDataAsChunkFunctionType {
// Backstop: FUSE entry points sanitize names before they reach
// inodeToPath, but async flush paths (e.g. writebackCache, handles whose
// RememberPath was set from an older code path) may still carry bytes
// that predate sanitization. Proto3 string fields require valid UTF-8,
// so scrub the full path once here before every AssignVolume call.
assignPath := fullPath.Sanitized()
return func(reader io.Reader, filename string, offset int64, tsNs int64, _ uint64) (chunk *filer_pb.FileChunk, err error) {
uploader, err := operation.NewUploader()
if err != nil {
return
}
uploadOption := &operation.UploadOption{
Filename: filename,
Cipher: wfs.option.Cipher,
IsInputCompressed: false,
MimeType: "",
PairMap: nil,
}
genFileUrlFn := func(host, fileId string) string {
fileUrl := fmt.Sprintf("http://%s/%s", host, fileId)
if wfs.option.VolumeServerAccess == "filerProxy" {
fileUrl = fmt.Sprintf("http://%s/?proxyChunkId=%s", wfs.getCurrentFiler(), fileId)
}
return fileUrl
}
fileId, uploadResult, err, data := uploader.UploadWithRetry(
wfs,
&filer_pb.AssignVolumeRequest{
Count: 1,
Replication: wfs.option.Replication,
Collection: wfs.option.Collection,
TtlSec: wfs.option.TtlSec,
DiskType: string(wfs.option.DiskType),
DataCenter: wfs.option.DataCenter,
Path: assignPath,
},
uploadOption, genFileUrlFn, reader,
)
if err != nil {
glog.V(0).Infof("upload data %v: %v", filename, err)
return nil, fmt.Errorf("upload data: %w", err)
}
if uploadResult.Error != "" {
glog.V(0).Infof("upload failure %v: %v", filename, err)
return nil, fmt.Errorf("upload result: %v", uploadResult.Error)
}
// When peer sharing is enabled we need EVERY chunk in the
// local cache so we can actually serve it back to peers on
// FetchChunk — otherwise the directory would advertise us as
// a holder and the fetcher would get NOT_FOUND from our
// chunk cache. When peer sharing is off we preserve the
// original behavior of caching only the first chunk (small
// files) to avoid blowing the cache on large uploads. Both
// paths gate on chunkCache != nil: -cacheCapacityMB=0 disables
// the cache entirely, in which case SetChunk would panic.
shouldCache := wfs.chunkCache != nil && (offset == 0 || wfs.peerAnnouncer != nil)
if shouldCache {
wfs.chunkCache.SetChunk(fileId, data)
}
// Announce every uploaded chunk so the tier-2 directory fills
// in as the file is written. Without this, the per-fetch
// announce path only bootstraps after someone else has already
// pulled a chunk via peer — which can't happen if nobody has
// told the directory who holds the chunk. Skip the announce
// when we couldn't cache (no point advertising bytes we can't
// actually serve back).
if wfs.peerAnnouncer != nil && shouldCache {
wfs.peerAnnouncer.EnqueueAnnounce(fileId)
}
chunk = uploadResult.ToPbFileChunk(fileId, offset, tsNs)
return chunk, nil
}
}