Files
seaweedfs/weed/filer/entry.go
Chris Lu bf31f404bc test: add pjdfstest POSIX compliance suite (#9013)
* test: add pjdfstest POSIX compliance suite

Adds a script and CI workflow that runs the upstream pjdfstest POSIX
compliance test suite against a SeaweedFS FUSE mount. The script starts
a self-contained `weed mini` server, mounts the filesystem with
`weed mount`, builds pjdfstest from source, and runs it under prove(1).

* fix: address review feedback on pjdfstest setup

- Use github.ref instead of github.head_ref in concurrency group so
  push events get a stable group key
- Add explicit timeout check after filer readiness polling loop
- Refresh pjdfstest checkout when PJDFSTEST_REPO or PJDFSTEST_REF are
  overridden instead of silently reusing stale sources

* test: add Docker-based pjdfstest for faster iteration

Adds a docker-compose setup that reuses the existing e2e image pattern:
- master, volume, filer services from chrislusf/seaweedfs:e2e
- mount service extended with pjdfstest baked in (Dockerfile extends e2e)
- Tests run via `docker compose exec mount /run.sh`
- CI workflow gains a parallel `pjdfstest (docker)` job

This avoids building Go from scratch on each iteration — just rebuild the
e2e image once and iterate on the compose stack.

* fix: address second round of review feedback

- Use mktemp for WORK_DIR so each run starts with a clean filer state
- Pin PJDFSTEST_REF to immutable commit (03eb257) instead of master
- Use cp -r instead of cp -a to avoid preserving ownership during setup

* fix: address CI failure and third round of review feedback

- Fix docker job: fall back to plain docker build when buildx cache
  export is not supported (default docker driver in some CI runners)
- Use /healthz endpoint for filer healthcheck in docker-compose
- Copy logs to a fixed path (/tmp/seaweedfs-pjdfstest-logs/) for
  reliable CI artifact upload when WORK_DIR is a mktemp path

* fix(mount): improve POSIX compliance for FUSE mount

Address several POSIX compliance gaps surfaced by the pjdfstest suite:

1. Filename length limit: reduce from 4096 to 255 bytes (NAME_MAX),
   returning ENAMETOOLONG for longer names.

2. SUID/SGID clearing on write: clear setuid/setgid bits when a
   non-root user writes to a file (POSIX requirement).

3. SUID/SGID clearing on chown: clear setuid/setgid bits when file
   ownership changes by a non-root user.

4. Sticky bit enforcement: add checkStickyBit helper and enforce it
   in Unlink, Rmdir, and Rename — only file owner, directory owner,
   or root may delete entries in sticky directories.

5. ctime (inode change time) tracking: add ctime field to the
   FuseAttributes protobuf message and filer.Attr struct. Update
   ctime on all metadata-modifying operations (SetAttr, Write/flush,
   Link, Create, Mkdir, Mknod, Symlink, Truncate). Fall back to
   mtime for backward compatibility when ctime is 0.

* fix: add -T flag to docker compose exec for CI

Disable TTY allocation in the pjdfstest docker job since GitHub
Actions runners have no interactive TTY.

* fix(mount): update parent directory mtime/ctime on entry changes

POSIX requires that a directory's st_mtime and st_ctime be updated
whenever entries are created or removed within it. Add
touchDirMtimeCtime() helper and call it after:
- mkdir, rmdir
- create (including deferred creates), mknod, unlink
- symlink, link
- rename (both source and destination directories)

This fixes pjdfstest failures in mkdir/00, mkfifo/00, mknod/00,
mknod/11, open/00, symlink/00, link/00, and rmdir/00.

* fix(mount): enforce sticky bit on destination directory during rename

POSIX requires sticky-bit enforcement on both source and destination
directories during rename. When the destination directory has the
sticky bit set and a target entry already exists, only the file owner,
directory owner, or root may replace it.

* fix(mount): add in-memory atime tracking for POSIX compliance

Track atime separately from mtime using a bounded in-memory map
(capped at 8192 entries with random eviction). atime is not persisted
to the filer — it's only kept in mount memory to satisfy POSIX stat
requirements for utimensat and related syscalls.

This fixes utimensat/00, utimensat/02, utimensat/04, utimensat/05,
and utimensat/09 pjdfstest failures where atime was incorrectly
aliased to mtime.

* fix(mount): restore long filename support, fix permission checks

- Restore 4096-byte filename limit (was incorrectly reduced to 255).
  SeaweedFS stores names as protobuf strings with no ext4-style
  constraint — the 255 limit is not applicable.

- Fix AcquireHandle permission check to map filer uid/gid to local
  space before calling hasAccess, matching the pattern used in Access().

- Fix hasAccess fallback when supplementary group lookup fails: fall
  through to "other" permissions instead of requiring both group AND
  other to match, which was overly restrictive for non-existent UIDs.

* fix(mount): fix permission checks and enforce NAME_MAX=255

- Fix AcquireHandle to map uid/gid from filer-space to local-space
  before calling hasAccess, consistent with the Access handler.

- Fix hasAccess fallback when supplementary group lookup fails: use
  "other" permissions only instead of requiring both group AND other.

- Enforce NAME_MAX=255 with a comment explaining the Linux FUSE kernel
  module's VFS-layer limit. Files >255 bytes can be created via direct
  FUSE protocol calls but can't be stat'd/chmod'd via normal syscalls.

- Don't call touchDirMtimeCtime for deferred creates to avoid
  invalidating the just-cached entry via filer metadata events.

* ci: mark pjdfstest steps as continue-on-error

The pjdfstest suite has known failures (Linux FUSE NAME_MAX=255
limitation, hard link nlink/ctime tracking, nanosecond precision)
that cannot be fixed in the mount layer. Mark the test steps as
continue-on-error so the CI job reports results without blocking.

* ci: increase pjdfstest bare metal timeout to 90 minutes

* fix: use full commit hash for PJDFSTEST_REF in run.sh

Short hashes cannot be resolved by git fetch --depth 1 on shallow
clones. Use the full 40-char SHA.

* test: add pjdfstest known failures skip list

Add known_failures.txt listing 33 test files that cannot pass due to:
- Linux FUSE kernel NAME_MAX=255 (26 files)
- Hard link nlink/ctime tracking requiring filer changes (3 files)
- Parent dir mtime on deferred create (1 file)
- Directory rename permission edge case (1 file)
- rmdir after hard link unlink (1 file)
- Nanosecond timestamp precision (1 file)

Both run.sh and run_inside_container.sh now skip these tests when
running the full suite. Any failure in a non-skipped test will cause
CI to fail, catching regressions immediately.

Remove continue-on-error from CI steps since the skip list handles
known failures.

Result: 204 test files, 8380 tests, all passing.

* ci: remove bare metal pjdfstest job, keep Docker only

The bare metal job consistently gets stuck past its timeout due to
weed processes not exiting cleanly. The Docker job covers the same
tests reliably and runs faster.
2026-04-10 09:52:16 -07:00

176 lines
4.3 KiB
Go

package filer
import (
"os"
"time"
"github.com/seaweedfs/seaweedfs/weed/pb/filer_pb"
"github.com/seaweedfs/seaweedfs/weed/s3api/s3_constants"
"github.com/seaweedfs/seaweedfs/weed/util"
)
type Attr struct {
Mtime time.Time // time of last modification
Crtime time.Time // time of creation (OS X only)
Ctime time.Time // time of last inode change
Mode os.FileMode // file mode
Uid uint32 // owner uid
Gid uint32 // group gid
Mime string // mime type
TtlSec int32 // ttl in seconds
UserName string
GroupNames []string
SymlinkTarget string
Md5 []byte
FileSize uint64
Rdev uint32
Inode uint64
}
func (attr Attr) IsDirectory() bool {
return attr.Mode&os.ModeDir > 0
}
type Entry struct {
util.FullPath
Attr
Extended map[string][]byte
// the following is for files
Chunks []*filer_pb.FileChunk `json:"chunks,omitempty"`
HardLinkId HardLinkId
HardLinkCounter int32
Content []byte
Remote *filer_pb.RemoteEntry
Quota int64
WORMEnforcedAtTsNs int64
}
func (entry *Entry) Size() uint64 {
return maxUint64(maxUint64(TotalSize(entry.GetChunks()), entry.FileSize), uint64(len(entry.Content)))
}
func (entry *Entry) Timestamp() time.Time {
if entry.IsDirectory() {
return entry.Crtime
} else {
return entry.Mtime
}
}
func (entry *Entry) ShallowClone() *Entry {
if entry == nil {
return nil
}
newEntry := &Entry{}
newEntry.FullPath = entry.FullPath
newEntry.Attr = entry.Attr
newEntry.Chunks = entry.Chunks
newEntry.Extended = entry.Extended
newEntry.HardLinkId = entry.HardLinkId
newEntry.HardLinkCounter = entry.HardLinkCounter
newEntry.Content = entry.Content
newEntry.Remote = entry.Remote
newEntry.Quota = entry.Quota
return newEntry
}
func (entry *Entry) ToProtoEntry() *filer_pb.Entry {
if entry == nil {
return nil
}
message := &filer_pb.Entry{}
message.Name = entry.FullPath.Name()
entry.ToExistingProtoEntry(message)
return message
}
func (entry *Entry) ToExistingProtoEntry(message *filer_pb.Entry) {
if entry == nil {
return
}
message.IsDirectory = entry.IsDirectory()
// Reuse pre-allocated attributes if available, otherwise allocate
if message.Attributes != nil {
EntryAttributeToExistingPb(entry, message.Attributes)
} else {
message.Attributes = EntryAttributeToPb(entry)
}
message.Chunks = entry.GetChunks()
message.Extended = entry.Extended
message.HardLinkId = entry.HardLinkId
message.HardLinkCounter = entry.HardLinkCounter
message.Content = entry.Content
message.RemoteEntry = entry.Remote
message.Quota = entry.Quota
message.WormEnforcedAtTsNs = entry.WORMEnforcedAtTsNs
}
func FromPbEntryToExistingEntry(message *filer_pb.Entry, fsEntry *Entry) {
fsEntry.Attr = PbToEntryAttribute(message.Attributes)
fsEntry.Chunks = message.Chunks
fsEntry.Extended = message.Extended
fsEntry.HardLinkId = HardLinkId(message.HardLinkId)
fsEntry.HardLinkCounter = message.HardLinkCounter
fsEntry.Content = message.Content
fsEntry.Remote = message.RemoteEntry
fsEntry.Quota = message.Quota
fsEntry.FileSize = FileSize(message)
fsEntry.WORMEnforcedAtTsNs = message.WormEnforcedAtTsNs
}
func (entry *Entry) ToProtoFullEntry() *filer_pb.FullEntry {
if entry == nil {
return nil
}
dir, _ := entry.FullPath.DirAndName()
return &filer_pb.FullEntry{
Dir: dir,
Entry: entry.ToProtoEntry(),
}
}
func (entry *Entry) GetChunks() []*filer_pb.FileChunk {
return entry.Chunks
}
func FromPbEntry(dir string, entry *filer_pb.Entry) *Entry {
t := &Entry{}
t.FullPath = util.NewFullPath(dir, entry.Name)
FromPbEntryToExistingEntry(entry, t)
return t
}
func maxUint64(x, y uint64) uint64 {
if x > y {
return x
}
return y
}
func (entry *Entry) IsExpireS3Enabled() (exist bool) {
if entry.Extended != nil {
_, exist = entry.Extended[s3_constants.SeaweedFSExpiresS3]
}
return exist
}
func (entry *Entry) IsS3Versioning() (exist bool) {
if entry.Extended != nil {
_, exist = entry.Extended[s3_constants.ExtVersionIdKey]
}
return exist
}
func (entry *Entry) GetS3ExpireTime() (expireTime time.Time) {
if entry.Mtime.IsZero() {
expireTime = entry.Crtime
} else {
expireTime = entry.Mtime
}
return expireTime.Add(time.Duration(entry.TtlSec) * time.Second)
}