Files
Chris Lu 6844ec067c fix(s3): cache remote-only source before CopyObject (#9304) (#9305)
* fix(s3): cache remote-only source before CopyObject (#9304)

CopyObject from a remote.mount source whose object lives only upstream
created a destination entry with FileSize > 0 but no chunks/content,
because the resolved source entry has no local chunks and the copy path
fell into the "inline/empty chunks" branch with empty entry.Content.
A subsequent GET returned 500 with "data integrity error: size N
reported but no content available". CopyObjectPart had the same shape
via copyChunksForRange iterating an empty chunk list.

Detect entry.IsInRemoteOnly() right after resolving the source in both
CopyObjectHandler and CopyObjectPartHandler and cache the object to the
local cluster first via a new cacheRemoteObjectForCopy helper (a
copy-time analogue of cacheRemoteObjectForStreaming with a bounded 30s
timeout and version-aware path resolution). If caching fails or
produces no chunks, return 503 with Retry-After: 5 instead of writing
a metadata-only destination, mirroring the GetObject behavior added in
the #7817 cold-cache fix.

Adds TestCopyObjectRemoteOnlySourceDetection pinning the four entry
shapes the fix branches on plus the pre-fix broken-output shape.

* address PR review on remote-only copy fix

- Use the resolved entry's version id when srcVersionId is empty so a
  CopyObject reading the latest object in a versioning-enabled bucket
  caches the correct .versions/v_<id> path instead of getting stuck in
  a 503 retry loop. New helper resolvedSourceVersionId handles the
  fallback for both CopyObject and CopyObjectPart.
- Drop the redundant cachedEntry.IsInRemoteOnly() recheck in both
  handlers; the cache helper now reports success based on local data
  presence, and IsInRemoteOnly does not look at inline Content so
  keeping the check would 503 on small inline-cached objects.
- Treat inline Content as a successful cache result in both
  cacheRemoteObjectForStreaming and cacheRemoteObjectForCopy via a
  shared cachedEntryHasLocalData predicate. The CopyObject inline
  branch already handles entries that have Content but no chunks.
- Extract buildVersionedRemoteObjectPath so the streaming and copy
  cache helpers share path construction.

Adds TestResolvedSourceVersionId and TestCachedEntryHasLocalData to
pin the new helpers' contracts.

* narrow streaming cache contract back to chunks-only

CodeRabbit flagged that cacheRemoteObjectForStreaming's caller in
streamFromVolumeServers (lines 997-1002) still required non-empty
chunks, so content-only cache hits would fall through to a 503 retry
loop instead of being honored.

Resolve by keeping the helper's contract chunks-only: the filer's
caching code only ever writes chunks, the streaming downstream isn't
wired to read inline Content from a cached entry, and a partial range-
aware inline writer here would be overkill for a path that doesn't
actually occur in practice. cacheRemoteObjectForCopy keeps the relaxed
contract since the copy path's inline branch genuinely handles both
chunked and content-only entries.

Document the asymmetry on cachedEntryHasLocalData and on
cacheRemoteObjectForStreaming so a future reader can see why the two
helpers diverge.

* extend version-id resolution to streaming cache path

CodeRabbit flagged that GetObjectHandler still passed the raw query
versionId to cacheRemoteObjectForStreaming. For latest-version reads
in versioning-enabled buckets that stays empty even though the
resolved entry lives at .versions/v_<id>, so remote-only GETs would
keep caching the wrong path and 503-ing forever. Reuse the new
resolvedSourceVersionId helper at the streaming call site too.

Also document on cachedEntryHasLocalData that the zero-byte case
flagged in the same review is handled upstream (IsInRemoteOnly
requires RemoteSize > 0, so the cache helper is never invoked for
empty remote objects -- CopyObject's pre-existing inline branch
writes a correct empty destination directly). Pin this with a new
test case.

* trim verbose comments

Drop tutorial-style and review-history comments. Keep only the WHY
that isn't obvious from identifiers: the #9304 reference on the new
branches in CopyObject / CopyObjectPart, the latest-version-fallback
rationale on resolvedSourceVersionId, and the streaming/copy contract
asymmetry on cachedEntryHasLocalData.

* drop issue references from comments

Issue numbers belong in PR descriptions and commit messages, not in
source comments where they rot. Replace with the underlying invariant
the code is preserving.

* test: drive remote-object cache helpers through real gRPC

Existing tests only re-enacted helper-function branching in test
space, so they could not have caught a handler that consumed a
remote-only entry without going through the cache. Stand up an
in-process filer gRPC server (UnimplementedSeaweedFilerServer +
configurable CacheRemoteObjectToLocalCluster response) and exercise
the two cache helpers end-to-end.

What's pinned:
- cacheRemoteObjectForCopy returns nil when the cache makes no
  progress (response is still remote-only), lets gRPC errors
  through as nil, accepts both chunked and inline-content cache
  hits, and surfaces deadline-exceeded as nil so callers can 503
  instead of holding the request open.
- Versioned source paths route to .versions/v_<id>; non-versioned
  and "null" stay at the bucket-relative path. Captured by reading
  the request the stub server received.
- cacheRemoteObjectForStreaming holds the stricter chunks-only
  contract: a content-only cache hit is not propagated, since
  streamFromVolumeServers' downstream isn't wired to read from
  inline Content there.

Any current or future handler that calls these helpers exercises
the same gRPC path under test, so the bug class is closed for
helper-routed cache calls.

* move remote-only copy test into the integration suite

The previous gRPC-stub test in weed/s3api/ was integration-flavored
but stubbed; relocate the coverage to the existing two-server suite
under test/s3/remote_cache/, which already exercises the real
remote.mount + remote.uncache flow against a primary SeaweedFS plus
a secondary acting as remote storage.

The new test/s3/remote_cache/remote_cache_copy_test.go drives:
- TestRemoteCacheCopyObject: upload to primary, uncache (entry now
  remote-only), CopyObject to a new key, GET the destination. Pre-
  fix the GET returned 500 'data integrity error: size N reported
  but no content'; this pins the fixed behavior over real HTTP
  through the actual handler stack.
- TestRemoteCacheCopyObjectPart: same shape via multipart
  UploadPartCopy on a 6 MiB object split into two parts, exercising
  CopyObjectPartHandler's range-copy path.

Drop weed/s3api/s3api_remote_storage_grpc_test.go: the helper-level
classification tests in s3api_remote_storage_test.go still cover
the contract pieces (cachedEntryHasLocalData, resolvedSourceVersionId,
the remote-only entry shape), and the integration suite covers the
end-to-end behavior that those classifications enable.
2026-05-03 18:52:45 -07:00
..

Remote Object Cache Integration Tests

This directory contains integration tests for the remote object caching feature with singleflight deduplication.

Test Flow

Each test follows this pattern:

  1. Write to local - Upload data to primary SeaweedFS (local storage)
  2. Uncache - Push data to remote storage and remove local chunks
  3. Read - Read data (triggers caching from remote back to local)

This tests the full remote caching workflow including singleflight deduplication.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Test Client                               │
│                                                                  │
│    1. PUT data to primary SeaweedFS                             │
│    2. remote.cache.uncache (push to remote, purge local)        │
│    3. GET data (triggers caching from remote)                   │
│    4. Verify singleflight deduplication                         │
└──────────────────────────────────┬──────────────────────────────┘
                                   │
                 ┌─────────────────┴─────────────────┐
                 ▼                                   ▼
┌────────────────────────────────────┐   ┌────────────────────────────────┐
│     Primary SeaweedFS              │   │     Remote SeaweedFS           │
│        (port 8333)                 │   │        (port 8334)             │
│                                    │   │                                │
│  - Being tested                    │   │  - Acts as "remote" S3         │
│  - Has remote storage mounted      │──▶│  - Receives uncached data      │
│  - Caches remote objects           │   │  - Serves data for caching     │
│  - Singleflight deduplication      │   │                                │
└────────────────────────────────────┘   └────────────────────────────────┘

What's Being Tested

Test Files and Coverage

Test File Commands Tested Test Count Description
remote_cache_test.go Basic caching 5 tests Original caching workflow and singleflight tests
remote_cache_copy_test.go S3 CopyObject / UploadPartCopy from a remote-only source 2 tests Source object lives only in remote storage; CopyObject and UploadPartCopy must cache it locally before persisting the destination so the result is readable
command_remote_configure_test.go remote.configure 6 tests Configuration management
command_remote_mount_test.go remote.mount, remote.unmount, remote.mount.buckets 10 tests Mount operations
command_remote_cache_test.go remote.cache, remote.uncache 13 tests Cache/uncache with filters
command_remote_copy_local_test.go remote.copy.local 12 tests NEW in PR #8033 - Local to remote copy
command_remote_meta_sync_test.go remote.meta.sync 8 tests Metadata synchronization
command_edge_cases_test.go All commands 11 tests Edge cases and stress tests

Total: 67 test cases covering 8 weed shell commands and the S3 copy paths for remote-only sources

Commands Tested

  1. remote.configure - Configure remote storage backends
  2. remote.mount - Mount remote storage to local directory
  3. remote.unmount - Unmount remote storage
  4. remote.mount.buckets - Mount all buckets from remote
  5. remote.cache - Cache remote files locally
  6. remote.uncache - Remove local cache, keep metadata
  7. remote.copy.local - Copy local files to remote (NEW in PR #8033)
  8. remote.meta.sync - Sync metadata from remote

Test Coverage

Basic Operations:

  • Basic caching workflow (Write → Uncache → Read)
  • Singleflight deduplication (concurrent reads trigger ONE cache operation)
  • Large object caching (5MB-100MB files)
  • Range requests (partial reads)
  • Not found handling

File Filtering:

  • Include patterns (*.pdf, *.txt, etc.)
  • Exclude patterns
  • Size filters (-minSize, -maxSize)
  • Age filters (-minAge, -maxAge)
  • Combined filters

Command Options:

  • Dry run mode (-dryRun=true)
  • Concurrency settings (-concurrent=N)
  • Force update (-forceUpdate=true)
  • Non-empty directory mounting (-nonempty=true)

Edge Cases:

  • Empty directories
  • Nested directory hierarchies
  • Special characters in filenames
  • Very large files (100MB+)
  • Many small files (100+)
  • Rapid cache/uncache cycles
  • Concurrent command execution
  • Invalid paths
  • Zero-byte files

Running Tests

Run All Tests

# Full automated workflow
make test-with-server

# Or manually
go test -v ./...

Run Specific Test Files

# Test remote.configure command
go test -v -run TestRemoteConfigure

# Test remote.mount/unmount commands
go test -v -run TestRemoteMount
go test -v -run TestRemoteUnmount

# Test remote.cache/uncache commands  
go test -v -run TestRemoteCache
go test -v -run TestRemoteUncache

# Test remote.copy.local command (PR #8033)
go test -v -run TestRemoteCopyLocal

# Test remote.meta.sync command
go test -v -run TestRemoteMetaSync

# Test edge cases
go test -v -run TestEdgeCase

Quick Start

# Build SeaweedFS, start both servers, run tests, stop servers
make test-with-server

Manual Steps

# 1. Build SeaweedFS binary
make build-weed

# 2. Start remote SeaweedFS (acts as "remote" storage)
make start-remote

# 3. Start primary SeaweedFS (the one being tested)
make start-primary

# 4. Configure remote storage mount
make setup-remote

# 5. Run tests
make test

# 6. Clean up
make clean

Configuration

Primary SeaweedFS (Being Tested)

Service Port
S3 API 8333
Filer 8888
Master 9333
Volume 8080

Remote SeaweedFS (Remote Storage)

Service Port
S3 API 8334
Filer 8889
Master 9334
Volume 8081

Makefile Targets

make help           # Show all available targets
make build-weed     # Build SeaweedFS binary
make start-remote   # Start remote SeaweedFS
make start-primary  # Start primary SeaweedFS
make setup-remote   # Configure remote storage mount
make test           # Run tests
make test-with-server  # Full automated test workflow
make logs           # Show server logs
make health         # Check server status
make clean          # Stop servers and clean up

Test Details

TestRemoteCacheBasic

Basic workflow test:

  1. Write object to primary (local)
  2. Uncache (push to remote, remove local chunks)
  3. Read (triggers caching from remote)
  4. Read again (from local cache - should be faster)

TestRemoteCacheConcurrent

Singleflight deduplication test:

  1. Write 1MB object
  2. Uncache to remote
  3. Launch 10 concurrent reads
  4. All should succeed with correct data
  5. Only ONE caching operation should run (singleflight)

TestRemoteCacheLargeObject

Large file test (5MB) to verify chunked transfer works correctly.

TestRemoteCacheRangeRequest

Tests HTTP range requests work correctly after caching.

TestRemoteCacheNotFound

Tests proper error handling for non-existent objects.

Troubleshooting

View logs

make logs           # Show recent logs from both servers
make logs-primary   # Follow primary logs in real-time
make logs-remote    # Follow remote logs in real-time

Check server health

make health

Clean up and retry

make clean
make test-with-server