mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-05-19 08:11:29 +00:00

Files

Chris Lu 82cf60a44f fix(s3api): re-encrypt UploadPartCopy bytes for the destination's SSE config (#8908 ) (#9280 )

* fix(s3api): re-encrypt UploadPartCopy bytes for the destination's SSE config (#8908)

The remaining failure mode in #8908 was that Docker Registry's blob
finalization (server-side Move via UploadPartCopy) silently corrupts
SSE-S3 multipart objects. Reproduces with `aws s3api upload-part-copy`
under bucket-default SSE-S3: the GET on the completed object returns
deterministic wrong bytes (correct length, same wrong SHA-256 across
runs). The metadata is mathematically self-consistent — every chunk's
stored IV equals `calculateIVWithOffset(baseIV_dst, partLocalOffset)` —
but the bytes on disk were encrypted with the SOURCE upload's key+baseIV.

Root cause:

- `copyChunksForRange` (and `createDestinationChunk`) constructs new
chunks for UploadPartCopy without copying `SseType` / `SseMetadata`,
so destination chunks are written with `SseType=NONE`.

- At completion, `completedMultipartChunk` (PR #9224's NONE→SSE_S3
backfill, intended to recover from a different missing-metadata
bug) sees those NONE chunks under an SSE-S3 multipart upload and
backfills SSE-S3 metadata derived from the destination upload's
baseIV. The chunk metadata is now internally consistent and the
GET path applies decryption — but the bytes on disk are encrypted
with the source upload's key, not the destination's. Decryption
produces deterministic garbage. Docker Registry pulls then fail
with "Digest did not match".

Fix: when either the source object or the destination multipart upload
has any SSE configured, take a slow-path UploadPartCopy that
(1) opens a plaintext reader of the source range — decrypting the
source's per-chunk SSE-S3 metadata if needed via a reused
`buildMultipartSSES3Reader`, and
(2) feeds that plaintext through `putToFiler`'s existing encryption
pipeline by staging the destination upload entry's SSE-S3/SSE-KMS
headers on a cloned request. Encryption then matches PutObjectPart's
contract: every part starts a fresh CTR stream from counter 0 with
`baseIV_dst`, and each internal chunk's metadata records
`calculateIVWithOffset(baseIV_dst, chunk.partLocalOffset)`.

The `non-SSE → non-SSE` case still takes the existing fast raw-byte
copy path — bytes on disk are plaintext on both sides, so chunk-level
metadata is irrelevant.

Cross-encryption from SSE-KMS / SSE-C sources is left as TODO — the
new path returns an explicit error rather than the previous silent
corruption. SSE-S3 (the user-reported case) round-trips correctly.

Tests:

- test/s3/sse/s3_sse_uploadpartcopy_integration_test.go pins three
UploadPartCopy shapes against bucket-default SSE-S3:
* Docker-Registry-shape 32MB+tail (the user's exact 5-chunk /
2-part metadata layout)
* single full-object UploadPartCopy
* many small range copies
Each round-trips SHA-256.

- test/s3/sse/s3_sse_concurrent_repro_test.go covers the parallel
multipart-upload shape from the user report (5 blobs in parallel,
full GET and chunked range GET both hash-checked) — pre-existing
coverage; added here as a regression sentinel.

* test(s3-sse): rename UploadPartCopy regression test so CI matches it

The CI workflow .github/workflows/s3-sse-tests.yml dispatches on the
TEST_PATTERN ".*Multipart.*Integration" — i.e. the test name must contain
both "Multipart" and "Integration" for CI to run it. The previous name
TestSSES3UploadPartCopyIntegration had only "Integration"; "UploadPart"
isn't "Multipart". Rename to TestSSES3MultipartUploadPartCopyIntegration
so the regression test actually runs in CI rather than only locally.

* fix(s3api): map unsupported UploadPartCopy SSE source to 501, not 500 (review feedback on #9280)

openSourcePlaintextReader explicitly rejects SSE-KMS and SSE-C sources
(SSE-S3 is the only one wired up in this slow path so far). Earlier
the caller blanket-mapped that to ErrInternalError, which collapses
"this shape isn't implemented yet" into the same 500 response a
real server failure would produce. Clients can no longer tell whether
they hit a feature gap or a bug.

Introduce a sentinel errCopySourceSSEUnsupported and have
copyObjectPartViaReencryption errors.Is-check it; on match, return
ErrNotImplemented (501) instead of ErrInternalError (500). Other
failures still map to 500.

Found by coderabbitai review on PR #9280.

* fix(s3api): UploadPartCopy must fail with NoSuchUpload when upload entry is missing (review feedback on #9280)

CopyObjectPartHandler's earlier checkUploadId call only verifies that
the uploadID's hash prefix matches dstObject; it does not prove the
upload directory exists in the filer. The previous logic silently
swallowed filer_pb.ErrNotFound from getEntry(uploadDir) and fell
through with uploadEntry=nil, which then skipped the destination
SSE check and could route a plain-source copy through the raw-byte
fast path even though the destination's encryption state is unknown.

Treat ErrNotFound as ErrNoSuchUpload so the client sees the right
status, matching the AWS S3 contract for UploadPartCopy on a
non-existent upload.

Found by coderabbitai review on PR #9280.

* feat(s3api): set SSE response headers on UploadPartCopy slow path (review feedback on #9280)

PutObjectPartHandler writes x-amz-server-side-encryption (and the KMS
key-id header for SSE-KMS) on every successful part response so clients
can confirm the destination's encryption state. The new UploadPartCopy
slow path was missing this — it returned only the ETag in the response
body and no SSE response headers.

Plumb putToFiler's SSEResponseMetadata back through
copyObjectPartViaReencryption to the handler, then call
setSSEResponseHeaders before writing the XML response, matching the
PutObjectPart contract.

Found by gemini-code-assist review on PR #9280.

* fix(s3api): map transient filer errors on UploadPartCopy upload-entry fetch to 503 (review feedback on #9280)

Earlier non-ErrNotFound errors from getEntry(uploadDir, uploadID) all
returned 500 InternalError, which most SDKs treat as fatal — even
though a transient filer outage (gRPC Unavailable, leader election in
flight, deadline exceeded) is exactly the kind of failure SDK retry
logic is supposed to recover from.

Add an isTransientFilerError helper that recognises:
- context.DeadlineExceeded / context.Canceled
- gRPC codes.Unavailable, DeadlineExceeded, ResourceExhausted, Aborted

When the upload-entry fetch fails for one of those reasons, return
503 ServiceUnavailable so the client retries; everything else still
maps to 500. Log line now also carries dstObject (in addition to
dstBucket and uploadID) to make incident triage easier.

Found by gemini-code-assist review on PR #9280.

2026-04-29 09:46:44 -07:00

docker-compose.yml

fix: add missing backslash for volume extraArgs in helm chart (#7676 )

2025-12-08 23:21:02 -08:00

github_7562_copy_test.go

chore: execute goimports to format the code (#7983 )

2026-01-07 13:06:08 -08:00

Makefile

feat(s3): support WEED_S3_SSE_KEY env var for SSE-S3 KEK (#8904 )

2026-04-03 13:01:21 -07:00

README_KMS.md

S3 API: Add integration with KMS providers (#7152 )

2025-08-22 22:10:30 -07:00

README.md

S3 API: Add integration with KMS providers (#7152 )

2025-08-22 22:10:30 -07:00

s3_kms.json

S3 API: Add integration with KMS providers (#7152 )

2025-08-22 22:10:30 -07:00

s3_range_headers_test.go

S3: Directly read write volume servers (#7481 )

2025-11-18 23:18:35 -08:00

s3_sse_concurrent_repro_test.go

fix(s3api): re-encrypt UploadPartCopy bytes for the destination's SSE config (#8908 ) (#9280 )

2026-04-29 09:46:44 -07:00

s3_sse_integration_test.go

fix(s3api): stream multipart-SSE chunks lazily to avoid truncated GETs (#8908 ) (#9228 )

2026-04-26 16:31:42 -07:00

s3_sse_multipart_copy_test.go

Add Kafka Gateway (#7231 )

2025-10-13 18:05:17 -07:00

s3_sse_range_coverage_test.go

fix(s3api): stream multipart-SSE chunks lazily to avoid truncated GETs (#8908 ) (#9228 )

2026-04-26 16:31:42 -07:00

s3_sse_range_server_test.go

S3: Directly read write volume servers (#7481 )

2025-11-18 23:18:35 -08:00

s3_sse_uploadpartcopy_integration_test.go

fix(s3api): re-encrypt UploadPartCopy bytes for the destination's SSE config (#8908 ) (#9280 )

2026-04-29 09:46:44 -07:00

s3_volume_encryption_test.go

go fix

2026-02-20 18:42:00 -08:00

s3-config-template.json

S3 API: Add integration with KMS providers (#7152 )

2025-08-22 22:10:30 -07:00

setup_openbao_sse.sh

Add Kafka Gateway (#7231 )

2025-10-13 18:05:17 -07:00

simple_sse_test.go

Add Kafka Gateway (#7231 )

2025-10-13 18:05:17 -07:00

sse_kms_openbao_test.go

Add Kafka Gateway (#7231 )

2025-10-13 18:05:17 -07:00

test_single_ssec.txt

S3 API: Add SSE-KMS (#7144 )

2025-08-21 08:28:07 -07:00

README.md

S3 Server-Side Encryption (SSE) Integration Tests

This directory contains comprehensive integration tests for SeaweedFS S3 API Server-Side Encryption functionality. These tests validate the complete end-to-end encryption/decryption pipeline from S3 API requests through filer metadata storage.

Overview

The SSE integration tests cover three main encryption methods:

SSE-C (Customer-Provided Keys): Client provides encryption keys via request headers
SSE-KMS (Key Management Service): Server manages encryption keys through a KMS provider
SSE-S3 (Server-Managed Keys): Server automatically manages encryption keys

🆕 Real KMS Integration

The tests now include real KMS integration with OpenBao, providing:

✅ Actual encryption/decryption operations (not mock keys)
✅ Multiple KMS keys for different security levels
✅ Per-bucket KMS configuration testing
✅ Performance benchmarking with real KMS operations

See README_KMS.md for detailed KMS integration documentation.

Why Integration Tests Matter

These integration tests were created to address a critical gap in test coverage that previously existed. While the SeaweedFS codebase had comprehensive unit tests for SSE components, it lacked integration tests that validated the complete request flow:

Client Request → S3 API → Filer Storage → Metadata Persistence → Retrieval → Decryption

The Bug These Tests Would Have Caught

A critical bug was discovered where:

✅ S3 API correctly encrypted data and sent metadata headers to the filer
❌ Filer did not process SSE metadata headers, losing all encryption metadata
❌ Objects could be encrypted but never decrypted (metadata was lost)

Unit tests passed because they tested components in isolation, but the integration was broken. These integration tests specifically validate that:

Encryption metadata is correctly sent to the filer
Filer properly processes and stores the metadata
Objects can be successfully retrieved and decrypted
Copy operations preserve encryption metadata
Multipart uploads maintain encryption consistency

Test Structure

Core Integration Tests

Basic Functionality

TestSSECIntegrationBasic - Basic SSE-C PUT/GET cycle
TestSSEKMSIntegrationBasic - Basic SSE-KMS PUT/GET cycle

Data Size Validation

TestSSECIntegrationVariousDataSizes - SSE-C with various data sizes (0B to 1MB)
TestSSEKMSIntegrationVariousDataSizes - SSE-KMS with various data sizes

Object Copy Operations

TestSSECObjectCopyIntegration - SSE-C object copying (key rotation, encryption changes)
TestSSEKMSObjectCopyIntegration - SSE-KMS object copying

Multipart Uploads

TestSSEMultipartUploadIntegration - SSE multipart uploads for large objects

Error Conditions

TestSSEErrorConditions - Invalid keys, malformed requests, error handling

Performance Tests

BenchmarkSSECThroughput - SSE-C performance benchmarking
BenchmarkSSEKMSThroughput - SSE-KMS performance benchmarking

Running Tests

Prerequisites

Build SeaweedFS: Ensure the weed binary is built and available in PATH
```
cd /path/to/seaweedfs
make
```
Dependencies: Tests use AWS SDK Go v2 and testify - these are handled by Go modules

Quick Test

Run basic SSE integration tests:

make test-basic

Comprehensive Testing

Run all SSE integration tests:

make test

Specific Test Categories

make test-ssec      # SSE-C tests only
make test-ssekms    # SSE-KMS tests only  
make test-copy      # Copy operation tests
make test-multipart # Multipart upload tests
make test-errors    # Error condition tests

Performance Testing

make benchmark      # Performance benchmarks
make perf          # Various data size performance tests

KMS Integration Testing

make setup-openbao          # Set up OpenBao KMS
make test-with-kms          # Run all SSE tests with real KMS
make test-ssekms-integration # Run SSE-KMS with OpenBao only
make clean-kms             # Clean up KMS environment

Development Testing

make manual-start   # Start SeaweedFS for manual testing
# ... run manual tests ...
make manual-stop    # Stop and cleanup

Test Configuration

Default Configuration

The tests use these default settings:

S3 Endpoint: http://127.0.0.1:8333
Access Key: some_access_key1
Secret Key: some_secret_key1
Region: us-east-1
Bucket Prefix: test-sse-

Custom Configuration

Override defaults via environment variables:

S3_PORT=8444 FILER_PORT=8889 make test

Test Environment

Each test run:

Starts a complete SeaweedFS cluster (master, volume, filer, s3)
Configures KMS support for SSE-KMS tests
Creates temporary buckets with unique names
Runs tests with real HTTP requests
Cleans up all test artifacts

Test Data Coverage

Data Sizes Tested

0 bytes: Empty files (edge case)
1 byte: Minimal data
16 bytes: Single AES block
31 bytes: Just under two blocks
32 bytes: Exactly two blocks
100 bytes: Small file
1 KB: Small text file
8 KB: Medium file
64 KB: Large file
1 MB: Very large file

Encryption Key Scenarios

SSE-C: Random 256-bit keys, key rotation, wrong keys
SSE-KMS: Various key IDs, encryption contexts, bucket keys
Copy Operations: Same key, different keys, encryption transitions

Critical Test Scenarios

Metadata Persistence Validation

The integration tests specifically validate scenarios that would catch metadata storage bugs:

// 1. Upload with SSE-C
client.PutObject(..., SSECustomerKey: key)  // ← Metadata sent to filer

// 2. Retrieve with SSE-C  
client.GetObject(..., SSECustomerKey: key)  // ← Metadata retrieved from filer

// 3. Verify decryption works
assert.Equal(originalData, decryptedData)    // ← Would fail if metadata lost

Content-Length Validation

Tests verify that Content-Length headers are correct, which would catch bugs related to IV handling:

assert.Equal(int64(originalSize), resp.ContentLength)  // ← Would catch IV-in-stream bugs

Debugging

View Logs

make debug-logs     # Show recent log entries
make debug-status   # Show process and port status

Manual Testing

make manual-start   # Start SeaweedFS
# Test with S3 clients, curl, etc.
make manual-stop    # Cleanup

Integration Test Benefits

These integration tests provide:

End-to-End Validation: Complete request pipeline testing
Metadata Persistence: Validates filer storage/retrieval of encryption metadata
Real Network Communication: Uses actual HTTP requests and responses
Production-Like Environment: Full SeaweedFS cluster with all components
Regression Protection: Prevents critical integration bugs
Performance Baselines: Benchmarking for performance monitoring

Continuous Integration

For CI/CD pipelines, use:

make ci-test        # Quick tests suitable for CI
make stress         # Stress testing for stability validation

Key Differences from Unit Tests

Aspect	Unit Tests	Integration Tests
Scope	Individual functions	Complete request pipeline
Dependencies	Mocked/simulated	Real SeaweedFS cluster
Network	None	Real HTTP requests
Storage	In-memory	Real filer database
Metadata	Manual simulation	Actual storage/retrieval
Speed	Fast (milliseconds)	Slower (seconds)
Coverage	Component logic	System integration

Conclusion

These integration tests ensure that SeaweedFS SSE functionality works correctly in production-like environments. They complement the existing unit tests by validating that all components work together properly, providing confidence that encryption/decryption operations will succeed for real users.

Most importantly, these tests would have immediately caught the critical filer metadata storage bug that was previously undetected, demonstrating the crucial importance of integration testing for distributed systems.