mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-14 05:41:29 +00:00
* fix(s3): encrypt SSE-S3 KEK at rest using passphrase-derived wrapping key * fix(s3): surface KEK migration failures instead of silently dropping them The legacy-plaintext -> encrypted-at-rest path used to swallow both wrapKEK and updateKEKContent errors. An operator who configured a passphrase but had a filer permission issue (or a wrap failure) saw nothing in the logs and the KEK stayed on disk in plaintext, with the migration retried on every restart and silently failing every time. Log each failure path explicitly so the unmigrated state is visible. The server still starts with the in-memory key loaded — refusing to boot here would be worse than the warning. Addresses gemini and coderabbit reviews on PR #8880. * fix(s3): use a per-installation random salt for KEK wrapping HKDF The original implementation hardcoded "seaweedfs-sse-s3-kek-wrapping-v1" as the HKDF salt for the KEK wrapping key. Two SeaweedFS installations using the same passphrase therefore produced byte-identical wrapping keys, opening a precomputation/rainbow-table angle against weaker passphrases. Generate a random 32-byte salt every time wrapKEK runs and embed it in the on-disk payload alongside the AES-GCM ciphertext. The new format is base64(magic("SWv2") || salt || nonce || ciphertext+tag); unwrapKEK detects the magic and reads the salt out of the payload. KEKs wrapped under the legacy fixed-salt format still unwrap cleanly and are opportunistically re-wrapped into v2 on the next load so operators get the stronger format without manual migration. Addresses gemini review on PR #8880. * fix(s3): plumb KEK passphrase from env into the global key manager InitializeGlobalSSES3KeyManager used to ignore the kekPassphrase field entirely — the global manager was always constructed with the empty constructor, so the encrypted-at-rest code path never engaged in production. Read the passphrase from WEED_S3_SSE_KEK_PASSPHRASE and apply it before InitializeWithFiler so the load path picks up the encrypted format. Log a warning when the env var is unset to make the plaintext fallback visible to operators upgrading from earlier builds. Adds SetKEKPassphrase as the public seam used by the global init and by tests, plus regression tests for the wrap/unwrap round-trip, random-salt independence across managers sharing a passphrase, and the no-passphrase fallback that preserves the legacy hex-decode path. Addresses coderabbit review on PR #8880. * fix(s3): drop redundant base64 decode in KEK migration check unwrapKEK already does the base64 decode and the magic-prefix check; isV2WrappedKEK was repeating both passes purely so the migration branch in loadSuperKeyFromFiler could ask "was this v1 or v2". Have unwrapKEK return the version flag directly and delete the redundant helper. Single decode pass per load. Addresses gemini review on PR #8880. * fix(s3): updateKEKContent must overwrite, not create filer_pb.MkFile maps to CreateEntry, which is O_EXCL: it fails with ErrEntryAlreadyExists when the file already exists. Both KEK migration paths (legacy v1→v2 rewrap and plaintext→encrypted) call updateKEKContent against the entry they just read, so MkFile errored out every time and the migrations only ran in memory while the on-disk KEK stayed in its old format. The previous commit logged the failure loudly but the result was the same: a pre-existing deployment never got migrated. Switch updateKEKContent to LookupDirectoryEntry + UpdateEntry so the overwrite actually persists. Surface the lookup/update errors so the caller's existing "migration failed; KEK still on disk in old form" warnings fire on the right cases. Addresses coderabbit critical review on PR #8880. * chore(s3): drop unused generateAndSaveSuperKeyToFiler Master removed this as dead code in #8913 and I reintroduced it during the merge resolution thinking the migration paths still needed it. On second look it has no callers in this branch — every KEK-creation path on PR #8880 goes through the existing reader code that handles "file not present, generate" inline. Drop the duplicate. Addresses gemini medium review on PR #8880. * fix(s3): updateKEKContent honours km.kekPath instead of hardcoded path The migration write path was always pointing at /etc/s3/sse_kek even when the manager was configured with an operator-overridden kekPath. Split km.kekPath at the last "/" so the lookup + UpdateEntry land on the same file the read path used. Defaults match defaultKEKPath when kekPath is unset. Addresses gemini medium review on PR #8880. * fix(s3): KEK passphrase via Viper key, with env-var fallback The KEK passphrase was read straight from os.Getenv, but every other SSE-S3 secret (s3.sse.kek, s3.sse.key) goes through Viper so an operator can set them in security.toml or via WEED_ env vars interchangeably. Add s3.sse.kek.passphrase to the same path; the existing SSES3KEKPassphraseEnv lookup stays as a fallback so deployments wired before this commit keep working. Addresses gemini medium review on PR #8880.