368 lines
10 KiB
Markdown
368 lines
10 KiB
Markdown
# Bring Your Own Storage (BYOS)
|
|
|
|
## Overview
|
|
|
|
ATCR supports "Bring Your Own Storage" (BYOS) for blob storage. Users can:
|
|
- Deploy their own hold service with embedded PDS
|
|
- Control access via crew membership in the hold's PDS
|
|
- Keep blob data in their own S3/Storj/Minio while manifests stay in their user PDS
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌──────────────────────────────────────────┐
|
|
│ ATCR AppView (API) │
|
|
│ - Manifests → User's PDS │
|
|
│ - Auth & service token management │
|
|
│ - Blob routing via XRPC │
|
|
│ - Profile management │
|
|
└────────────┬─────────────────────────────┘
|
|
│
|
|
│ Hold discovery priority:
|
|
│ 1. io.atcr.sailor.profile.defaultHold (DID)
|
|
│ 2. io.atcr.hold records (legacy)
|
|
│ 3. AppView default_hold_did
|
|
▼
|
|
┌──────────────────────────────────────────┐
|
|
│ User's PDS │
|
|
│ - io.atcr.sailor.profile (hold DID) │
|
|
│ - io.atcr.manifest (with holdDid) │
|
|
└────────────┬─────────────────────────────┘
|
|
│
|
|
│ Service token from user's PDS
|
|
▼
|
|
┌──────────────────────────────────────────┐
|
|
│ Hold Service (did:web:hold.example.com) │
|
|
│ ├── Embedded PDS │
|
|
│ │ ├── Captain record (ownership) │
|
|
│ │ └── Crew records (access control) │
|
|
│ ├── XRPC multipart upload endpoints │
|
|
│ └── Storage driver (S3/Storj/etc.) │
|
|
└──────────────────────────────────────────┘
|
|
```
|
|
|
|
## Hold Service Components
|
|
|
|
Each hold is a full ATProto actor with:
|
|
- **DID**: `did:web:hold.example.com` (hold's identity)
|
|
- **Embedded PDS**: Stores captain + crew records (shared data)
|
|
- **Storage backend**: S3, Storj, Minio, filesystem, etc.
|
|
- **XRPC endpoints**: Standard ATProto + custom OCI multipart upload
|
|
|
|
### Records in Hold's PDS
|
|
|
|
**Captain record** (`io.atcr.hold.captain/self`):
|
|
```json
|
|
{
|
|
"$type": "io.atcr.hold.captain",
|
|
"owner": "did:plc:alice123",
|
|
"public": false,
|
|
"deployedAt": "2025-10-14T...",
|
|
"region": "iad",
|
|
"provider": "fly.io"
|
|
}
|
|
```
|
|
|
|
**Crew records** (`io.atcr.hold.crew/{rkey}`):
|
|
```json
|
|
{
|
|
"$type": "io.atcr.hold.crew",
|
|
"member": "did:plc:bob456",
|
|
"role": "admin",
|
|
"permissions": ["blob:read", "blob:write"],
|
|
"addedAt": "2025-10-14T..."
|
|
}
|
|
```
|
|
|
|
### Sailor Profile (User's PDS)
|
|
|
|
Users set their preferred hold in their sailor profile:
|
|
|
|
```json
|
|
{
|
|
"$type": "io.atcr.sailor.profile",
|
|
"defaultHold": "did:web:hold.example.com",
|
|
"createdAt": "2025-10-02T...",
|
|
"updatedAt": "2025-10-02T..."
|
|
}
|
|
```
|
|
|
|
## Deployment
|
|
|
|
### Configuration
|
|
|
|
Hold service is configured entirely via environment variables:
|
|
|
|
```bash
|
|
# Hold identity (REQUIRED)
|
|
HOLD_PUBLIC_URL=https://hold.example.com
|
|
HOLD_OWNER=did:plc:your-did-here
|
|
|
|
# Storage backend
|
|
STORAGE_DRIVER=s3
|
|
AWS_ACCESS_KEY_ID=your_access_key
|
|
AWS_SECRET_ACCESS_KEY=your_secret_key
|
|
AWS_REGION=us-east-1
|
|
S3_BUCKET=my-blobs
|
|
|
|
# Access control
|
|
HOLD_PUBLIC=false # Require authentication for reads
|
|
HOLD_ALLOW_ALL_CREW=false # Only explicit crew members can write
|
|
|
|
# Embedded PDS
|
|
HOLD_DATABASE_PATH=/var/lib/atcr-hold/hold.db
|
|
HOLD_DATABASE_KEY_PATH=/var/lib/atcr-hold/keys
|
|
```
|
|
|
|
### Running Locally
|
|
|
|
```bash
|
|
# Build
|
|
go build -o bin/atcr-hold ./cmd/hold
|
|
|
|
# Run (with env vars or .env file)
|
|
export HOLD_PUBLIC_URL=http://localhost:8080
|
|
export HOLD_OWNER=did:plc:your-did-here
|
|
export STORAGE_DRIVER=filesystem
|
|
export STORAGE_ROOT_DIR=/tmp/atcr-hold
|
|
export HOLD_DATABASE_PATH=/tmp/atcr-hold/hold.db
|
|
|
|
./bin/atcr-hold
|
|
```
|
|
|
|
On first run, the hold service creates:
|
|
- Captain record in embedded PDS (making you the owner)
|
|
- Crew record for owner with all permissions
|
|
- DID document at `/.well-known/did.json`
|
|
|
|
### Deploy to Fly.io
|
|
|
|
```bash
|
|
# Create fly.toml
|
|
cat > fly.toml <<EOF
|
|
app = "my-atcr-hold"
|
|
primary_region = "ord"
|
|
|
|
[env]
|
|
HOLD_PUBLIC_URL = "https://my-atcr-hold.fly.dev"
|
|
STORAGE_DRIVER = "s3"
|
|
AWS_REGION = "us-east-1"
|
|
S3_BUCKET = "my-blobs"
|
|
HOLD_PUBLIC = "false"
|
|
HOLD_ALLOW_ALL_CREW = "false"
|
|
|
|
[http_service]
|
|
internal_port = 8080
|
|
force_https = true
|
|
auto_stop_machines = true
|
|
auto_start_machines = true
|
|
min_machines_running = 0
|
|
|
|
[[vm]]
|
|
cpu_kind = "shared"
|
|
cpus = 1
|
|
memory_mb = 256
|
|
EOF
|
|
|
|
# Deploy
|
|
fly launch
|
|
fly deploy
|
|
|
|
# Set secrets
|
|
fly secrets set AWS_ACCESS_KEY_ID=...
|
|
fly secrets set AWS_SECRET_ACCESS_KEY=...
|
|
fly secrets set HOLD_OWNER=did:plc:your-did-here
|
|
```
|
|
|
|
## Request Flow
|
|
|
|
### Push with BYOS
|
|
|
|
```
|
|
1. Client: docker push atcr.io/alice/myapp:latest
|
|
|
|
2. AppView resolves alice → did:plc:alice123
|
|
|
|
3. AppView discovers hold DID:
|
|
- Check alice's sailor profile for defaultHold
|
|
- Returns: "did:web:alice-storage.fly.dev"
|
|
|
|
4. AppView gets service token from alice's PDS:
|
|
GET /xrpc/com.atproto.server.getServiceAuth?aud=did:web:alice-storage.fly.dev
|
|
Response: { "token": "eyJ..." }
|
|
|
|
5. AppView initiates multipart upload to hold:
|
|
POST https://alice-storage.fly.dev/xrpc/io.atcr.hold.initiateUpload
|
|
Authorization: Bearer {serviceToken}
|
|
Body: { "digest": "sha256:abc..." }
|
|
Response: { "uploadId": "xyz" }
|
|
|
|
6. For each part:
|
|
- AppView: POST /xrpc/io.atcr.hold.getPartUploadUrl
|
|
- Hold validates service token, checks crew membership
|
|
- Hold returns: { "url": "https://s3.../presigned" }
|
|
- Client uploads directly to S3 presigned URL
|
|
|
|
7. AppView completes upload:
|
|
POST /xrpc/io.atcr.hold.completeUpload
|
|
Body: { "uploadId": "xyz", "digest": "sha256:abc...", "parts": [...] }
|
|
|
|
8. Manifest stored in alice's PDS:
|
|
- holdDid: "did:web:alice-storage.fly.dev"
|
|
- holdEndpoint: "https://alice-storage.fly.dev" (backward compat)
|
|
```
|
|
|
|
### Pull with BYOS
|
|
|
|
```
|
|
1. Client: docker pull atcr.io/alice/myapp:latest
|
|
|
|
2. AppView fetches manifest from alice's PDS
|
|
|
|
3. Manifest contains:
|
|
- holdDid: "did:web:alice-storage.fly.dev"
|
|
|
|
4. AppView caches hold DID for 10 minutes (covers pull operation)
|
|
|
|
5. Client requests blob: GET /v2/alice/myapp/blobs/sha256:abc123
|
|
|
|
6. AppView uses cached hold DID from manifest
|
|
|
|
7. AppView gets service token from alice's PDS
|
|
|
|
8. AppView calls hold XRPC:
|
|
GET /xrpc/com.atproto.sync.getBlob?did={userDID}&cid=sha256:abc123
|
|
Authorization: Bearer {serviceToken}
|
|
Response: { "url": "https://s3.../presigned-download" }
|
|
|
|
9. AppView redirects client to presigned S3 URL
|
|
|
|
10. Client downloads directly from S3
|
|
```
|
|
|
|
**Key insight:** Pull uses the `holdDid` stored in the manifest, ensuring blobs are fetched from where they were originally pushed.
|
|
|
|
## Access Control
|
|
|
|
### Read Access
|
|
|
|
- **Public hold** (`HOLD_PUBLIC=true`): Anonymous + authenticated users
|
|
- **Private hold** (`HOLD_PUBLIC=false`): Authenticated users with crew membership
|
|
|
|
### Write Access
|
|
|
|
- Hold owner (captain) OR crew members only
|
|
- Verified via `io.atcr.hold.crew` records in hold's embedded PDS
|
|
- Service token proves user identity (from user's PDS)
|
|
|
|
### Authorization Flow
|
|
|
|
```go
|
|
1. AppView gets service token from user's PDS
|
|
2. AppView sends request to hold with service token
|
|
3. Hold validates service token (checks it's from user's PDS)
|
|
4. Hold extracts user's DID from token
|
|
5. Hold checks crew records in its embedded PDS
|
|
6. If crew member found → allow, else → deny
|
|
```
|
|
|
|
## Managing Crew Members
|
|
|
|
### Add Crew Member
|
|
|
|
Use ATProto client to create crew record in hold's PDS:
|
|
|
|
```bash
|
|
# Via XRPC (if hold supports it)
|
|
POST https://hold.example.com/xrpc/io.atcr.hold.requestCrew
|
|
Authorization: Bearer {userOAuthToken}
|
|
|
|
# Or manually via captain's OAuth to hold's PDS
|
|
atproto put-record \
|
|
--pds https://hold.example.com \
|
|
--collection io.atcr.hold.crew \
|
|
--rkey "{memberDID}" \
|
|
--value '{
|
|
"$type": "io.atcr.hold.crew",
|
|
"member": "did:plc:bob456",
|
|
"role": "admin",
|
|
"permissions": ["blob:read", "blob:write"]
|
|
}'
|
|
```
|
|
|
|
### Remove Crew Member
|
|
|
|
```bash
|
|
atproto delete-record \
|
|
--pds https://hold.example.com \
|
|
--collection io.atcr.hold.crew \
|
|
--rkey "{memberDID}"
|
|
```
|
|
|
|
## Storage Drivers
|
|
|
|
Hold service supports all distribution storage drivers:
|
|
- **S3** - AWS S3, Minio, Storj (via S3 gateway)
|
|
- **Filesystem** - Local disk (for testing)
|
|
- **Azure** - Azure Blob Storage
|
|
- **GCS** - Google Cloud Storage
|
|
- **Swift** - OpenStack Swift
|
|
|
|
## Example: Team Hold
|
|
|
|
```bash
|
|
# 1. Deploy hold service
|
|
export HOLD_PUBLIC_URL=https://team-hold.fly.dev
|
|
export HOLD_OWNER=did:plc:admin
|
|
export HOLD_PUBLIC=false # Private
|
|
export STORAGE_DRIVER=s3
|
|
export AWS_ACCESS_KEY_ID=...
|
|
export S3_BUCKET=team-blobs
|
|
|
|
fly deploy
|
|
|
|
# 2. Hold auto-creates captain + crew records on first run
|
|
|
|
# 3. Admin adds team members via hold's PDS (requires OAuth)
|
|
# (TODO: Implement crew management UI/CLI)
|
|
|
|
# 4. Team members set their sailor profile:
|
|
atproto put-record \
|
|
--collection io.atcr.sailor.profile \
|
|
--rkey "self" \
|
|
--value '{
|
|
"$type": "io.atcr.sailor.profile",
|
|
"defaultHold": "did:web:team-hold.fly.dev"
|
|
}'
|
|
|
|
# 5. Team members can now push/pull using team hold
|
|
```
|
|
|
|
## Limitations
|
|
|
|
### Current IAM Challenges
|
|
|
|
See [EMBEDDED_PDS.md](./EMBEDDED_PDS.md#iam-challenges) for detailed discussion.
|
|
|
|
**Known issues:**
|
|
1. **RPC permission format**: Service tokens don't work with IP-based DIDs in local dev
|
|
2. **Dynamic hold discovery**: AppView can't dynamically OAuth arbitrary holds from sailor profiles
|
|
3. **Manual profile management**: No UI for updating sailor profile (must use ATProto client)
|
|
|
|
**Workaround:** Use hostname-based DIDs (`did:web:hold.example.com`) and public holds for now.
|
|
|
|
## Future Improvements
|
|
|
|
1. **Crew management UI** - Web interface for adding/removing crew members
|
|
2. **Dynamic OAuth** - Support for arbitrary BYOS holds without pre-configuration
|
|
3. **Hold migration** - Tools for moving blobs between holds
|
|
4. **Storage analytics** - Track usage per user/repository
|
|
5. **Distributed cache** - Redis for hold DID cache in multi-instance deployments
|
|
|
|
## References
|
|
|
|
- [EMBEDDED_PDS.md](./EMBEDDED_PDS.md) - Embedded PDS architecture and IAM details
|
|
- [ATProto Lexicon Spec](https://atproto.com/specs/lexicon)
|
|
- [Distribution Storage Drivers](https://distribution.github.io/distribution/storage-drivers/)
|
|
- [S3 Presigned URLs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/PresignedUrlUploadObject.html)
|