Files
at-container-registry/docs/BYOS.md

518 lines
16 KiB
Markdown

# Bring Your Own Storage (BYOS)
## Overview
ATCR supports "Bring Your Own Storage" (BYOS) for blob storage. This allows users to:
- Deploy their own storage service backed by S3/Storj/Minio/filesystem
- Control who can use their storage (public or private)
- Keep blob data in their own infrastructure while manifests remain in their ATProto PDS
## Architecture
```
┌─────────────────────────────────────────────┐
│ ATCR AppView (API) │
│ - Manifests → ATProto PDS │
│ - Auth & token validation │
│ - Blob routing (issues redirects) │
│ - Profile management │
└─────────────────┬───────────────────────────┘
│ Hold discovery priority:
│ 1. io.atcr.sailor.profile.defaultHold
│ 2. io.atcr.hold records
│ 3. AppView default_storage_endpoint
┌─────────────────────────────────────────────┐
│ User's PDS │
│ - io.atcr.sailor.profile (hold preference) │
│ - io.atcr.hold records (own holds) │
│ - io.atcr.manifest records (with holdEP) │
└─────────────────┬───────────────────────────┘
│ Redirects to hold
┌─────────────────────────────────────────────┐
│ Storage Service (Hold) │
│ - Blob storage (S3/Storj/Minio/filesystem) │
│ - Presigned URL generation │
│ - Authorization (DID-based) │
└─────────────────────────────────────────────┘
```
## ATProto Records
### io.atcr.sailor.profile
**NEW:** User profile for hold selection preferences. Created automatically on first authentication.
```json
{
"$type": "io.atcr.sailor.profile",
"defaultHold": "https://team-hold.example.com",
"createdAt": "2025-10-02T12:00:00Z",
"updatedAt": "2025-10-02T12:00:00Z"
}
```
**Record key:** Always `"self"` (only one profile per user)
**Behavior:**
- Created automatically when user first authenticates (OAuth or Basic Auth)
- If AppView has `default_storage_endpoint`, profile gets that as initial `defaultHold`
- User can update to join shared holds or use their own hold
- Set `defaultHold` to `null` to opt out of defaults (use own hold or AppView default)
**This solves the multi-hold problem:** Users who are crew members of multiple holds can explicitly choose which one to use via their profile.
### io.atcr.hold
Users create a hold record in their PDS to configure their own storage:
```json
{
"$type": "io.atcr.hold",
"endpoint": "https://alice-storage.example.com",
"owner": "did:plc:alice123",
"public": false,
"createdAt": "2025-10-01T12:00:00Z"
}
```
### io.atcr.hold.crew
Hold owners can add crew members (for shared storage):
```json
{
"$type": "io.atcr.hold.crew",
"hold": "at://did:plc:alice/io.atcr.hold/my-storage",
"member": "did:plc:bob456",
"role": "write",
"addedAt": "2025-10-01T12:00:00Z"
}
```
**Note:** Crew records are stored in the **hold owner's PDS**, not the crew member's PDS. This ensures the hold owner maintains full control over access.
## Storage Service
### Deployment
The storage service is a lightweight HTTP server that:
1. Accepts presigned URL requests
2. Verifies DID authorization
3. Generates presigned URLs for S3/Storj/etc
4. Returns URLs to AppView for client redirect
### Configuration
The hold service is configured entirely via environment variables. See `.env.example` for all options.
**Required environment variables:**
```bash
# Hold service public URL (REQUIRED)
HOLD_PUBLIC_URL=https://storage.example.com
# Storage driver type
STORAGE_DRIVER=s3
# For S3/Minio
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
S3_BUCKET=my-blobs
# For Storj (optional - custom S3 endpoint)
# S3_ENDPOINT=https://gateway.storjshare.io
# For filesystem storage
# STORAGE_DRIVER=filesystem
# STORAGE_ROOT_DIR=/var/lib/atcr-storage
```
**Authorization:**
ATCR follows ATProto's public-by-default model with gated anonymous access:
**Read Access:**
- **Public hold** (`HOLD_PUBLIC=true`): Anonymous reads allowed (no authentication)
- **Private hold** (`HOLD_PUBLIC=false`): Requires authentication (any ATCR user with sailor.profile)
**Write Access:**
- Always requires authentication
- Must be hold owner OR crew member (verified via `io.atcr.hold.crew` records in owner's PDS)
**Key Points:**
- "Private" just means "no anonymous access" - not "limited user access"
- Any authenticated ATCR user can read from private holds
- Crew membership only controls WRITE access, not READ access
- This aligns with ATProto's public records model (no private PDS records yet)
### Running
```bash
# Build
go build -o atcr-hold ./cmd/hold
# Set environment variables (or use .env file)
export HOLD_PUBLIC_URL=https://storage.example.com
export STORAGE_DRIVER=s3
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1
export S3_BUCKET=my-blobs
# Run
./atcr-hold
```
**Registration (required):**
The hold service must be registered in a PDS to be discoverable by the AppView.
**Standard registration workflow:**
1. Set `HOLD_OWNER` to your DID:
```bash
export HOLD_OWNER=did:plc:your-did-here
```
2. Start the hold service:
```bash
./atcr-hold
```
3. **Check the logs** for the OAuth authorization URL:
```
================================================================================
OAUTH AUTHORIZATION REQUIRED
================================================================================
Please visit this URL to authorize the hold service:
https://bsky.app/authorize?client_id=...
Waiting for authorization...
================================================================================
```
4. Visit the URL in your browser and authorize
5. The hold service will:
- Exchange the authorization code for a token
- Create `io.atcr.hold` record in your PDS
- Create `io.atcr.hold.crew` record (making you the owner)
- Save registration state
6. On subsequent runs, the service checks if already registered and skips OAuth
**Alternative methods:**
- **Manual API registration**: Call `POST /register` with your own OAuth token
- **Completely manual**: Create PDS records yourself using any ATProto client
### Deploy to Fly.io
```bash
# Create fly.toml
cat > fly.toml <<EOF
app = "my-atcr-hold"
primary_region = "ord"
[env]
HOLD_PUBLIC_URL = "https://my-atcr-hold.fly.dev"
HOLD_SERVER_ADDR = ":8080"
STORAGE_DRIVER = "s3"
AWS_REGION = "us-east-1"
S3_BUCKET = "my-blobs"
HOLD_PUBLIC = "false"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 256
EOF
# Deploy
fly launch
fly deploy
# Set secrets
fly secrets set AWS_ACCESS_KEY_ID=...
fly secrets set AWS_SECRET_ACCESS_KEY=...
fly secrets set HOLD_OWNER=did:plc:your-did-here
# Check logs for OAuth URL on first run
fly logs
# Visit the OAuth URL shown in logs to authorize
# The hold service will register itself in your PDS
```
## Request Flow
### Push with BYOS
1. **Docker push** `atcr.io/alice/myapp:latest`
2. **AppView** resolves `alice` → `did:plc:alice123`
3. **AppView** discovers hold via priority logic:
- Check alice's `io.atcr.sailor.profile` for `defaultHold`
- If not set, check alice's `io.atcr.hold` records
- Fall back to AppView's `default_storage_endpoint`
4. **Found:** `alice.profile.defaultHold = "https://team-hold.example.com"`
5. **AppView** → team-hold: POST `/put-presigned-url`
```json
{
"did": "did:plc:alice123",
"digest": "sha256:abc123...",
"size": 1048576
}
```
6. **Hold service**:
- Verifies alice is authorized (checks crew records)
- Generates S3 presigned upload URL (15min expiry)
- Returns: `{"url": "https://s3.../blob?signature=..."}`
7. **AppView** → Docker: `307 Redirect` to presigned URL
8. **Docker** → S3: PUT blob directly (no proxy)
9. **Manifest** stored in alice's PDS with `holdEndpoint: "https://team-hold.example.com"`
### Pull with BYOS
1. **Docker pull** `atcr.io/alice/myapp:latest`
2. **AppView** fetches manifest from alice's PDS
3. **Manifest** contains `holdEndpoint: "https://team-hold.example.com"`
4. **AppView** caches: `(alice's DID, "myapp") → "https://team-hold.example.com"` (10min TTL)
5. **Docker** requests blobs: GET `/v2/alice/myapp/blobs/sha256:abc123`
6. **AppView** uses **cached hold from manifest** (not re-discovered)
7. **AppView** → team-hold: POST `/get-presigned-url`
8. **Hold service** returns presigned download URL
9. **AppView** → Docker: `307 Redirect`
10. **Docker** → S3: GET blob directly
**Key insight:** Pull uses the historical `holdEndpoint` from the manifest, ensuring blobs are fetched from where they were originally pushed, even if alice later changes her profile's `defaultHold`.
## Default Registry
The AppView can run its own storage service as the default:
### AppView config
```yaml
middleware:
- name: registry
options:
atproto-resolver:
default_storage_endpoint: https://storage.atcr.io
```
### Default hold service config
```bash
# Accept any authenticated DID
HOLD_PUBLIC=false # Requires authentication
# Or allow public reads
HOLD_PUBLIC=true # Public reads, auth required for writes
```
This provides free-tier shared storage for users who don't want to deploy their own.
## Storage Drivers Supported
The storage service uses distribution's storage drivers:
- **S3** - AWS S3, Minio, Storj (via S3 gateway)
- **Filesystem** - Local disk (for testing)
- **Azure** - Azure Blob Storage
- **GCS** - Google Cloud Storage
- **Swift** - OpenStack Swift
- **OSS** - Alibaba Cloud OSS
## Quotas
Quotas are NOT implemented in the storage service. Instead, use:
- **S3**: Bucket policies, lifecycle rules
- **Storj**: Project limits in Storj dashboard
- **Minio**: Quota enforcement features
- **Filesystem**: Disk quotas at OS level
## Security
### Authorization
Authorization is based on ATProto's public-by-default model:
**Read Authorization:**
- **Public hold** (`public: true` in hold record):
- Anonymous users: ✅ Allowed
- Any authenticated user: ✅ Allowed
- **Private hold** (`public: false` in hold record):
- Anonymous users: ❌ 401 Unauthorized
- Any authenticated ATCR user: ✅ Allowed (no crew membership required)
**Write Authorization:**
- Anonymous users: ❌ 401 Unauthorized
- Authenticated non-crew: ❌ 403 Forbidden
- Authenticated crew member: ✅ Allowed
- Hold owner: ✅ Allowed
**Implementation:**
- Hold service queries owner's PDS for `io.atcr.hold.crew` records
- Crew records are public ATProto records (read without authentication)
- "Private" holds only gate anonymous access, not authenticated user access
- This reflects ATProto's current limitation: no private PDS records
### Presigned URLs
- 15 minute expiry
- Client uploads/downloads directly to storage
- No data flows through AppView or hold service
### Private Holds
"Private" holds gate anonymous access while remaining accessible to authenticated users:
**What "Private" Means:**
- `HOLD_PUBLIC=false` prevents anonymous reads
- Any authenticated ATCR user can still read
- This aligns with ATProto's public records model
**Write Control:**
- Only hold owner and crew members can write
- Crew membership managed via `io.atcr.hold.crew` records in owner's PDS
- Removing crew member immediately revokes write access
**Future: True Private Access**
- When ATProto adds private PDS records, ATCR can support truly private repos
- For now, "private" = "authenticated-only access"
## Example: Personal Storage
Alice wants to use her own Storj account:
1. **Set environment variables**:
```bash
export HOLD_PUBLIC_URL=https://alice-storage.fly.dev
export HOLD_OWNER=did:plc:alice123
export STORAGE_DRIVER=s3
export AWS_ACCESS_KEY_ID=your_storj_access_key
export AWS_SECRET_ACCESS_KEY=your_storj_secret_key
export S3_ENDPOINT=https://gateway.storjshare.io
export S3_BUCKET=alice-blobs
```
2. **Deploy hold service** to Fly.io - auto-registration creates hold + crew record
3. **Push images** - AppView automatically routes to her storage
## Example: Team Hold
A company wants shared storage for their team:
1. **Deploy hold service** with S3 credentials and auto-registration:
```bash
export HOLD_PUBLIC_URL=https://company-hold.fly.dev
export HOLD_OWNER=did:plc:admin
export HOLD_PUBLIC=false
export STORAGE_DRIVER=s3
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export S3_BUCKET=company-blobs
```
2. **Hold service auto-registers** on first run, creating:
- Hold record in admin's PDS
- Crew record making admin the owner
3. **Admin adds crew members** via ATProto client or manually:
```bash
# Using atproto client
atproto put-record \
--collection io.atcr.hold.crew \
--rkey "company-did:plc:engineer1" \
--value '{
"$type": "io.atcr.hold.crew",
"hold": "at://did:plc:admin/io.atcr.hold/company",
"member": "did:plc:engineer1",
"role": "write"
}'
```
4. **Team members set their profile** to use the shared hold:
```bash
# Engineer updates their sailor profile
atproto put-record \
--collection io.atcr.sailor.profile \
--rkey "self" \
--value '{
"$type": "io.atcr.sailor.profile",
"defaultHold": "https://company-hold.fly.dev"
}'
```
5. **Hold service queries PDS** for crew records to authorize writes
6. **Engineers push/pull** using `atcr.io/engineer1/myapp` - blobs go to company hold
## Limitations
1. **No resume/partial uploads** - Storage service doesn't track upload state
2. **No advanced features** - Just basic put/get, no deduplication logic
3. **In-memory cache** - Hold endpoint cache is in-memory (for production, use Redis)
4. **Manual profile updates** - No UI for updating sailor profile (must use ATProto client)
## Performance Optimization: S3 Presigned URLs
**Status:** Planned implementation (see [PRESIGNED_URLS.md](./PRESIGNED_URLS.md))
Currently, hold services act as proxies for blob data. With presigned URLs:
- **Downloads:** Docker → S3 direct (via 307 redirect)
- **Uploads:** Docker → AppView → S3 (via presigned URL)
- **Hold service bandwidth:** Reduced by 99.98% (only orchestration)
**Benefits:**
- Hold services can run on minimal infrastructure ($5/month instances)
- Direct S3 transfers at maximum speed
- Scales to arbitrarily large images
- Works with Storj, MinIO, Backblaze B2, Cloudflare R2
See [PRESIGNED_URLS.md](./PRESIGNED_URLS.md) for complete technical details and implementation guide.
## Future Improvements
1. **S3 Presigned URLs** - Implement direct S3 URLs (see [PRESIGNED_URLS.md](./PRESIGNED_URLS.md))
2. **Automatic failover** - Multiple storage endpoints, fallback to default
3. **Storage analytics** - Track usage per DID
4. **Quota integration** - Optional quota tracking in storage service
5. **Profile management UI** - Web interface for users to manage their sailor profile
6. **Distributed cache** - Redis/Memcached for hold endpoint cache in multi-instance deployments
## Comparison to Default Storage
| Feature | Default (Shared S3) | BYOS |
|---------|---------------------|------|
| Setup | None required | Deploy storage service |
| Cost | Free (with quota) | User pays for S3/Storj |
| Control | Limited | Full control |
| Performance | Shared | Dedicated |
| Quotas | Enforced by AppView | User managed |
| Privacy | Blobs in shared bucket | Blobs in user's bucket |
## References
- [ATProto Lexicon Spec](https://atproto.com/specs/lexicon)
- [Distribution Storage Drivers](https://distribution.github.io/distribution/storage-drivers/)
- [S3 Presigned URLs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/PresignedUrlUploadObject.html)
- [Storj Documentation](https://docs.storj.io/)