Files
at-container-registry/docs/BYOS.md

16 KiB

Bring Your Own Storage (BYOS)

Overview

ATCR supports "Bring Your Own Storage" (BYOS) for blob storage. This allows users to:

  • Deploy their own storage service backed by S3/Storj/Minio/filesystem
  • Control who can use their storage (public or private)
  • Keep blob data in their own infrastructure while manifests remain in their ATProto PDS

Architecture

┌─────────────────────────────────────────────┐
│ ATCR AppView (API)                          │
│ - Manifests → ATProto PDS                   │
│ - Auth & token validation                   │
│ - Blob routing (issues redirects)           │
│ - Profile management                        │
└─────────────────┬───────────────────────────┘
                  │
                  │ Hold discovery priority:
                  │ 1. io.atcr.sailor.profile.defaultHold
                  │ 2. io.atcr.hold records
                  │ 3. AppView default_storage_endpoint
                  ▼
┌─────────────────────────────────────────────┐
│ User's PDS                                  │
│ - io.atcr.sailor.profile (hold preference) │
│ - io.atcr.hold records (own holds)         │
│ - io.atcr.manifest records (with holdEP)   │
└─────────────────┬───────────────────────────┘
                  │
                  │ Redirects to hold
                  ▼
┌─────────────────────────────────────────────┐
│ Storage Service (Hold)                      │
│ - Blob storage (S3/Storj/Minio/filesystem) │
│ - Presigned URL generation                  │
│ - Authorization (DID-based)                 │
└─────────────────────────────────────────────┘

ATProto Records

io.atcr.sailor.profile

NEW: User profile for hold selection preferences. Created automatically on first authentication.

{
  "$type": "io.atcr.sailor.profile",
  "defaultHold": "https://team-hold.example.com",
  "createdAt": "2025-10-02T12:00:00Z",
  "updatedAt": "2025-10-02T12:00:00Z"
}

Record key: Always "self" (only one profile per user)

Behavior:

  • Created automatically when user first authenticates (OAuth or Basic Auth)
  • If AppView has default_storage_endpoint, profile gets that as initial defaultHold
  • User can update to join shared holds or use their own hold
  • Set defaultHold to null to opt out of defaults (use own hold or AppView default)

This solves the multi-hold problem: Users who are crew members of multiple holds can explicitly choose which one to use via their profile.

io.atcr.hold

Users create a hold record in their PDS to configure their own storage:

{
  "$type": "io.atcr.hold",
  "endpoint": "https://alice-storage.example.com",
  "owner": "did:plc:alice123",
  "public": false,
  "createdAt": "2025-10-01T12:00:00Z"
}

io.atcr.hold.crew

Hold owners can add crew members (for shared storage):

{
  "$type": "io.atcr.hold.crew",
  "hold": "at://did:plc:alice/io.atcr.hold/my-storage",
  "member": "did:plc:bob456",
  "role": "write",
  "addedAt": "2025-10-01T12:00:00Z"
}

Note: Crew records are stored in the hold owner's PDS, not the crew member's PDS. This ensures the hold owner maintains full control over access.

Storage Service

Deployment

The storage service is a lightweight HTTP server that:

  1. Accepts presigned URL requests
  2. Verifies DID authorization
  3. Generates presigned URLs for S3/Storj/etc
  4. Returns URLs to AppView for client redirect

Configuration

The hold service is configured entirely via environment variables. See .env.example for all options.

Required environment variables:

# Hold service public URL (REQUIRED)
HOLD_PUBLIC_URL=https://storage.example.com

# Storage driver type
STORAGE_DRIVER=s3

# For S3/Minio
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
S3_BUCKET=my-blobs

# For Storj (optional - custom S3 endpoint)
# S3_ENDPOINT=https://gateway.storjshare.io

# For filesystem storage
# STORAGE_DRIVER=filesystem
# STORAGE_ROOT_DIR=/var/lib/atcr-storage

Authorization:

ATCR follows ATProto's public-by-default model with gated anonymous access:

Read Access:

  • Public hold (HOLD_PUBLIC=true): Anonymous reads allowed (no authentication)
  • Private hold (HOLD_PUBLIC=false): Requires authentication (any ATCR user with sailor.profile)

Write Access:

  • Always requires authentication
  • Must be hold owner OR crew member (verified via io.atcr.hold.crew records in owner's PDS)

Key Points:

  • "Private" just means "no anonymous access" - not "limited user access"
  • Any authenticated ATCR user can read from private holds
  • Crew membership only controls WRITE access, not READ access
  • This aligns with ATProto's public records model (no private PDS records yet)

Running

# Build
go build -o atcr-hold ./cmd/hold

# Set environment variables (or use .env file)
export HOLD_PUBLIC_URL=https://storage.example.com
export STORAGE_DRIVER=s3
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1
export S3_BUCKET=my-blobs

# Run
./atcr-hold

Registration (required):

The hold service must be registered in a PDS to be discoverable by the AppView.

Standard registration workflow:

  1. Set HOLD_OWNER to your DID:

    export HOLD_OWNER=did:plc:your-did-here
    
  2. Start the hold service:

    ./atcr-hold
    
  3. Check the logs for the OAuth authorization URL:

    ================================================================================
    OAUTH AUTHORIZATION REQUIRED
    ================================================================================
    
    Please visit this URL to authorize the hold service:
    
      https://bsky.app/authorize?client_id=...
    
    Waiting for authorization...
    ================================================================================
    
  4. Visit the URL in your browser and authorize

  5. The hold service will:

    • Exchange the authorization code for a token
    • Create io.atcr.hold record in your PDS
    • Create io.atcr.hold.crew record (making you the owner)
    • Save registration state
  6. On subsequent runs, the service checks if already registered and skips OAuth

Alternative methods:

  • Manual API registration: Call POST /register with your own OAuth token
  • Completely manual: Create PDS records yourself using any ATProto client

Deploy to Fly.io

# Create fly.toml
cat > fly.toml <<EOF
app = "my-atcr-hold"
primary_region = "ord"

[env]
  HOLD_PUBLIC_URL = "https://my-atcr-hold.fly.dev"
  HOLD_SERVER_ADDR = ":8080"
  STORAGE_DRIVER = "s3"
  AWS_REGION = "us-east-1"
  S3_BUCKET = "my-blobs"
  HOLD_PUBLIC = "false"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0

[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory_mb = 256
EOF

# Deploy
fly launch
fly deploy

# Set secrets
fly secrets set AWS_ACCESS_KEY_ID=...
fly secrets set AWS_SECRET_ACCESS_KEY=...
fly secrets set HOLD_OWNER=did:plc:your-did-here

# Check logs for OAuth URL on first run
fly logs

# Visit the OAuth URL shown in logs to authorize
# The hold service will register itself in your PDS

Request Flow

Push with BYOS

  1. Docker push atcr.io/alice/myapp:latest
  2. AppView resolves alicedid:plc:alice123
  3. AppView discovers hold via priority logic:
    • Check alice's io.atcr.sailor.profile for defaultHold
    • If not set, check alice's io.atcr.hold records
    • Fall back to AppView's default_storage_endpoint
  4. Found: alice.profile.defaultHold = "https://team-hold.example.com"
  5. AppView → team-hold: POST /put-presigned-url
    {
      "did": "did:plc:alice123",
      "digest": "sha256:abc123...",
      "size": 1048576
    }
    
  6. Hold service:
    • Verifies alice is authorized (checks crew records)
    • Generates S3 presigned upload URL (15min expiry)
    • Returns: {"url": "https://s3.../blob?signature=..."}
  7. AppView → Docker: 307 Redirect to presigned URL
  8. Docker → S3: PUT blob directly (no proxy)
  9. Manifest stored in alice's PDS with holdEndpoint: "https://team-hold.example.com"

Pull with BYOS

  1. Docker pull atcr.io/alice/myapp:latest
  2. AppView fetches manifest from alice's PDS
  3. Manifest contains holdEndpoint: "https://team-hold.example.com"
  4. AppView caches: (alice's DID, "myapp") → "https://team-hold.example.com" (10min TTL)
  5. Docker requests blobs: GET /v2/alice/myapp/blobs/sha256:abc123
  6. AppView uses cached hold from manifest (not re-discovered)
  7. AppView → team-hold: POST /get-presigned-url
  8. Hold service returns presigned download URL
  9. AppView → Docker: 307 Redirect
  10. Docker → S3: GET blob directly

Key insight: Pull uses the historical holdEndpoint from the manifest, ensuring blobs are fetched from where they were originally pushed, even if alice later changes her profile's defaultHold.

Default Registry

The AppView can run its own storage service as the default:

AppView config

middleware:
  - name: registry
    options:
      atproto-resolver:
        default_storage_endpoint: https://storage.atcr.io

Default hold service config

# Accept any authenticated DID
HOLD_PUBLIC=false  # Requires authentication

# Or allow public reads
HOLD_PUBLIC=true  # Public reads, auth required for writes

This provides free-tier shared storage for users who don't want to deploy their own.

Storage Drivers Supported

The storage service uses distribution's storage drivers:

  • S3 - AWS S3, Minio, Storj (via S3 gateway)
  • Filesystem - Local disk (for testing)
  • Azure - Azure Blob Storage
  • GCS - Google Cloud Storage
  • Swift - OpenStack Swift
  • OSS - Alibaba Cloud OSS

Quotas

Quotas are NOT implemented in the storage service. Instead, use:

  • S3: Bucket policies, lifecycle rules
  • Storj: Project limits in Storj dashboard
  • Minio: Quota enforcement features
  • Filesystem: Disk quotas at OS level

Security

Authorization

Authorization is based on ATProto's public-by-default model:

Read Authorization:

  • Public hold (public: true in hold record):

    • Anonymous users: Allowed
    • Any authenticated user: Allowed
  • Private hold (public: false in hold record):

    • Anonymous users: 401 Unauthorized
    • Any authenticated ATCR user: Allowed (no crew membership required)

Write Authorization:

  • Anonymous users: 401 Unauthorized
  • Authenticated non-crew: 403 Forbidden
  • Authenticated crew member: Allowed
  • Hold owner: Allowed

Implementation:

  • Hold service queries owner's PDS for io.atcr.hold.crew records
  • Crew records are public ATProto records (read without authentication)
  • "Private" holds only gate anonymous access, not authenticated user access
  • This reflects ATProto's current limitation: no private PDS records

Presigned URLs

  • 15 minute expiry
  • Client uploads/downloads directly to storage
  • No data flows through AppView or hold service

Private Holds

"Private" holds gate anonymous access while remaining accessible to authenticated users:

What "Private" Means:

  • HOLD_PUBLIC=false prevents anonymous reads
  • Any authenticated ATCR user can still read
  • This aligns with ATProto's public records model

Write Control:

  • Only hold owner and crew members can write
  • Crew membership managed via io.atcr.hold.crew records in owner's PDS
  • Removing crew member immediately revokes write access

Future: True Private Access

  • When ATProto adds private PDS records, ATCR can support truly private repos
  • For now, "private" = "authenticated-only access"

Example: Personal Storage

Alice wants to use her own Storj account:

  1. Set environment variables:

    export HOLD_PUBLIC_URL=https://alice-storage.fly.dev
    export HOLD_OWNER=did:plc:alice123
    export STORAGE_DRIVER=s3
    export AWS_ACCESS_KEY_ID=your_storj_access_key
    export AWS_SECRET_ACCESS_KEY=your_storj_secret_key
    export S3_ENDPOINT=https://gateway.storjshare.io
    export S3_BUCKET=alice-blobs
    
  2. Deploy hold service to Fly.io - auto-registration creates hold + crew record

  3. Push images - AppView automatically routes to her storage

Example: Team Hold

A company wants shared storage for their team:

  1. Deploy hold service with S3 credentials and auto-registration:

    export HOLD_PUBLIC_URL=https://company-hold.fly.dev
    export HOLD_OWNER=did:plc:admin
    export HOLD_PUBLIC=false
    export STORAGE_DRIVER=s3
    export AWS_ACCESS_KEY_ID=...
    export AWS_SECRET_ACCESS_KEY=...
    export S3_BUCKET=company-blobs
    
  2. Hold service auto-registers on first run, creating:

    • Hold record in admin's PDS
    • Crew record making admin the owner
  3. Admin adds crew members via ATProto client or manually:

    # Using atproto client
    atproto put-record \
      --collection io.atcr.hold.crew \
      --rkey "company-did:plc:engineer1" \
      --value '{
        "$type": "io.atcr.hold.crew",
        "hold": "at://did:plc:admin/io.atcr.hold/company",
        "member": "did:plc:engineer1",
        "role": "write"
      }'
    
  4. Team members set their profile to use the shared hold:

    # Engineer updates their sailor profile
    atproto put-record \
      --collection io.atcr.sailor.profile \
      --rkey "self" \
      --value '{
        "$type": "io.atcr.sailor.profile",
        "defaultHold": "https://company-hold.fly.dev"
      }'
    
  5. Hold service queries PDS for crew records to authorize writes

  6. Engineers push/pull using atcr.io/engineer1/myapp - blobs go to company hold

Limitations

  1. No resume/partial uploads - Storage service doesn't track upload state
  2. No advanced features - Just basic put/get, no deduplication logic
  3. In-memory cache - Hold endpoint cache is in-memory (for production, use Redis)
  4. Manual profile updates - No UI for updating sailor profile (must use ATProto client)

Performance Optimization: S3 Presigned URLs

Status: Planned implementation (see PRESIGNED_URLS.md)

Currently, hold services act as proxies for blob data. With presigned URLs:

  • Downloads: Docker → S3 direct (via 307 redirect)
  • Uploads: Docker → AppView → S3 (via presigned URL)
  • Hold service bandwidth: Reduced by 99.98% (only orchestration)

Benefits:

  • Hold services can run on minimal infrastructure ($5/month instances)
  • Direct S3 transfers at maximum speed
  • Scales to arbitrarily large images
  • Works with Storj, MinIO, Backblaze B2, Cloudflare R2

See PRESIGNED_URLS.md for complete technical details and implementation guide.

Future Improvements

  1. S3 Presigned URLs - Implement direct S3 URLs (see PRESIGNED_URLS.md)
  2. Automatic failover - Multiple storage endpoints, fallback to default
  3. Storage analytics - Track usage per DID
  4. Quota integration - Optional quota tracking in storage service
  5. Profile management UI - Web interface for users to manage their sailor profile
  6. Distributed cache - Redis/Memcached for hold endpoint cache in multi-instance deployments

Comparison to Default Storage

Feature Default (Shared S3) BYOS
Setup None required Deploy storage service
Cost Free (with quota) User pays for S3/Storj
Control Limited Full control
Performance Shared Dedicated
Quotas Enforced by AppView User managed
Privacy Blobs in shared bucket Blobs in user's bucket

References