Files
Evan Jarrett 487fc8a47e wording
2026-01-04 23:37:31 -06:00

20 KiB

ATCR Quota System

This document describes ATCR's storage quota implementation using ATProto records for per-user layer tracking.

Table of Contents

Overview

ATCR implements per-user storage quotas to:

  1. Limit storage consumption on shared hold services
  2. Provide transparency (show users their storage usage)
  3. Enable fair billing (users pay for what they use)

Key principle: Users pay for layers they reference, deduplicated per-user. If you push the same layer in multiple images, you only pay once.

Example Scenario

Alice pushes myapp:v1 (layers A, B, C - each 100MB)
→ Creates 3 layer records in hold's PDS
→ Alice's quota: 300MB (3 unique layers)

Alice pushes myapp:v2 (layers A, B, D)
→ Creates 3 more layer records (A, B again, plus D)
→ Alice's quota: 400MB (4 unique layers: A, B, C, D)
→ Layers A, B appear twice in records but deduplicated in quota calc

Bob pushes his-app:latest (layers A, E)
→ Creates 2 layer records for Bob
→ Bob's quota: 200MB (2 unique layers: A, E)
→ Layer A shared with Alice in S3, but Bob pays for his own usage

Physical S3 storage: 500MB (A, B, C, D, E - deduplicated globally)
Alice's quota: 400MB
Bob's quota: 200MB

Quota Model

Everyone Pays for What They Upload

Each user is charged for all unique layers they reference, regardless of whether those layers exist in S3 from other users' uploads.

Why this model?

  • Simple mental model: "I pushed 500MB of layers, I use 500MB of quota"
  • Predictable: Your quota doesn't change based on others' actions
  • Clean deletion: Delete manifest → layer records removed → quota freed
  • No cross-user dependencies: Users are isolated

Trade-off:

  • Total claimed storage can exceed physical S3 storage
  • This is acceptable - deduplication is an operational benefit for ATCR, not a billing feature

ATProto-Native Storage

Layer tracking uses ATProto records stored in the hold's embedded PDS:

  • Collection: io.atcr.hold.layer
  • Repository: Hold's DID (e.g., did:web:hold01.atcr.io)
  • Records: One per manifest-layer relationship (TID-based keys)

This approach:

  • Keeps quota data in ATProto (no separate database)
  • Enables standard ATProto sync/query mechanisms
  • Provides full audit trail of layer usage

Layer Record Schema

LayerRecord

// pkg/atproto/lexicon.go

type LayerRecord struct {
    Type      string `json:"$type"`     // "io.atcr.hold.layer"
    Digest    string `json:"digest"`    // Layer digest (sha256:abc123...)
    Size      int64  `json:"size"`      // Size in bytes
    MediaType string `json:"mediaType"` // e.g., "application/vnd.oci.image.layer.v1.tar+gzip"
    Manifest  string `json:"manifest"`  // at://did:plc:alice/io.atcr.manifest/abc123
    UserDID   string `json:"userDid"`   // User's DID for quota grouping
    CreatedAt string `json:"createdAt"` // ISO 8601 timestamp
}

Record Key

Records use TID (timestamp-based ID) as the rkey. This means:

  • Multiple records can exist for the same layer (from different manifests)
  • Deduplication happens at query time, not storage time
  • Simple append-only writes on manifest push

Example Records

Manifest A (layers X, Y, Z) → creates 3 records
Manifest B (layers X, W)    → creates 2 records

io.atcr.hold.layer collection:
┌──────────────┬────────┬──────┬───────────────────────────────────┬─────────────────┐
│ rkey (TID)   │ digest │ size │ manifest                          │ userDid         │
├──────────────┼────────┼──────┼───────────────────────────────────┼─────────────────┤
│ 3jui7...001  │ X      │ 100  │ at://did:plc:alice/.../manifestA  │ did:plc:alice   │
│ 3jui7...002  │ Y      │ 200  │ at://did:plc:alice/.../manifestA  │ did:plc:alice   │
│ 3jui7...003  │ Z      │ 150  │ at://did:plc:alice/.../manifestA  │ did:plc:alice   │
│ 3jui7...004  │ X      │ 100  │ at://did:plc:alice/.../manifestB  │ did:plc:alice   │  ← duplicate digest
│ 3jui7...005  │ W      │ 300  │ at://did:plc:alice/.../manifestB  │ did:plc:alice   │
└──────────────┴────────┴──────┴───────────────────────────────────┴─────────────────┘

Quota Calculation

Query: User's Unique Storage

-- Calculate quota by deduplicating layers
SELECT SUM(size) FROM (
    SELECT DISTINCT digest, size
    FROM io.atcr.hold.layer
    WHERE userDid = ?
)

Using the example above:

  • Layer X appears twice but counted once: 100
  • Layers Y, Z, W counted once each: 200 + 150 + 300
  • Total: 750 bytes

Implementation

// pkg/hold/quota/quota.go

type QuotaManager struct {
    pds *pds.Server  // Hold's embedded PDS
}

// GetUsage calculates a user's current quota usage
func (q *QuotaManager) GetUsage(ctx context.Context, userDID string) (int64, error) {
    // List all layer records for this user
    records, err := q.pds.ListRecords(ctx, LayerCollection, userDID)
    if err != nil {
        return 0, err
    }

    // Deduplicate by digest
    uniqueLayers := make(map[string]int64) // digest -> size
    for _, record := range records {
        var layer LayerRecord
        if err := json.Unmarshal(record.Value, &layer); err != nil {
            continue
        }
        if layer.UserDID == userDID {
            uniqueLayers[layer.Digest] = layer.Size
        }
    }

    // Sum unique layer sizes
    var total int64
    for _, size := range uniqueLayers {
        total += size
    }

    return total, nil
}

// CheckQuota returns true if user has space for additional bytes
func (q *QuotaManager) CheckQuota(ctx context.Context, userDID string, additional int64, limit int64) (bool, int64, error) {
    current, err := q.GetUsage(ctx, userDID)
    if err != nil {
        return false, 0, err
    }

    return current+additional <= limit, current, nil
}

Quota Response

type QuotaInfo struct {
    Used      int64 `json:"used"`      // Current usage (deduplicated)
    Limit     int64 `json:"limit"`     // User's quota limit
    Available int64 `json:"available"` // Remaining space
}

Push Flow

Step-by-Step: User Pushes Image

┌──────────┐          ┌──────────┐          ┌──────────┐          ┌──────────┐
│  Client  │          │ AppView  │          │   Hold   │          │ User PDS │
│ (Docker) │          │          │          │ Service  │          │          │
└──────────┘          └──────────┘          └──────────┘          └──────────┘
     │                      │                      │                      │
     │ 1. Upload blobs      │                      │                      │
     ├─────────────────────>│                      │                      │
     │                      │ 2. Route to hold     │                      │
     │                      ├─────────────────────>│                      │
     │                      │                      │ 3. Store in S3       │
     │                      │                      │                      │
     │ 4. PUT manifest      │                      │                      │
     ├─────────────────────>│                      │                      │
     │                      │                      │                      │
     │                      │ 5. Calculate quota   │                      │
     │                      │    impact for new    │                      │
     │                      │    layers            │                      │
     │                      │                      │                      │
     │                      │ 6. Check quota limit │                      │
     │                      ├─────────────────────>│                      │
     │                      │<─────────────────────┤                      │
     │                      │                      │                      │
     │                      │ 7. Store manifest    │                      │
     │                      ├──────────────────────┼─────────────────────>│
     │                      │                      │                      │
     │                      │ 8. Create layer      │                      │
     │                      │    records           │                      │
     │                      ├─────────────────────>│                      │
     │                      │                      │ 9. Write to          │
     │                      │                      │    hold's PDS        │
     │                      │                      │                      │
     │ 10. 201 Created      │                      │                      │
     │<─────────────────────┤                      │                      │

Implementation

// pkg/appview/storage/routing_repository.go

func (r *RoutingRepository) PutManifest(ctx context.Context, manifest distribution.Manifest) error {
    // Parse manifest to get layers
    layers := extractLayers(manifest)

    // Get user's current unique layers from hold
    existingLayers, err := r.holdClient.GetUserLayers(ctx, r.userDID)
    if err != nil {
        return err
    }
    existingSet := makeDigestSet(existingLayers)

    // Calculate quota impact (only new unique layers)
    var quotaImpact int64
    for _, layer := range layers {
        if !existingSet[layer.Digest] {
            quotaImpact += layer.Size
        }
    }

    // Check quota
    ok, current, err := r.quotaManager.CheckQuota(ctx, r.userDID, quotaImpact, r.quotaLimit)
    if err != nil {
        return err
    }
    if !ok {
        return fmt.Errorf("quota exceeded: used=%d, impact=%d, limit=%d",
            current, quotaImpact, r.quotaLimit)
    }

    // Store manifest in user's PDS
    manifestURI, err := r.atprotoClient.PutManifest(ctx, manifest)
    if err != nil {
        return err
    }

    // Create layer records in hold's PDS
    for _, layer := range layers {
        record := LayerRecord{
            Type:      "io.atcr.hold.layer",
            Digest:    layer.Digest,
            Size:      layer.Size,
            MediaType: layer.MediaType,
            Manifest:  manifestURI,
            UserDID:   r.userDID,
            CreatedAt: time.Now().Format(time.RFC3339),
        }
        if err := r.holdClient.CreateLayerRecord(ctx, record); err != nil {
            log.Printf("Warning: failed to create layer record: %v", err)
            // Continue - reconciliation will fix
        }
    }

    return nil
}

Quota Check Timing

Quota is checked when the manifest is pushed (after blobs are uploaded):

  • Blobs upload first via presigned URLs
  • Manifest pushed last triggers quota check
  • If quota exceeded, manifest is rejected (orphaned blobs cleaned by GC)

This matches Harbor's approach and is the industry standard.

Delete Flow

Manifest Deletion

When a user deletes a manifest:

┌──────────┐          ┌──────────┐          ┌──────────┐          ┌──────────┐
│   User   │          │ AppView  │          │   Hold   │          │ User PDS │
│    UI    │          │          │          │ Service  │          │          │
└──────────┘          └──────────┘          └──────────┘          └──────────┘
     │                      │                      │                      │
     │ DELETE manifest      │                      │                      │
     ├─────────────────────>│                      │                      │
     │                      │                      │                      │
     │                      │ 1. Delete manifest   │                      │
     │                      │    from user's PDS   │                      │
     │                      ├──────────────────────┼─────────────────────>│
     │                      │                      │                      │
     │                      │ 2. Delete layer      │                      │
     │                      │    records for this  │                      │
     │                      │    manifest          │                      │
     │                      ├─────────────────────>│                      │
     │                      │                      │ 3. Remove records    │
     │                      │                      │    where manifest    │
     │                      │                      │    == deleted URI    │
     │                      │                      │                      │
     │ 4. 204 No Content    │                      │                      │
     │<─────────────────────┤                      │                      │

Implementation

// pkg/appview/handlers/manifest.go

func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) {
    userDID := auth.GetDID(r.Context())
    repository := chi.URLParam(r, "repository")
    digest := chi.URLParam(r, "digest")

    // Get manifest URI before deletion
    manifestURI := fmt.Sprintf("at://%s/%s/%s", userDID, ManifestCollection, digest)

    // Delete manifest from user's PDS
    if err := h.atprotoClient.DeleteRecord(ctx, ManifestCollection, digest); err != nil {
        http.Error(w, "failed to delete manifest", 500)
        return
    }

    // Delete associated layer records from hold's PDS
    if err := h.holdClient.DeleteLayerRecords(ctx, manifestURI); err != nil {
        log.Printf("Warning: failed to delete layer records: %v", err)
        // Continue - reconciliation will clean up
    }

    w.WriteHeader(http.StatusNoContent)
}

Hold Service: Delete Layer Records

// pkg/hold/pds/xrpc.go

func (s *Server) DeleteLayerRecords(ctx context.Context, manifestURI string) error {
    // List all layer records
    records, err := s.ListRecords(ctx, LayerCollection, "")
    if err != nil {
        return err
    }

    // Delete records matching this manifest
    for _, record := range records {
        var layer LayerRecord
        if err := json.Unmarshal(record.Value, &layer); err != nil {
            continue
        }
        if layer.Manifest == manifestURI {
            if err := s.DeleteRecord(ctx, LayerCollection, record.RKey); err != nil {
                log.Printf("Failed to delete layer record %s: %v", record.RKey, err)
            }
        }
    }

    return nil
}

Quota After Deletion

After deleting a manifest:

  • Layer records for that manifest are removed
  • Quota recalculated with SELECT DISTINCT query
  • If layer was only in deleted manifest → quota decreases
  • If layer exists in other manifests → quota unchanged (still deduplicated)

Garbage Collection

Orphaned Blobs

Orphaned blobs accumulate when:

  1. Manifest push fails after blobs uploaded
  2. Quota exceeded - manifest rejected
  3. User deletes manifest - blobs may no longer be referenced

GC Process

// pkg/hold/gc/gc.go

func (gc *GarbageCollector) Run(ctx context.Context) error {
    // Step 1: Get all referenced digests from layer records
    records, err := gc.pds.ListRecords(ctx, LayerCollection, "")
    if err != nil {
        return err
    }

    referenced := make(map[string]bool)
    for _, record := range records {
        var layer LayerRecord
        if err := json.Unmarshal(record.Value, &layer); err != nil {
            continue
        }
        referenced[layer.Digest] = true
    }

    log.Printf("Found %d referenced blobs", len(referenced))

    // Step 2: Walk S3 blobs and delete unreferenced
    var deleted, reclaimed int64
    err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fi storagedriver.FileInfo) error {
        if fi.IsDir() {
            return nil
        }

        digest := extractDigestFromPath(fi.Path())
        if !referenced[digest] {
            size := fi.Size()
            if err := gc.driver.Delete(ctx, fi.Path()); err != nil {
                log.Printf("Failed to delete %s: %v", digest, err)
                return nil
            }
            deleted++
            reclaimed += size
            log.Printf("GC: deleted %s (%d bytes)", digest, size)
        }
        return nil
    })

    log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes", deleted, reclaimed)
    return err
}

GC Schedule

# Environment variable
GC_ENABLED=true
GC_INTERVAL=24h  # Daily by default

Configuration

Hold Service Environment Variables

# .env.hold

# Quota Configuration
QUOTA_ENABLED=true
QUOTA_DEFAULT_LIMIT=10737418240  # 10GB in bytes

# Garbage Collection
GC_ENABLED=true
GC_INTERVAL=24h

Quota Limits by Bytes

Size Bytes
1 GB 1073741824
5 GB 5368709120
10 GB 10737418240
50 GB 53687091200
100 GB 107374182400

Future Enhancements

1. Quota API Endpoints

GET  /xrpc/io.atcr.hold.getQuota?did={userDID}  - Get user's quota usage
GET  /xrpc/io.atcr.hold.getQuotaBreakdown       - Storage by repository

2. Quota Alerts

  • Warning thresholds at 80%, 90%, 95%
  • Email/webhook notifications
  • Grace period before hard enforcement

3. Tier-Based Quotas (Implemented)

ATCR uses quota tiers to limit storage per crew member, configured via quotas.yaml:

# quotas.yaml
tiers:
  deckhand:        # Entry-level crew
    quota: 5GB
  bosun:           # Mid-level crew
    quota: 50GB
  quartermaster:   # High-level crew
    quota: 100GB

defaults:
  new_crew_tier: deckhand  # Default tier for new crew members
Tier Limit Description
deckhand 5 GB Entry-level crew member
bosun 50 GB Mid-level crew member
quartermaster 100 GB Senior crew member
owner (captain) Unlimited Hold owner always has unlimited

Tier Resolution:

  1. If user is captain (owner) → unlimited
  2. If crew member has explicit tier → use that tier's limit
  3. If crew member has no tier → use defaults.new_crew_tier
  4. If default tier not found → unlimited

Crew Record Example:

{
  "$type": "io.atcr.hold.crew",
  "member": "did:plc:alice123",
  "role": "writer",
  "permissions": ["blob:write"],
  "tier": "bosun",
  "addedAt": "2026-01-04T12:00:00Z"
}

4. Rate Limiting

Pull rate limits (Docker Hub style):

  • Anonymous: 100 pulls per 6 hours per IP
  • Authenticated: 200 pulls per 6 hours
  • Paid: Unlimited

5. Quota Purchasing

  • Stripe integration for additional storage
  • $0.10/GB/month pricing (industry standard)

References


Document Version: 2.0 Last Updated: 2026-01-04 Model: Per-user layer tracking with ATProto records