# ATCR Quota System This document describes ATCR's storage quota implementation using ATProto records for per-user layer tracking. ## Table of Contents - [Overview](#overview) - [Quota Model](#quota-model) - [Layer Record Schema](#layer-record-schema) - [Quota Calculation](#quota-calculation) - [Push Flow](#push-flow) - [Delete Flow](#delete-flow) - [Garbage Collection](#garbage-collection) - [Configuration](#configuration) - [Future Enhancements](#future-enhancements) ## Overview ATCR implements per-user storage quotas to: 1. **Limit storage consumption** on shared hold services 2. **Provide transparency** (show users their storage usage) 3. **Enable fair billing** (users pay for what they use) **Key principle:** Users pay for layers they reference, deduplicated per-user. If you push the same layer in multiple images, you only pay once. ### Example Scenario ``` Alice pushes myapp:v1 (layers A, B, C - each 100MB) → Creates 3 layer records in hold's PDS → Alice's quota: 300MB (3 unique layers) Alice pushes myapp:v2 (layers A, B, D) → Creates 3 more layer records (A, B again, plus D) → Alice's quota: 400MB (4 unique layers: A, B, C, D) → Layers A, B appear twice in records but deduplicated in quota calc Bob pushes his-app:latest (layers A, E) → Creates 2 layer records for Bob → Bob's quota: 200MB (2 unique layers: A, E) → Layer A shared with Alice in S3, but Bob pays for his own usage Physical S3 storage: 500MB (A, B, C, D, E - deduplicated globally) Alice's quota: 400MB Bob's quota: 200MB ``` ## Quota Model ### Everyone Pays for What They Upload Each user is charged for all unique layers they reference, regardless of whether those layers exist in S3 from other users' uploads. **Why this model?** - **Simple mental model**: "I pushed 500MB of layers, I use 500MB of quota" - **Predictable**: Your quota doesn't change based on others' actions - **Clean deletion**: Delete manifest → layer records removed → quota freed - **No cross-user dependencies**: Users are isolated **Trade-off:** - Total claimed storage can exceed physical S3 storage - This is acceptable - deduplication is an operational benefit for ATCR, not a billing feature ### ATProto-Native Storage Layer tracking uses ATProto records stored in the hold's embedded PDS: - **Collection**: `io.atcr.hold.layer` - **Repository**: Hold's DID (e.g., `did:web:hold01.atcr.io`) - **Records**: One per manifest-layer relationship (TID-based keys) This approach: - Keeps quota data in ATProto (no separate database) - Enables standard ATProto sync/query mechanisms - Provides full audit trail of layer usage ## Layer Record Schema ### LayerRecord ```go // pkg/atproto/lexicon.go type LayerRecord struct { Type string `json:"$type"` // "io.atcr.hold.layer" Digest string `json:"digest"` // Layer digest (sha256:abc123...) Size int64 `json:"size"` // Size in bytes MediaType string `json:"mediaType"` // e.g., "application/vnd.oci.image.layer.v1.tar+gzip" Manifest string `json:"manifest"` // at://did:plc:alice/io.atcr.manifest/abc123 UserDID string `json:"userDid"` // User's DID for quota grouping CreatedAt string `json:"createdAt"` // ISO 8601 timestamp } ``` ### Record Key Records use TID (timestamp-based ID) as the rkey. This means: - Multiple records can exist for the same layer (from different manifests) - Deduplication happens at query time, not storage time - Simple append-only writes on manifest push ### Example Records ``` Manifest A (layers X, Y, Z) → creates 3 records Manifest B (layers X, W) → creates 2 records io.atcr.hold.layer collection: ┌──────────────┬────────┬──────┬───────────────────────────────────┬─────────────────┐ │ rkey (TID) │ digest │ size │ manifest │ userDid │ ├──────────────┼────────┼──────┼───────────────────────────────────┼─────────────────┤ │ 3jui7...001 │ X │ 100 │ at://did:plc:alice/.../manifestA │ did:plc:alice │ │ 3jui7...002 │ Y │ 200 │ at://did:plc:alice/.../manifestA │ did:plc:alice │ │ 3jui7...003 │ Z │ 150 │ at://did:plc:alice/.../manifestA │ did:plc:alice │ │ 3jui7...004 │ X │ 100 │ at://did:plc:alice/.../manifestB │ did:plc:alice │ ← duplicate digest │ 3jui7...005 │ W │ 300 │ at://did:plc:alice/.../manifestB │ did:plc:alice │ └──────────────┴────────┴──────┴───────────────────────────────────┴─────────────────┘ ``` ## Quota Calculation ### Query: User's Unique Storage ```sql -- Calculate quota by deduplicating layers SELECT SUM(size) FROM ( SELECT DISTINCT digest, size FROM io.atcr.hold.layer WHERE userDid = ? ) ``` Using the example above: - Layer X appears twice but counted once: 100 - Layers Y, Z, W counted once each: 200 + 150 + 300 - **Total: 750 bytes** ### Implementation ```go // pkg/hold/quota/quota.go type QuotaManager struct { pds *pds.Server // Hold's embedded PDS } // GetUsage calculates a user's current quota usage func (q *QuotaManager) GetUsage(ctx context.Context, userDID string) (int64, error) { // List all layer records for this user records, err := q.pds.ListRecords(ctx, LayerCollection, userDID) if err != nil { return 0, err } // Deduplicate by digest uniqueLayers := make(map[string]int64) // digest -> size for _, record := range records { var layer LayerRecord if err := json.Unmarshal(record.Value, &layer); err != nil { continue } if layer.UserDID == userDID { uniqueLayers[layer.Digest] = layer.Size } } // Sum unique layer sizes var total int64 for _, size := range uniqueLayers { total += size } return total, nil } // CheckQuota returns true if user has space for additional bytes func (q *QuotaManager) CheckQuota(ctx context.Context, userDID string, additional int64, limit int64) (bool, int64, error) { current, err := q.GetUsage(ctx, userDID) if err != nil { return false, 0, err } return current+additional <= limit, current, nil } ``` ### Quota Response ```go type QuotaInfo struct { Used int64 `json:"used"` // Current usage (deduplicated) Limit int64 `json:"limit"` // User's quota limit Available int64 `json:"available"` // Remaining space } ``` ## Push Flow ### Step-by-Step: User Pushes Image ``` ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Client │ │ AppView │ │ Hold │ │ User PDS │ │ (Docker) │ │ │ │ Service │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ │ 1. Upload blobs │ │ │ ├─────────────────────>│ │ │ │ │ 2. Route to hold │ │ │ ├─────────────────────>│ │ │ │ │ 3. Store in S3 │ │ │ │ │ │ 4. PUT manifest │ │ │ ├─────────────────────>│ │ │ │ │ │ │ │ │ 5. Calculate quota │ │ │ │ impact for new │ │ │ │ layers │ │ │ │ │ │ │ │ 6. Check quota limit │ │ │ ├─────────────────────>│ │ │ │<─────────────────────┤ │ │ │ │ │ │ │ 7. Store manifest │ │ │ ├──────────────────────┼─────────────────────>│ │ │ │ │ │ │ 8. Create layer │ │ │ │ records │ │ │ ├─────────────────────>│ │ │ │ │ 9. Write to │ │ │ │ hold's PDS │ │ │ │ │ │ 10. 201 Created │ │ │ │<─────────────────────┤ │ │ ``` ### Implementation ```go // pkg/appview/storage/routing_repository.go func (r *RoutingRepository) PutManifest(ctx context.Context, manifest distribution.Manifest) error { // Parse manifest to get layers layers := extractLayers(manifest) // Get user's current unique layers from hold existingLayers, err := r.holdClient.GetUserLayers(ctx, r.userDID) if err != nil { return err } existingSet := makeDigestSet(existingLayers) // Calculate quota impact (only new unique layers) var quotaImpact int64 for _, layer := range layers { if !existingSet[layer.Digest] { quotaImpact += layer.Size } } // Check quota ok, current, err := r.quotaManager.CheckQuota(ctx, r.userDID, quotaImpact, r.quotaLimit) if err != nil { return err } if !ok { return fmt.Errorf("quota exceeded: used=%d, impact=%d, limit=%d", current, quotaImpact, r.quotaLimit) } // Store manifest in user's PDS manifestURI, err := r.atprotoClient.PutManifest(ctx, manifest) if err != nil { return err } // Create layer records in hold's PDS for _, layer := range layers { record := LayerRecord{ Type: "io.atcr.hold.layer", Digest: layer.Digest, Size: layer.Size, MediaType: layer.MediaType, Manifest: manifestURI, UserDID: r.userDID, CreatedAt: time.Now().Format(time.RFC3339), } if err := r.holdClient.CreateLayerRecord(ctx, record); err != nil { log.Printf("Warning: failed to create layer record: %v", err) // Continue - reconciliation will fix } } return nil } ``` ### Quota Check Timing Quota is checked when the **manifest is pushed** (after blobs are uploaded): - Blobs upload first via presigned URLs - Manifest pushed last triggers quota check - If quota exceeded, manifest is rejected (orphaned blobs cleaned by GC) This matches Harbor's approach and is the industry standard. ## Delete Flow ### Manifest Deletion When a user deletes a manifest: ``` ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ User │ │ AppView │ │ Hold │ │ User PDS │ │ UI │ │ │ │ Service │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ │ DELETE manifest │ │ │ ├─────────────────────>│ │ │ │ │ │ │ │ │ 1. Delete manifest │ │ │ │ from user's PDS │ │ │ ├──────────────────────┼─────────────────────>│ │ │ │ │ │ │ 2. Delete layer │ │ │ │ records for this │ │ │ │ manifest │ │ │ ├─────────────────────>│ │ │ │ │ 3. Remove records │ │ │ │ where manifest │ │ │ │ == deleted URI │ │ │ │ │ │ 4. 204 No Content │ │ │ │<─────────────────────┤ │ │ ``` ### Implementation ```go // pkg/appview/handlers/manifest.go func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) { userDID := auth.GetDID(r.Context()) repository := chi.URLParam(r, "repository") digest := chi.URLParam(r, "digest") // Get manifest URI before deletion manifestURI := fmt.Sprintf("at://%s/%s/%s", userDID, ManifestCollection, digest) // Delete manifest from user's PDS if err := h.atprotoClient.DeleteRecord(ctx, ManifestCollection, digest); err != nil { http.Error(w, "failed to delete manifest", 500) return } // Delete associated layer records from hold's PDS if err := h.holdClient.DeleteLayerRecords(ctx, manifestURI); err != nil { log.Printf("Warning: failed to delete layer records: %v", err) // Continue - reconciliation will clean up } w.WriteHeader(http.StatusNoContent) } ``` ### Hold Service: Delete Layer Records ```go // pkg/hold/pds/xrpc.go func (s *Server) DeleteLayerRecords(ctx context.Context, manifestURI string) error { // List all layer records records, err := s.ListRecords(ctx, LayerCollection, "") if err != nil { return err } // Delete records matching this manifest for _, record := range records { var layer LayerRecord if err := json.Unmarshal(record.Value, &layer); err != nil { continue } if layer.Manifest == manifestURI { if err := s.DeleteRecord(ctx, LayerCollection, record.RKey); err != nil { log.Printf("Failed to delete layer record %s: %v", record.RKey, err) } } } return nil } ``` ### Quota After Deletion After deleting a manifest: - Layer records for that manifest are removed - Quota recalculated with `SELECT DISTINCT` query - If layer was only in deleted manifest → quota decreases - If layer exists in other manifests → quota unchanged (still deduplicated) ## Garbage Collection ### Orphaned Blobs Orphaned blobs accumulate when: 1. Manifest push fails after blobs uploaded 2. Quota exceeded - manifest rejected 3. User deletes manifest - blobs may no longer be referenced ### GC Process ```go // pkg/hold/gc/gc.go func (gc *GarbageCollector) Run(ctx context.Context) error { // Step 1: Get all referenced digests from layer records records, err := gc.pds.ListRecords(ctx, LayerCollection, "") if err != nil { return err } referenced := make(map[string]bool) for _, record := range records { var layer LayerRecord if err := json.Unmarshal(record.Value, &layer); err != nil { continue } referenced[layer.Digest] = true } log.Printf("Found %d referenced blobs", len(referenced)) // Step 2: Walk S3 blobs and delete unreferenced var deleted, reclaimed int64 err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fi storagedriver.FileInfo) error { if fi.IsDir() { return nil } digest := extractDigestFromPath(fi.Path()) if !referenced[digest] { size := fi.Size() if err := gc.driver.Delete(ctx, fi.Path()); err != nil { log.Printf("Failed to delete %s: %v", digest, err) return nil } deleted++ reclaimed += size log.Printf("GC: deleted %s (%d bytes)", digest, size) } return nil }) log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes", deleted, reclaimed) return err } ``` ### GC Schedule ```bash # Environment variable GC_ENABLED=true GC_INTERVAL=24h # Daily by default ``` ## Configuration ### Hold Service Environment Variables ```bash # .env.hold # Quota Configuration QUOTA_ENABLED=true QUOTA_DEFAULT_LIMIT=10737418240 # 10GB in bytes # Garbage Collection GC_ENABLED=true GC_INTERVAL=24h ``` ### Quota Limits by Bytes | Size | Bytes | |------|-------| | 1 GB | 1073741824 | | 5 GB | 5368709120 | | 10 GB | 10737418240 | | 50 GB | 53687091200 | | 100 GB | 107374182400 | ## Future Enhancements ### 1. Quota API Endpoints ``` GET /xrpc/io.atcr.hold.getQuota?did={userDID} - Get user's quota usage GET /xrpc/io.atcr.hold.getQuotaBreakdown - Storage by repository ``` ### 2. Quota Alerts - Warning thresholds at 80%, 90%, 95% - Email/webhook notifications - Grace period before hard enforcement ### 3. Tier-Based Quotas (Implemented) ATCR uses quota tiers to limit storage per crew member, configured via `quotas.yaml`: ```yaml # quotas.yaml tiers: deckhand: # Entry-level crew quota: 5GB bosun: # Mid-level crew quota: 50GB quartermaster: # High-level crew quota: 100GB defaults: new_crew_tier: deckhand # Default tier for new crew members ``` | Tier | Limit | Description | |------|-------|-------------| | deckhand | 5 GB | Entry-level crew member | | bosun | 50 GB | Mid-level crew member | | quartermaster | 100 GB | Senior crew member | | owner (captain) | Unlimited | Hold owner always has unlimited | **Tier Resolution:** 1. If user is captain (owner) → unlimited 2. If crew member has explicit tier → use that tier's limit 3. If crew member has no tier → use `defaults.new_crew_tier` 4. If default tier not found → unlimited **Crew Record Example:** ```json { "$type": "io.atcr.hold.crew", "member": "did:plc:alice123", "role": "writer", "permissions": ["blob:write"], "tier": "bosun", "addedAt": "2026-01-04T12:00:00Z" } ``` ### 4. Rate Limiting Pull rate limits (Docker Hub style): - Anonymous: 100 pulls per 6 hours per IP - Authenticated: 200 pulls per 6 hours - Paid: Unlimited ### 5. Quota Purchasing - Stripe integration for additional storage - $0.10/GB/month pricing (industry standard) ## References - **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/ - **ATProto Spec:** https://atproto.com/specs/record - **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec --- **Document Version:** 2.0 **Last Updated:** 2026-01-04 **Model:** Per-user layer tracking with ATProto records