# ATCR Quota System

This document describes ATCR's storage quota implementation using ATProto records for per-user layer tracking.

## Table of Contents

- [Overview](#overview)
- [Quota Model](#quota-model)
- [Layer Record Schema](#layer-record-schema)
- [Quota Calculation](#quota-calculation)
- [Push Flow](#push-flow)
- [Delete Flow](#delete-flow)
- [Garbage Collection](#garbage-collection)
- [Configuration](#configuration)
- [Future Enhancements](#future-enhancements)

## Overview

ATCR implements per-user storage quotas to:
1. **Limit storage consumption** on shared hold services
2. **Provide transparency** (show users their storage usage)
3. **Enable fair billing** (users pay for what they use)

**Key principle:** Users pay for layers they reference, deduplicated per-user. If you push the same layer in multiple images, you only pay once.

### Example Scenario

```
Alice pushes myapp:v1 (layers A, B, C - each 100MB)
→ Creates 3 layer records in hold's PDS
→ Alice's quota: 300MB (3 unique layers)

Alice pushes myapp:v2 (layers A, B, D)
→ Creates 3 more layer records (A, B again, plus D)
→ Alice's quota: 400MB (4 unique layers: A, B, C, D)
→ Layers A, B appear twice in records but deduplicated in quota calc

Bob pushes his-app:latest (layers A, E)
→ Creates 2 layer records for Bob
→ Bob's quota: 200MB (2 unique layers: A, E)
→ Layer A shared with Alice in S3, but Bob pays for his own usage

Physical S3 storage: 500MB (A, B, C, D, E - deduplicated globally)
Alice's quota: 400MB
Bob's quota: 200MB
```

## Quota Model

### Everyone Pays for What They Upload

Each user is charged for all unique layers they reference, regardless of whether those layers exist in S3 from other users' uploads.

**Why this model?**
- **Simple mental model**: "I pushed 500MB of layers, I use 500MB of quota"
- **Predictable**: Your quota doesn't change based on others' actions
- **Clean deletion**: Delete manifest → layer records removed → quota freed
- **No cross-user dependencies**: Users are isolated

**Trade-off:**
- Total claimed storage can exceed physical S3 storage
- This is acceptable - deduplication is an operational benefit for ATCR, not a billing feature

### ATProto-Native Storage

Layer tracking uses ATProto records stored in the hold's embedded PDS:
- **Collection**: `io.atcr.hold.layer`
- **Repository**: Hold's DID (e.g., `did:web:hold01.atcr.io`)
- **Records**: One per manifest-layer relationship (TID-based keys)

This approach:
- Keeps quota data in ATProto (no separate database)
- Enables standard ATProto sync/query mechanisms
- Provides full audit trail of layer usage

## Layer Record Schema

### LayerRecord

```go
// pkg/atproto/lexicon.go

type LayerRecord struct {
    Type      string `json:"$type"`     // "io.atcr.hold.layer"
    Digest    string `json:"digest"`    // Layer digest (sha256:abc123...)
    Size      int64  `json:"size"`      // Size in bytes
    MediaType string `json:"mediaType"` // e.g., "application/vnd.oci.image.layer.v1.tar+gzip"
    Manifest  string `json:"manifest"`  // at://did:plc:alice/io.atcr.manifest/abc123
    UserDID   string `json:"userDid"`   // User's DID for quota grouping
    CreatedAt string `json:"createdAt"` // ISO 8601 timestamp
}
```

### Record Key

Records use TID (timestamp-based ID) as the rkey. This means:
- Multiple records can exist for the same layer (from different manifests)
- Deduplication happens at query time, not storage time
- Simple append-only writes on manifest push

### Example Records

```
Manifest A (layers X, Y, Z) → creates 3 records
Manifest B (layers X, W)    → creates 2 records

io.atcr.hold.layer collection:
┌──────────────┬────────┬──────┬───────────────────────────────────┬─────────────────┐
│ rkey (TID)   │ digest │ size │ manifest                          │ userDid         │
├──────────────┼────────┼──────┼───────────────────────────────────┼─────────────────┤
│ 3jui7...001  │ X      │ 100  │ at://did:plc:alice/.../manifestA  │ did:plc:alice   │
│ 3jui7...002  │ Y      │ 200  │ at://did:plc:alice/.../manifestA  │ did:plc:alice   │
│ 3jui7...003  │ Z      │ 150  │ at://did:plc:alice/.../manifestA  │ did:plc:alice   │
│ 3jui7...004  │ X      │ 100  │ at://did:plc:alice/.../manifestB  │ did:plc:alice   │  ← duplicate digest
│ 3jui7...005  │ W      │ 300  │ at://did:plc:alice/.../manifestB  │ did:plc:alice   │
└──────────────┴────────┴──────┴───────────────────────────────────┴─────────────────┘
```

## Quota Calculation

### Query: User's Unique Storage

```sql
-- Calculate quota by deduplicating layers
SELECT SUM(size) FROM (
    SELECT DISTINCT digest, size
    FROM io.atcr.hold.layer
    WHERE userDid = ?
)
```

Using the example above:
- Layer X appears twice but counted once: 100
- Layers Y, Z, W counted once each: 200 + 150 + 300
- **Total: 750 bytes**

### Implementation

```go
// pkg/hold/quota/quota.go

type QuotaManager struct {
    pds *pds.Server  // Hold's embedded PDS
}

// GetUsage calculates a user's current quota usage
func (q *QuotaManager) GetUsage(ctx context.Context, userDID string) (int64, error) {
    // List all layer records for this user
    records, err := q.pds.ListRecords(ctx, LayerCollection, userDID)
    if err != nil {
        return 0, err
    }

    // Deduplicate by digest
    uniqueLayers := make(map[string]int64) // digest -> size
    for _, record := range records {
        var layer LayerRecord
        if err := json.Unmarshal(record.Value, &layer); err != nil {
            continue
        }
        if layer.UserDID == userDID {
            uniqueLayers[layer.Digest] = layer.Size
        }
    }

    // Sum unique layer sizes
    var total int64
    for _, size := range uniqueLayers {
        total += size
    }

    return total, nil
}

// CheckQuota returns true if user has space for additional bytes
func (q *QuotaManager) CheckQuota(ctx context.Context, userDID string, additional int64, limit int64) (bool, int64, error) {
    current, err := q.GetUsage(ctx, userDID)
    if err != nil {
        return false, 0, err
    }

    return current+additional <= limit, current, nil
}
```

### Quota Response

```go
type QuotaInfo struct {
    Used      int64 `json:"used"`      // Current usage (deduplicated)
    Limit     int64 `json:"limit"`     // User's quota limit
    Available int64 `json:"available"` // Remaining space
}
```

## Push Flow

### Step-by-Step: User Pushes Image

```
┌──────────┐          ┌──────────┐          ┌──────────┐          ┌──────────┐
│  Client  │          │ AppView  │          │   Hold   │          │ User PDS │
│ (Docker) │          │          │          │ Service  │          │          │
└──────────┘          └──────────┘          └──────────┘          └──────────┘
     │                      │                      │                      │
     │ 1. Upload blobs      │                      │                      │
     ├─────────────────────>│                      │                      │
     │                      │ 2. Route to hold     │                      │
     │                      ├─────────────────────>│                      │
     │                      │                      │ 3. Store in S3       │
     │                      │                      │                      │
     │ 4. PUT manifest      │                      │                      │
     ├─────────────────────>│                      │                      │
     │                      │                      │                      │
     │                      │ 5. Calculate quota   │                      │
     │                      │    impact for new    │                      │
     │                      │    layers            │                      │
     │                      │                      │                      │
     │                      │ 6. Check quota limit │                      │
     │                      ├─────────────────────>│                      │
     │                      │<─────────────────────┤                      │
     │                      │                      │                      │
     │                      │ 7. Store manifest    │                      │
     │                      ├──────────────────────┼─────────────────────>│
     │                      │                      │                      │
     │                      │ 8. Create layer      │                      │
     │                      │    records           │                      │
     │                      ├─────────────────────>│                      │
     │                      │                      │ 9. Write to          │
     │                      │                      │    hold's PDS        │
     │                      │                      │                      │
     │ 10. 201 Created      │                      │                      │
     │<─────────────────────┤                      │                      │
```

### Implementation

```go
// pkg/appview/storage/routing_repository.go

func (r *RoutingRepository) PutManifest(ctx context.Context, manifest distribution.Manifest) error {
    // Parse manifest to get layers
    layers := extractLayers(manifest)

    // Get user's current unique layers from hold
    existingLayers, err := r.holdClient.GetUserLayers(ctx, r.userDID)
    if err != nil {
        return err
    }
    existingSet := makeDigestSet(existingLayers)

    // Calculate quota impact (only new unique layers)
    var quotaImpact int64
    for _, layer := range layers {
        if !existingSet[layer.Digest] {
            quotaImpact += layer.Size
        }
    }

    // Check quota
    ok, current, err := r.quotaManager.CheckQuota(ctx, r.userDID, quotaImpact, r.quotaLimit)
    if err != nil {
        return err
    }
    if !ok {
        return fmt.Errorf("quota exceeded: used=%d, impact=%d, limit=%d",
            current, quotaImpact, r.quotaLimit)
    }

    // Store manifest in user's PDS
    manifestURI, err := r.atprotoClient.PutManifest(ctx, manifest)
    if err != nil {
        return err
    }

    // Create layer records in hold's PDS
    for _, layer := range layers {
        record := LayerRecord{
            Type:      "io.atcr.hold.layer",
            Digest:    layer.Digest,
            Size:      layer.Size,
            MediaType: layer.MediaType,
            Manifest:  manifestURI,
            UserDID:   r.userDID,
            CreatedAt: time.Now().Format(time.RFC3339),
        }
        if err := r.holdClient.CreateLayerRecord(ctx, record); err != nil {
            log.Printf("Warning: failed to create layer record: %v", err)
            // Continue - reconciliation will fix
        }
    }

    return nil
}
```

### Quota Check Timing

Quota is checked when the **manifest is pushed** (after blobs are uploaded):
- Blobs upload first via presigned URLs
- Manifest pushed last triggers quota check
- If quota exceeded, manifest is rejected (orphaned blobs cleaned by GC)

This matches Harbor's approach and is the industry standard.

## Delete Flow

### Manifest Deletion

When a user deletes a manifest:

```
┌──────────┐          ┌──────────┐          ┌──────────┐          ┌──────────┐
│   User   │          │ AppView  │          │   Hold   │          │ User PDS │
│    UI    │          │          │          │ Service  │          │          │
└──────────┘          └──────────┘          └──────────┘          └──────────┘
     │                      │                      │                      │
     │ DELETE manifest      │                      │                      │
     ├─────────────────────>│                      │                      │
     │                      │                      │                      │
     │                      │ 1. Delete manifest   │                      │
     │                      │    from user's PDS   │                      │
     │                      ├──────────────────────┼─────────────────────>│
     │                      │                      │                      │
     │                      │ 2. Delete layer      │                      │
     │                      │    records for this  │                      │
     │                      │    manifest          │                      │
     │                      ├─────────────────────>│                      │
     │                      │                      │ 3. Remove records    │
     │                      │                      │    where manifest    │
     │                      │                      │    == deleted URI    │
     │                      │                      │                      │
     │ 4. 204 No Content    │                      │                      │
     │<─────────────────────┤                      │                      │
```

### Implementation

```go
// pkg/appview/handlers/manifest.go

func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) {
    userDID := auth.GetDID(r.Context())
    repository := chi.URLParam(r, "repository")
    digest := chi.URLParam(r, "digest")

    // Get manifest URI before deletion
    manifestURI := fmt.Sprintf("at://%s/%s/%s", userDID, ManifestCollection, digest)

    // Delete manifest from user's PDS
    if err := h.atprotoClient.DeleteRecord(ctx, ManifestCollection, digest); err != nil {
        http.Error(w, "failed to delete manifest", 500)
        return
    }

    // Delete associated layer records from hold's PDS
    if err := h.holdClient.DeleteLayerRecords(ctx, manifestURI); err != nil {
        log.Printf("Warning: failed to delete layer records: %v", err)
        // Continue - reconciliation will clean up
    }

    w.WriteHeader(http.StatusNoContent)
}
```

### Hold Service: Delete Layer Records

```go
// pkg/hold/pds/xrpc.go

func (s *Server) DeleteLayerRecords(ctx context.Context, manifestURI string) error {
    // List all layer records
    records, err := s.ListRecords(ctx, LayerCollection, "")
    if err != nil {
        return err
    }

    // Delete records matching this manifest
    for _, record := range records {
        var layer LayerRecord
        if err := json.Unmarshal(record.Value, &layer); err != nil {
            continue
        }
        if layer.Manifest == manifestURI {
            if err := s.DeleteRecord(ctx, LayerCollection, record.RKey); err != nil {
                log.Printf("Failed to delete layer record %s: %v", record.RKey, err)
            }
        }
    }

    return nil
}
```

### Quota After Deletion

After deleting a manifest:
- Layer records for that manifest are removed
- Quota recalculated with `SELECT DISTINCT` query
- If layer was only in deleted manifest → quota decreases
- If layer exists in other manifests → quota unchanged (still deduplicated)

## Garbage Collection

### Orphaned Blobs

Orphaned blobs accumulate when:
1. Manifest push fails after blobs uploaded
2. Quota exceeded - manifest rejected
3. User deletes manifest - blobs may no longer be referenced

### GC Process

```go
// pkg/hold/gc/gc.go

func (gc *GarbageCollector) Run(ctx context.Context) error {
    // Step 1: Get all referenced digests from layer records
    records, err := gc.pds.ListRecords(ctx, LayerCollection, "")
    if err != nil {
        return err
    }

    referenced := make(map[string]bool)
    for _, record := range records {
        var layer LayerRecord
        if err := json.Unmarshal(record.Value, &layer); err != nil {
            continue
        }
        referenced[layer.Digest] = true
    }

    log.Printf("Found %d referenced blobs", len(referenced))

    // Step 2: Walk S3 blobs and delete unreferenced
    var deleted, reclaimed int64
    err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fi storagedriver.FileInfo) error {
        if fi.IsDir() {
            return nil
        }

        digest := extractDigestFromPath(fi.Path())
        if !referenced[digest] {
            size := fi.Size()
            if err := gc.driver.Delete(ctx, fi.Path()); err != nil {
                log.Printf("Failed to delete %s: %v", digest, err)
                return nil
            }
            deleted++
            reclaimed += size
            log.Printf("GC: deleted %s (%d bytes)", digest, size)
        }
        return nil
    })

    log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes", deleted, reclaimed)
    return err
}
```

### GC Schedule

```bash
# Environment variable
GC_ENABLED=true
GC_INTERVAL=24h  # Daily by default
```

## Configuration

### Hold Service Environment Variables

```bash
# .env.hold

# Quota Configuration
QUOTA_ENABLED=true
QUOTA_DEFAULT_LIMIT=10737418240  # 10GB in bytes

# Garbage Collection
GC_ENABLED=true
GC_INTERVAL=24h
```

### Quota Limits by Bytes

| Size | Bytes |
|------|-------|
| 1 GB | 1073741824 |
| 5 GB | 5368709120 |
| 10 GB | 10737418240 |
| 50 GB | 53687091200 |
| 100 GB | 107374182400 |

## Future Enhancements

### 1. Quota API Endpoints

```
GET  /xrpc/io.atcr.hold.getQuota?did={userDID}  - Get user's quota usage
GET  /xrpc/io.atcr.hold.getQuotaBreakdown       - Storage by repository
```

### 2. Quota Alerts

- Warning thresholds at 80%, 90%, 95%
- Email/webhook notifications
- Grace period before hard enforcement

### 3. Tier-Based Quotas (Implemented)

ATCR uses quota tiers to limit storage per crew member, configured via `quotas.yaml`:

```yaml
# quotas.yaml
tiers:
  deckhand:        # Entry-level crew
    quota: 5GB
  bosun:           # Mid-level crew
    quota: 50GB
  quartermaster:   # High-level crew
    quota: 100GB

defaults:
  new_crew_tier: deckhand  # Default tier for new crew members
```

| Tier | Limit | Description |
|------|-------|-------------|
| deckhand | 5 GB | Entry-level crew member |
| bosun | 50 GB | Mid-level crew member |
| quartermaster | 100 GB | Senior crew member |
| owner (captain) | Unlimited | Hold owner always has unlimited |

**Tier Resolution:**
1. If user is captain (owner) → unlimited
2. If crew member has explicit tier → use that tier's limit
3. If crew member has no tier → use `defaults.new_crew_tier`
4. If default tier not found → unlimited

**Crew Record Example:**
```json
{
  "$type": "io.atcr.hold.crew",
  "member": "did:plc:alice123",
  "role": "writer",
  "permissions": ["blob:write"],
  "tier": "bosun",
  "addedAt": "2026-01-04T12:00:00Z"
}
```

### 4. Rate Limiting

Pull rate limits (Docker Hub style):
- Anonymous: 100 pulls per 6 hours per IP
- Authenticated: 200 pulls per 6 hours
- Paid: Unlimited

### 5. Quota Purchasing

- Stripe integration for additional storage
- $0.10/GB/month pricing (industry standard)

## References

- **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/
- **ATProto Spec:** https://atproto.com/specs/record
- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec

---

**Document Version:** 2.0
**Last Updated:** 2026-01-04
**Model:** Per-user layer tracking with ATProto records