Files
at-container-registry/docs/QUOTAS.md
Evan Jarrett 487fc8a47e wording
2026-01-04 23:37:31 -06:00

576 lines
20 KiB
Markdown

# ATCR Quota System
This document describes ATCR's storage quota implementation using ATProto records for per-user layer tracking.
## Table of Contents
- [Overview](#overview)
- [Quota Model](#quota-model)
- [Layer Record Schema](#layer-record-schema)
- [Quota Calculation](#quota-calculation)
- [Push Flow](#push-flow)
- [Delete Flow](#delete-flow)
- [Garbage Collection](#garbage-collection)
- [Configuration](#configuration)
- [Future Enhancements](#future-enhancements)
## Overview
ATCR implements per-user storage quotas to:
1. **Limit storage consumption** on shared hold services
2. **Provide transparency** (show users their storage usage)
3. **Enable fair billing** (users pay for what they use)
**Key principle:** Users pay for layers they reference, deduplicated per-user. If you push the same layer in multiple images, you only pay once.
### Example Scenario
```
Alice pushes myapp:v1 (layers A, B, C - each 100MB)
→ Creates 3 layer records in hold's PDS
→ Alice's quota: 300MB (3 unique layers)
Alice pushes myapp:v2 (layers A, B, D)
→ Creates 3 more layer records (A, B again, plus D)
→ Alice's quota: 400MB (4 unique layers: A, B, C, D)
→ Layers A, B appear twice in records but deduplicated in quota calc
Bob pushes his-app:latest (layers A, E)
→ Creates 2 layer records for Bob
→ Bob's quota: 200MB (2 unique layers: A, E)
→ Layer A shared with Alice in S3, but Bob pays for his own usage
Physical S3 storage: 500MB (A, B, C, D, E - deduplicated globally)
Alice's quota: 400MB
Bob's quota: 200MB
```
## Quota Model
### Everyone Pays for What They Upload
Each user is charged for all unique layers they reference, regardless of whether those layers exist in S3 from other users' uploads.
**Why this model?**
- **Simple mental model**: "I pushed 500MB of layers, I use 500MB of quota"
- **Predictable**: Your quota doesn't change based on others' actions
- **Clean deletion**: Delete manifest → layer records removed → quota freed
- **No cross-user dependencies**: Users are isolated
**Trade-off:**
- Total claimed storage can exceed physical S3 storage
- This is acceptable - deduplication is an operational benefit for ATCR, not a billing feature
### ATProto-Native Storage
Layer tracking uses ATProto records stored in the hold's embedded PDS:
- **Collection**: `io.atcr.hold.layer`
- **Repository**: Hold's DID (e.g., `did:web:hold01.atcr.io`)
- **Records**: One per manifest-layer relationship (TID-based keys)
This approach:
- Keeps quota data in ATProto (no separate database)
- Enables standard ATProto sync/query mechanisms
- Provides full audit trail of layer usage
## Layer Record Schema
### LayerRecord
```go
// pkg/atproto/lexicon.go
type LayerRecord struct {
Type string `json:"$type"` // "io.atcr.hold.layer"
Digest string `json:"digest"` // Layer digest (sha256:abc123...)
Size int64 `json:"size"` // Size in bytes
MediaType string `json:"mediaType"` // e.g., "application/vnd.oci.image.layer.v1.tar+gzip"
Manifest string `json:"manifest"` // at://did:plc:alice/io.atcr.manifest/abc123
UserDID string `json:"userDid"` // User's DID for quota grouping
CreatedAt string `json:"createdAt"` // ISO 8601 timestamp
}
```
### Record Key
Records use TID (timestamp-based ID) as the rkey. This means:
- Multiple records can exist for the same layer (from different manifests)
- Deduplication happens at query time, not storage time
- Simple append-only writes on manifest push
### Example Records
```
Manifest A (layers X, Y, Z) → creates 3 records
Manifest B (layers X, W) → creates 2 records
io.atcr.hold.layer collection:
┌──────────────┬────────┬──────┬───────────────────────────────────┬─────────────────┐
│ rkey (TID) │ digest │ size │ manifest │ userDid │
├──────────────┼────────┼──────┼───────────────────────────────────┼─────────────────┤
│ 3jui7...001 │ X │ 100 │ at://did:plc:alice/.../manifestA │ did:plc:alice │
│ 3jui7...002 │ Y │ 200 │ at://did:plc:alice/.../manifestA │ did:plc:alice │
│ 3jui7...003 │ Z │ 150 │ at://did:plc:alice/.../manifestA │ did:plc:alice │
│ 3jui7...004 │ X │ 100 │ at://did:plc:alice/.../manifestB │ did:plc:alice │ ← duplicate digest
│ 3jui7...005 │ W │ 300 │ at://did:plc:alice/.../manifestB │ did:plc:alice │
└──────────────┴────────┴──────┴───────────────────────────────────┴─────────────────┘
```
## Quota Calculation
### Query: User's Unique Storage
```sql
-- Calculate quota by deduplicating layers
SELECT SUM(size) FROM (
SELECT DISTINCT digest, size
FROM io.atcr.hold.layer
WHERE userDid = ?
)
```
Using the example above:
- Layer X appears twice but counted once: 100
- Layers Y, Z, W counted once each: 200 + 150 + 300
- **Total: 750 bytes**
### Implementation
```go
// pkg/hold/quota/quota.go
type QuotaManager struct {
pds *pds.Server // Hold's embedded PDS
}
// GetUsage calculates a user's current quota usage
func (q *QuotaManager) GetUsage(ctx context.Context, userDID string) (int64, error) {
// List all layer records for this user
records, err := q.pds.ListRecords(ctx, LayerCollection, userDID)
if err != nil {
return 0, err
}
// Deduplicate by digest
uniqueLayers := make(map[string]int64) // digest -> size
for _, record := range records {
var layer LayerRecord
if err := json.Unmarshal(record.Value, &layer); err != nil {
continue
}
if layer.UserDID == userDID {
uniqueLayers[layer.Digest] = layer.Size
}
}
// Sum unique layer sizes
var total int64
for _, size := range uniqueLayers {
total += size
}
return total, nil
}
// CheckQuota returns true if user has space for additional bytes
func (q *QuotaManager) CheckQuota(ctx context.Context, userDID string, additional int64, limit int64) (bool, int64, error) {
current, err := q.GetUsage(ctx, userDID)
if err != nil {
return false, 0, err
}
return current+additional <= limit, current, nil
}
```
### Quota Response
```go
type QuotaInfo struct {
Used int64 `json:"used"` // Current usage (deduplicated)
Limit int64 `json:"limit"` // User's quota limit
Available int64 `json:"available"` // Remaining space
}
```
## Push Flow
### Step-by-Step: User Pushes Image
```
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Client │ │ AppView │ │ Hold │ │ User PDS │
│ (Docker) │ │ │ │ Service │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │ │ │
│ 1. Upload blobs │ │ │
├─────────────────────>│ │ │
│ │ 2. Route to hold │ │
│ ├─────────────────────>│ │
│ │ │ 3. Store in S3 │
│ │ │ │
│ 4. PUT manifest │ │ │
├─────────────────────>│ │ │
│ │ │ │
│ │ 5. Calculate quota │ │
│ │ impact for new │ │
│ │ layers │ │
│ │ │ │
│ │ 6. Check quota limit │ │
│ ├─────────────────────>│ │
│ │<─────────────────────┤ │
│ │ │ │
│ │ 7. Store manifest │ │
│ ├──────────────────────┼─────────────────────>│
│ │ │ │
│ │ 8. Create layer │ │
│ │ records │ │
│ ├─────────────────────>│ │
│ │ │ 9. Write to │
│ │ │ hold's PDS │
│ │ │ │
│ 10. 201 Created │ │ │
│<─────────────────────┤ │ │
```
### Implementation
```go
// pkg/appview/storage/routing_repository.go
func (r *RoutingRepository) PutManifest(ctx context.Context, manifest distribution.Manifest) error {
// Parse manifest to get layers
layers := extractLayers(manifest)
// Get user's current unique layers from hold
existingLayers, err := r.holdClient.GetUserLayers(ctx, r.userDID)
if err != nil {
return err
}
existingSet := makeDigestSet(existingLayers)
// Calculate quota impact (only new unique layers)
var quotaImpact int64
for _, layer := range layers {
if !existingSet[layer.Digest] {
quotaImpact += layer.Size
}
}
// Check quota
ok, current, err := r.quotaManager.CheckQuota(ctx, r.userDID, quotaImpact, r.quotaLimit)
if err != nil {
return err
}
if !ok {
return fmt.Errorf("quota exceeded: used=%d, impact=%d, limit=%d",
current, quotaImpact, r.quotaLimit)
}
// Store manifest in user's PDS
manifestURI, err := r.atprotoClient.PutManifest(ctx, manifest)
if err != nil {
return err
}
// Create layer records in hold's PDS
for _, layer := range layers {
record := LayerRecord{
Type: "io.atcr.hold.layer",
Digest: layer.Digest,
Size: layer.Size,
MediaType: layer.MediaType,
Manifest: manifestURI,
UserDID: r.userDID,
CreatedAt: time.Now().Format(time.RFC3339),
}
if err := r.holdClient.CreateLayerRecord(ctx, record); err != nil {
log.Printf("Warning: failed to create layer record: %v", err)
// Continue - reconciliation will fix
}
}
return nil
}
```
### Quota Check Timing
Quota is checked when the **manifest is pushed** (after blobs are uploaded):
- Blobs upload first via presigned URLs
- Manifest pushed last triggers quota check
- If quota exceeded, manifest is rejected (orphaned blobs cleaned by GC)
This matches Harbor's approach and is the industry standard.
## Delete Flow
### Manifest Deletion
When a user deletes a manifest:
```
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ User │ │ AppView │ │ Hold │ │ User PDS │
│ UI │ │ │ │ Service │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │ │ │
│ DELETE manifest │ │ │
├─────────────────────>│ │ │
│ │ │ │
│ │ 1. Delete manifest │ │
│ │ from user's PDS │ │
│ ├──────────────────────┼─────────────────────>│
│ │ │ │
│ │ 2. Delete layer │ │
│ │ records for this │ │
│ │ manifest │ │
│ ├─────────────────────>│ │
│ │ │ 3. Remove records │
│ │ │ where manifest │
│ │ │ == deleted URI │
│ │ │ │
│ 4. 204 No Content │ │ │
│<─────────────────────┤ │ │
```
### Implementation
```go
// pkg/appview/handlers/manifest.go
func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) {
userDID := auth.GetDID(r.Context())
repository := chi.URLParam(r, "repository")
digest := chi.URLParam(r, "digest")
// Get manifest URI before deletion
manifestURI := fmt.Sprintf("at://%s/%s/%s", userDID, ManifestCollection, digest)
// Delete manifest from user's PDS
if err := h.atprotoClient.DeleteRecord(ctx, ManifestCollection, digest); err != nil {
http.Error(w, "failed to delete manifest", 500)
return
}
// Delete associated layer records from hold's PDS
if err := h.holdClient.DeleteLayerRecords(ctx, manifestURI); err != nil {
log.Printf("Warning: failed to delete layer records: %v", err)
// Continue - reconciliation will clean up
}
w.WriteHeader(http.StatusNoContent)
}
```
### Hold Service: Delete Layer Records
```go
// pkg/hold/pds/xrpc.go
func (s *Server) DeleteLayerRecords(ctx context.Context, manifestURI string) error {
// List all layer records
records, err := s.ListRecords(ctx, LayerCollection, "")
if err != nil {
return err
}
// Delete records matching this manifest
for _, record := range records {
var layer LayerRecord
if err := json.Unmarshal(record.Value, &layer); err != nil {
continue
}
if layer.Manifest == manifestURI {
if err := s.DeleteRecord(ctx, LayerCollection, record.RKey); err != nil {
log.Printf("Failed to delete layer record %s: %v", record.RKey, err)
}
}
}
return nil
}
```
### Quota After Deletion
After deleting a manifest:
- Layer records for that manifest are removed
- Quota recalculated with `SELECT DISTINCT` query
- If layer was only in deleted manifest → quota decreases
- If layer exists in other manifests → quota unchanged (still deduplicated)
## Garbage Collection
### Orphaned Blobs
Orphaned blobs accumulate when:
1. Manifest push fails after blobs uploaded
2. Quota exceeded - manifest rejected
3. User deletes manifest - blobs may no longer be referenced
### GC Process
```go
// pkg/hold/gc/gc.go
func (gc *GarbageCollector) Run(ctx context.Context) error {
// Step 1: Get all referenced digests from layer records
records, err := gc.pds.ListRecords(ctx, LayerCollection, "")
if err != nil {
return err
}
referenced := make(map[string]bool)
for _, record := range records {
var layer LayerRecord
if err := json.Unmarshal(record.Value, &layer); err != nil {
continue
}
referenced[layer.Digest] = true
}
log.Printf("Found %d referenced blobs", len(referenced))
// Step 2: Walk S3 blobs and delete unreferenced
var deleted, reclaimed int64
err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fi storagedriver.FileInfo) error {
if fi.IsDir() {
return nil
}
digest := extractDigestFromPath(fi.Path())
if !referenced[digest] {
size := fi.Size()
if err := gc.driver.Delete(ctx, fi.Path()); err != nil {
log.Printf("Failed to delete %s: %v", digest, err)
return nil
}
deleted++
reclaimed += size
log.Printf("GC: deleted %s (%d bytes)", digest, size)
}
return nil
})
log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes", deleted, reclaimed)
return err
}
```
### GC Schedule
```bash
# Environment variable
GC_ENABLED=true
GC_INTERVAL=24h # Daily by default
```
## Configuration
### Hold Service Environment Variables
```bash
# .env.hold
# Quota Configuration
QUOTA_ENABLED=true
QUOTA_DEFAULT_LIMIT=10737418240 # 10GB in bytes
# Garbage Collection
GC_ENABLED=true
GC_INTERVAL=24h
```
### Quota Limits by Bytes
| Size | Bytes |
|------|-------|
| 1 GB | 1073741824 |
| 5 GB | 5368709120 |
| 10 GB | 10737418240 |
| 50 GB | 53687091200 |
| 100 GB | 107374182400 |
## Future Enhancements
### 1. Quota API Endpoints
```
GET /xrpc/io.atcr.hold.getQuota?did={userDID} - Get user's quota usage
GET /xrpc/io.atcr.hold.getQuotaBreakdown - Storage by repository
```
### 2. Quota Alerts
- Warning thresholds at 80%, 90%, 95%
- Email/webhook notifications
- Grace period before hard enforcement
### 3. Tier-Based Quotas (Implemented)
ATCR uses quota tiers to limit storage per crew member, configured via `quotas.yaml`:
```yaml
# quotas.yaml
tiers:
deckhand: # Entry-level crew
quota: 5GB
bosun: # Mid-level crew
quota: 50GB
quartermaster: # High-level crew
quota: 100GB
defaults:
new_crew_tier: deckhand # Default tier for new crew members
```
| Tier | Limit | Description |
|------|-------|-------------|
| deckhand | 5 GB | Entry-level crew member |
| bosun | 50 GB | Mid-level crew member |
| quartermaster | 100 GB | Senior crew member |
| owner (captain) | Unlimited | Hold owner always has unlimited |
**Tier Resolution:**
1. If user is captain (owner) → unlimited
2. If crew member has explicit tier → use that tier's limit
3. If crew member has no tier → use `defaults.new_crew_tier`
4. If default tier not found → unlimited
**Crew Record Example:**
```json
{
"$type": "io.atcr.hold.crew",
"member": "did:plc:alice123",
"role": "writer",
"permissions": ["blob:write"],
"tier": "bosun",
"addedAt": "2026-01-04T12:00:00Z"
}
```
### 4. Rate Limiting
Pull rate limits (Docker Hub style):
- Anonymous: 100 pulls per 6 hours per IP
- Authenticated: 200 pulls per 6 hours
- Paid: Unlimited
### 5. Quota Purchasing
- Stripe integration for additional storage
- $0.10/GB/month pricing (industry standard)
## References
- **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/
- **ATProto Spec:** https://atproto.com/specs/record
- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec
---
**Document Version:** 2.0
**Last Updated:** 2026-01-04
**Model:** Per-user layer tracking with ATProto records