22 KiB
Embedded PDS Architecture for Hold Services
This document explores the evolution of ATCR's hold service architecture toward becoming an embedded ATProto PDS (Personal Data Server).
Motivation
Comparison to Other ATProto Projects
Several ATProto projects face similar challenges with large data storage:
| Project | Large Data | Metadata | Current Solution |
|---|---|---|---|
| tangled.org | Git objects | Issues, PRs, comments | External knot storage |
| stream.place | Video segments | Stream info, chat | Embedded "static PDS" |
| ATCR | Container blobs | Manifests, comments, builds | External hold service |
Common problem: Large binary data can't realistically live in user PDSs, but interaction metadata gets fragmented across different users' PDSs.
Emerging pattern: Application-specific storage services with embedded minimal PDS implementations.
The Fragmentation Problem
Tangled.org Example
user/myproject repository
├── Git data → Knot (external storage)
├── Issues → Created by @alice → Lives in alice's PDS
├── PRs → Created by @bob → Lives in bob's PDS
└── Comments → Created by @charlie → Lives in charlie's PDS
Problems:
- Repo owner can't export all issues/PRs easily
- No single source of truth for repo metadata
- Interaction history fragmented across PDSs
- Can't encrypt repo data while maintaining collaboration
ATCR's Similar Challenge
atcr.io/alice/myapp
├── Manifests → alice's PDS
├── Blobs → Hold service (external)
└── Future: Comments, builds, attestations → Where?
Stream.place's Approach
Stream.place built a minimal "static PDS" embedded in their application with just the XRPC endpoints they need:
com.atproto.repo.describeRepocom.atproto.sync.subscribeRepos- Minimal read methods
Why: Avoid rate-limiting Bluesky's infrastructure with video segments while staying ATProto-native.
Current Hold Service Architecture
The current hold service is intentionally minimal:
Hold Service =
- OAuth token validation (call user's PDS)
- Generate presigned S3 URLs
- Return HTTP redirects
- Optional crew membership checks
Endpoints:
POST /get-presigned-url→ S3 download URLPOST /put-presigned-url→ S3 upload URLGET /blobs/{digest}→ Proxy fallbackPUT /blobs/{digest}→ Proxy fallbackGET /health→ Health check
Resource footprint:
- Single Go binary (~20MB)
- No database (stateless)
- No PDS (validates against user's PDS)
- Minimal memory/CPU (just signing URLs)
- S3 does all the heavy lifting
This is already as cheap as possible for what it does - just an OAuth validation + URL signing service.
Why Not Force Blobs into User PDSs?
Size Considerations
PDS blob limits: Default ~50MB (Bluesky may be lower)
Container layer sizes:
- Alpine base: ~5MB ✓
- Config blobs: ~1-5KB ✓
- Small Go binaries: 10-30MB ✓
- Node.js base: 100-200MB ✗
- Python base: 50-100MB ✗
- ML models: 500MB - 10GB ✗
- Large datasets: huge ✗
Reality: Many/most layers exceed 50MB. A split-brain approach would be the norm, not the exception.
Split-Brain Complexity
func (s *SplitBlobStore) Create(ctx context.Context, options ...) {
// Challenges:
// 1. Monolithic uploads: Size known upfront ✓
// 2. Chunked uploads: Size unknown until complete ✗
// 3. Resumable uploads: State management across PDS/hold ✗
// 4. Mount/cross-repo: Which backend to check? ✗
}
Detection works for simple cases but breaks down with:
- Multipart/chunked uploads (no size until complete)
- Resumable uploads (stateful across boundaries)
- Cross-repository blob mounts (which backend?)
Pragmatic Decision
Accept the trade-off:
- Blobs in holds (practical for large data)
- Manifests in user's PDS (ownership of metadata)
- Focus on making holds easy to deploy and migrate
Users still own the important part - the manifest is the source of truth for what the image is.
Embedded PDS Vision
Key Insight: Hold is the PDS
Because blobs are content-addressed and deduplicated globally, there isn't a singular owner of blob data. Multiple images share the same base layer blobs.
Therefore: The hold itself is the PDS (with identity did:web:hold1.example.com), not individual image repositories.
Proposed Architecture
Hold Service = Minimal PDS (did:web:hold1.example.com)
├── Standard ATProto blob endpoints:
│ ├── com.atproto.sync.uploadBlob
│ ├── com.atproto.sync.getBlob
│ └── Blob storage → S3 (like normal PDS)
├── Custom XRPC methods:
│ ├── io.atcr.hold.delegateAccess (IAM)
│ ├── io.atcr.hold.getUploadUrl (optimization)
│ ├── io.atcr.hold.getDownloadUrl (optimization)
│ ├── io.atcr.hold.exportImage (data portability)
│ └── io.atcr.hold.getStats (metadata)
└── Records (hold's own PDS):
├── io.atcr.hold.crew (crew membership)
└── io.atcr.hold.config (hold configuration)
Benefits
- ATProto-native: Uses standard XRPC, not custom REST API
- Discoverable: Hold's DID document advertises capabilities
- Portable: Users can export images via XRPC
- Standardized: Blob operations use ATProto conventions
- Future-proof: Can add more XRPC methods as needed
- Interoperable: Works with ATProto tooling
Implementation Details
1. SHA256 to CID Mapping
ATProto uses CIDs (Content Identifiers) for blobs, while OCI uses SHA256 digests. However, CIDs support SHA256 as the hash function.
Key insight: We can construct CIDs directly from SHA256 digests with no additional storage needed!
// pkg/hold/cid.go
func DigestToCID(digest string) (cid.Cid, error) {
// sha256:abc123... → raw bytes
hash := parseDigest(digest)
// Construct CIDv1 with sha256 codec
return cid.NewCidV1(
cid.Raw, // codec
multihash.SHA2_256, // hash function
hash, // hash bytes
)
}
func CIDToDigest(c cid.Cid) string {
// Decode multihash → sha256:abc...
mh := c.Hash()
return fmt.Sprintf("sha256:%x", mh)
}
Mapping:
OCI digest: sha256:abc123...
ATProto CID: bafybei... (CIDv1 with sha256, base32 encoded)
Storage path: s3://bucket/blobs/sha256/ab/abc123...
Blobs stay in distribution's layout, we just compute CID on-the-fly. No mapping records needed.
2. Storage: Distribution Layout with PDS Interface
The hold's blob storage uses distribution's driver directly - no encoding or transformation:
type HoldBlobStore struct {
storageDriver storagedriver.StorageDriver // S3, filesystem, etc
}
// Implements ATProto blob interface
func (h *HoldBlobStore) UploadBlob(ctx context.Context, data io.Reader) (cid.Cid, error) {
// 1. Compute sha256 while reading
digest, size := computeDigest(data)
// 2. Store at distribution's path: blobs/sha256/ab/abc123...
path := h.blobPath(digest)
h.storageDriver.PutContent(ctx, path, data)
// 3. Return CID (computed from sha256)
return DigestToCID(digest), nil
}
func (h *HoldBlobStore) GetBlob(ctx context.Context, c cid.Cid) (io.Reader, error) {
// 1. Convert CID → sha256 digest
digest := CIDToDigest(c)
// 2. Fetch from distribution's path
path := h.blobPath(digest)
return h.storageDriver.Reader(ctx, path, 0)
}
Storage continues to use distribution's existing S3 layout. The PDS interface is just a wrapper.
3. Authentication & IAM
Challenge: ATProto operations are authenticated AS the account owner. For hold operations, we need actions to be performed AS the hold (not individual users), but authorized BY crew members.
Important context: AppView manages the user's OAuth session. When users authenticate via the credential helper, they actually authenticate through AppView's web interface. AppView obtains and stores the user's OAuth token and DPoP key. The credential helper only receives a registry JWT.
Proposed: DPoP Proof Delegation (Standard ATProto Federation)
1. User authenticates via AppView (OAuth flow)
- AppView obtains: OAuth token, refresh token, DPoP key, DID
- AppView stores these in its token storage
- Credential helper receives: Registry JWT only
2. When AppView needs blob access, it calls hold:
POST /xrpc/io.atcr.hold.delegateAccess
Headers: Authorization: DPoP <user-oauth-token>
DPoP: <proof-signed-with-user-dpop-key>
Body: {
"userDid": "did:plc:alice123",
"purpose": "blob-upload",
"duration": 900
}
3. Hold validates (standard ATProto token validation):
- Verify DPoP proof signature matches token's bound key
- Call user's PDS: com.atproto.server.getSession (validates token)
- Extract user's DID from validated session
- Check user's DID in hold's crew records
- If authorized, issue temporary token for blob operations
4. AppView uses delegated token for blob operations:
POST /xrpc/com.atproto.sync.uploadBlob
Headers: Authorization: DPoP <hold-token>
DPoP: <proof>
This is standard ATProto federation - services pass OAuth tokens with DPoP proofs between each other. Hold independently validates tokens against the user's PDS, so there's no trust relationship required.
Crew records stored in hold's PDS:
{
"$type": "io.atcr.hold.crew",
"member": "did:plc:alice123",
"role": "admin",
"permissions": ["blob:read", "blob:write", "crew:manage"],
"addedAt": "2025-10-14T..."
}
Security considerations:
- User's OAuth token is exposed to hold during delegation
- However, hold independently validates it (can't be forged)
- Tokens are short-lived (15min typical)
- Hold only accepts tokens for crew members
- Hold validates DPoP binding (requires private key)
- Standard ATProto security model
4. Presigned URLs for Optimized Egress
While standard ATProto blob endpoints work, direct S3 access is more efficient. Hold can expose custom XRPC methods:
// io.atcr.hold.getUploadUrl - Get presigned upload URL
type GetUploadUrlRequest struct {
Digest string // sha256:abc...
Size int64
}
type GetUploadUrlResponse struct {
UploadURL string // Presigned S3 URL
ExpiresAt time.Time
}
// io.atcr.hold.getDownloadUrl - Get presigned download URL
type GetDownloadUrlRequest struct {
Digest string
}
type GetDownloadUrlResponse struct {
DownloadURL string // Presigned S3 URL
ExpiresAt time.Time
}
AppView uses optimized path:
func (a *ATProtoBlobStore) ServeBlob(ctx, w, r, dgst) error {
// Try optimized presigned URL endpoint
resp, err := a.client.GetDownloadUrl(ctx, dgst)
if err == nil {
// Redirect directly to S3
http.Redirect(w, r, resp.DownloadURL, http.StatusTemporaryRedirect)
return nil
}
// Fallback: Standard ATProto blob endpoint (proxied)
reader, _ := a.client.GetBlob(ctx, holdDID, cid)
io.Copy(w, reader)
}
Best of both worlds: Standard ATProto interface + S3 optimization for bandwidth efficiency.
5. Image Export for Portability
Custom XRPC method enables users to export entire images:
// io.atcr.hold.exportImage - Export all blobs for an image
type ExportImageRequest struct {
Manifest *oci.Manifest // User provides manifest
}
type ExportImageResponse struct {
ArchiveURL string // Presigned S3 URL to tar.gz
ExpiresAt time.Time
}
// Implementation:
// 1. Extract all blob digests from manifest (config + layers)
// 2. Create tar.gz with all blobs
// 3. Upload to S3 temp location
// 4. Return presigned download URL (15min expiry)
Users can request all blobs for their images and migrate to different holds.
Changes Required
AppView Changes
Current:
type ProxyBlobStore struct {
holdURL string // HTTP endpoint
}
func (p *ProxyBlobStore) ServeBlob(...) {
// POST /put-presigned-url
// Return redirect
}
New:
type ATProtoBlobStore struct {
holdDID string // did:web:hold1.example.com
holdURL string // Resolved from DID document
client *atproto.Client // XRPC client
delegatedToken string // From io.atcr.hold.delegateAccess
}
func (a *ATProtoBlobStore) ServeBlob(ctx, w, r, dgst) error {
// Try optimized: io.atcr.hold.getDownloadUrl
// Fallback: com.atproto.sync.getBlob
}
Hold Service Changes
Transform from simple HTTP server to minimal PDS:
// cmd/hold/main.go
func main() {
// Storage driver (unchanged)
storageDriver := buildStorageDriver()
// NEW: Embedded PDS
pds := hold.NewEmbeddedPDS(hold.Config{
DID: "did:web:hold1.example.com",
BlobStore: storageDriver,
Collections: []string{
"io.atcr.hold.crew",
"io.atcr.hold.config",
},
})
// Serve XRPC endpoints
mux.Handle("/xrpc/", pds.Handler())
// Legacy endpoints (optional for backwards compat)
// mux.Handle("/get-presigned-url", legacyHandler)
}
Open Questions
1. Docker Hub Size Limits
Research findings: Docker Hub has soft limits around 10-20GB per layer, with practical issues beyond that. No hard-coded enforcement.
For ATCR: Hold services can theoretically support larger blobs if S3 and network infrastructure allows. May want configurable limits to prevent abuse.
2. Token Delegation Security Model
Recommended approach: DPoP proof delegation (standard ATProto federation pattern)
Open questions:
- How long should delegated tokens last? (15min like presigned URLs?)
- Should delegation be per-operation or session-based?
- Do we need audit logs for delegated operations?
- Can AppView cache delegated tokens across requests?
- Should we implement token refresh for long-running operations?
3. Migration Path
- Do we support both HTTP and XRPC APIs during transition?
- How do existing manifests with
holdEndpoint: "https://..."migrate toholdDid: "did:web:..."? - Can AppView auto-detect if hold supports XRPC vs legacy?
4. PDS Implementation Scope
Minimal endpoints needed:
com.atproto.sync.uploadBlobcom.atproto.sync.getBlobcom.atproto.repo.describeRepo(discovery)- Custom XRPC methods (delegation, presigned URLs, export)
Not needed:
com.atproto.repo.*(no user repos)com.atproto.server.*(no user sessions)- Most sync/admin endpoints
Can we build a reusable "static PDS" library for apps like ATCR, tangled.org, stream.place?
5. Crew Management
- How are crew members added/removed?
- UI in AppView? CLI tool? Direct XRPC calls?
- Can crew members delegate to other crew members?
- Role hierarchy (owner > admin > member)?
6. Hold Discovery & Registration
Current: Hold registers by creating records in owner's PDS New: Hold is its own identity - how does AppView discover available holds?
Possibilities:
- Holds publish to feeds
- AppView maintains directory
- DIDs are manually configured
- ATProto directory service
7. Multi-Tenancy
Could one hold PDS serve multiple "logical holds" for different organizations?
did:web:hold-provider.com/org1
did:web:hold-provider.com/org2
Or should each hold be a separate deployment?
8. Blob Deduplication
Current behavior: Global deduplication (same layer shared across all images).
With embedded PDS:
- Does dedup stay global across all crew/users?
- Or is it per-hold (isolated storage)?
- How do we track blob references for garbage collection?
9. Cost Model
- Who pays for S3 storage/egress?
- Hold operator? Image owner? Per-pull?
- How to implement metering/billing via XRPC?
10. Disaster Recovery
- How to backup hold's PDS (crew records, config)?
- Can holds replicate to other holds?
- Image export handles blobs - what about metadata?
Implementation Plan
Phase 1: Basic PDS with Carstore ✅ COMPLETED
Implementation: Using indigo's carstore with SQLite + DeltaSession
import (
"github.com/bluesky-social/indigo/carstore"
"github.com/bluesky-social/indigo/models"
"github.com/bluesky-social/indigo/repo"
)
type HoldPDS struct {
did string
carstore carstore.CarStore
session *carstore.DeltaSession // Provides blockstore interface
repo *repo.Repo
dbPath string
uid models.Uid // User ID for carstore (fixed: 1)
}
func NewHoldPDS(ctx context.Context, did, dbPath string) (*HoldPDS, error) {
// Create SQLite-backed carstore
sqlStore, err := carstore.NewSqliteStore(dbPath)
sqlStore.Open(dbPath)
cs := sqlStore.CarStore()
// For single-hold use, fixed UID
uid := models.Uid(1)
// Create DeltaSession (provides blockstore interface)
session, err := cs.NewDeltaSession(ctx, uid, nil)
// Create repo with session as blockstore
r := repo.NewRepo(ctx, did, session)
return &HoldPDS{
did: did,
carstore: cs,
session: session,
repo: r,
dbPath: dbPath,
uid: uid,
}, nil
}
Key learnings:
- ✅ Carstore provides blockstore via
DeltaSession(not direct access) - ✅
models.Uidis the user ID type (we use fixed UID(1)) - ✅ DeltaSession needs to be a pointer (
*carstore.DeltaSession) - ✅
repo.NewRepo()accepts the session directly as blockstore
Storage:
- Single file:
/var/lib/atcr-hold/hold.db(SQLite) - Contains MST nodes, records, commits in carstore tables
- Proper indigo repo/MST implementation (production-tested)
Why SQLite carstore:
- ✅ Single file persistence (like appview's SQLite)
- ✅ Official indigo storage backend
- ✅ Handles compaction/cleanup automatically
- ✅ Migration path to Postgres/Scylla if needed
- ✅ Easy to replicate (Litestream, LiteFS, rsync)
- ✅ CAR import/export support built-in
Scale considerations:
- SQLite carstore marked "experimental" but suitable for single-hold use
- MST designed for massive scale (O(log n) operations)
- 1000 crew records = ~1-2MB database (trivial)
- Bluesky PDSs use carstore for millions of records
- If needed: migrate to Postgres-backed carstore (same API)
Hold as Proper ATProto User
Decision: Make holds full ATProto actors for discoverability and ecosystem integration.
What this enables:
- Hold becomes discoverable via ATProto directory
- Can have profile (
app.bsky.actor.profile) - Can post status updates (
app.bsky.feed.post) - Users can follow holds
- Social proof/reputation via ATProto social graph
MVP Scope: We're building the minimal PDS needed for discoverability, not a full social client:
- ✅ Signing keys (ES256K via
atproto/atcrypto) - ✅ DID document (did:web at
/.well-known/did.json) - ✅ Standard XRPC endpoints (
describeRepo,getRecord,listRecords) - ✅ Profile record (
app.bsky.actor.profile) - ⏸️ Posting functionality (later - other services can read our records)
Key insight: Other ATProto services will "just work" as long as they can retrieve records from the hold's PDS. We don't need to implement full social features for the hold to participate in the ecosystem.
Crew Management: Individual Records
Decision: Individual crew record per user (remove wildcard logic)
// io.atcr.hold.crew/{rkey}
{
"$type": "io.atcr.hold.crew",
"member": "did:plc:alice123",
"role": "admin", // or "member"
"permissions": ["blob:read", "blob:write"],
"addedAt": "2025-10-14T..."
}
// io.atcr.hold.config/policy
{
"$type": "io.atcr.hold.config",
"access": "public", // or "allowlist"
"allowAny": true, // public: allow any authenticated user
"requireAuth": true, // require authentication (no anonymous)
"maxUsers": 1000 // optional limit
}
Authorization logic:
func (p *HoldPDS) CheckAccess(ctx context.Context, userDID string) (bool, error) {
policy := p.GetPolicy(ctx)
if policy.Access == "public" && policy.AllowAny {
// Public hold - any authenticated ATCR user allowed
// No individual crew record needed
return true, nil
}
if policy.Access == "allowlist" {
// Check explicit crew membership
_, err := p.GetCrewMember(ctx, userDID)
return err == nil, nil
}
return false, nil
}
Benefits of individual records:
- Auditability (track who has access)
- Per-user permissions (admin vs member)
- Explicit revocation capabilities
- Analytics (usage tracking)
- Rate limiting (per-user quotas)
- subscribeRepos events on crew changes
Use cases:
- Public community hold:
access: "public", allowAny: true- no crew records needed - Private team hold:
access: "allowlist"- explicit crew membership - Hybrid: Public access + explicit admin crew records for elevated permissions
Next Steps
- Add indigo dependencies - carstore, repo, MST
- Implement HoldPDS with carstore - Create pkg/hold/pds
- Add crew management - CRUD operations for crew records
- Implement standard PDS endpoints - describeServer, describeRepo, getRecord, listRecords
- Add DID document - did:web identity generation
- Custom XRPC methods - getUploadUrl, getDownloadUrl (presigned URLs)
- Wire up in cmd/hold - Serve XRPC alongside existing HTTP
- Test basic operations - Add/list crew, policy checks
- Design delegation/IAM - Token exchange for authenticated operations
- Implement AppView XRPC client - Support PDS-based holds
References
- Stream.place embedded PDS: https://streamplace.leaflet.pub/3lut7mgni5s2k/l-quote/6_318-6_554#6
- ATProto OAuth spec: https://atproto.com/specs/oauth
- ATProto XRPC spec: https://atproto.com/specs/xrpc
- CID spec: https://github.com/multiformats/cid
- OCI Distribution Spec: https://github.com/opencontainers/distribution-spec