# Incremental Migration: Vendored repomgr → Direct `indigo/repo` ## Context The hold PDS uses a vendored copy of indigo's `repomgr` (~1450 lines in `pkg/hold/pds/repomgr.go`). Upstream, repomgr is [soft-deprecated](https://github.com/bluesky-social/indigo/pull/1102#issuecomment-2985956040) — bnewbold recommends using `indigo/repo` directly (as [cocoon](https://github.com/haileyok/cocoon) does). The vendored copy already has custom patches (PutRecord, UpsertRecord, prevData) and will continue to drift. This migration defines a clean interface, then swaps the implementation behind it. ## Phase 0: Save plan to docs, remove dead code, define interface ✅ **Goal:** Persist this migration plan as a reference doc. Shrink repomgr.go from ~1450 lines to ~750 by removing dead code. Fix import.go encapsulation. Define `RepoOperator` interface so all code accesses repomgr through it. No behavior change. **Completed 2026-02-28.** repomgr.go: 1453 → 871 lines (-40%). import.go: 189 → 88 lines (-53%). New repo_operator.go: 82 lines. ### Step 1: Save plan to docs ✅ - Write this plan to `docs/REPOMGR_MIGRATION.md` ### Step 2: Dead code deleted from `repomgr.go` ✅ - `HandleExternalUserEvent()` + `handleExternalUserEventNoArchive()` + `handleExternalUserEventArchive()` - `ImportNewRepo()` + `processNewRepo()` + `walkTree()` + `processOp()` + `stringOrNil()` - `CheckRepoSig()` - `TakeDownRepo()`, `ResetRepo()`, `VerifyRepo()` - `GetProfile()` - `CarStore()` - `NextTID()` / `nextTID()` (removed entirely — `BatchWrite` uses `rm.clk.Next()` directly) - `noArchive` field removed from struct - 12 unused imports cleaned up ### Step 3: Fixed `import.go` encapsulation ✅ Added `BulkUpsert()` method to `RepoManager`. Rewrote `ImportFromCAR` to call `p.repomgr.BulkUpsert()` instead of reaching into `p.repomgr.lockUser`, `p.repomgr.cs`, `p.repomgr.kmgr`, `p.repomgr.events`. Removed the 88-line `bulkImportRecords()` private method. ### Step 4: Defined `RepoOperator` interface ✅ New file: `pkg/hold/pds/repo_operator.go` — interface with 16 methods, compile-time check, shared types (`RepoEvent`, `RepoOp`, `EventKind`, `BulkRecord`). ### Step 5: Updated callers to use interface ✅ - `pkg/hold/pds/server.go` — `repomgr *RepoManager` → `repomgr RepoOperator`, `RepomgrRef()` returns `RepoOperator` - Downstream callers (`hold/server.go`, `admin/handlers_relays.go`, tests) unchanged — they go through `RepomgrRef()` which returns the interface - `var _ RepoOperator = (*RepoManager)(nil)` compile-time check in `repo_operator.go` ### Verification: ✅ - `make lint` — 0 issues (also fixed pre-existing unchecked error in `events.go`) - `make test` — all tests pass --- ## Phase 1: Test hardening against the interface ✅ **Goal:** Write tests against `RepoOperator` that verify current behavior while `RepoManager` is the only implementation. These become the regression safety net when swapping to the new implementation in Phase 3. **Completed 2026-02-28.** New `repo_operator_test.go`: 38 subtests covering all CRUD, read, event emission, error paths, and edge cases. `runRepoOperatorTests(t, setup)` pattern ready for Phase 2's `DirectRepoOperator`. ### Gaps covered ✅ - `CreateRecord` — round-trip, TID 13-char rkey format, no-panic without event handler - `UpdateRecord` — CID changes, new data returned, non-existent record error, hydrated events - `PutRecord` — explicit rkey, duplicate rkey error, hydrated events - `UpsertRecord` — create path (created=true), update path (created=false, CID changes) - `DeleteRecord` — delete + verify gone, non-existent rkey error - `BatchWrite` — create+delete batch, update write type, auto-rkey (nil Rkey), delete-not-found error, empty write elem error, multi-op event emission with hydration, update hydration - `BulkUpsert` — create + re-upsert with changed data, multi-op event emission - `GetRecord` — CID match, CID mismatch error, not-found error - `GetRecordProof` — head CID + proof blocks, not-found error, no-repo error - `GetRepoRoot` — defined CID after init, changes after write - `GetRepoRev` — non-empty, changes after write - `ReadRepo` — non-empty CAR output, incremental export with `since` - `InitNewActor` — empty DID error, zero user error, event emission with hydration - Event emission — create/update/delete events verified: prevData, ops, oldRoot, newRoot, rev, since, repoSlice, hydration ### Coverage ✅ All RepoOperator methods 81–100%. Remaining uncovered lines are internal infrastructure error guards (`GetUserRepoRev`, `NewDeltaSession`, `OpenRepo`, `Commit`, `CloseWithRoot`) — not reachable without mocking the carstore. ### Files modified ✅ - `pkg/hold/pds/repo_operator_test.go` — new file, 38 subtests ### Verification ✅ - `make lint` — 0 issues - `make test` — all tests pass --- ## Phase 2: Build new implementation ✅ **Goal:** Create `DirectRepoOperator` using `indigo/repo` directly (cocoon pattern). **Completed 2026-02-28.** New `pkg/hold/pds/repo.go`: 548 lines (vs 852 in repomgr.go, ~36% reduction). All 37 subtests pass identically for both implementations. Race detector, shuffled order, and parallel execution all clean. ### New file: `pkg/hold/pds/repo.go` ✅ Key differences from vendored repomgr: - **Single `sync.Mutex`** instead of per-user lock map (`lklk` + `userLocks` + `userLock` struct + reference counting) - **No OpenTelemetry tracing** (`otel.Tracer` calls removed) - **No `gorm` dependency** (`RepoHead` struct removed, `gorm.io/gorm` dropped from go.mod) - **`openWriteSession` / `commitWrite` helpers** extract the repeated 6-step write pattern Core mutation pattern (same as current, just cleaner): 1. Lock → get rev → open delta session → open repo 2. Capture `r.DataCid()` for prevData 3. Perform operation(s) 4. `r.Commit()` → `ds.CloseWithRoot()` → emit event → unlock ### Shared types moved to `repo_operator.go` ✅ - `KeyManager` interface and `ActorInfo` struct moved from `repomgr.go` - Both implementations import from the same location ### Test wiring ✅ - `setupTestDirectRepoOperator` — creates carstore + key manager directly (no `NewHoldPDS`) - `runRepoOperatorTests` refactored to accept optional `freshSetup` for `InitNewActor_EventEmission` - `TestDirectRepoOperator` runs all 37 subtests identically ### Verification ✅ - `go build ./cmd/hold` — compiles - `TestRepoManager` — 37/37 subtests pass - `TestDirectRepoOperator` — 37/37 subtests pass - `-race -shuffle=on -count=5 -parallel=8` — all clean - `make lint` — 0 issues - `make test` — all tests pass --- ## Phase 3: Config flag + switchover **Goal:** Feature flag to select implementation, default old. ### Changes: - `pkg/hold/config.go` — add `UseDirectRepo bool` to DatabaseConfig - `pkg/hold/pds/server.go` — select implementation based on config in `NewHoldPDS`/`NewHoldPDSWithDB` - Regenerate example configs ### Verification: - Deploy with `use_direct_repo: false` (default) - Test with `use_direct_repo: true` in staging - `make lint && make test` --- ## Phase 4: Remove vendored repomgr **Goal:** After production validation, delete the old code. - Delete `repomgr.go` - Remove config flag, make `DirectRepoOperator` the only implementation - Rename `repo_direct.go` → `repo_operator_impl.go` - Regenerate example configs - `make lint && make test` --- ## Decision log - **`indigo/repo` over `atproto/repo`**: `atproto/repo` has the MST primitives (`Insert`, `Remove`, `ApplyOp`) but no high-level PDS API (`OpenRepo`, `CreateRecord`, `Commit(signFn)`). Its own doc.go says "does not yet work for implementing a repository host (PDS)." `indigo/repo` is what the reference PDS and cocoon use. The `RepoOperator` interface means we can swap later if `atproto/repo` adds PDS support.