7.7 KiB
Incremental Migration: Vendored repomgr → Direct indigo/repo
Context
The hold PDS uses a vendored copy of indigo's repomgr (~1450 lines in pkg/hold/pds/repomgr.go). Upstream, repomgr is soft-deprecated — bnewbold recommends using indigo/repo directly (as cocoon does). The vendored copy already has custom patches (PutRecord, UpsertRecord, prevData) and will continue to drift. This migration defines a clean interface, then swaps the implementation behind it.
Phase 0: Save plan to docs, remove dead code, define interface ✅
Goal: Persist this migration plan as a reference doc. Shrink repomgr.go from ~1450 lines to ~750 by removing dead code. Fix import.go encapsulation. Define RepoOperator interface so all code accesses repomgr through it. No behavior change.
Completed 2026-02-28. repomgr.go: 1453 → 871 lines (-40%). import.go: 189 → 88 lines (-53%). New repo_operator.go: 82 lines.
Step 1: Save plan to docs ✅
- Write this plan to
docs/REPOMGR_MIGRATION.md
Step 2: Dead code deleted from repomgr.go ✅
HandleExternalUserEvent()+handleExternalUserEventNoArchive()+handleExternalUserEventArchive()ImportNewRepo()+processNewRepo()+walkTree()+processOp()+stringOrNil()CheckRepoSig()TakeDownRepo(),ResetRepo(),VerifyRepo()GetProfile()CarStore()NextTID()/nextTID()(removed entirely —BatchWriteusesrm.clk.Next()directly)noArchivefield removed from struct- 12 unused imports cleaned up
Step 3: Fixed import.go encapsulation ✅
Added BulkUpsert() method to RepoManager. Rewrote ImportFromCAR to call p.repomgr.BulkUpsert() instead of reaching into p.repomgr.lockUser, p.repomgr.cs, p.repomgr.kmgr, p.repomgr.events. Removed the 88-line bulkImportRecords() private method.
Step 4: Defined RepoOperator interface ✅
New file: pkg/hold/pds/repo_operator.go — interface with 16 methods, compile-time check, shared types (RepoEvent, RepoOp, EventKind, BulkRecord).
Step 5: Updated callers to use interface ✅
pkg/hold/pds/server.go—repomgr *RepoManager→repomgr RepoOperator,RepomgrRef()returnsRepoOperator- Downstream callers (
hold/server.go,admin/handlers_relays.go, tests) unchanged — they go throughRepomgrRef()which returns the interface var _ RepoOperator = (*RepoManager)(nil)compile-time check inrepo_operator.go
Verification: ✅
make lint— 0 issues (also fixed pre-existing unchecked error inevents.go)make test— all tests pass
Phase 1: Test hardening against the interface ✅
Goal: Write tests against RepoOperator that verify current behavior while RepoManager is the only implementation. These become the regression safety net when swapping to the new implementation in Phase 3.
Completed 2026-02-28. New repo_operator_test.go: 38 subtests covering all CRUD, read, event emission, error paths, and edge cases. runRepoOperatorTests(t, setup) pattern ready for Phase 2's DirectRepoOperator.
Gaps covered ✅
CreateRecord— round-trip, TID 13-char rkey format, no-panic without event handlerUpdateRecord— CID changes, new data returned, non-existent record error, hydrated eventsPutRecord— explicit rkey, duplicate rkey error, hydrated eventsUpsertRecord— create path (created=true), update path (created=false, CID changes)DeleteRecord— delete + verify gone, non-existent rkey errorBatchWrite— create+delete batch, update write type, auto-rkey (nil Rkey), delete-not-found error, empty write elem error, multi-op event emission with hydration, update hydrationBulkUpsert— create + re-upsert with changed data, multi-op event emissionGetRecord— CID match, CID mismatch error, not-found errorGetRecordProof— head CID + proof blocks, not-found error, no-repo errorGetRepoRoot— defined CID after init, changes after writeGetRepoRev— non-empty, changes after writeReadRepo— non-empty CAR output, incremental export withsinceInitNewActor— empty DID error, zero user error, event emission with hydration- Event emission — create/update/delete events verified: prevData, ops, oldRoot, newRoot, rev, since, repoSlice, hydration
Coverage ✅
All RepoOperator methods 81–100%. Remaining uncovered lines are internal infrastructure error guards (GetUserRepoRev, NewDeltaSession, OpenRepo, Commit, CloseWithRoot) — not reachable without mocking the carstore.
Files modified ✅
pkg/hold/pds/repo_operator_test.go— new file, 38 subtests
Verification ✅
make lint— 0 issuesmake test— all tests pass
Phase 2: Build new implementation ✅
Goal: Create DirectRepoOperator using indigo/repo directly (cocoon pattern).
Completed 2026-02-28. New pkg/hold/pds/repo.go: 548 lines (vs 852 in repomgr.go, ~36% reduction). All 37 subtests pass identically for both implementations. Race detector, shuffled order, and parallel execution all clean.
New file: pkg/hold/pds/repo.go ✅
Key differences from vendored repomgr:
- Single
sync.Mutexinstead of per-user lock map (lklk+userLocks+userLockstruct + reference counting) - No OpenTelemetry tracing (
otel.Tracercalls removed) - No
gormdependency (RepoHeadstruct removed,gorm.io/gormdropped from go.mod) openWriteSession/commitWritehelpers extract the repeated 6-step write pattern
Core mutation pattern (same as current, just cleaner):
- Lock → get rev → open delta session → open repo
- Capture
r.DataCid()for prevData - Perform operation(s)
r.Commit()→ds.CloseWithRoot()→ emit event → unlock
Shared types moved to repo_operator.go ✅
KeyManagerinterface andActorInfostruct moved fromrepomgr.go- Both implementations import from the same location
Test wiring ✅
setupTestDirectRepoOperator— creates carstore + key manager directly (noNewHoldPDS)runRepoOperatorTestsrefactored to accept optionalfreshSetupforInitNewActor_EventEmissionTestDirectRepoOperatorruns all 37 subtests identically
Verification ✅
go build ./cmd/hold— compilesTestRepoManager— 37/37 subtests passTestDirectRepoOperator— 37/37 subtests pass-race -shuffle=on -count=5 -parallel=8— all cleanmake lint— 0 issuesmake test— all tests pass
Phase 3: Config flag + switchover
Goal: Feature flag to select implementation, default old.
Changes:
pkg/hold/config.go— addUseDirectRepo boolto DatabaseConfigpkg/hold/pds/server.go— select implementation based on config inNewHoldPDS/NewHoldPDSWithDB- Regenerate example configs
Verification:
- Deploy with
use_direct_repo: false(default) - Test with
use_direct_repo: truein staging make lint && make test
Phase 4: Remove vendored repomgr
Goal: After production validation, delete the old code.
- Delete
repomgr.go - Remove config flag, make
DirectRepoOperatorthe only implementation - Rename
repo_direct.go→repo_operator_impl.go - Regenerate example configs
make lint && make test
Decision log
indigo/repooveratproto/repo:atproto/repohas the MST primitives (Insert,Remove,ApplyOp) but no high-level PDS API (OpenRepo,CreateRecord,Commit(signFn)). Its own doc.go says "does not yet work for implementing a repository host (PDS)."indigo/repois what the reference PDS and cocoon use. TheRepoOperatorinterface means we can swap later ifatproto/repoadds PDS support.