Greenfield Go multi-tenant IPFS Pinning Service wire-compatible with the
IPFS Pinning Services API spec. Paired 1:1 with Kubo over localhost RPC,
clustered via embedded NATS JetStream, Postgres source-of-truth with
RLS-enforced tenancy, Fiber + huma v2 for the HTTP surface, Authentik
OIDC for session login with kid-rotated HS256 JWT API tokens.
Feature-complete against the 22-milestone build plan, including the
ship-it v1.0 gap items:
* admin CLIs: drain/uncordon, maintenance, mint-token, rotate-key,
prune-denylist, rebalance --dry-run, cache-stats, cluster-presences
* TTL leader election via NATS KV, fence tokens, JetStream dedup
* rebalancer (plan/apply split), reconciler, requeue sweeper
* ristretto caches with NATS-backed cross-node invalidation
(placements live-nodes + token denylist)
* maintenance watchdog for stuck cluster-pause flag
* Prometheus /metrics with CIDR ACL, HTTP/pin/scheduler/cache gauges
* rate limiting: session (10/min) + anonymous global (120/min)
* integration tests: rebalance, refcount multi-org, RLS belt
* goreleaser (tar + deb/rpm/apk + Alpine Docker) targeting Gitea
Stack: Cobra/Viper, Fiber v2 + huma v2, embedded NATS JetStream,
pgx/sqlc/golang-migrate, ristretto, TypeID, prometheus/client_golang,
testcontainers-go.
10 KiB
Deploying anchorage
Two supported shapes — both use the same container image + binaries produced by GoReleaser.
Before either path, configure your OIDC provider: see ../docs/authentik-setup.md for the Authentik walkthrough. Without a valid
auth.authentik.issuer/clientID/audiencein anchorage.yaml the web UI login won't work — thoughanchorage admin mint-tokenand the API still do.
Option 1 — Linux packages (deb / rpm)
GoReleaser emits .deb and .rpm artifacts with a bundled systemd
unit, lifecycle hooks, and directory structure under
/etc/anchorage, /var/lib/anchorage, /var/log/anchorage.
# Debian / Ubuntu
apt install ./anchorage_${VERSION}_linux_amd64.deb
# RHEL / Fedora / Alma
dnf install ./anchorage_${VERSION}_linux_amd64.rpm
Post-install flow:
cp /etc/anchorage/anchorage.yaml.example /etc/anchorage/anchorage.yaml
# edit anchorage.yaml — postgres DSN, authentik issuer, ipfs.rpc, …
openssl rand -base64 48 > /etc/anchorage/jwt.key
chmod 0400 /etc/anchorage/jwt.key
chown anchorage:anchorage /etc/anchorage/jwt.key
# apply schema (advisory-lock-safe on a cluster)
/usr/bin/anchorage migrate up --config /etc/anchorage/anchorage.yaml
systemctl start anchorage
systemctl status anchorage
journalctl -u anchorage -f
Option 2 — Docker Swarm (three-node stack)
The stack in docker-compose.yml runs three
anchorage instances, each paired 1:1 with its own Kubo daemon, against
a single Postgres. An nginx LB fronts HTTP and upgrades /v1/events
to WebSocket.
Prerequisites
- Three Docker Swarm nodes (or one node if you don't care about HA —
just drop the
placement.constraintslines). - Each anchorage-hosting node needs
anchorage.anchor-idas a label andanchorage.anchor=true:
docker swarm init
docker node update --label-add anchorage.db=true node-1
docker node update --label-add anchorage.anchor=true node-1
docker node update --label-add anchorage.anchor=true node-2
docker node update --label-add anchorage.anchor=true node-3
docker node update --label-add anchorage.anchor-id=anchor-1 node-1
docker node update --label-add anchorage.anchor-id=anchor-2 node-2
docker node update --label-add anchorage.anchor-id=anchor-3 node-3
Secrets
openssl rand -base64 32 | docker secret create anchorage_postgres_password -
openssl rand -base64 48 | docker secret create anchorage_jwt_key -
Env file
cat > .env <<'EOF'
ANCHORAGE_IMAGE=git.anomalous.dev/alphacentri/anchorage:latest
ANCHORAGE_DOMAIN=anchor.example.com
ANCHORAGE_AUTHENTIK_URL=https://auth.example.com/application/o/anchorage/
POSTGRES_REPLICAS=0
EOF
Deploy
docker stack deploy -c docker-compose.yml anchorage
docker stack services anchorage
docker service logs anchorage_anchorage-1
Verify
curl -fsS https://anchor.example.com/v1/health
curl -fsS https://anchor.example.com/v1/ready
Upgrade
# Bump the image tag in .env, then:
docker stack deploy -c docker-compose.yml anchorage
# Before a disruptive rolling restart, pause the cluster rebalancer
# so brief node absences don't trigger placement thrash:
anchorage admin maintenance on --reason "upgrade to v1.2" --ttl 30m
# …wait for the stack to converge, then:
anchorage admin maintenance off
Drain a single node for hardware work:
anchorage admin drain nod_anchor_2 # also visible in audit log
anchorage admin uncordon nod_anchor_2
Minting a JWT for IPFS clients (ipfs pin remote)
Before any OIDC user exists — or when handing a long-lived token to a
CI pipeline or a headless service — use anchorage admin mint-token.
It reads the signing key directly off disk and emits a signed JWT
to stdout; no live anchorage process is required.
# Sysadmin break-glass token, default 395-day TTL (1 year + 30-day grace)
TOKEN=$(anchorage admin mint-token \
--signing-key /etc/anchorage/jwt.key \
--issuer https://auth.example.com/application/o/anchorage/ \
--audience anchorage)
# Hand it to the IPFS CLI:
ipfs pin remote service add anchor https://anchor.example.com/v1 "$TOKEN"
ipfs pin remote add --service=anchor --name "my-dataset" bafybeig...
--issuer and --audience must match the running anchorage's
auth.authentik.* config — when mint-token is run from the same host
as the server it reads these from anchorage.yaml automatically.
Shorter-lived tokens (e.g., a developer session):
anchorage admin mint-token --role member --org org_... --ttl 8h
Minted tokens are standalone — they don't appear in
GET /v1/tokens and can't be revoked individually. To revoke one,
either write its jti to the denylist via the /v1/tokens/{jti}
DELETE endpoint (if registered) or rotate the signing key to invalidate
every outstanding token at once.
Rotating the JWT signing key
anchorage supports overlap-style rotation — load the new key alongside the old, flip which one mints new tokens, then drop the retired key once outstanding tokens have expired or been re-minted. No mass re-auth event.
Every token carries a kid header naming the key that signed it.
The verifier picks the matching key from the currently-loaded set,
so "verify against either A or B" works unambiguously.
Config shape
auth.apiToken.signingKeys is a list. Exactly one entry has
primary: true — the minting key; any additional entries are
verify-only.
Steady state:
auth:
apiToken:
signingKeys:
- id: "2026-04"
path: /etc/anchorage/jwt.key
primary: true
During a rotation overlap:
auth:
apiToken:
signingKeys:
- id: "2026-04"
path: /etc/anchorage/jwt.key
primary: true # still the minting key
- id: "2026-10"
path: /etc/anchorage/jwt.key.2026-10
# verify-only until we flip `primary` below
Procedure
Step 1 — generate the new key and stage it.
anchorage admin rotate-signing-key --id 2026-10 --out /etc/anchorage/jwt.key.2026-10
# prints a YAML snippet to stdout — append it to auth.apiToken.signingKeys
Distribute the new key file to every anchorage node (Swarm secret, k8s Secret, Ansible, whatever you already use). The file must have identical bytes on every node.
Apply the config change adding the new entry (no primary: true) and
roll-restart the fleet. Every anchorage now verifies against both
keys but continues minting with the old primary.
Step 2 — flip primary. Edit the config so primary: true moves
from the old entry to the new one:
signingKeys:
- id: "2026-04"
path: /etc/anchorage/jwt.key
- id: "2026-10"
path: /etc/anchorage/jwt.key.2026-10
primary: true
Roll-restart. New mints now use kid=2026-10. Tokens already in the
wild with kid=2026-04 continue to verify.
Step 3 — drop the retired key. Wait until outstanding old-key
tokens have expired or been re-minted. auth.apiToken.maxTTL is the
upper bound:
- 24h default TTL + sessions only: wait 25h and you're done.
- 395-day IPFS client tokens: either wait the full window, or mass-revoke via the denylist and ask users to re-mint. Most shops pick the second path for security-driven rotations and the first for scheduled ones.
Remove the old entry:
signingKeys:
- id: "2026-10"
path: /etc/anchorage/jwt.key.2026-10
primary: true
Roll-restart. Any straggler token still signed with the old key is
now rejected with token: unknown kid "2026-04". Delete
/etc/anchorage/jwt.key from every node once the restart is complete.
When to rotate
- Scheduled (annual / per-security-policy) — follow the full three-step procedure. Invisible to users whose tokens renew inside the overlap window.
- Suspected compromise — do steps 1+2 immediately (seconds apart), then mass-denylist every outstanding old-key token or skip directly to step 3 and accept the breakage.
- Algorithm migration (HS256 → ed25519 / RS256) — not yet
supported; the
tokenpackage is HS256-only today. When it lands, the same three-step rotation pattern will apply.
Observability: Prometheus /metrics
anchorage serves a Prometheus scrape endpoint at /metrics at the
root (not under /v1) so standard service-discovery selectors work.
Gated by a CIDR allowlist on the direct TCP peer IP. Defaults to
loopback + RFC1918, which matches the typical compose / swarm / k8s
intra-cluster scrape path without leaking /metrics through a public
LB. Tighten or disable via server.metrics.allowCIDRs in
anchorage.yaml.
Series exposed:
anchorage_http_requests_total{method,status_class}
anchorage_pin_ops_total{op,result}
anchorage_scheduler_fetch_total{node,result}
anchorage_scheduler_acks_total{node,status}
anchorage_cache_hits_total{name}
anchorage_cache_misses_total{name}
anchorage_leader_is_elected
anchorage_cluster_nodes_live
anchorage_placements_by_status{status}
Scrape with the standard Prometheus job config (scrape each anchorage
pod / container directly — the LB is bypassed). Alerting rules are
left to the operator; a reasonable starter set watches for
anchorage_leader_is_elected == 0 across every node (nobody is the
leader), rate(anchorage_pin_ops_total{result="err"}[5m]) spikes, and
anchorage_cluster_nodes_live falling below minReplicas.
Rate limiting
Two layers:
POST /v1/auth/session— capped per IP per minute (server.rateLimit.sessionPerMinute, default 10). Brute-force guard.- All anonymous requests — capped per IP per minute (
server.rateLimit.anonymousPerMinute, default 120). Authenticated traffic (valid Bearer or session cookie) is exempt. Probe paths (/v1/health,/v1/ready,/metrics) are exempt.
Storage is per-process in-memory. Sticky sessions at the LB make this
effectively global; without sticky sessions an attacker can burst
across N anchorage nodes for N× the throughput. If that matters in
your deployment, deploy behind a proxy that enforces its own global
limits (e.g., nginx limit_req_zone, envoy local_ratelimit).
Backing up Postgres
docker exec -it $(docker ps -q -f name=anchorage_postgres) \
pg_dump -U anchorage -Fc anchorage > anchorage_$(date +%F).pgdump
Backing up NATS state
NATS state under /var/lib/anchorage/nats is non-authoritative — it
holds in-flight jobs and the leader / cluster-maintenance KV. Losing
it trips the requeue sweeper once and comes back; Postgres is the
source of truth.
Still, if you want it captured:
docker run --rm -v anchorage_anchorage_1_data:/data \
busybox tar czf - /data/nats > nats_1_$(date +%F).tar.gz