6 Commits

Author SHA1 Message Date
02a4bece5d Ollama: keep loaded models resident until evicted (KEEP_ALIVE=-1)
Was 30m, which evicts after 30 minutes of inactivity and forces a
reload penalty on the next request. Setting -1 holds models in VRAM
indefinitely; MAX_LOADED_MODELS=3 caps how many can stay resident
simultaneously (vs the previous 2). Tune MAX higher if you're
rotating between more than three models AND your GPU has the VRAM
for it — comment in the compose explains the trade-off.

For the live srvno.de stack: OLLAMA_KEEP_ALIVE=-1 takes effect on
the next `docker compose up -d ollama`. Loaded models survive the
restart only if they're re-requested before swap-out anyway.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 19:25:42 -05:00
def27087c1 Make every image tag in compose pinnable via .env
Floating tags (`latest`, `main`) made deploys non-deterministic — a
container recreate could pull a newer Open WebUI, Ollama, or Anubis at
any time. Wrapped every image: src in a ${VAR:-default} substitution
and surfaced the full set in .env.example with a header explaining
where to find current versions and bumped COMFYUI_IMAGE_TAG default
to 0.2.1 (the just-tagged version with the transformers pin).

Vars added: CADDY_TAG, OLLAMA_TAG, OPEN_WEBUI_TAG, ALPINE_TAG,
ANUBIS_TAG (COMFYUI_IMAGE_TAG already existed). Defaults match the
previous floating-tag behaviour for ones I'm not confident which
specific version to pin (Ollama, Open WebUI, Anubis) — operator should
update those to verified versions for production deploys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:27:08 -05:00
5a34ced8f1 Add S3 mirror path for Ollama models + mirror-ollama-model.sh helper
Three pieces:

1. mirror-ollama-model.sh — run on any machine that has the model
   pulled. Parses the manifest at
   ~/.ollama/models/manifests/registry.ollama.ai/<ns>/<name>/<tag>,
   greps every sha256:* digest, tars manifest + referenced blobs into
   one .tgz. Output is portable — extract over any other Ollama
   data dir and the model is immediately visible.

2. init-models.sh gains an s3_pull function that curls a tarball from
   $S3_OLLAMA_BASE and extracts into /root/.ollama/models/. Falls back
   to ollama pull when S3_OLLAMA_BASE is unset, so s3_pull lines are
   safe to commit before the bucket is ready. huihui_ai/qwen3.5-
   abliterated:9b promoted to s3_pull as the example.

3. docker-compose.yml model-init service propagates S3_OLLAMA_BASE
   from .env. Curl auto-installs at script start because ollama/ollama
   doesn't always ship it.

README documents the mirror workflow under "Mirroring models to S3".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 13:43:26 -05:00
0ad99b6199 Add comfyui-model-init sidecar for ComfyUI model preseeding
Mirrors the Ollama model-init pattern: a one-shot Alpine container that
mounts the comfyui-models volume and runs comfyui-init-models.sh, which
curls direct download URLs (HuggingFace by default) into the right
subdirectories. Idempotent — already-present files are skipped.

HF_TOKEN is plumbed through for gated repos (Flux-dev, SD3, etc.) and is
opt-in via .env. The default list ships SD 1.5 only, matching the
placeholder filename in workflows/*.json. Examples for SDXL, Flux, and
upscalers are commented in the script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:57:24 -05:00
21e976e275 Externalise WEBUI_URL / LLM_URL to .env
All checks were successful
release / Build & Push Docker Image (push) Successful in 31m43s
So changing the deployment's hostnames is a one-file edit (.env) instead
of touching docker-compose.yml. WEBUI_URL is the full URL with scheme
(Open WebUI uses it for auth redirects); LLM_URL is the bare hostname
(Anubis wants it for COOKIE_DOMAIN).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 10:49:29 -05:00
5b61caa197 Add deployments/ai-stack — combined production-shape example
Sanitized snapshot of the live srvno.de stack: Caddy + Ollama (with
preseed) + ComfyUI + Open WebUI + Anubis stub. Real hostnames,
secrets, and bcrypt hash replaced with placeholders so the dir is safe
to commit.

Caddyfile updated to point at comfyui:8188 (the source file pointed at
the now-removed forge service). Dropped FIGMENT_/FORGE_/SEGMENT_IMAGE_TAG
from the env example. Harmonised the init-models.sh mount path between
ollama and model-init services.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 10:40:41 -05:00