User confirmed this model works end-to-end after the multi-base-model
search. Settled on it because Qwen 3 VL's fine-tune lineage isn't
damaged by abliteration the way Qwen 3.5's is, so it both calls tools
reliably AND won't refuse to dispatch on NSFW edit prompts.
Updated:
- image_studio.json base_model_id → huihui_ai/qwen3-vl-abliterated:8b
- init-models.sh: pulls the abliterated VL model in place of the
non-working standard qwen3.5:9b
- image_studio.md: setup table base-model row + vision-section
'why this and not the alternatives' explanation
function_calling stays default and tool_choice required. Operator
can flip to native + drop tool_choice once they've verified the new
base behaves with structured tool calls (which would also remove the
need for a separate Task Model for title generation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The abliterated 9B was the source of the tool-call format mangling
(both Native XML leaks and Default Python-syntax leaks). Standard
qwen3.5:9b is the same family, same 9B size (6.6 GB), vision-capable
and native tool calling actually works.
The image content uncensored-ness was always going to come from the
SDXL checkpoints in ComfyUI — the LLM is just a dispatcher. Picking
a well-behaved tool-caller for that role doesn't compromise output
content.
Updated:
- image_studio.json base_model_id → qwen3.5:9b
- init-models.sh: pulls qwen3.5:9b as a standard registry pull,
in addition to the existing abliterated 9B (which stays for
other chat models)
- image_studio.md setup table + vision section explaining why
we chose standard over abliterated for the dispatcher role
function_calling stays as 'default' and tool_choice as 'required'
for now — they don't hurt with a reliable tool-caller and operators
can flip back to native + drop tool_choice once they verify it
works for them (which also removes the need for a separate Task
Model for title generation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three pieces:
1. mirror-ollama-model.sh — run on any machine that has the model
pulled. Parses the manifest at
~/.ollama/models/manifests/registry.ollama.ai/<ns>/<name>/<tag>,
greps every sha256:* digest, tars manifest + referenced blobs into
one .tgz. Output is portable — extract over any other Ollama
data dir and the model is immediately visible.
2. init-models.sh gains an s3_pull function that curls a tarball from
$S3_OLLAMA_BASE and extracts into /root/.ollama/models/. Falls back
to ollama pull when S3_OLLAMA_BASE is unset, so s3_pull lines are
safe to commit before the bucket is ready. huihui_ai/qwen3.5-
abliterated:9b promoted to s3_pull as the example.
3. docker-compose.yml model-init service propagates S3_OLLAMA_BASE
from .env. Curl auto-installs at script start because ollama/ollama
doesn't always ship it.
README documents the mirror workflow under "Mirroring models to S3".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five models from the production GPU host's current pull set. Picks up
the idempotency-checking loop pattern from the source script so re-runs
print "already present" instead of re-pulling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sanitized snapshot of the live srvno.de stack: Caddy + Ollama (with
preseed) + ComfyUI + Open WebUI + Anubis stub. Real hostnames,
secrets, and bcrypt hash replaced with placeholders so the dir is safe
to commit.
Caddyfile updated to point at comfyui:8188 (the source file pointed at
the now-removed forge service). Dropped FIGMENT_/FORGE_/SEGMENT_IMAGE_TAG
from the env example. Harmonised the init-models.sh mount path between
ollama and model-init services.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>