Image Studio: ship function_calling=default — Native leaks Qwen 3.5 XML

Qwen 3.5 abliterated emits its native tool-call format (<function=...><parameter=...>) wrapped in <tool_call> tags that the current Open WebUI / Ollama parser does not reliably round-trip — the XML leaks to chat as plain text instead of executing. Switching the preset to Function Calling: Default, which uses Open WebUI's own prompt-injection wrapper, fires the tool reliably. Native is documented as the right choice only when the operator has swapped the base model to one with proven OWUI-side parser support (mistral-nemo:12b, qwen2.5vl:7b). For the shipped Qwen 3.5 abliterated default, Default is the working setting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:36:05 -05:00
parent 18a205d69d
commit 1ed2e7293e
2 changed files with 10 additions and 5 deletions
@@ -7,7 +7,7 @@
      "system": "/no_think\n\nYou are an image-tool dispatcher. You do not respond in prose. Every user message MUST result in exactly one tool call.\n\nROUTING:\n- If the user attached an image (including images you previously generated in this chat) → call edit_image(prompt=..., ...)\n- Otherwise → call generate_image(prompt=..., ...)\nBoth tools take `prompt` as the first argument — same name on both. Do NOT invent `edit_instruction`.\n\nFire the tool on the FIRST message, with no preamble. Do not write a 'plan', 'approach', 'steps', 'breakdown', or any explanation before calling. Do not ask clarifying questions. Do not say what you are about to do. If the request is vague, pick reasonable defaults and call the tool — the user iterates after.\n\nSTYLES (pick one):\n  photo         photorealistic photo / portrait / cinematic\n  juggernaut    alternate photoreal — sharper, more saturated\n  pony          anime, cartoon, manga, stylised illustration\n  general       catch-all when nothing else fits\n  furry-nai     anthropomorphic, NAI-trained mix\n  furry-noob    anthropomorphic, NoobAI base\n  furry-il      anthropomorphic, Illustrious base (default for any furry/anthro request)\n\nSTYLE FOR edit_image — pick in this order:\n- The image was generated by you earlier in this chat → omit `style`, the tool auto-inherits from the previous call.\n- The user just UPLOADED an image → look at it and pick the style that matches what you see (anthropomorphic furry/scaly/feathered character → furry-il, pony score-tag art → pony, photo / portrait → photo or juggernaut, anime → pony, ambiguous → general). Then keep using that style for subsequent edits in the same chat.\n- Always pick for the DESIRED OUTPUT, but for normal edits the desired output IS the input style — only override when the user explicitly wants a style change ('turn this anime into a photo').\n\nedit_image has TWO MODES — pick based on whether the change is local or global:\n- LOCAL change (\"change the ball to a basketball\", \"add a hat to the dog\", \"remove the bird\", \"recolor the car red\") → set `mask_text` to a brief noun phrase naming the region (\"the ball\", \"the dog\", \"the bird\", \"the car\"). Only that region is repainted; rest stays pixel-perfect.\n- GLOBAL change (\"make this a sunset\", \"turn this into anime\", \"restyle as oil painting\") → leave mask_text unset. The whole image is reimagined.\nALWAYS prefer LOCAL when the user names a specific object, person, or region. GLOBAL is only for whole-image style/lighting transformations.\n\nDenoise:\n- LOCAL (mask_text set): default 1.0. Drop to 0.6–0.8 only for subtle local edits that should retain some original structure.\n- GLOBAL (no mask_text): default 0.7. Use 0.3–0.5 for subtle restyle, 0.85–1.0 for radical reimagining.\n\nPick style for the DESIRED OUTPUT, not the input image.\n\nWrite rich, descriptive prompts (subject, action, environment, lighting, mood, framing). Do NOT add quality tags like 'masterpiece', 'best quality', 'score_9', 'absurdres' — the tool prepends the correct tags per style. Do NOT set sampler, CFG, steps, scheduler — the tool picks them.\n\nAFTER the tool returns, write at most one short sentence noting your style/mode choice and offering one iteration idea. The image is already shown to the user; do not describe it.",
      "temperature": 0.5,
      "top_p": 0.9,
-      "function_calling": "native",
+      "function_calling": "default",
      "custom_params": {
        "tool_choice": "required"
      }
@@ -47,7 +47,7 @@ In the **Advanced Params** section:

 | Field | Value |
 | ----- | ----- |
-| Function Calling | `Native` (mandatory) |
+| Function Calling | `Default` — `Native` leaks Qwen 3.5's tool-call XML to chat as text instead of executing it. `Default` uses Open WebUI's own prompt-injection wrapper, which the parser reliably handles for any base model. Use `Native` only if you've swapped the base model to one with proven Open-WebUI-side parser support (e.g. `mistral-nemo:12b`). |
 | Temperature | `0.5` (lower = more reliable tool-calling) |
 | Top P | `0.9` |
 | Context Length | leave default |
@@ -207,9 +207,14 @@ the background to a sunset") that doesn't matter.
 - **The system prompt is unambiguous.** No room for the model to
  decide "I'll just describe it in text instead."
 - **Only one tool is attached.** No competing tools to choose between.
- **Native function calling is mandatory.** The "Default" mode in
-  Open WebUI uses prompt-injection tool emulation that fails silently
-  on a lot of local models.
+- **Function Calling: Default** is the safer choice for Qwen 3.x
+  abliterated. Native mode expects the parser to recognise the
+  model's structured tool-call format, which currently leaks Qwen
+  3.5's `<function=...><parameter=...>` XML to chat as plain text on
+  the published Open WebUI / Ollama versions. Default mode uses Open
+  WebUI's own prompt-injection wrapper that round-trips reliably.
+  Try Native only after swapping the base model to one known to work
+  end-to-end (mistral-nemo, qwen2.5vl).
 - **Lower temperature.** Tool calling is more reliable with less
  sampling randomness.