698136c37e
When the user generated an image and then asked to modify it ("make
her hair red", "now at sunset"), the OWUI default-mode tool decider
was picking generate_image and producing a fresh, unrelated image —
or, if the base model output the call as text instead, narrating
generate_image(...) in the chat instead of dispatching anything.
Root cause was in the tool docstrings, which the decider weights
heavily. Both were asymmetric in the same direction:
- edit_image led with "an image the user has ATTACHED to the chat"
and "the user uploads an image" — both phrasings exclude assistant-
emitted images, so the decider read follow-up turns as "no source,
edit_image invalid."
- generate_image's exclusion clause ("they have NOT attached an
existing image") matched: assistant-emitted images aren't
"attached," so generate_image stayed valid for follow-ups too.
Result: on iteration turns, the decider saw generate_image as the
only valid choice and dispatched it (or the base model emitted a
pseudo-call when the decider declined).
Rewrite both leads symmetrically:
- edit_image now covers "any image already in this chat" with
explicit mention of assistant-emitted sources.
- generate_image now defers to edit_image whenever ANY image is
visible above, even when the user's phrasing sounds like a fresh
request — that last clause is what catches "make her hair red."
The extraction code itself already handled assistant-emitted images
correctly (path #2 + #4 in _extract_attached_image, including the
chat-DB fallback from f26dfbe) — only the docstrings were lying to
the decider.
smart_image_gen.py 0.7.10 -> 0.7.11
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>