feat: Qwen Image Edit 2511 — full pipeline with LoRA, GGUF, quantization, and UI#131
feat: Qwen Image Edit 2511 — full pipeline with LoRA, GGUF, quantization, and UI#131
Conversation
|
@copilot When I attempt to run a generation with Qwen Image (Diffusers version). I get this stack trace: |
The crash was caused by Fixed in |
|
@copilot Can you figure out where the WARNING in the fourth line of log output is coming from and remove it? |
The warning came from |
|
@copilot I'm getting the following error upon generating with the model: |
Fixed in |
|
@copilot I'm still getting tokenizer errors: |
... Fixed in |
|
@copilot |
Fixed in Processor loading: Replaced Reference images UI: Added txt2img without reference images: No warning or block when no reference images are set — the text encoder works with an empty list, so txt2img works fine. |
|
@copilot New error: |
The negative conditioning node was always added to the graph, causing the text encoder to be loaded twice even when CFG=1 (where the negative prediction is unused). Now only adds the negative node when cfg_scale > 1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kohya LoRAs use underscore-separated keys like lora_unet_transformer_blocks_0_attn_to_k.lokr_w1 instead of the diffusers dot-separated format. Add: - Kohya key detection (lora_unet_transformer_blocks_*) - Key conversion mapping from Kohya underscores to model dot-paths - Updated LoRA config detection to recognize Kohya format + LoKR suffixes - Flux Kohya exclusion (lora_unet_double_blocks, lora_unet_single_blocks) - Test model for Kohya LoKR identification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Float16 noise has limited precision that creates quantization patterns visible as vertical ripple artifacts, especially with few-step Lightning LoRA generation where the denoiser doesn't have enough steps to smooth them out. Use float32 (matching Z-Image and the diffusers pipeline). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@copilot Here are the comments:
|
…70b1-43ca-a314-4367e472cd4f Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Test 1: Model identification (stripped model fixture) - Kohya LoKR format LoRA detected as type=lora, format=lycoris, base=qwen-image Test 2: Conversion unit tests (12 tests) - Format detection: Kohya vs diffusers vs empty state dicts - Kohya key conversion: attention, MLP, unknown, non-matching keys - Full model conversion: Kohya and diffusers produce correct prefixed layer keys - Prefix stripping: "transformer." prefix handled correctly Test 3: Unknown key behavior tests - Unknown Kohya sub-modules are silently skipped (1 known + 1 unknown = 1 layer) - All-unknown state dict produces empty ModelPatchRaw (0 layers) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ba333f3 to
e806505
Compare
…raph builder chore(backend): add test for heuristic detection of Qwen Image Edit GGUF model variant chore(frontend): add regression test for ref images not added to qwen image in generate mode fix(frontend): graph build handling of Qwen Image when CFG <=1 chore(frontend): add regression test for optimal dimension selection
a9252c3 to
c2d7b8b
Compare
e2710c4 to
b7de7d7
Compare
Summary
Complete implementation of the Qwen Image Edit 2511 pipeline for InvokeAI, including text-to-image generation, image editing with reference images, LoRA support (including Lightning distillation), GGUF quantized transformers, and BitsAndBytes encoder quantization.
Key Features
Backend Changes
zero_cond_tmodulation, LoRA application via LayerPatcher with sidecar patching for GGUF, shift override for LightningModelLoader),zero_cond_t=True, correctin_channels>=4.56.0(the video processor fallback imports already handle this)Frontend Changes
qwenImageEditComponentSource,qwenImageEditQuantization,qwenImageEditShiftin params slice with persistence and model-switch cleanupFunctional Testing Guide
1. Text-to-Image Generation (Basic)
2. GGUF Quantized Transformer
3. BitsAndBytes Encoder Quantization
4. LoRA Support
5. Image Editing with Reference Image
6. Multiple Reference Images
7. Model Switching Cleanup
🤖 Generated with Claude Code