Skip to content

feat: Add Modular Pipeline for Stable Diffusion 3 (SD3)#13324

Open
AlanPonnachan wants to merge 3 commits intohuggingface:mainfrom
AlanPonnachan:feat/sd3-modular-pipeline
Open

feat: Add Modular Pipeline for Stable Diffusion 3 (SD3)#13324
AlanPonnachan wants to merge 3 commits intohuggingface:mainfrom
AlanPonnachan:feat/sd3-modular-pipeline

Conversation

@AlanPonnachan
Copy link
Contributor

What does this PR do?

This PR introduces the modular architecture for Stable Diffusion 3 (SD3), implementing both Text-to-Image (T2I) and Image-to-Image (I2I) pipelines.

Key additions:

  • Added SD3ModularPipeline and SD3AutoBlocks to the dynamic modular pipeline resolver.
  • Migrated SD3-specific mechanics to the new BlockState
  • Added corresponding dummy objects and lazy-loading fallbacks.
  • Added TestSD3ModularPipelineFast and TestSD3Img2ImgModularPipelineFast test suites.

Related issue: #13295

Before submitting

Who can review?

@sayakpaul @asomoza

class TestSD3Img2ImgModularPipelineFast(ModularPipelineTesterMixin):
pipeline_class = SD3ModularPipeline
pipeline_blocks_class = SD3AutoBlocks
pretrained_model_name_or_path = "hf-internal-testing/tiny-sd3-pipe"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests currently point to hf-internal-testing/tiny-sd3-pipe. So tests are failing as of now. I think this happens since /tiny-sd3-pipe lacks modular_model_index.json.

could someone on the team push a hf-internal-testing/tiny-sd3-modular testing repository with a modular_model_index.json?

Once that infrastructure is available on the Hub, I will update the pretrained_model_name_or_path and run my tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now, we could first load "hf-internal-testing/tiny-sd3-pipe" in a ModularPipeline and then save and push to a repository on the Hub. We can use it throughout and PR. Once the PR is close to merge, we can transfer the repo to the "hf-internal-testing" org. Does that work?

@sayakpaul sayakpaul requested review from asomoza and yiyixuxu March 25, 2026 02:22
@sayakpaul
Copy link
Member

sayakpaul commented Mar 25, 2026

@AlanPonnachan thanks for this PR! Could you also provide some test code and sample outputs?

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting started on this! I left some comments (majorly on the use of guidance).

logger = logging.get_logger(__name__)


def retrieve_timesteps(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're using the one from here

let's directly import or add a "# Copied from ..." comment.

return timesteps, num_inference_steps


def calculate_shift(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

return timesteps, num_inference_steps


class SD3SetTimestepsStep(ModularPipelineBlocks):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to follow this semantic for the rest of the blocks as well.

Suggested change
class SD3SetTimestepsStep(ModularPipelineBlocks):
class StableDiffusion3SetTimestepsStep(ModularPipelineBlocks):

@property
def inputs(self) -> list[tuple[str, Any]]:
return[
InputParam("joint_attention_kwargs"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's provide a type hint here.

return_dict=False,
)[0]

if block_state.do_classifier_free_guidance:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the guidance bit should be implemented using our guiders components. Refer to

def __call__(self, components: QwenImageModularPipeline, block_state: BlockState, i: int, t: torch.Tensor):

as an example.


logger = logging.get_logger(__name__)

def retrieve_latents(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

return[OutputParam(name="processed_image")]

@staticmethod
def check_inputs(height, width, vae_scale_factor, patch_size):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition!

return prompt_embeds.to(dtype=components.text_encoder_3.dtype, device=device)

@staticmethod
def _get_clip_prompt_embeds(components, prompt, device, clip_skip, clip_model_index):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we could use

# Copied from diffusers.pipelines.stable_diffusion_3.pipeline_stable_diffusion_3.StableDiffusion3Pipeline._get_t5_prompt_embeds
? Perhaps with "# Copied from ... with self -> components"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants