Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement


Note: A simplified demo of our RAG-Diffusion (For more complex layouts, please run our code directly).

Step 1. Choose layout example
Tip: You can get a more ideal picture by adjusting HB_replace and SR_delta
Step 2. Press “Run” to get results
Errors may be displayed due to insufficient computing power
0 999999
0 8
0 1
1 15
1 50
Layout Example ⬇️
Examples
Layout Prompt Regional Fundamental Prompt(BREAK is a delimiter). Regional Highly Descriptive Prompt(BREAK is a delimiter). HB_replace(The times of hard binding. More can make the position control more precise, but may lead to obvious boundaries.) SR_delta(The fusion strength of image latent and regional-aware local latent. This is a flexible parameter, you can try 0.25, 0.5, 0.75, 1.0.) Number of inference steps Guidance Scale Seed Random seed