Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers 🎨
- 2025-03-25: Our paper with supplementary material Attention Specialty for Diffusion Transformers is now available on arXiv.
- 2025-03-25: We release the code!
A training-free method based on DiT-based models (e.g., FLUX.1.dev, FLUX.1.schnell, SD v3.5) that allows users to precisely place instances and accurately attribute representations in detailed multi-instance layouts using preliminary sketches, while maintaining overall image quality.
- Arxiv Paper with Supplementary Material
- Inference Code
- More Demos. Coming soon. stay tuned! 🚀
- ComfyUI support
- Huggingface Space support
git clone https://github.com/bitzhangcy/MIS_DiT.git
cd MIS_DiT
conda create -n ast python=3.10
conda activate ast
pip install -r requirements.txt
The default checkpoint is FLUX.1-dev (link). Additionally, FLUX.1-schnell and SD v3.5 are also supported, with FLUX.1-schnell utilizing different hyperparameters and SD v3.5 featuring a distinct model architecture and parameter set.
Get the access token from FLUX.1-dev and set it at line 63 in flux_hcrt.py
as hf_token = "your_access_token"
.
You can quickly perform precise multi-instance synthesis using the following Gradio interface and the instructions below:
python flux_hcrt.py
- Create the image layout.
- Enter text prompt and label each segment.
- Check the generated images, and tune the hyperparameters if needed.
wc : Degree of T2T attention modulation module.
wd : Degree of I2T attention modulation module.
wf : Degree of I2I attention modulation module.
We sincerely thank the authors of DenseDiffusion for their open-source code, which serves as the foundation of our project.
If you find this repository useful, please cite using the following BibTeX entry:
@misc{,
title={Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers},
author={Zhang, Chunyang and Sun, Zhenhong and Zhang, Zhicheng and Wang, Junyan and Zhang, Yu and Gong, Dong and Mo, Huadong and Dong, Daoyi},
year={2025},
eprint={2504.10148},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2504.10148},
}
If you have any questions or suggestions, please feel free to contact us 😆!