The paper propose a lightweight, data-efficient method for controllable medical image-mask pair generation. Our method fine-tunes Stable Diffusion with limited data (under 30 minutes and 24GB memory) and uses automated quality assessment protocol filters to enhance reliability and diversity. In the inference phase, we use guide mask for controllable generation to achieve controllable shape and location of the lesion area. We also use a lightweight diffusion model as a mask generator to improve the versatility of the generated images. Experiments on five segmentation tasks demonstrate that models trained with our synthetic data achieve an average 3% accuracy improvement.
# Create a new conda environment named 'MedDiff-FT' with Python 3.10.14
conda create -n MedDiff-FT python=3.10.14
# Activate the newly created environment
conda activate MedDiff-FT
# Change directory to the project folder
cd MedDiff-FT
# Install all required Python packages listed in requirements.txt
pip install -r requirements.txt- HuggingFace Hub Path: stable-diffusion-v1-5/stable-diffusion-inpainting
cd MedDiff-FT/main
accelerate launch train.py --pretrained_model_name_or_path='/path/checkpoint' \
--instance_data_dir='../data' \
--output_dir='../check/test' \
--resolution=512 \
--train_batch_size=1 \
--gradient_accumulation_steps=2 \
--learning_rate=3e-6 \
--max_train_steps=500 \cd MedDiff-FT/main
python infer.py \
--model_path /path/to/model \
--input_path /path/to/input_images \ # normal images
--label_path /path/to/masks \ # Generate masks
--out_path /path/to/output \- Initialize project repository
- Release training and inference code
- Release the code about Mask Generator
- Release the code about Non-lesion Image Generator
- Release the checkpoints
If you find this work is helpful to your research, please consider citing our paper:
@misc{xie2025meddiffft,
title={MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis},
author={Xie Jianhao and Zhang Ziang and Weng Zhenyu and Zhu Yuesheng and Luo Guibo},
year={2025},
eprint={2507.00377},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.00377}
}