Difference between text to image training scripts #11598

nadavpo · 2025-05-21T18:27:29Z

nadavpo
May 21, 2025

Hi,
I’m trying to find the best approach for SD fine-tuning and I’m little-bit confused. Most if the scripts have the same structure with small changes between them.
I saw this options:

Train_text_to_image.py - fine-tune the unet parameters
Train_text_to_image_lora.py - train LoRA layers in the unet
Train_dreambooth.py - fine tune the unet parameters add optionally the text encoder also
Train_dreambooth_lora.py - same as 3 but training LoRA layers instead of the entire parameters.

Am I right? Thus are the differences between the scripts?
I also saw that dreambooth have some spacial dataset class but I didn’t understood what spacial there.
Beside that there are script for the vae training, can’t I just add the parameters of the vae to the optimizer in one of the scripts above?

thank you

asomoza · 2025-05-21T21:33:07Z

asomoza
May 21, 2025
Maintainer

Dreambooth is a training technique for training text to image models in a single subject, is not the same as training text to image or a text to image lora. You can read more about it here.

Is not in all the examples but you can train the text encoders with any of them, it's just optional, there are some full finetunes that also trained the text encoders, it's supposed to bring better results but that really depends on what you're training. You can also do more spcifics training with the advanced scripts like DoRA or B-LoRA

About the VAE, no, you can't just add the parameters of it because it's a different model with a different kind of training, the unet/transformer model uses the embeddings from the text encoders to generate images, the VAE it's used for encoding/decoding the images into latents/images.

Finally the training scripts are for learning and to improve them, so you can just read them and try to understand what's being done and you can add more specific parts for your needs.

4 replies

nadavpo May 22, 2025
Author

Thanks @asomoza.
I’m little bit familiar with dreambooth, I meant that in the training scripts I don’t see large differences between the methods that I mentioned.
For example, I was expected that in the dreambooth scripts there will be some parameters for token that the model learn sentences with and without this token like in the diagram below that was taken from the original paper

asomoza May 22, 2025
Maintainer

in the dreambooth training scripts you're referring to the class and instance prompts which are present, in the arguments and in the code which you can search and read.

There isn't that much difference between the methods, dreambooth is basically just a focused text to image finetune that can overfit the model really fast compared to normal loras and finetuning, don't know if that's what you're referring but with some small changes you can convert either script to do or not dreambooth.

nadavpo May 22, 2025
Author

Ok understood thanks!

last question, do you know what the difference between train_custom_diffusion.py and train_text_to_image.py? Don’t they do the same?

asomoza May 22, 2025
Maintainer

I haven't really read the custom diffusion script (it was added before I was part of the team). Did a quick read and it seems more similar to dreambooth than a text to image finetune, it uses the instance and class prompts, the main difference is that it trains some specific layers. Maybe it does something more but I don't have the time to read the whole code or the paper right now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Difference between text to image training scripts #11598

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Difference between text to image training scripts #11598

Uh oh!

Uh oh!

nadavpo May 21, 2025

Replies: 1 comment · 4 replies

Uh oh!

asomoza May 21, 2025 Maintainer

Uh oh!

nadavpo May 22, 2025 Author

Uh oh!

asomoza May 22, 2025 Maintainer

Uh oh!

nadavpo May 22, 2025 Author

Uh oh!

asomoza May 22, 2025 Maintainer

nadavpo
May 21, 2025

Replies: 1 comment 4 replies

asomoza
May 21, 2025
Maintainer

nadavpo May 22, 2025
Author

asomoza May 22, 2025
Maintainer

nadavpo May 22, 2025
Author

asomoza May 22, 2025
Maintainer