Generated image in blur while training Autoencodertiny Decoder

Hi, this is Phyllis. Very appreciate your marvelous work on SD-VAE compression and acceleration !!!! I am currently working on Autoencodertiny decoder training from scratch using LDM training structure, but I find that the generated images are not very clear. The training is completely the same as LDM training apart from your extra loss `distance(disc(real).mean(), disc(fake).mean())` in my decoder generator  (the extra loss indeed helped with stability and FID ,many thanks!!!! ). I train the decoder using SD1.5 encoder output as my input for `190k steps with batch_size=4, lr = 1e-4`, but generated images are still not clear. Would you mind give some hints (loss used, steps for training, any finetune stage? ) on how to align with your TAESD results?

this is TAESD result
![20240617-152624](https://github.com/madebyollin/taesd/assets/172868758/d1352e6f-db76-4a78-8646-f5facf5bfe06)



this is MY result
![tiny_tiny_generated_images_199000_59d1bee81c7d1232debf](https://github.com/madebyollin/taesd/assets/172868758/1d424b3e-4d17-46cf-8f3b-1062a1c018c2)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generated image in blur while training Autoencodertiny Decoder #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Generated image in blur while training Autoencodertiny Decoder #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions