-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hey,
I have a NVIDIA GeForce 1660 SUPER 6GB card, and I wanted to train LoRA models with it.
This is my configuration:
accelerate launch --num_cpu_threads_per_process 4 train_network.py --network_module="networks.lora" --pretrained_model_name_or_path=/mnt/models/animefull-final-pruned.ckpt --vae=/mnt/models/animevae.pt --train_data_dir=/mnt/datasets/character --output_dir=/mnt/out --output_name=character --caption_extension=.txt --shuffle_caption --prior_loss_weight=1 --network_alpha=128 --resolution=512 --enable_bucket --min_bucket_reso=320 --max_bucket_reso=768 --train_batch_size=1 --gradient_accumulation_steps=1 --learning_rate=0.0001 --text_encoder_lr=0.00005 --max_train_epochs=20 --mixed_precision=fp16 --save_precision=fp16 --use_8bit_adam --xformers --save_every_n_epochs=1 --save_model_as=safetensors --clip_skip=2 --flip_aug --color_aug --face_crop_aug_range="2.0,4.0" --network_dim=128 --max_token_length=225 --lr_scheduler=constant
The train directory's name is 3_Concept1, so 3 repetitions are used.
The script does not throw any errors, but loss=nan and corrupted unets are produced.
I've tried setting mixed_precision to no, but then I've run out of VRAM.
I've also tried disabling xformers, but again, I've run out of VRAM.
I've compiled xformers myself, using pip install ninja && MAX_JOBS=4 pip install -v .
Also tried several other xformers versions, like 0.0.16 and the one suggested in the README.
Tried both CUDA 11.6 and 11.7.
Python version: 3.10.6
PyTorch version: torch==1.12.1+cu116 torchvision==0.13.1+cu116
Any help is much appreciated!
Thank you!