loss=nan on 1660 SUPER 6GB

Hey,
I have a NVIDIA GeForce 1660 SUPER 6GB card, and I wanted to train LoRA models with it.
This is my configuration:

`accelerate launch --num_cpu_threads_per_process 4 train_network.py
    --network_module="networks.lora"
    --pretrained_model_name_or_path=/mnt/models/animefull-final-pruned.ckpt
    --vae=/mnt/models/animevae.pt
    --train_data_dir=/mnt/datasets/character
    --output_dir=/mnt/out
    --output_name=character
    --caption_extension=.txt
    --shuffle_caption
    --prior_loss_weight=1
    --network_alpha=128
    --resolution=512
    --enable_bucket
    --min_bucket_reso=320
    --max_bucket_reso=768
    --train_batch_size=1
    --gradient_accumulation_steps=1
    --learning_rate=0.0001
    --text_encoder_lr=0.00005
    --max_train_epochs=20
    --mixed_precision=fp16
    --save_precision=fp16
    --use_8bit_adam
    --xformers
    --save_every_n_epochs=1
    --save_model_as=safetensors
    --clip_skip=2
    --flip_aug
    --color_aug
    --face_crop_aug_range="2.0,4.0"
    --network_dim=128
    --max_token_length=225
    --lr_scheduler=constant
`

The train directory's name is 3_Concept1, so 3 repetitions are used.
The script does not throw any errors, but loss=nan and corrupted unets are produced.
I've tried setting mixed_precision to no, but then I've run out of VRAM.
I've also tried disabling xformers, but again, I've run out of VRAM.
I've compiled xformers myself, using `pip install ninja && MAX_JOBS=4 pip install -v .`
Also tried several other xformers versions, like 0.0.16 and the one suggested in the README.
Tried both CUDA 11.6 and 11.7.

Python version: 3.10.6
PyTorch version: torch==1.12.1+cu116 torchvision==0.13.1+cu116

Any help is much appreciated!
Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

loss=nan on 1660 SUPER 6GB #293

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

loss=nan on 1660 SUPER 6GB #293

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions