For the weights linked in this repo, (What are & How many) GPUs was the model trained on? And how long did the training take?