Hi! I was trying to run the inference.ipynb notebook, and I got an RuntimeError.
While I ran model = LlamaForCausalLM.from_pretrained(train_config.model_name, device_map="auto", config=config).to(device) , I got RuntimeError: You can't move a model that has some modules offloaded to cpu or disk.
I guess it is related to the device = "cuda" if torch.cuda.is_available() else "cpu" , but my GPU RAM is available and not full.
How should I do to solve it? Thank you.
Best regards,
Maggie

Hi! I was trying to run the inference.ipynb notebook, and I got an RuntimeError.
While I ran
model = LlamaForCausalLM.from_pretrained(train_config.model_name, device_map="auto", config=config).to(device), I got RuntimeError: You can't move a model that has some modules offloaded to cpu or disk.I guess it is related to the
device = "cuda" if torch.cuda.is_available() else "cpu", but my GPU RAM is available and not full.How should I do to solve it? Thank you.
Best regards,
Maggie