RuntimeError: Invalid device string: 'cuda:None'

During training (Tesla V100-PCIE-16GB) I get the following error

```shell
Train:   0%|                                                                                                                                           | 0/10 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/anaconda/envs/rtfm/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/anaconda/envs/rtfm/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/dev-medekm-gpu/code/Users/michael.medek/rtfm/rtfm/finetune.py", line 451, in <module>
    main(
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/dev-medekm-gpu/code/Users/michael.medek/rtfm/rtfm/finetune.py", line 408, in main
    results = train(
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/dev-medekm-gpu/code/Users/michael.medek/rtfm/rtfm/train_utils.py", line 274, in train
    batch[key] = batch[key].to(f"cuda:{local_rank}")
RuntimeError: Invalid device string: 'cuda:None'
Train:   0%| 
```

Which traces to here

https://github.com/mlfoundations/rtfm/blob/9884a6be456db74941d48babc1293a2eefb4df6b/rtfm/train_utils.py#L274

where `local_rank` is None, thus `Invalid device string: 'cuda:None'`. How is this supposed to work? The default of the function is `local_rank=None` which should be invalid, since it must be int, right? In `evaluate()` there is only `local_rank: int`.

By adding

```python
local_rank = 0
rank = 0
print("WARNING! Overwriting local_rank and rank to 0!")
```

this issue is worked around.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Invalid device string: 'cuda:None' #15

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: Invalid device string: 'cuda:None' #15

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions