-
-
Notifications
You must be signed in to change notification settings - Fork 657
Closed
Labels
HacktoberfestPyDataGlobalPyData Global 2020 SprintPyData Global 2020 Sprintbughelp wantedmodule: engineEngine moduleEngine module
Description
🐛 Bug description
When loading a checkpoint trained with anEngine using a DeterministEngine, the following error is raised:
src/training/engine.py:44: in valid
self.trainer.valid(self.my_task, checkpoint_path)
src/training/trainers/single_trainer.py:65: in valid
valid_engine.run(valid_loader)
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/engine.py:701: in run
return self._internal_run()
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/engine.py:774: in _internal_run
self._handle_exception(e)
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/engine.py:469: in _handle_exception
raise e
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/engine.py:751: in _internal_run
self._fire_event(Events.EPOCH_COMPLETED)
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/engine.py:424: in _fire_event
func(*first, *(event_args + others), **kwargs)
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/handlers/checkpoint.py:373: in __call__
checkpoint = self._setup_checkpoint()
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/handlers/checkpoint.py:437: in _setup_checkpoint
checkpoint[k] = obj.state_dict()
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/deterministic.py:186: in state_dict
state_dict = super(DeterministicEngine, self).state_dict()
../../miniconda3/envs/training-py36/lib/python3.6/site-packages/ignite/engine/engine.py:504: in state_dict
return OrderedDict([(k, getattr(self.state, k)) for k in keys])
.0 = <tuple_iterator object at 0x166c104a8>
> return OrderedDict([(k, getattr(self.state, k)) for k in keys])
E AttributeError: 'State' object has no attribute 'rng_states'
Expected behavior:
Since the checkpoint doesn't have rng_states, DeterministEngine should print a warning and ignore the previous rng_states (recreate on the fly)
How to reproduce:
- Train a pytorch model with an Engine
- Save a checkpoint
- Resume the training using a DeterministicEngine
Environment
- PyTorch Version (e.g., 1.4): 1.7.1
- Ignite Version (e.g., 0.3.0): 0.4.6
- OS (e.g., Linux): macOS
- How you installed Ignite (
conda,pip, source): pip - Python version: 3.6.10
vfdev-5
Metadata
Metadata
Assignees
Labels
HacktoberfestPyDataGlobalPyData Global 2020 SprintPyData Global 2020 Sprintbughelp wantedmodule: engineEngine moduleEngine module