Skip to content

[bug-fix] Fix save/restore critic, add test #5062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 10, 2021
Merged

Conversation

ervteng
Copy link
Contributor

@ervteng ervteng commented Mar 9, 2021

Proposed change(s)

Add critic to the list of modules for optimizer, so that it is saved/restored properly. This was introduced in #4939 and doesn't affect older releases of ML-Agents.

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@ervteng ervteng requested review from dongruoping and andrewcoh March 9, 2021 16:19
Copy link
Contributor

@dongruoping dongruoping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this able to run on GPU? In _compare_two_policies we move the actors explicitly and I just saw that we also added self.actor.to(default_device()) to TorchPolicy. I didn't see that in optimizers and I wonder if we need to do the same thing as well.

model_saver.save_checkpoint("MockBrain", 2000)

# create a new optimizer and policy
optimizer2 = OptimizerClass(policy, trainer_settings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional that optimizer2 is using policy not policy2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this be policy2 - good catch

@ervteng
Copy link
Contributor Author

ervteng commented Mar 9, 2021

Is this able to run on GPU? In _compare_two_policies we move the actors explicitly and I just saw that we also added self.actor.to(default_device()) to TorchPolicy. I didn't see that in optimizers and I wonder if we need to do the same thing as well.

I think I need to add this too, but the different optimizers have different components. I'll try iterating through the get_modules() and moving each of those to the default device.

edit Never mind, this should work b/c the optimizer moves itself to the default device on initialization. So if it fails, it means the optimizer itself is broken.

@ervteng ervteng merged commit 78cb833 into main Mar 10, 2021
@delete-merged-branch delete-merged-branch bot deleted the develop-fix-resume branch March 10, 2021 15:09
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants