-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Update distributed example tests in run_python_examples.sh
#1250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @sirutBuasai! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
✅ Deploy Preview for pytorch-examples-preview canceled.
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
I wonder if we should just move over those lines to https://github.com/sirutBuasai/examples/blob/release/2.3/.github/workflows/main_distributed.yaml which calls this script https://github.com/pytorch/examples/blob/main/run_distributed_examples.sh |
Why not add these lines to the distributed test script instead? |
Please go for it, it'd be an extremely useful contribution |
I've added the script to the distributed test script. I considered just calling Let me know what you think about whether we should consolidate the distributed tests into |
Indeed I'm suggesting that all distributed tests should only be in run_distributed_examples.sh |
DRY'd the code a little but it might be heavier than necessary. lmk if there are any other changes I should make. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thank you!
…#1250) * Fix distributed test * fix parallel scripts * install dill * remove dill * run 2 gpu * remove gpucount, use default * Add examples to distributed examples * refactor distributed test * fx ERRORS overwriting * run with base dir * remove distributed from run_python_examples.sh * move basedir to source * separate init --------- Co-authored-by: Sirut Buasai <[email protected]>
Update distributed example tests in
run_python_examples.sh
to use run_examples.sh.Current script will fail distributed tests when retrieving torchrun environment variables such as WORLD_SIZE if not launched from
run_examples.sh