Skip to content

fix: OIDC Parallel Requests error #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 18, 2022

Conversation

paragbhingre
Copy link
Contributor

This PR fixes the issue #299

Description - When multiple workflows try to assume a role at the same time then OIDC provider returns an error - Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements

We now have a fix which will make sure that these parallel requests will retry assuming a role if it fails to do so. retryAndBackoff logic will take care of retrial of assuming a role at random times so that no 2 parallel requests will try to get credentials from OIDC provider at the same time.

Screenshots -
We have tried running 40 parallel workflows and all of them were able to assume a role successfully

Screen Shot 2022-01-10 at 2 26 31 PM

Screen Shot 2022-01-10 at 2 29 27 PM

Copy link
Contributor

@GrahamCampbell GrahamCampbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great change. I've been running into this issue myself.

@paragbhingre paragbhingre merged commit 6b3c017 into aws-actions:master Jan 18, 2022
facebook-github-bot pushed a commit to pytorch/torchx that referenced this pull request Feb 11, 2022
Summary:
Our integration tests are hitting an error when acquiring AWS credentials.
```
botocore.errorfactory.InvalidIdentityTokenException: An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements
```

https://github.com/pytorch/torchx/runs/5147711223?check_suite_focus=true

This switches to the `aws-actions/configure-aws-credentials` GitHub action since it includes retries as of aws-actions/configure-aws-credentials#350

Pull Request resolved: #386

Test Plan: CI

Reviewed By: kiukchung

Differential Revision: D34158910

Pulled By: d4l3k

fbshipit-source-id: 6b6b9516b0233ea5b0f05f0fc5e6483829587c82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants