-
Notifications
You must be signed in to change notification settings - Fork 43
Add retries to configureIndex
and update
operations
#318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -198,6 +198,8 @@ export const mapHttpStatusError = (failedRequestInfo: FailedRequestInfo) => { | |||
return new PineconeInternalServerError(failedRequestInfo); | |||
case 501: | |||
return new PineconeNotImplementedError(failedRequestInfo); | |||
case 503: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops forgot to map this the 1st time around
// Scale up podType to x2 | ||
let state = true; | ||
let retryCount = 0; | ||
const maxRetries = 10; | ||
while (state && retryCount < maxRetries) { | ||
try { | ||
await pinecone.configureIndex(podIndexName, { | ||
spec: { pod: { podType: 'p1.x2' } }, | ||
}); | ||
state = false; | ||
} catch (e) { | ||
if (e instanceof PineconeInternalServerError) { | ||
retryCount++; | ||
await sleep(2000); | ||
} else { | ||
console.log('Unexpected error:', e); | ||
throw e; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this now that we've got retries!
expect(callCount).toBe(2); | ||
}); | ||
|
||
test('Update operation should retry 1x if server responds 1x with error and 1x with success', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured duplicating this type of test across upsert
and update
would be enough to justify me not duplicating it again for configureIndex
, but lmk if you disagree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We should rly centralize this type of thing to avoid duplicating this logic, but I think this is okay for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems useful to validate that the calls themselves trigger retries as we'd expect, is that what you're talking about centralizing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm I'm not 100% sure we're on the same page -- when you say "the calls themselves," are you talking about the different async funcs that we could pass into the RetryWrapper
?
Assuming you answer yes to the above, yes that's what I'd like to centralize.... something like we have a single parameterized test that confirms that < whatever async func > is retried n
times
…connections" This reverts commit 6e357c1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, it's nice that this is a pretty easy replacement due to how you set things up.
|
||
// Helper function to start the server with a specific response pattern | ||
const startMockServer = (shouldSucceedOnSecondCall: boolean) => { | ||
// Create http server | ||
server = http.createServer((req, res) => { | ||
server = http.createServer({ keepAlive: false }, (req, res) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, what happens when setting this to false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It simply ensures any outstanding http connections close once whatever you're doing on the spun up server concludes. It's set as a default to true
in Node20+, which is a new development, and something I thought might be the cause of our failing tests. Unfortunately, setting it to false
(so that it remains false
across 18 and 20) didn't fix the problem.
if (error?.status) { | ||
return mapHttpStatusError(error); | ||
} | ||
return error; // Return original error if no mapping is needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know what errors we're seeing that end up without an associated error.status
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't quite remember off the top of my head, and definitely need to look into this in the future at some point, but basically something about the BasePineconeError class sometimes has status
in the json obj that's printed out in the console (if you print the error), but that shows up as undefined
when you do error.status
.
expect(callCount).toBe(2); | ||
}); | ||
|
||
test('Update operation should retry 1x if server responds 1x with error and 1x with success', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems useful to validate that the calls themselves trigger retries as we'd expect, is that what you're talking about centralizing?
Re: CI/CD failures in Node20+: I did a lot of research into why the
Basically, I had to add a
I tried a bunch of different ways to add a |
Problem
We first shipped retries to
upsert
as a POC. Now that we are happy with that, we are expanding retries toconfigureIndex
andupdate
operations.This PR includes updates to the retry logic itself, too: I found some errors when I applied it to
configureIndex
, yay!Type of Change
Test Plan
CI passes.