-
Notifications
You must be signed in to change notification settings - Fork 3.7k
prevent op-proposer Send tx failure errors & clean up op-proposer logging #6138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prevent op-proposer Send tx failure errors & clean up op-proposer logging #6138
Conversation
|
✅ Deploy Preview for opstack-docs canceled.
|
5374a97
to
d660985
Compare
Perhaps we should just add a flag so that the blocknumber/blockhash is set to 0 in the output proposals so that these errors can be completely avoided |
d660985
to
4298829
Compare
That would certainly work, I just wasn't sure how important the hash matching check is. This is turning out to be more invasive than I would like to clean up what is mostly an aesthetic issue with the logging, though I figure we should at least document somewhere in the code what's going on in order to hopefully spare the next poor soul who might follow the error messages down this same rabbit hole. |
aaa7171
to
911362b
Compare
This PR is now eliminating the failed proposer tx errors (at least in normal operation) & reclassifies some noisy info logs as Debug. Resulting Info+ logs are much more manageable. I can also add a flag to bypass the blocknum/blockhash and related logic if you think it would be useful.
|
911362b
to
b0e45ec
Compare
b0e45ec
to
7e39100
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this.
Probably need to rebase to get some of the tests passing again - there's been a few build improvements lately.
The mocks failure can be fixed by running make generate-mocks
in op-service
though you'd need to have the right version of mockery installed (v2.28.1). I can run that and push a commit up for you if it's easier.
Otherwise I think this looks good to me. I'll try to track someone a bit more familiar with the proposer to double check the log level changes but the logic makes sense to me. Hopefully rebase and regenerate mocks will get the tests passing but if not we can work through whatever's left.
logging changes + the approach to waiting look good to me. |
ff47860
to
4d496ad
Compare
done
done |
bf9eace
to
539b49d
Compare
ooo, so close. I was hoping the hive tests would start passing too since they really don't provide much info about what's failing... I'll dig into that and see what I can find out. |
ok hive is hitting a timeout as part of waiting for withdrawals (the logs in the workspace download have some useful info). Could be affected by this but also using code that isn't calculating the wait period correctly (fixing that in #6193). I can reproduce the failure locally - had to build the images locally and tag them as |
Not easy but I've pulled those withdrawal timing fixes into hive (many yaks were shaved along the way) and combined with increasing the timeout on one test those hive tests are now passing. So I think it's just that the proposer is slightly slower to submit proposals now and that combined with some bad logic for waiting for withdrawals to finalise was pushing it past the timeout. |
Ah, with this change op-proposer is constantly falling behind in the hive tests. It has a submission interval of 6 (so every 12 seconds) and L1 blocks happen every 15 seconds. So the test winds up having to wait for a number of proposals to be published rather than just one and that takes quite a while. The current withdrawal waiting logic is incorrectly waiting for multiple output proposals which just makes it worse. We can likely make all the hive tests a lot faster by reducing its L1 block time. So I think we'll need to merge #6193 then ethereum-optimism/hive#84 and then we can merge this one. Thanks for your patience, it's been a bit crazy how many different issues have all converged around this. |
@roberto-bayardo Pre-reqs for this are now merged, including #6225 which was a new surprise. :) Could you please rebase this? Hopefully everything should then pass. |
539b49d
to
5b661cc
Compare
5b661cc
to
0537c85
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #6138 +/- ##
===========================================
+ Coverage 44.16% 44.21% +0.04%
===========================================
Files 436 436
Lines 29025 29081 +56
Branches 709 709
===========================================
+ Hits 12818 12857 +39
- Misses 15126 15145 +19
+ Partials 1081 1079 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your patience on this.
proposal transaction errors were commonly being logged due to another esoteric EstimateGas quirk, which this PR cleans up by making sure the head of the L1 chain is sufficiently advanced before submitting the output proposal.
changes a spammy info log to debug
changes the warning log message on failure to fetch receipt to info level since this is typical behavior when txs are replaced with increased fees
documentation added to explain some of the expected logging outcomes
fixes function name typo
Context: it turns out that "EstimateGas" actually defaults to estimating gas at "Latest" block state instead of "Pending" block state (the godoc in ethclient is outdated, though a PR to update it is here). This means estimate gas will fail with tx revert when the l1BlockNumber == latest, because the contract calls blockhash(l1blocknumber) and this returns "0" instead of the block hash.