-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Refactor engine JIT stage #2806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
yield return GetOverheadNoUnrollIterationData(); | ||
yield return GetDummyIterationData(dummy2Action); | ||
yield return GetWorkloadNoUnrollIterationData(); | ||
yield return GetDummyIterationData(dummy3Action); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndreyAkinshin You added dummy actions in 2017. I don't know what they are for. Do we still need them?
d5e6cd4
to
8efb670
Compare
Honestly, I think you shouldn't use |
Can you elaborate on that? Why would we need more than 1 invocation per tier for throughput benchmarks? 30 invocations is too much for the stage to complete in a timely manner for long-running benchmarks. Also, I tried |
I think the profile will not be representable (a benchmark may invoke the same method from different places and we don't have context-sensitive profiling yet) + we have optimizations like we intentionally make call counting for some methods smaller so their callers are guaranteed to be promoted later (it's for some internal calls so we can bake final addresses of their Tier1 code versions directly instead of having indirect calls), although, I am mostly concerned about PGO quality. |
Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1, we can allow the pilot/warmup stages to handle it later (#1210). Can you also verify the logic in |
How do you check that? I don't think there is a way to check whether a benchmark and all of its callees are fully warmed up |
We don't. We just run a number of invocations based on the configured values retrieved from |
8a143d6
to
f02de9b
Compare
dotnet/runtime#117787 (comment)
@AndyAyersMS (to not derail that issue), how can we account for OSR in the jit stage here? |
I think for BDN specifically OSR is just some intermediate tier it doesn't have to care about, it shouldn't impact the Tier0->Tier1 promotion velocity. Since the method is too slow, I guess BDN decided not too call it too many times? |
This is purely for the jit stage, where the number of invocations are fixed (in an attempt to push it through all tiers). I'm not sure what the jit thinks is not called enough times. Perhaps because of how the stages work, it only invokes once per iteration, and the jit can't see that the iterations are being ran multiple times? If we called it through the |
That's what I thought, but the evidence shows otherwise. It took 60 invocations to fully reach tier1, instead of 30 (DPGO disabled). |
Did you try profiling the example from dotnet/runtime#117787? If not, I can do it soonish. |
Nope, I don't have much experience to know what to look for. If you're going to do it from this branch, add +2 to |
Don't jit overhead methods if the job is configured to not measure it. Remove extra call counting delay for in-process benchmarks. Set CallCountingDelayMs env var if DisassemblyDiagnoser is not used. Added a test for very long first invocation time.
Fixes #2004
Fixes #1466
Contributes to #2787, #1993, #1780, #1210
EngineFactory
to a properEngineJitStage
.EngineFactory
to a new pilot stage (JIT stage, according to its name, now only focuses on jitting).IEngine
(breaking changes).LegacyJit
.