Refactor engine JIT stage #2806

timcassell · 2025-07-09T19:23:24Z

Fixes #2004
Fixes #1466
Contributes to #2787, #1993, #1780, #1210

Moved JIT stage from EngineFactory to a proper EngineJitStage.
- JIT stage now attempts to push the benchmarked method through all JIT tiers.
Moved heuristic from EngineFactory to a new pilot stage (JIT stage, according to its name, now only focuses on jitting).
- Fixed the heuristic to never include the first invocation.
Cleanup around IEngine (breaking changes).
Improved check for LegacyJit.

timcassell · 2025-07-09T19:24:27Z

src/BenchmarkDotNet/Engines/EngineJitStage.cs

+            yield return GetOverheadNoUnrollIterationData();
+            yield return GetDummyIterationData(dummy2Action);
+            yield return GetWorkloadNoUnrollIterationData();
+            yield return GetDummyIterationData(dummy3Action);


@AndreyAkinshin You added dummy actions in 2017. I don't know what they are for. Do we still need them?

timcassell · 2025-07-11T14:58:42Z

cc @AndyAyersMS @EgorBo

EgorBo · 2025-07-13T20:04:03Z

JIT stage now attempts to push the benchmarked method through all JIT tiers.
Set environment variable for the runtime to enable aggressive tiering by default.

Honestly, I think you shouldn't use TC_AggressiveTiering, just 1 iteration to promote to Tier1 is mostly just for internal testing. I think CallCountingDelayMs=0 should be enough.

timcassell · 2025-07-13T20:09:59Z

Honestly, I think you shouldn't use TC_AggressiveTiering, just 1 iteration to promote to Tier1 is mostly just for internal testing. I think CallCountingDelayMs=0 should be enough.

Can you elaborate on that? Why would we need more than 1 invocation per tier for throughput benchmarks? 30 invocations is too much for the stage to complete in a timely manner for long-running benchmarks.

Also, I tried CallCountingDelayMs=0, but it breaks the disassembler (dotnet/runtime#117339).

EgorBo · 2025-07-13T20:17:30Z

Can you elaborate on that?

I think the profile will not be representable (a benchmark may invoke the same method from different places and we don't have context-sensitive profiling yet) + we have optimizations like we intentionally make call counting for some methods smaller so their callers are guaranteed to be promoted later (it's for some internal calls so we can bake final addresses of their Tier1 code versions directly instead of having indirect calls), although, I am mostly concerned about PGO quality.

timcassell · 2025-07-13T20:28:58Z

Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1, we can allow the pilot/warmup stages to handle it later (#1210).

Can you also verify the logic in JitInfo.cs?

EgorBo · 2025-07-13T20:42:28Z

Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1

How do you check that? I don't think there is a way to check whether a benchmark and all of its callees are fully warmed up

timcassell · 2025-07-13T20:46:02Z

Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1

How do you check that? I don't think there is a way to check whether a benchmark and all of its callees are fully warmed up

We don't. We just run a number of invocations based on the configured values retrieved from JitInfo and hope for the best. The pilot/warmup stages will have to work with some sort of heuristic to try to determine if tiering caused the measured time to significantly drop.

timcassell · 2025-07-17T22:17:05Z

dotnet/runtime#117787 (comment)

The "third tier" you see may be OSR, since your method loops a lot and isn't called often.

@AndyAyersMS (to not derail that issue), how can we account for OSR in the jit stage here?

EgorBo · 2025-07-17T22:35:33Z

dotnet/runtime#117787 (comment)

The "third tier" you see may be OSR, since your method loops a lot and isn't called often.

@AndyAyersMS (to not derail that issue), how can we account for OSR in the jit stage here?

I think for BDN specifically OSR is just some intermediate tier it doesn't have to care about, it shouldn't impact the Tier0->Tier1 promotion velocity. Since the method is too slow, I guess BDN decided not too call it too many times?

timcassell · 2025-07-17T22:39:19Z

Since the method is too slow, I guess BDN decided not too call it too many times?

This is purely for the jit stage, where the number of invocations are fixed (in an attempt to push it through all tiers). I'm not sure what the jit thinks is not called enough times. Perhaps because of how the stages work, it only invokes once per iteration, and the jit can't see that the iterations are being ran multiple times? If we called it through the WorkloadUnroll method (with unrollFactor = 16), the jit would skip the OSR?

timcassell · 2025-07-17T22:43:13Z

I think for BDN specifically OSR is just some intermediate tier it doesn't have to care about, it shouldn't impact the Tier0->Tier1 promotion velocity.

That's what I thought, but the evidence shows otherwise. It took 60 invocations to fully reach tier1, instead of 30 (DPGO disabled).

AndyAyersMS · 2025-07-17T23:05:01Z

Did you try profiling the example from dotnet/runtime#117787? If not, I can do it soonish.

timcassell · 2025-07-17T23:09:47Z

Did you try profiling the example from dotnet/runtime#117787? If not, I can do it soonish.

Nope, I don't have much experience to know what to look for. If you're going to do it from this branch, add +2 to remainingTiers in the jit stage to see results of all tiers. Appreciate it.

Don't jit overhead methods if the job is configured to not measure it. Remove extra call counting delay for in-process benchmarks. Set CallCountingDelayMs env var if DisassemblyDiagnoser is not used. Added a test for very long first invocation time.

timcassell added the Area:Engine label Jul 9, 2025

timcassell commented Jul 9, 2025

View reviewed changes

timcassell added the breaking change label Jul 9, 2025

timcassell force-pushed the jit-stage branch 4 times, most recently from d5e6cd4 to 8efb670 Compare July 10, 2025 18:45

timcassell mentioned this pull request Jul 10, 2025

Improve memory diagnoser accuracy #2562

Merged

timcassell force-pushed the jit-stage branch from a5c1dc0 to fee6992 Compare July 13, 2025 19:05

Refactored engine JIT stage.

cf37148

timcassell force-pushed the jit-stage branch from fee6992 to cf37148 Compare July 13, 2025 19:07

timcassell force-pushed the jit-stage branch 2 times, most recently from 8a143d6 to f02de9b Compare July 13, 2025 21:42

PR feedback.

52cabdd

timcassell force-pushed the jit-stage branch from f02de9b to 52cabdd Compare July 13, 2025 21:43

IsRyuJit field instead of property.

ca3d7b9

timcassell force-pushed the jit-stage branch from 9ade172 to ca3d7b9 Compare July 13, 2025 22:45

timcassell requested a review from AndreyAkinshin July 13, 2025 22:54

Fix call counts.

778a1a7

timcassell mentioned this pull request Jul 17, 2025

Random.Next Tier1 slower than Tier0 dotnet/runtime#117787

Open

Added an extra invocation to the end of jit stage.

1251e63

timcassell force-pushed the jit-stage branch from c41913e to 54424b0 Compare July 20, 2025 00:22

timcassell force-pushed the jit-stage branch from 54424b0 to a9458f1 Compare July 20, 2025 00:24

Fixed GetMaxMeasurementCount.

e9ad5b8

timcassell force-pushed the jit-stage branch from 76798ab to e9ad5b8 Compare July 20, 2025 00:34

timcassell added 3 commits July 20, 2025 02:03

Run jit stage iterations in batches if there is enough time.

aaf32bb

Fixed EnginePilotStageInitial.CorrectValues.

42b0838

Fix batch calculation based on single invocation.

e30aa78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor engine JIT stage #2806

Refactor engine JIT stage #2806

Uh oh!

timcassell commented Jul 9, 2025 •

edited

Loading

Uh oh!

timcassell Jul 9, 2025

Uh oh!

timcassell commented Jul 11, 2025

Uh oh!

EgorBo commented Jul 13, 2025

Uh oh!

timcassell commented Jul 13, 2025

Uh oh!

EgorBo commented Jul 13, 2025 •

edited

Loading

Uh oh!

timcassell commented Jul 13, 2025

Uh oh!

EgorBo commented Jul 13, 2025

Uh oh!

timcassell commented Jul 13, 2025

Uh oh!

timcassell commented Jul 17, 2025

Uh oh!

EgorBo commented Jul 17, 2025 •

edited

Loading

Uh oh!

timcassell commented Jul 17, 2025

Uh oh!

timcassell commented Jul 17, 2025 •

edited

Loading

Uh oh!

AndyAyersMS commented Jul 17, 2025

Uh oh!

timcassell commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Refactor engine JIT stage #2806

Are you sure you want to change the base?

Refactor engine JIT stage #2806

Uh oh!

Conversation

timcassell commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timcassell Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

timcassell commented Jul 11, 2025

Uh oh!

EgorBo commented Jul 13, 2025

Uh oh!

timcassell commented Jul 13, 2025

Uh oh!

EgorBo commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timcassell commented Jul 13, 2025

Uh oh!

EgorBo commented Jul 13, 2025

Uh oh!

timcassell commented Jul 13, 2025

Uh oh!

timcassell commented Jul 17, 2025

Uh oh!

EgorBo commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timcassell commented Jul 17, 2025

Uh oh!

timcassell commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndyAyersMS commented Jul 17, 2025

Uh oh!

timcassell commented Jul 17, 2025

Uh oh!

Uh oh!

timcassell commented Jul 9, 2025 •

edited

Loading

EgorBo commented Jul 13, 2025 •

edited

Loading

EgorBo commented Jul 17, 2025 •

edited

Loading

timcassell commented Jul 17, 2025 •

edited

Loading