Investigation: Optimizer/executor/traces are behaving strangely

I decided to benchmark nbody using pyperf on my computer.

When I turn on the optimizer (optimize_uops) in CPython main + JIT + Guido's exponential backoff and GHCCC, **on is 3-10% slower than with it off**.

However, with above settings and reducing trace length from 800 to 200, optimizer **on is 2% faster than off**.

My hunch is that the trace lengths are too long to be worth optimizing right now. What I think is happening is that we are falling off trace way too early. Say for a trace of 800, the optimizer abstract interprets all 800 bytecode instructions. Whereas in reality, maybe only 40 of the instructions are executed before we fall off in most cases. In the above example, the instructions that are worth optimizing are only 40 out of the 800 instructions.

We should seriously reconsider reducing trace length.

In other news, I tried encouraging inlining of all optimizer functions by placing them in the same compilation unit. No speedup there with LTO. So that means LTO is already doing a good job.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigation: Optimizer/executor/traces are behaving strangely #669

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigation: Optimizer/executor/traces are behaving strangely #669

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions