Skip to content

Conversation

@DaniPopes
Copy link

@DaniPopes DaniPopes commented May 24, 2025

Currently, calling {Criterion,BenchmarkGroup}::bench_function creates a copy of BenchmarkGroup::run_bench, which includes all the Routine trait methods, for every unique bench closure. This is unnecessary and doesn't make a difference for benchmark performance as the closure in question is the one taking &mut Bencher, which is not the actual routine that is timed.

This PR roughly halves the number of LLVM lines generated by the Rust compiler, tested with recmo/uint's benchmarking suite using cargo llvm-lines -p ruint --bench bench_uint. This suite has around 214 unique calls to criterion.bench_function. This speeds up compilation of a release build of the benchmark from ~15s to ~12s on my machine, likely a lot more in more resource-constrained environments like in CI, or when using less codegen-units.

Before:

  Lines                 Copies               Function name
  -----                 ------               -------------
  911857                26592                (TOTAL)

After:

  Lines                 Copies               Function name
  -----                 ------               -------------
  413095                15305                (TOTAL)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant