Skip to content

Attempt to measure runtime quicker#573

Draft
marcelm wants to merge 1 commit intomainfrom
tsc
Draft

Attempt to measure runtime quicker#573
marcelm wants to merge 1 commit intomainfrom
tsc

Conversation

@marcelm
Copy link
Copy Markdown
Collaborator

@marcelm marcelm commented Mar 20, 2026

This should probably not be merged, but I wanted to report this somewhere.

When profiling strobealign on my laptop, I noticed that measuring elapsed time (by calling std::time::Instant::now() and std::time::Instant::elapsed()) took up about 5% of the total runtime. On x86, it is possible to measure elapsed time using a special machine instruction (RDTSC) that reads out the TSC (Time Stamp Counter) register, which is very fast.

The fastant crate provides a drop-in replacement for std::time::Instant that measures time in that way.

This PR replaces all uses of std::time::Instant with fastant::time::Instant.

Now the weird part: On my laptop, this made strobealign about 5% faster, but on my desktop PC, I measure no difference at all (both have similar CPUs). Even the profiler output is clearly different.

std::time::Instant::now() is documented to use clock_gettime, which is part of the C library. My hypothesis is that on my desktop PC, clock_gettime actually uses RDTSC and is therefore fast, but that the laptop uses a slower implementation. I’ll need to check this when I have access to the laptop again.

Measuring the current time with std::time::Instant comes with quite some
overhead. On x86, we can use the TSC (Time Stamp Counter) register instead,
which can be read out with a single machine instruction.

This replaces all uses of `std::time::Instant` with `fastant::time::Instant`
from the [fastant crate](https://crates.io/crates/fastant).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant