Skip to content

Add NearestFrom for faster fast_blur#2868

Open
RunDevelopment wants to merge 4 commits intoimage-rs:mainfrom
RunDevelopment:nearest-from
Open

Add NearestFrom for faster fast_blur#2868
RunDevelopment wants to merge 4 commits intoimage-rs:mainfrom
RunDevelopment:nearest-from

Conversation

@RunDevelopment
Copy link
Copy Markdown
Member

@RunDevelopment RunDevelopment commented Mar 16, 2026

After talking about it here, I removed the f32::round in the hot path of fast_blur.

I did this by adding a new private trait: NearestFrom. This has similar semantics to NumCast::from, but it will round to nearest for float->int conversions and saturate on numeric bounds instead of returning Option. This makes it a natural fit for performance-sensitive code that needs to convert f32 to subpixels.

For now, I only used the trait in fast_blur, but other operations can use it as well for both correctness and performance. I specifically designed the trait to have uses beyond fast_blur.

This PR does make fast_blur significantly faster, even for the u8 case. Here are the benchmark results from my machine (Intel i7-8700K):

bench before after change
fast blur: sigma 3.0 125.60 ms 17.616 ms -85.975%
fast blur: sigma 7.0 112.48 ms 17.462 ms -84.475%
fast blur: sigma 50.0 108.44 ms 18.044 ms -83.360%

That's between 6-7 times faster.


Note that this is no competition for #2846 and its fixed-point implementation for u8. On my machine, #2846 reaches 8ms on the same benchmark.


I also want to mention that some implementations of NearestFrom still leave performance on the table. See this comment for example. I just optimized the important primitives for now. Everything else can follow later.

@fintelia
Copy link
Copy Markdown
Contributor

We may want to spin everything related to PrimitiveSealed into a separate source file. I suspect we're going to end up with a bunch

@RunDevelopment
Copy link
Copy Markdown
Member Author

Depends. For example, I also want to add a trait to make RGB->Luma conversions faster. This would also need to be a super trait of PrimitiveSealed, but I'd like to keep its definition and implementation right next to the (internal) code that uses it for local reasoning.

So I'm not too sure that the code in traits.rs is going to grow a lot, but I'm also not against moving things into a separate source file if it does.

@RunDevelopment
Copy link
Copy Markdown
Member Author

We may want to spin everything related to PrimitiveSealed into a separate source file. I suspect we're going to end up with a bunch

After thinking about it a bit more, I implemented your suggestion. I don't see it growing a lot now, but there's also no harm in doing so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants