Skip to content

Branch on Bool alpha in scaling mul! #1286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 20, 2025
Merged

Conversation

jishnub
Copy link
Member

@jishnub jishnub commented Apr 14, 2025

By dealing with the alpha == 0 case separately, we ensure that if alpha::Bool, it must be true. This reduces the branches in _lscale_add from 4 to 2 in the common case of 3-argument mul!. This leads to a latency reduction, as each branch has to compile a different broadcast expression, and we currently compile four but use only one. Primarily, this PR leads to a reduction in allocations.

julia> using LinearAlgebra

julia> v = 1:4; w = similar(v);

julia> @time mul!(w, 1, v);
  0.171120 seconds (1.04 M allocations: 52.799 MiB, 99.98% compilation time) # nightly
  0.163178 seconds (702.63 k allocations: 35.533 MiB, 99.98% compilation time) # this PR

Something similar usually doesn't lead to a big gain in the _rscale_add method, as s * alpha often has the same type as s, and therefore the branches on alpha compile the same code.

@jishnub jishnub force-pushed the jishnub/mul_lscale_alphabool branch from 4153569 to 613935b Compare April 14, 2025 17:36
Copy link

codecov bot commented Apr 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.11%. Comparing base (830ea2f) to head (b937fae).
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1286      +/-   ##
==========================================
+ Coverage   92.03%   92.11%   +0.08%     
==========================================
  Files          34       34              
  Lines       15499    15507       +8     
==========================================
+ Hits        14265    14285      +20     
+ Misses       1234     1222      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jishnub jishnub added the ttfx The change pertains to first-call latency label Apr 19, 2025
@jishnub jishnub merged commit 07725da into master Apr 20, 2025
4 checks passed
@jishnub jishnub deleted the jishnub/mul_lscale_alphabool branch April 20, 2025 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ttfx The change pertains to first-call latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants