Skip to content

Commit f204148

Browse files
KenoKristofferC
authored andcommitted
Unicode: Force-inline isgraphemebreak! (#58674)
When this API was added, this function inlined, which is important, because the API relies on the allocation of the `Ref` being elided. At some point (I went back to 1.8) this regressed. For example, it is currently responsible for substantially all non-Expr allocations in JuliaParser. Before (parsing all of Base with JuliaParser): ``` │ Memory estimate: 76.93 MiB, allocs estimate: 719922. ``` After: ``` │ Memory estimate: 53.31 MiB, allocs estimate: 156. ``` Also add a test to make sure this doesn't regress again. (cherry picked from commit d6294ba)
1 parent 371f633 commit f204148

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

base/strings/unicode.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -725,7 +725,7 @@ isgraphemebreak(c1::AbstractChar, c2::AbstractChar) =
725725
# Stateful grapheme break required by Unicode-9 rules: the string
726726
# must be processed in sequence, with state initialized to Ref{Int32}(0).
727727
# Requires utf8proc v2.0 or later.
728-
function isgraphemebreak!(state::Ref{Int32}, c1::AbstractChar, c2::AbstractChar)
728+
@inline function isgraphemebreak!(state::Ref{Int32}, c1::AbstractChar, c2::AbstractChar)
729729
if ismalformed(c1) || ismalformed(c2)
730730
state[] = 0
731731
return true

stdlib/Unicode/test/runtests.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,8 @@ end
284284
@test_throws BoundsError graphemes("äöüx", 2:5)
285285
@test_throws BoundsError graphemes("äöüx", 5:5)
286286
@test_throws ArgumentError graphemes("äöüx", 0:1)
287+
288+
@test @allocated(length(graphemes("äöüx"))) == 0
287289
end
288290

289291
@testset "#3721, #6939 up-to-date character widths" begin

0 commit comments

Comments
 (0)