Skip to content

Commit 40f6d6b

Browse files
KenoKristofferC
authored andcommitted
Unicode: Force-inline isgraphemebreak! (#58674)
When this API was added, this function inlined, which is important, because the API relies on the allocation of the `Ref` being elided. At some point (I went back to 1.8) this regressed. For example, it is currently responsible for substantially all non-Expr allocations in JuliaParser. Before (parsing all of Base with JuliaParser): ``` │ Memory estimate: 76.93 MiB, allocs estimate: 719922. ``` After: ``` │ Memory estimate: 53.31 MiB, allocs estimate: 156. ``` Also add a test to make sure this doesn't regress again. (cherry picked from commit d6294ba)
1 parent b180986 commit 40f6d6b

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

base/strings/unicode.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -776,7 +776,7 @@ isgraphemebreak(c1::AbstractChar, c2::AbstractChar) =
776776
# Stateful grapheme break required by Unicode-9 rules: the string
777777
# must be processed in sequence, with state initialized to Ref{Int32}(0).
778778
# Requires utf8proc v2.0 or later.
779-
function isgraphemebreak!(state::Ref{Int32}, c1::AbstractChar, c2::AbstractChar)
779+
@inline function isgraphemebreak!(state::Ref{Int32}, c1::AbstractChar, c2::AbstractChar)
780780
if ismalformed(c1) || ismalformed(c2)
781781
state[] = 0
782782
return true

stdlib/Unicode/test/runtests.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,8 @@ end
284284
@test_throws BoundsError graphemes("äöüx", 2:5)
285285
@test_throws BoundsError graphemes("äöüx", 5:5)
286286
@test_throws ArgumentError graphemes("äöüx", 0:1)
287+
288+
@test @allocated(length(graphemes("äöüx"))) == 0
287289
end
288290

289291
@testset "#3721, #6939 up-to-date character widths" begin

0 commit comments

Comments
 (0)