Skip to content

Commit b207111

Browse files
committed
Unicdode: Force-inline isgraphemebreak!
When this API was added, this function inlined, which is important, because the API relies on the allocation of the `Ref` being elided. At some point (I went back to 1.8) this regressed. For example, it is currently responsible for substantially all non-Expr allocations in JuliaParser. Before (parsing all of Base with JuliaParser): ``` │ Memory estimate: 76.93 MiB, allocs estimate: 719922. ``` After: ``` │ Memory estimate: 53.31 MiB, allocs estimate: 156. ``` Also add a test to make sure this doesn't regress again.
1 parent d6b3669 commit b207111

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

base/strings/unicode.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -800,7 +800,7 @@ isgraphemebreak(c1::AbstractChar, c2::AbstractChar) =
800800
# Stateful grapheme break required by Unicode-9 rules: the string
801801
# must be processed in sequence, with state initialized to Ref{Int32}(0).
802802
# Requires utf8proc v2.0 or later.
803-
function isgraphemebreak!(state::Ref{Int32}, c1::AbstractChar, c2::AbstractChar)
803+
@inline function isgraphemebreak!(state::Ref{Int32}, c1::AbstractChar, c2::AbstractChar)
804804
if ismalformed(c1) || ismalformed(c2)
805805
state[] = 0
806806
return true

stdlib/Unicode/test/runtests.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,8 @@ end
284284
@test_throws BoundsError graphemes("äöüx", 2:5)
285285
@test_throws BoundsError graphemes("äöüx", 5:5)
286286
@test_throws ArgumentError graphemes("äöüx", 0:1)
287+
288+
@test @allocate(length(graphemes("äöüx"))) == 0
287289
end
288290

289291
@testset "#3721, #6939 up-to-date character widths" begin

0 commit comments

Comments
 (0)