Skip to content

String::push reallocates too aggressively in corner cases #26124

Closed
@pmarcelll

Description

@pmarcelll

There was a post recently on /r/rust about microbenchmarks in different programming languages (https://github.com/kostya/benchmarks), and I was trying to understand how the Rust implementation works.
I noticed that in the Base64 benchmark the author used this code to construct a srting with str_size copies of a character:

let str_size = 10000000;
let mut str: String = "".to_string();
for _ in 0..str_size { str.push_str("a"); }

I thought that push_str is a bit inefficient for a single character, so I looked at the source code. I started playing with String::push and found a trick in the implemetation for ASCII characters (relevant PR: #20079).

The actual problem:
If the character is 1 byte long in UTF-8, it is treated as an ASCII character (as in C), but if it's bigger than 1 byte, the algorythm always reserves 4 bytes in the vector even if the pushed character's length is only 2 or 3 bytes. So even if the vector has just enough space for one of these smaller characters, the vector reallocates its storage (and doubles the size of the buffer). As a result, the string now has an unnecessarily doubled capacity.
I know it doesn't seem like a big problem, but if someone uses push somehow with non-English text, it might become a significant memory overhead. The issue happens if the length of the string becomes a power of 2, so strings with smaller length are more affected.

A possible solution:

// This is a modified version of the original
pub fn push(&mut self, ch: char) {
    match ch.len_utf8() {
        1 => self.vec.push(ch as u8),
        ch_len => {
            let cur_len = self.len();

            self.vec.reserve(ch_len);

            unsafe {
                // Attempt to not use an intermediate buffer by just pushing bytes
                // directly onto this string.
                let slice = slice::from_raw_parts_mut (
                    self.vec.as_mut_ptr().offset(cur_len as isize),
                    ch_len
                );
                let used = ch.encode_utf8(slice).unwrap_or(0);
                self.vec.set_len(cur_len + used);
            }
        }
    }
}

I benchmarked this version, therer was no measurable difference.
There's one problem though: char::len_utf8 is not marked with #[inline], so without LTO, it's currently slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions