Skip to content

[rust-guard] Rust Guard: Eliminate String allocations in policy_scope_token and integrity_rank_normalized hot paths #5739

@github-actions

Description

@github-actions

Improvement 1: policy_scope_token + normalize_scopeCow<'_, str>

Category: Performance
File(s): guards/github-guard/rust-guard/src/labels/helpers.rs
Effort: Medium (15–30 min)
Risk: Low

Problem

policy_scope_token (line 194) always allocates a String, even in the overwhelmingly common single-scope case where labels.join(" | ") simply clones the one scope_label:

pub(crate) fn policy_scope_token(scopes: &[PolicyScopeEntry]) -> String {
    let labels: Vec<&str> = scopes.iter()
        .map(|s| s.scope_label.as_str())
        .filter(|s| !s.is_empty())
        .collect();
    if labels.is_empty() {
        String::new()
    } else {
        labels.join(" | ")   // ← allocates even when labels.len() == 1
    }
}

normalize_scope (line 159) also allocates a String via scope.to_string() in its fallthrough branches. Both are in the hot path: normalize_scope is called by every reader_integrity, writer_integrity, merged_integrity, none_integrity, max_integrity, and cap_integrity invocation — one call per labeled item.

Suggested Change

Return Cow<'_, str> from both functions so the single-scope common case borrows from PolicyContext without allocating:

Before

pub(crate) fn policy_scope_token(scopes: &[PolicyScopeEntry]) -> String {
    let labels: Vec<&str> = scopes.iter()
        .map(|s| s.scope_label.as_str())
        .filter(|s| !s.is_empty())
        .collect();
    if labels.is_empty() {
        String::new()
    } else {
        labels.join(" | ")
    }
}

fn normalize_scope(scope: &str, ctx: &PolicyContext) -> String {
    let token = policy_scope_token(&ctx.scopes);
    if token.is_empty() {
        scope.to_string()
    } else if ctx.scopes.iter().any(|e| matches!(e.scope_kind, ScopeKind::All | ScopeKind::Public)) {
        token
    } else if let Some((owner, repo)) = split_repo_id(scope) {
        let matches_any = ctx.scopes.iter().any(|e| {
            repo_matches_scope(e.scope_kind, owner, repo,
                e.scope_owner.as_deref().unwrap_or(""),
                e.scope_repo.as_deref().unwrap_or(""))
        });
        if matches_any { token } else { scope.to_string() }
    } else {
        scope.to_string()
    }
}

After

use std::borrow::Cow;

pub(crate) fn policy_scope_token(scopes: &[PolicyScopeEntry]) -> Cow<'_, str> {
    let mut labels = scopes.iter()
        .map(|s| s.scope_label.as_str())
        .filter(|s| !s.is_empty());
    match (labels.next(), labels.next()) {
        (None, _)         => Cow::Borrowed(""),
        (Some(first), None) => Cow::Borrowed(first),           // ← zero alloc
        (Some(first), Some(second)) => {
            let mut s = String::from(first);
            s.push_str(" | ");
            s.push_str(second);
            for rest in labels { s.push_str(" | "); s.push_str(rest); }
            Cow::Owned(s)
        }
    }
}

fn normalize_scope<'a>(scope: &'a str, ctx: &'a PolicyContext) -> Cow<'a, str> {
    let token = policy_scope_token(&ctx.scopes);
    if token.is_empty() {
        Cow::Borrowed(scope)
    } else if ctx.scopes.iter().any(|e| matches!(e.scope_kind, ScopeKind::All | ScopeKind::Public)) {
        token
    } else if let Some((owner, repo)) = split_repo_id(scope) {
        let matches_any = ctx.scopes.iter().any(|e| {
            repo_matches_scope(e.scope_kind, owner, repo,
                e.scope_owner.as_deref().unwrap_or(""),
                e.scope_repo.as_deref().unwrap_or(""))
        });
        if matches_any { token } else { Cow::Borrowed(scope) }
    } else {
        Cow::Borrowed(scope)
    }
}

Callers of normalize_scope that currently pass &normalize_scope(...) need only change to &*normalize_scope(...) or pass normalize_scope(...).as_ref(). Callers of policy_scope_token that compare .is_empty() work unchanged via Deref<Target=str>.

Why This Matters

In the common single-scope case (almost all production deployments), this eliminates 1–2 heap allocations per integrity label call. Integrity functions (reader_integrity, writer_integrity, etc.) are called once per item in every response — for a 100-item search result, this avoids 200+ allocations.


Improvement 2: Zero-allocation predicate in integrity_rank_normalized

Category: Performance
File(s): guards/github-guard/rust-guard/src/labels/helpers.rs
Effort: Small (< 15 min)
Risk: Low

Problem

integrity_rank_normalized (line 1286) checks a label's rank by calling format_integrity_label for each level (up to 4), allocating a new String per iteration just to test equality:

fn integrity_rank_normalized(normalized_scope: &str, labels: &[String]) -> u8 {
    for (rank, (prefix, base)) in INTEGRITY_LEVELS.iter().enumerate().rev() {
        let tag = format_integrity_label(prefix, normalized_scope, base);  // ← String alloc
        if labels.iter().any(|l| l == &tag) {
            return (rank + 1) as u8;
        }
    }
    0
}

format_integrity_label allocates for all three label shapes:

  • empty scope → base.to_string() (clones a static str)
  • multi-scope (|) → format!("integrity={};scopes=...")
  • common single scope → format!("{}{}", prefix, scope)

integrity_rank_normalized is called from cap_integrity, max_integrity, and integrity_rank — on every response item that requires integrity comparison.

Suggested Change

Add a label_matches_normalized predicate that checks equality for the common case without allocating. Only fall back to format_integrity_label for the rare multi-part scope case:

Before

fn integrity_rank_normalized(normalized_scope: &str, labels: &[String]) -> u8 {
    for (rank, (prefix, base)) in INTEGRITY_LEVELS.iter().enumerate().rev() {
        let tag = format_integrity_label(prefix, normalized_scope, base);
        if labels.iter().any(|l| l == &tag) {
            return (rank + 1) as u8;
        }
    }
    0
}

After

/// Check whether `label` equals `format_integrity_label(prefix, scope, base)`
/// without allocating for the common (non-multi-scope) cases.
#[inline]
fn label_matches_normalized(label: &str, prefix: &str, scope: &str, base: &str) -> bool {
    if scope.is_empty() {
        label == base
    } else if scope.contains('|') {
        // Multi-scope: fall back to allocation (rare path)
        label == format_integrity_label(prefix, scope, base)
    } else {
        // Common path: "{prefix}{scope}" — zero allocation
        label.len() == prefix.len() + scope.len()
            && label.starts_with(prefix)
            && label[prefix.len()..] == *scope
    }
}

fn integrity_rank_normalized(normalized_scope: &str, labels: &[String]) -> u8 {
    for (rank, (prefix, base)) in INTEGRITY_LEVELS.iter().enumerate().rev() {
        if labels.iter().any(|l| label_matches_normalized(l, prefix, normalized_scope, base)) {
            return (rank + 1) as u8;
        }
    }
    0
}

Why This Matters

integrity_rank_normalized is called twice inside max_integrity (once for current, once for candidate) and twice inside cap_integrity. In a response with N items, each going through max_integrity, this eliminates up to 4 × 2 × N String allocations. For 100-item responses this is ~800 avoided small-string allocations — meaningful for a WASM guard operating under tight memory constraints.


Codebase Health Summary

  • Total Rust files: 9
  • Total lines: 13,814
  • Areas analyzed: lib.rs, labels/helpers.rs, labels/mod.rs, labels/tool_rules.rs, labels/backend.rs, labels/response_items.rs, labels/response_paths.rs, labels/constants.rs, tools.rs
  • Areas with no further improvements: tools.rs (well-covered), labels/constants.rs (complete)

Generated by Rust Guard Improver • Run: §25911841563

Generated by Rust Guard Improver · ● 2.7M ·

  • expires on May 22, 2026, 10:09 AM UTC

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions