Skip to content

perf: optimize strategy selection and Teddy 2-byte fingerprint#61

Merged
kolkov merged 1 commit intomainfrom
feature/perf-optimization-v0.10.0
Jan 6, 2026
Merged

perf: optimize strategy selection and Teddy 2-byte fingerprint#61
kolkov merged 1 commit intomainfrom
feature/perf-optimization-v0.10.0

Conversation

@kolkov
Copy link
Copy Markdown
Contributor

@kolkov kolkov commented Jan 6, 2026

Summary

  • Implement Teddy 2-byte fingerprint (reduces false positives by ~90%)
  • Reorder strategy selection to prioritize DigitPrefilter for digit-lead patterns
  • Add isDigitLeadPattern() helper for pattern classification

Performance Impact

Pattern Before After Change
literal_alt 31ms 8ms +4x faster
version 8.2ms 2ms +4x faster
IP 3.9ms 5.5ms -43% (trade-off)

Trade-off Analysis

The IP pattern regression is an acceptable trade-off:

Technical Details

Teddy 2-byte Fingerprint

Changed default from 1-byte to 2-byte fingerprint in prefilter/teddy.go:

  • 1-byte: ~25% false positive rate on typical text
  • 2-byte: <0.5% false positive rate

Strategy Reorder

Moved DigitPrefilter check before tiny NFA fallback in meta/strategy.go:

  • Ensures digit-lead patterns use specialized prefilter
  • Prevents single-byte inner literals (like .) for digit patterns

Test Plan

Phase 1: Version pattern optimization
- Move DigitPrefilter check before tiny NFA fallback (line 776)
- Reject single-byte inner literals for digit-lead patterns
- Patterns like \d+\.\d+\.\d+ now use DigitPrefilter instead of ReverseInner
- Expected improvement: version pattern 12x -> 2x slower vs Rust

Phase 2: Teddy 2-byte fingerprint
- Change default fingerprint length from 1 to 2 bytes
- Implement teddySlimSSSE3_2 assembly function (~150 LOC)
- Reduces false positives by ~90% (from ~25% to <0.5%)
- Expected improvement: literal_alt pattern 39x -> 5x slower vs Rust

Files modified:
- meta/strategy.go: reorder DigitPrefilter check
- prefilter/teddy.go: change default to 2-byte fingerprint
- prefilter/teddy_ssse3_amd64.go: add dispatch for case 2
- prefilter/teddy_ssse3_amd64.s: implement teddySlimSSSE3_2
- prefilter/teddy_test.go: update test expectation
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 6, 2026

Benchmark Comparison

Comparing main → PR #61

Summary: geomean 173.0n 172.7n -0.17%

⚠️ Potential regressions detected:

geomean                               ³                +0.00%               ³
geomean                               ³                +0.00%               ³
geomean                         ³                +0.00%               ³
geomean                         ³                +0.00%               ³
AhoCorasickVsStdlib/coregex_IsMatch-4                   1.249µ ± ∞ ¹    1.252µ ± ∞ ¹   +0.24% (p=0.032 n=5)
IPRegex_Find/coregex_1KB_sparse-4                       2.307µ ± ∞ ¹    4.298µ ± ∞ ¹  +86.30% (p=0.008 n=5)
IPRegex_Find/stdlib_1MB_sparse-4                        1.208m ± ∞ ¹    2.155m ± ∞ ¹  +78.45% (p=0.008 n=5)
IPRegex_Find/stdlib_1MB_dense-4                         8.084µ ± ∞ ¹   15.050µ ± ∞ ¹  +86.17% (p=0.008 n=5)
Find/hello-4                                            531.2n ± ∞ ¹    535.6n ± ∞ ¹   +0.83% (p=0.032 n=5)
IsMatch/literal-4                                       57.00n ± ∞ ¹    58.54n ± ∞ ¹   +2.70% (p=0.008 n=5)

Full results available in workflow artifacts. CI runners have ~10-20% variance.
For accurate benchmarks, run locally: ./scripts/bench.sh --compare

@kolkov kolkov merged commit 34f1eae into main Jan 6, 2026
15 checks passed
@kolkov kolkov deleted the feature/perf-optimization-v0.10.0 branch January 6, 2026 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant