Skip to content

Hir::is_match_empty returns false for \b, but should return true #859

Closed
@BurntSushi

Description

@BurntSushi

The predicate in question: https://docs.rs/regex-syntax/latest/regex_syntax/hir/struct.Hir.html#method.is_match_empty

The issue here is that is_match_empty returns true for \B but not for \b. I had done this because \B matches "" but \b does not. However, as of version 1.5.5, this program runs without panicking:

use regex::Regex;

fn main() {
    let wb = Regex::new(r"\b").unwrap();
    let notwb = Regex::new(r"\B").unwrap();
    
    assert!(!wb.is_match(""));
    assert!(notwb.is_match(""));
    
    let got: Vec<_> = wb.find_iter("a").map(|m| m.range()).collect();
    assert_eq!(vec![0..0, 1..1], got);
    
    let got: Vec<_> = notwb.find_iter("a").map(|m| m.range()).collect();
    assert!(got.is_empty());
}

Playground link.

Thus proving that \b does indeed report matches that correspond to the empty string. Therefore, it is a bug that is_match_empty returns false for \b. The issue here is that neither \B nor \b match every empty string. Instead, they only match a subset of empty strings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions