Skip to content

Commit dfe4121

Browse files
evhubclaude
andcommitted
Allow parenthesized match patterns in string/sequence/iter captures
In unambiguous capture positions (head-tail, init-last, head-last splits), captures can now be parenthesized match patterns like view patterns or equality checks, e.g. `(int -> n) + "G" = "8G"`. Search split positions (two captures) still require bare variable names. Closes #882 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ccb0df5 commit dfe4121

File tree

6 files changed

+45
-17
lines changed

6 files changed

+45
-17
lines changed

DOCS.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1197,6 +1197,8 @@ infix_pattern ::= bar_or_pattern ("`" EXPR "`" [EXPR])* # infix check
11971197
11981198
bar_or_pattern ::= pattern ("|" pattern)* # match any
11991199
1200+
capture ::= NAME | "(" pattern ")" # in unambiguous capture positions
1201+
12001202
base_pattern ::= (
12011203
"(" pattern ")" # parentheses
12021204
| "None" | "True" | "False" # constants
@@ -1230,29 +1232,29 @@ base_pattern ::= (
12301232
| [( # sequence splits
12311233
"(" patterns ")"
12321234
| "[" patterns "]"
1233-
) "+"] NAME ["+" (
1235+
) "+"] capture ["+" (
12341236
"(" patterns ")" # this match must be the same
12351237
| "[" patterns "]" # construct as the first match
1236-
)] ["+" NAME ["+" (
1237-
"(" patterns ")" # and same here
1238+
)] ["+" NAME ["+" ( # search splits require NAME
1239+
"(" patterns ")" # must be same as first match
12381240
| "[" patterns "]"
12391241
)]]
12401242
| [( # iterable splits
12411243
"(" patterns ")"
12421244
| "[" patterns "]"
12431245
| "(|" patterns "|)"
1244-
) "::"] NAME ["::" (
1246+
) "::"] capture ["::" (
12451247
"(" patterns ")"
12461248
| "[" patterns "]"
12471249
| "(|" patterns "|)"
1248-
)] [ "::" NAME [
1250+
)] [ "::" NAME [ # search splits require NAME
12491251
"(" patterns ")"
12501252
| "[" patterns "]"
12511253
| "(|" patterns "|)"
12521254
]]
1253-
| [STRING "+"] NAME # complex string matching
1255+
| [STRING "+"] capture # complex string matching
12541256
["+" STRING]
1255-
["+" NAME ["+" STRING]]
1257+
["+" NAME ["+" STRING]] # search splits require NAME
12561258
)
12571259
```
12581260

@@ -1287,13 +1289,13 @@ base_pattern ::= (
12871289
- Sequence Destructuring:
12881290
- Lists (`[<patterns>]`), Tuples (`(<patterns>)`): will only match a sequence (`collections.abc.Sequence`) of the same length, and will check the contents against `<patterns>` (Coconut automatically registers `numpy` arrays and `collections.deque` objects as sequences).
12891291
- Lazy lists (`(|<patterns>|)`): same as list or tuple matching, but checks for an Iterable (`collections.abc.Iterable`) instead of a Sequence.
1290-
- Head-Tail Splits (`<list/tuple> + <var>` or `(<patterns>, *<var>)`): will match the beginning of the sequence against the `<list/tuple>`/`<patterns>`, then bind the rest to `<var>`, and make it the type of the construct used.
1291-
- Init-Last Splits (`<var> + <list/tuple>` or `(*<var>, <patterns>)`): exactly the same as head-tail splits, but on the end instead of the beginning of the sequence.
1292-
- Head-Last Splits (`<list/tuple> + <var> + <list/tuple>` or `(<patterns>, *<var>, <patterns>)`): the combination of a head-tail and an init-last split.
1293-
- Search Splits (`<var1> + <list/tuple> + <var2>` or `(*<var1>, <patterns>, *<var2>)`): searches for the first occurrence of the `<list/tuple>`/`<patterns>` in the sequence, then puts everything before into `<var1>` and everything after into `<var2>`.
1292+
- Head-Tail Splits (`<list/tuple> + <capture>` or `(<patterns>, *<var>)`): will match the beginning of the sequence against the `<list/tuple>`/`<patterns>`, then bind the rest to `<capture>`, and make it the type of the construct used. `<capture>` can be a variable name or a parenthesized match pattern (e.g. `(int -> x)`).
1293+
- Init-Last Splits (`<capture> + <list/tuple>` or `(*<var>, <patterns>)`): exactly the same as head-tail splits, but on the end instead of the beginning of the sequence.
1294+
- Head-Last Splits (`<list/tuple> + <capture> + <list/tuple>` or `(<patterns>, *<var>, <patterns>)`): the combination of a head-tail and an init-last split.
1295+
- Search Splits (`<var1> + <list/tuple> + <var2>` or `(*<var1>, <patterns>, *<var2>)`): searches for the first occurrence of the `<list/tuple>`/`<patterns>` in the sequence, then puts everything before into `<var1>` and everything after into `<var2>`. Search split captures must be variable names, not parenthesized patterns.
12941296
- Head-Last Search Splits (`<list/tuple> + <var> + <list/tuple> + <var> + <list/tuple>` or `(<patterns>, *<var>, <patterns>, *<var>, <patterns>)`): the combination of a head-tail split and a search split.
1295-
- Iterable Splits (`<list/tuple/lazy list> :: <var> :: <list/tuple/lazy list> :: <var> :: <list/tuple/lazy list>`): same as other sequence destructuring, but works on any iterable (`collections.abc.Iterable`), including infinite iterators (note that if an iterator is matched against it will be modified unless it is [`reiterable`](#reiterable)).
1296-
- Complex String Matching (`<string> + <var> + <string> + <var> + <string>`): string matching supports the same destructuring options as above.
1297+
- Iterable Splits (`<list/tuple/lazy list> :: <capture> :: <list/tuple/lazy list> :: <var> :: <list/tuple/lazy list>`): same as other sequence destructuring, but works on any iterable (`collections.abc.Iterable`), including infinite iterators (note that if an iterator is matched against it will be modified unless it is [`reiterable`](#reiterable)).
1298+
- Complex String Matching (`<string> + <capture> + <string> + <var> + <string>`): string matching supports the same destructuring options as above. In unambiguous positions (single capture), `<capture>` can be a variable name or a parenthesized match pattern (e.g. `(int -> n) + "px"`).
12971299

12981300
_Note: Like [iterator slicing](#iterator-slicing), iterator and lazy list matching make no guarantee that the original iterator matched against be preserved (to preserve the iterator, use Coconut's [`reiterable`](#reiterable) built-in)._
12991301

coconut/compiler/grammar.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2056,6 +2056,10 @@ class Grammar(object):
20562056
del_stmt = addspace(keyword("del") - simple_assignlist)
20572057

20582058
interior_name_match = labeled_group(setname, "var")
2059+
interior_capture_match = (
2060+
interior_name_match
2061+
| labeled_group(lparen.suppress() + match + rparen.suppress(), "paren")
2062+
)
20592063
matchlist_anon_named_tuple_item = (
20602064
Group(Optional(dot) + unsafe_name) + equals + match
20612065
| Group(Optional(dot) + interior_name_match) + equals
@@ -2103,18 +2107,18 @@ class Grammar(object):
21032107
match_string = interleaved_tokenlist(
21042108
# f_string_atom must come first
21052109
f_string_atom("f_string") | fixed_len_string_tokens("string"),
2106-
interior_name_match("capture"),
2110+
interior_capture_match("capture"),
21072111
plus,
21082112
at_least_two=True,
21092113
)("string_sequence")
21102114
sequence_match = interleaved_tokenlist(
21112115
(match_list | match_tuple)("literal"),
2112-
interior_name_match("capture"),
2116+
interior_capture_match("capture"),
21132117
plus,
21142118
)("sequence")
21152119
iter_match = interleaved_tokenlist(
21162120
(match_list | match_tuple | match_lazy)("literal"),
2117-
interior_name_match("capture"),
2121+
interior_capture_match("capture"),
21182122
unsafe_dubcolon,
21192123
at_least_two=True,
21202124
)("iter")

coconut/compiler/matching.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -812,6 +812,8 @@ def handle_sequence(self, seq_type, seq_groups, item, iter_match=False):
812812
if len(seq_groups) == 3:
813813
(front_gtype, front_match), mid_group, (back_gtype, back_match) = seq_groups
814814
internal_assert(front_gtype == "capture" == back_gtype, "invalid sequence match middle groups", seq_groups)
815+
if "paren" in front_match or "paren" in back_match:
816+
raise CoconutDeferredSyntaxError("parenthesized match patterns cannot be used in sequence search patterns", self.loc)
815817
mid_gtype, mid_contents = mid_group
816818

817819
if iter_match:

coconut/root.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
VERSION = "3.2.0"
2727
VERSION_NAME = None
2828
# False for release, int >= 1 for develop
29-
DEVELOP = 13
29+
DEVELOP = 14
3030
ALPHA = False # for pre releases rather than post releases
3131

3232
assert DEVELOP is False or DEVELOP >= 1, "DEVELOP must be False or an int >= 1"

coconut/tests/src/cocotest/agnostic/primary_2.coco

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -732,6 +732,22 @@ def primary_test_2() -> bool:
732732
assert isinstance(lazy_d, lazy_deque)
733733
assert issubclass(type(lazy_d), lazy_deque)
734734

735+
# Issue #882: parenthesized match patterns in string captures
736+
(int -> n882) + "G" = "8G"
737+
assert n882 == 8
738+
"G" + (int -> n882b) = "G8"
739+
assert n882b == 8
740+
"[" + (int -> n882c) + "]" = "[42]"
741+
assert n882c == 42
742+
743+
a882 = "hello"
744+
(==a882) + " world" = "hello world"
745+
746+
match (int -> mv882) + "px" in "100px":
747+
assert mv882 == 100
748+
else:
749+
assert False
750+
735751
# deferred ImportError on lazy imports
736752
lazy import coconut_nonexistent_test_module
737753
assert_raises(=> coconut_nonexistent_test_module.attr, ImportError)

coconut/tests/src/extras.coco

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,10 @@ def test_setup_none() -> bool:
143143
assert_raises(-> parse("def f(x) = return x"), CoconutSyntaxError)
144144
assert_raises(-> parse("def f(x) =\n return x"), CoconutSyntaxError)
145145
assert_raises(-> parse("10 20"), CoconutSyntaxError)
146+
assert_raises(-> parse('x + "1" + (=="2") = "test"'), CoconutSyntaxError)
147+
assert_raises(-> parse('(==x) + "1" + y = "test"'), CoconutSyntaxError)
148+
assert_raises(-> parse(r'd"\n hello"'), CoconutSyntaxError)
149+
assert_raises(-> parse('d""" hello"""'), CoconutSyntaxError)
146150

147151
assert_raises(-> parse("()[(())"), CoconutSyntaxError, err_has="""
148152
unclosed open '[' (line 1)

0 commit comments

Comments
 (0)