Skip to content

fix(countries/MH): make postal_code_regex +4 extension optional (#1039)#1414

Merged
dr5hn merged 1 commit into
masterfrom
fix/mh-postal-regex-extension-optional
Apr 25, 2026
Merged

fix(countries/MH): make postal_code_regex +4 extension optional (#1039)#1414
dr5hn merged 1 commit into
masterfrom
fix/mh-postal-regex-extension-optional

Conversation

@dr5hn
Copy link
Copy Markdown
Owner

@dr5hn dr5hn commented Apr 25, 2026

Summary

Fixes a latent bug in Marshall Islands' postal_code_regex that was discovered while preparing postcode-data PRs for #1039.

- "postal_code_regex": "^969\\d{2}(-\\d{4})$"
+ "postal_code_regex": "^969\\d{2}(?:-\\d{4})?$"

Why

The MH regex required the -#### extension to be present. That blocked legitimate 5-digit ZIPs 96960 (Majuro) and 96970 (Ebeye) from passing the cross-reference validator's codepostal_code_regex check introduced in #1398. Two issues in one regex:

  1. Missing ? after the extension group → made the +4 mandatory
  2. Capturing group (...) instead of non-capturing (?:...) → inconsistent with how VI and PR encode the same #####-#### shape

Comparison with similar countries

Country Format Regex (before / after)
US Virgin Islands (VI) #####-#### ^008\d{2}(?:-\d{4})?$ ← already correct
Puerto Rico (PR) #####-#### ^00[679]\d{2}(?:-\d{4})?$ ← already correct
Marshall Islands (MH) #####-#### was: ^969\d{2}(-\d{4})$now: ^969\d{2}(?:-\d{4})?$

Validation

Code Before After Expected
96960 (Majuro)
96970 (Ebeye)
96960-1234
96960-12
123456
969XX

Impact

  • Unblocks a future MH postcode-data PR (would have hit code does not match postal_code_regex errors)
  • No effect on existing data — countries.json MH record has postal_code_format: "#####-####", no actual postcodes use this regex yet
  • Diff is exactly 1 line

Refs: #1039

The Marshall Islands postal_code_regex required the +4 extension
('-####') to be present:

  Before:  ^969\d{2}(-\d{4})$
  After:   ^969\d{2}(?:-\d{4})?$

This was inconsistent with how the same #####-#### format is encoded
for VI (Virgin Islands) and PR (Puerto Rico), where the extension is
optional. It also blocked legitimate 5-digit MH ZIPs (96960 Majuro,
96970 Ebeye) from passing the cross-reference validator's regex check
introduced in #1398.

Two changes in one regex:
1. Added '?' after the extension group → makes it optional
2. Changed '(...)' → '(?:...)' → non-capturing group, matching the
   convention used in VI/PR

Validated:
- 96960 (Majuro), 96970 (Ebeye) now match
- 96960-1234 (full +4) still matches
- Invalid codes (96960-12, 123456, 969XX) correctly rejected

Refs: #1039

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 25, 2026 15:30
@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Apr 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

CSC Validation Report

PR Format

  • ✅ Description provided
  • ❌ Data source linked
  • ✅ Issue linked (recommended for data changes)
  • ✅ Justification / context provided

Labels applied: data:countries

Schema Validation (250 records)

Errors (blocking):

  • ❌ contributions/countries/countries.json: Record 1 ("Afghanistan"): "id" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 1 ("Afghanistan"): "created_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 1 ("Afghanistan"): "updated_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 1 ("Afghanistan"): "flag" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 2 ("Aland Islands"): "id" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 2 ("Aland Islands"): "created_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 2 ("Aland Islands"): "updated_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 2 ("Aland Islands"): "flag" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 3 ("Albania"): "id" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 3 ("Albania"): "created_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 3 ("Albania"): "updated_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 3 ("Albania"): "flag" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 4 ("Algeria"): "id" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 4 ("Algeria"): "created_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 4 ("Algeria"): "updated_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 4 ("Algeria"): "flag" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 5 ("American Samoa"): "id" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 5 ("American Samoa"): "created_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 5 ("American Samoa"): "updated_at" must not be included (auto-managed)
  • ❌ contributions/countries/countries.json: Record 5 ("American Samoa"): "flag" must not be included (auto-managed)
  • ...and 980 more errors

Warnings:

  • ⚠️ contributions/countries/countries.json: Record 1 ("Afghanistan"): unknown field "population"
  • ⚠️ contributions/countries/countries.json: Record 1 ("Afghanistan"): unknown field "gdp"
  • ⚠️ contributions/countries/countries.json: Record 1 ("Afghanistan"): unknown field "area_sq_km"
  • ⚠️ contributions/countries/countries.json: Record 1 ("Afghanistan"): unknown field "postal_code_format"
  • ⚠️ contributions/countries/countries.json: Record 1 ("Afghanistan"): unknown field "postal_code_regex"
  • ⚠️ contributions/countries/countries.json: Record 2 ("Aland Islands"): unknown field "population"
  • ⚠️ contributions/countries/countries.json: Record 2 ("Aland Islands"): unknown field "gdp"
  • ⚠️ contributions/countries/countries.json: Record 2 ("Aland Islands"): unknown field "area_sq_km"
  • ⚠️ contributions/countries/countries.json: Record 2 ("Aland Islands"): unknown field "postal_code_format"
  • ⚠️ contributions/countries/countries.json: Record 2 ("Aland Islands"): unknown field "postal_code_regex"
  • ...and 1240 more warnings

Duplicate Detection

  • ⚠️ contributions/countries/countries.json: Record 1 ("Afghanistan") appears to be a duplicate of existing "Afghanistan" (id: 1, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 2 ("Aland Islands") appears to be a duplicate of existing "Aland Islands" (id: 2, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 3 ("Albania") appears to be a duplicate of existing "Albania" (id: 3, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 4 ("Algeria") appears to be a duplicate of existing "Algeria" (id: 4, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 5 ("American Samoa") appears to be a duplicate of existing "American Samoa" (id: 5, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 6 ("Andorra") appears to be a duplicate of existing "Andorra" (id: 6, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 7 ("Angola") appears to be a duplicate of existing "Angola" (id: 7, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 8 ("Anguilla") appears to be a duplicate of existing "Anguilla" (id: 8, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 9 ("Antarctica") appears to be a duplicate of existing "Antarctica" (id: 9, distance: 0.0km)
  • ⚠️ contributions/countries/countries.json: Record 10 ("Antigua and Barbuda") appears to be a duplicate of existing "Antigua and Barbuda" (id: 10, distance: 0.0km)

1000 error(s), 1500 warning(s) | Status: Changes required

Please fix the errors above and push a new commit. Refer to our Contribution Guidelines for details.

@dosubot dosubot Bot added bug Something isn't working fixed Issue has been fixed labels Apr 25, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Marshall Islands (MH) country record so its postal_code_regex accepts both 5-digit ZIPs and optional ZIP+4 extensions, aligning validation behavior with similar territories (e.g., PR/VI) and unblocking postcode cross-reference validation.

Changes:

  • Adjust MH postal_code_regex to make the -#### extension optional.
  • Switch the extension group to non-capturing for consistency with existing patterns.

@dr5hn dr5hn merged commit 1ebd8fa into master Apr 25, 2026
5 checks passed
@dr5hn dr5hn deleted the fix/mh-postal-regex-extension-optional branch April 25, 2026 15:32
dr5hn added a commit that referenced this pull request Apr 25, 2026
Adds the 2 main Marshall Islands ZIP codes:
  96960 Majuro (capital, Majuro Atoll)
  96970 Ebeye (Kwajalein Atoll)

Uses US ZIP system (Compact of Free Association). Now passes the
cross-reference validator after the MH regex fix in #1414 made the
+4 extension optional.

Refs: #1039
Builds on: #1414

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working data:countries fixed Issue has been fixed needs-changes size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants