Skip to content

feat(postcodes/TH): bulk-import 1,189 Thailand postcodes via Thai Post (#1039)#1454

Merged
dr5hn merged 2 commits into
masterfrom
feat/postcodes-thailand
Apr 27, 2026
Merged

feat(postcodes/TH): bulk-import 1,189 Thailand postcodes via Thai Post (#1039)#1454
dr5hn merged 2 commits into
masterfrom
feat/postcodes-thailand

Conversation

@dr5hn
Copy link
Copy Markdown
Owner

@dr5hn dr5hn commented Apr 27, 2026

Summary

Adds 1,189 Thailand postcodes spanning all 77 Thai provinces, sourced
from the chawatvish/thailand_postcode_json mirror of Thai Post's
catalogue.

  • Coverage: 100% province resolution (77/77 in the source feed).
  • Granularity: amphoe (district) level — one record per
    (postcode, district) pair.

How it works

Thai Post's 5-digit code carries the province administrative number in
its first two digits, and CSC's states.json already uses the same
2-digit numeric iso2 for Thai provinces (10 = Bangkok, 50 = Chiang
Mai, 81 = Krabi, etc.). State resolution is a direct code[:2]
lookup — no name normalisation needed.

Source & licence

Validation

  • python3 -m py_compile clean.
  • 100% of 1,189 codes match country.postal_code_regex (^(\d{5})$).
  • 100% of records resolve a valid state_id whose country_id == 219
    and whose iso2 matches state_code.
  • No auto-managed fields (id, created_at, updated_at, flag).

Test plan

  • Importer compiles and runs on a clean checkout.
  • Cross-reference validator passes (regex + FK + state_code agreement).
  • Idempotent merge verified.
  • CI pipeline green.

Refs #1039.

🤖 Generated with Claude Code

#1039)

Source: Thai Post catalogue redistributed via the
chawatvish/thailand_postcode_json mirror. All 77 source provinces
resolve at 100% via direct postcode-prefix lookup — Thai Post's
5-digit code carries the province in its first two digits, which
exactly match CSC's 2-digit numeric state.iso2 (e.g. 10 = Bangkok,
50 = Chiang Mai). Records dedupe at (postcode, district) granularity.

Refs #1039.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 27, 2026 13:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

@dosubot dosubot Bot added size:XS This PR changes 0-9 lines, ignoring generated files. enhancement New feature or request labels Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

CSC Validation Report

PR Format

  • ✅ Description provided
  • ✅ Data source linked
  • ✅ Issue linked (recommended for data changes)
  • ✅ Justification / context provided

Labels applied: data:postcodes, large-contribution

⚠️ Large Contribution

This PR contains 1189 records. Large contributions require manual review.

Schema Validation (1189 records)

✅ All records passed validation

Cross-Reference Validation

✅ 2378 reference(s) verified

Source URL Verification

✅ 2 source URL(s) accessible


All checks passed | Status: Ready for review

… postal prefix

The original importer assumed Thai Post's 5-digit postal prefix matches
ISO 3166-2:TH iso2 ("Thai Post's 5-digit code carries the province in
its first two digits, which exactly match CSC's 2-digit numeric
state.iso2"). This is wrong for ~10 provinces:

- Postal-prefix-land treats Bangkok and Samut Prakan as one zone
  (both 10xxx), so every later province's postal prefix is off-by-one
  from CSC's iso2.
- Concretely: 11xxx is Nonthaburi (not Samut Prakan), 12xxx is
  Pathum Thani (not Nonthaburi), ..., 18xxx is Saraburi (not Chai Nat).
- ~10% of records (115+) were attributed to the wrong province; the
  most visible miss was Saraburi (iso2=19) being entirely absent
  while its 20 districts ended up under Chai Nat.

The source feed already nests every postcode under its real province
name in Thai script. Switching to native-name lookup against
states.json gives the correct mapping:

- 11xxx → Nonthaburi (12)
- 17xxx → Chai Nat (18)
- 18xxx → Saraburi (19)
- 77/78 provinces covered (only Pattaya special metro admin missing,
  as expected — it's not a separate entry in the source's 77-province
  list).

Two CSC-side typos required aliases:
- Bangkok native is "กรุงเทพฯ" in CSC vs "กรุงเทพมหานคร" (formal) in source
- Nan native is "แนน" in CSC (typo) vs "น่าน" (correct) in source

Both handled by NATIVE_ALIASES so the fix doesn't depend on a separate
states.json correction.
@dr5hn dr5hn merged commit 518b3f0 into master Apr 27, 2026
1 check passed
@dr5hn dr5hn deleted the feat/postcodes-thailand branch April 27, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data:postcodes enhancement New feature or request large-contribution ready-for-review size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants